├── CHANGELOG.md
├── Unraid Auto Dataset Watcher & Converter  v1
    ├── README.md
    └── Unraid Auto Dataset Watcher & Converter v1.sh
├── README.md
└── Unraid Auto Dataset Watcher & Converter v2.sh


/CHANGELOG.md:
--------------------------------------------------------------------------------
 1 | **Changelog for Unraid Auto Dataset Watcher & Converter**
 2 | 
 3 | 
 4 | 
 5 | [v2.0] - 2023-09-11
 6 | Added
 7 | 
 8 |     New function to auto stop only Docker containers whose appdata is not ZFS based before conversion.
 9 |     New function to auto stop only VMs whose vdisk folder is not a dataset before conversion.
10 |     Ability to add extra datasets to source_datasets_array. Users can now have the script process as many datasets as they like.
11 | 
12 | Improved
13 | 
14 |     Various safety checks:
15 |         Check if sources exist.
16 |         Check if sources are datasets.
17 |         Determine if there's any work to be done before script runs. The script will not execute if there's no work needed.
18 | 
19 | [v1.2] - 2023-09-09
20 | Added
21 | 
22 |     New function normalize_name to normalize German umlauts in dataset names.
23 | 
24 | [v1.1] - 2023-07-16
25 | Improved
26 | 
27 |     Explicit logging when cleanup is disabled. Enhanced feedback regarding rsync operations and errors.
28 | 
29 | [v1.0] - Original Release
30 | Added
31 | 
32 |     Initial release of the script designed for Unraid servers to manage ZFS datasets.
33 |     Features:
34 |         Stop Services: Ability to stop Docker containers and VMs.
35 |         Rename Original Directories: Appends "_temp" suffix to directories to be converted.
36 |         Create New Datasets: Converts directories into ZFS datasets.
37 |         Populate New Datasets: Copies data from temporary directory to new dataset.
38 |         Cleanup: Optional removal of temporary directories.
39 |         Restart Services: Restarts Docker containers and VMs after operations.
40 | 


--------------------------------------------------------------------------------
/Unraid Auto Dataset Watcher & Converter  v1/README.md:
--------------------------------------------------------------------------------
 1 | # Unraid Auto Dataset Watcher & Converter
 2 | 
 3 | ```This version has been superseded by Unraid Auto Dataset Watcher & Converter v2```
 4 | 
 5 | 
 6 | This script, designed to run on an Unraid server using the User Scripts plugin, is a useful tool for managing your ZFS datasets. It actively monitors specified datasets, checking to ensure all top-level folders are actually ZFS datasets themselves. If any regular directories are detected, the script  converts them into datasets.
 7 | 
 8 | This functionality proves especially beneficial when managing, for eaxample, an appdata share set up as a dataset. For instance, when a new Docker container is installed on Unraid, it generates that new container's appdata as a folder within the appdata dataset. This script identifies such instances and converts these folders into individual datasets. Ensuring each Docker container's appdata is isolated in its own dataset allows for precise snapshotting, greatly facilitating rollback operations in case of any issues with a particular container. It provides similar benefits for VMs, transforming newly created VM vdisks - which are typically established as folders - into datasets. These capabilities contribute towards more effective management and recovery of your Docker and VM data.
 9 | 
10 | ## Pre-requisites
11 | Before using the script, ensure the following:
12 | 
13 | - Unraid server (version 6.12 or higher) with ZFS support.
14 | - [User Scripts](https://forums.unraid.net/topic/48286-plugin-user-scripts/) plugin is installed.
15 | - (Optional) [ZFS Master plugin](https://forums.unraid.net/topic/122261-plugin-zfs-master/) plugin is installed for enhanced ZFS functionality.
16 | - Plugins are installed via Unraid's Community Apps
17 | 
18 | ## Setup
19 | 
20 | 1. Install the User Scripts plugin on your Unraid server.
21 | 2. Add a new script and paste in the provided script.
22 | 3. Edit the script's variables according to your specific server configuration and needs.
23 | 
24 | ## Variables
25 | The variables are located at the top of the script. The script as is contains demo variables which you should change to suit your needs.
26 | 
27 | ```
28 | dry_run="no"
29 | source_pool="cyberflux"
30 | source_dataset="appdata"
31 | should_stop_containers="yes"
32 | containers_to_keep_running=("Emby" "container2")
33 | should_stop_vms="yes"
34 | vms_to_keep_running=("Home Assistant" "vm2")
35 | cleanup="yes"
36 | replace_spaces="no" 
37 | ```
38 | 
39 | - `dry_run`: This allows you to test the script without making changes to the system. If set to "yes", the script will print out what it would do without actually executing the commands.
40 | - `source_pool` and `source_dataset`: These are the ZFS pool name and dataset name where your source data resides which you want the script look for 'regular' directories to convert.
41 | - `should_stop_containers` and `should_stop_vms`: These decide whether the script should stop all Docker containers and VMs while it is running. 
42 | - `containers_to_keep_running` and `vms_to_keep_running`: These are arrays where you can list the names of specific Docker containers and VMs that should not be stopped by the script.
43 |    If you know certain containers or VMs do not need to be stopped (for example, these containers have appdata that is already a dataset or the container ie Plex its appdata is not in a different location.
44 | - `cleanup`: If set to "yes", the script will remove temporary data that was copied to create the new datasets.
45 | - `replace_spaces`: If set to "yes", the script will replace spaces in the names of datasets with underscores. Useful in some situations.
46 | 
47 | ## Usage
48 | 
49 | Install the script using the Unraid Userscripts Plugin. You can set it to run on a schedule according to your needs.
50 | 
51 | When running this script manually, it is recommended to run it in the background then view logs to see the progress. This is especially important when running the script for the first time or when there is a large amount of data, as it may take some time. If you run the script in the foreground, the browser page needs to be kept open, otherwise, the script will terminate prematurely.
52 | 
53 | ## Safeguards
54 | 
55 | This script has been designed with several safeguards to prevent data loss and disruption:
56 | 
57 | - The script will not stop Docker containers or VMs that are listed in the `containers_to_keep_running` or `vms_to_keep_running` arrays. This prevents unnecessary disruption to these services.
58 | - The script will not create a new dataset if there is not enough space in the ZFS pool. This prevents overfilling the pool and causing issues with existing data.
59 | - The `dry_run` option allows you to see what the script would do without it making any changes. This is useful for testing and debugging the script.
60 | 
61 | ## Simplified Working Principle
62 | 
63 | Here's how this script operates:
64 | 
65 | 1. **Stopping Services**: If configured to do so, the script will first stop Docker containers and VMs running on your Unraid server. This prevents any data being written to or read from the directories that will be converted, ensuring data consistency and preventing potential corruption. However, you have the option to exclude certain containers or VMs if they do not require stopping, such as if they are already on separate datasets.
66 | 
67 | 2. **Renaming Original Directories**: For each directory identified to be converted into a ZFS dataset, the script first renames it by appending a "_temp" suffix. This is done to prevent name conflicts when creating the new dataset and to safeguard the original data.
68 | 
69 | 3. **Creating New Datasets**: The script then attempts to create a new ZFS dataset with the same name as the original directory. If the new dataset is successfully created, it moves on to the next step. If not (due to an error or insufficient space), the script will skip this directory and proceed to the next one.
70 | 
71 | 4. **Populating New Datasets**: Once the new dataset is created, the script copies the data from the renamed (temporary) directory into the new dataset. This step is crucial as it ensures that all the original data is preserved in the new dataset.
72 | 
73 | 5. **Cleanup**: If the `cleanup` variable is set to "yes", the script will delete the renamed directory and its contents after the data has been successfully copied to the new dataset. This process frees up space in the parent dataset. However, if the dataset creation or data copying fails for any reason, the renamed directory will not be removed, providing an opportunity for you to investigate the issue.
74 | 
75 | 6. **Restarting Services**: Finally, if the script stopped any Docker containers or VMs at the start, it will restart these services. This ensures your applications continue running with minimal downtime.
76 | 
77 | Remember, you can use the `dry_run` mode to simulate the script operation without making any actual changes. This mode allows you to see what the script would do before letting it operate on your data. This is especially useful for understanding how the script would interact with your specific configuration.
78 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Unraid Auto Dataset Watcher & Converter v2
  2 | 
  3 | 
  4 | This script is  for  converting directories into ZFS datasets on an Unraid server and runs in the User Scripts plugin.
  5 | It's proficient in processing appdata from Docker Containers, vdisks from VMs and various other locations within a single run. For directories storing appdata or VM vdisk data, the script is able to detect active containers or VMs that are using these folders. It will automatically stop these containers or VMs prior to initiating the conversion.
  6 | Set to operate on a schedule via Unraid user scripts, this tool then can continue to monitor datasets, making certain that their associated child folders remain as datasets. This is especially valuable when, for instance, installing a new container: its appdata will be converted automatically. Such functionality is invaluable for users keen on snapshotting individual containers, VMs, or various data structures.
  7 | 
  8 | ## Overview
  9 | 
 10 | The script will do the following
 11 | 
 12 | -   Evaluate whether a directory qualifies for conversion from a folder to a ZFS dataset.
 13 | -   Intelligently stop relevant Docker containers or VMs that are tied to directories earmarked for ZFS dataset conversion.
 14 | -   Generate a new ZFS dataset and transfer the content of the folder to this dataset.
 15 | -   Restart the Docker containers or VMs once the conversion wraps up.
 16 | -   Provide a detailed report on what has been successfully converted.
 17 | 
 18 | ## Pre-requisites
 19 | 
 20 | Before using the script, ensure the following:
 21 | 
 22 | -   Unraid server (version 6.12 or higher) with ZFS support.
 23 | -   [User Scripts](https://forums.unraid.net/topic/48286-plugin-user-scripts/) plugin is installed.
 24 | -   (Optional) [ZFS Master plugin](https://forums.unraid.net/topic/122261-plugin-zfs-master/) plugin is installed for enhanced ZFS functionality.
 25 | -   Plugins are installed via Unraid's Community Apps
 26 | 
 27 | ## Setup
 28 | 
 29 | 1.  Install the User Scripts plugin on your Unraid server.
 30 | 2.  Add a new script and paste in the provided script.
 31 | 3.  Edit the script's variables according to your specific server configuration and needs.
 32 | 
 33 | ## Variables
 34 | 
 35 | - `dry_run`: Set to "yes" if you only want to simulate a run without making any changes. Set to "no" to actually run the conversion.
 36 | 
 37 | #### Docker Containers:
 38 | 
 39 | If you want the script to process Docker appdata -
 40 | 
 41 | - `should_process_containers`: Set to "yes" this tells the script the location is container appdata so it can safely deal with it.
 42 | - `source_pool_where_appdata_is`: Specify the source pool containing the appdata.
 43 | - `source_dataset_where_appdata_is`: Specify the source dataset for appdata.
 44 | 
 45 | #### Virtual Machines:
 46 | 
 47 | If you want the script to process VM vdisks -
 48 | 
 49 | - `should_process_vms`: Set to "yes" this tells the script the location  contains vdisks  so it can safely deal with it.
 50 | - `source_pool_where_vm_domains_are`: Specify the source pool containing the VM domains.
 51 | - `source_dataset_where_vm_domains_are`: Specify the source dataset for VM domains.
 52 | - `vm_forceshutdown_wait`: Duration (in seconds) to wait before force stopping a VM if it doesn't shut down gracefully.
 53 | 
 54 | ### Additional User-Defined Datasets:
 55 | 
 56 | This is where you can add other datasets (non appdata ot vm ones)  to be processed by the script:
 57 | 
 58 | - `source_datasets_array`: Specify custom paths in the format pool/dataset, e.g., "tank/mydata".
 59 | 
 60 | 
 61 | ## Running the Script
 62 | 
 63 | After you have configured the script, follow these steps:
 64 | 
 65 | 1.  Save any changes you've made to the script.
 66 | 2.  Run the script using the User Scripts plugin. For the initial run, if there are a significant number of folders requiring conversion, click  the 'Run in Background' button. This ensures that you won't have to keep the browser window open, as closing it would otherwise terminate the script.
 67 | 3.  Configure the script to operate on a schedule that suits your needs, ensuring automated and timely conversions.
 68 | 
 69 | ------------------------------------------------------------------
 70 | ------------------------------------------------------------------
 71 | 
 72 | **Simplified Working Principle:**
 73 | 
 74 | 1.  **Initialization**:
 75 |     
 76 |     -   The script is initialized with several configuration parameters.
 77 |     -   `dry_run`: If set to "yes", the script won't make any real changes but will only output what would happen.
 78 |     -   `should_process_containers`: If set to "yes", the script will process and convert Docker containers' appdata.
 79 |     -   `should_process_vms`: If set to "yes", the script will process and convert Virtual Machines' disk folders.
 80 | 2.  **Dataset Path Check**:
 81 |     
 82 |     -   If the user wants to process Docker containers or VMs, their corresponding dataset paths are added to the `source_datasets_array`.
 83 | 3.  **Utilities**:
 84 |     
 85 |     -   `find_real_location()`: Identifies the actual physical location of a given path.
 86 |     -   `is_zfs_dataset()`: Checks if a given path is a ZFS dataset.
 87 | 4.  **Stopping Containers**:
 88 |     
 89 |     -   For each running Docker container:
 90 |         -   If a container has bind mounts:
 91 |             -   The script identifies the real location of the bind mounts.
 92 |             -   If the bind mounts lie within the designated source appdata, the script checks if they're located within a ZFS dataset.
 93 |             -   If the appdata is not a ZFS dataset (i.e., it's a folder), the script stops the container, intending to convert the folder to a ZFS dataset later on.
 94 | 5.  **Stopping VMs**:
 95 |     
 96 |     -   For each running VM:
 97 |         -   The script identifies the VM's disk.
 98 |         -   If the disk's real location lies within the designated source VM domains, the script checks if it's inside a ZFS dataset.
 99 |         -   If the VM's disk is not a ZFS dataset (i.e., it's a folder), the script attempts to shut down the VM. If it does not shut down within a specified wait time, the VM is forcefully stopped.
100 | 6.  **Creating Datasets**:
101 |     
102 |     -   For each folder in the designated source paths:
103 |         -   If the folder is not already a ZFS dataset:
104 |             -   The script checks if there's enough space to create a new dataset.
105 |             -   The folder is renamed with a "_temp" suffix.
106 |             -   A new ZFS dataset is created.
107 |             -   Contents from the "_temp" folder are copied (rsync'd) into the new ZFS dataset.
108 |             -   If the copying is successful and cleanup is enabled, the "_temp" folder is deleted.
109 | 7.  **Restarting Containers & VMs**:
110 |     
111 |     -   If containers were stopped earlier, they're restarted after the dataset conversions.
112 |     -   If VMs were stopped earlier, they're restarted after the dataset conversions.
113 | 8.  **Logging**:
114 |     
115 |     -   The script logs all actions taken, from the initial dataset path checks to the stopping and restarting of containers and VMs.
116 | 
117 | _Key Concepts_:
118 | 
119 | -   **Bind Mount**: A type of mount where a source directory or file is superimposed onto a destination, making its contents accessible from the destination. Used heavily in Unraid Docker templates.
120 |     
121 | -   **ZFS Dataset**: A ZFS dataset can be thought of as a sort of advanced folder with features like compression, quota, and snapshot capabilities.
122 |     
123 | -   **rsync**: A fast, versatile utility for copying files and directories. It's often used for mirroring and backups. Keeps timestamps and permissions etc
124 |     
125 | 
126 | **How script  Works**:
127 | 
128 | 1.  The script first checks whether it should process Docker containers or VMs based on the user's settings.
129 | 2.  For Docker containers, the script examines their bind mounts. If any bind mount's true location resides inside a regular folder (and not a ZFS dataset) in the designated source path for appdata, that container is stopped.
130 | 3.  Similarly, for VMs, the script checks the true location of their disks. VMs with disks residing inside regular folders in the designated source path for VM domains are stopped.
131 | 4.  With the necessary containers and VMs stopped, the script converts relevant folders in the source paths into ZFS datasets.
132 | 5.  Once the conversion process is done, the script restarts the containers and VMs it had stopped.
133 | 6. Prints results
134 | 
135 | **CONTRIBUTE TO THE PROJECT**
136 | 
137 | Your insights and expertise can make a difference! If you've identified improvements or have suggestions for the script, I'd truly appreciate your contributions. Help me make this tool even better.
138 | 
139 | I'm open to feedback, code enhancements, or new ideas.
140 | 
141 | 
142 | **DISCLAIMER**
143 | 
144 | While this script has been thoroughly tested and is believed to be reliable, unforeseen edge cases may arise. By using this software, you acknowledge potential risks and agree to use it at your own discretion. The author assumes no responsibility for any unintended outcomes.
145 | 
146 | Use wisely and responsibly!!!
147 | 


--------------------------------------------------------------------------------
/Unraid Auto Dataset Watcher & Converter  v1/Unraid Auto Dataset Watcher & Converter v1.sh:
--------------------------------------------------------------------------------
  1 | #!/bin/bash
  2 | #set -x
  3 | # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  4 | # #   Script for watching a dataset and auto updating regular folders converting them to datasets                                         # #
  5 | # #   (needs Unraid 6.12 or above)                                                                                                        # # 
  6 | # #   by - SpaceInvaderOne                                                                                                                # # 
  7 | # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  8 | 
  9 | 
 10 | # real run or dry run
 11 | dry_run="no"              # Set to "yes" for a dry run. Change to "no" to run for real
 12 | 
 13 | #
 14 | # Main Variables
 15 | source_pool="cyberflux"    #this is the zpool in which your source dataset resides (note the does NOT start with /mnt/)
 16 | source_dataset="appdata"     #this is the name of the dataset you want to check and convert its child directories to datasets
 17 | should_stop_containers="yes" # Setting to "yes" will stop all containers except thos listed below. This should be set to yes if watching the appdata share
 18 | containers_to_keep_running=("Emby" "container2") #Containers that you do not want to be stopped (see readme)
 19 | should_stop_vms="no"        #Setting to "yes" will stop all vms except thos listed below. This should be set to yes if watching the domain share
 20 | vms_to_keep_running=("Home Assistant" "vm2") #Containers that you do not want to be stopped (see readme)
 21 | cleanup="yes"               #Seeting to yes will cleanup after running (see readme)
 22 | replace_spaces="no"         # Set this to "no" to keep spaces in the dataset names
 23 | #
 24 | #
 25 | #Avanced variables you do not need to change these.
 26 | source_path="${source_pool}/${source_dataset}"
 27 | mount_point="/mnt"  
 28 | stopped_containers=()
 29 | stopped_vms=()
 30 | converted_folders=()
 31 | buffer_zone=11 # this is a bufferzone for addional free space needed in the dataset set as a percentage value beween 1 and 100.
 32 | #                it should be set a little higher than what you have your minimum free space floor that is set in the Unraid gui for the zpool
 33 | 
 34 | #
 35 | # This function is to stop running Docker containers if required
 36 | stop_docker_containers() {
 37 |   if [ "$should_stop_containers" = "yes" ]; then
 38 |     echo "Checking Docker containers..."
 39 |     for container in $(docker ps -q); do
 40 |       container_name=$(docker container inspect --format '{{.Name}}' "$container" | cut -c 2-)
 41 |       if ! [[ " ${containers_to_keep_running[@]} " =~ " ${container_name} " ]]; then
 42 |         echo "Stopping Docker container ${container_name}..."
 43 |         if [ "$dry_run" != "yes" ]; then
 44 |           docker stop "$container"
 45 |           stopped_containers+=($container)  # Save the id of the stopped container
 46 |         else
 47 |           echo "Dry Run: Docker container ${container_name} would be stopped"
 48 |         fi
 49 |       fi
 50 |     done
 51 |   fi
 52 | }
 53 | 
 54 | #
 55 | # this function is to stoprunning VMs if required
 56 | stop_virtual_machines() {
 57 |   if [ "$should_stop_vms" = "yes" ]; then
 58 |     echo "Checking VMs..."
 59 |     # Get the list of running vms
 60 |     running_vms=$(virsh list --name | awk NF)
 61 |     oldIFS=$IFS
 62 |     IFS=$'\n'
 63 |     
 64 |     for vm in $running_vms; do
 65 |       # restore the IFS
 66 |       IFS=$oldIFS
 67 |       
 68 |       # Check if VM is in the array of VMs to keep running
 69 |       if ! [[ " ${vms_to_keep_running[@]} " =~ " ${vm} " ]]; then
 70 |         echo "Stopping VM $vm..."
 71 |         if [ "$dry_run" != "yes" ]; then
 72 |           # Shutdown the VM then wait for it to stop
 73 |           virsh shutdown "$vm"
 74 |           for i in {1..18}; do
 75 |             if virsh domstate "$vm" | grep -q 'shut off'; then
 76 |               break
 77 |             fi
 78 |             if ((i == 18)); then
 79 |               virsh destroy "$vm"
 80 |             fi
 81 |             sleep 5
 82 |           done
 83 |           stopped_vms+=("$vm")  # Save the name of the stopped VM
 84 |         else
 85 |           echo "Dry Run: VM $vm would be stopped"
 86 |         fi
 87 |       fi
 88 |       # cchange IFS back to handle newline only for the next loop iteration
 89 |       IFS=$'\n'
 90 |     done
 91 |     # restore the IFS
 92 |     IFS=$oldIFS
 93 |   fi
 94 | }
 95 | 
 96 | #
 97 | # Function to start  Docker containers which had been stopped earlier
 98 | start_docker_containers() {
 99 |   if [ "$should_stop_containers" = "yes" ]; then
100 |     for container in ${stopped_containers[@]}; do
101 |       echo "Restarting Docker container $(docker container inspect --format '{{.Name}}' "$container")..."
102 |       if [ "$dry_run" != "yes" ]; then
103 |         docker start "$container"
104 |       else
105 |         echo "Dry Run: Docker container $(docker container inspect --format '{{.Name}}' "$container") would be restarted"
106 |       fi
107 |     done
108 |   fi
109 | }
110 | 
111 | 
112 | #
113 | # function  starts VMs that had been stopped earlier
114 | start_virtual_machines() {
115 |   if [ "$should_stop_vms" = "yes" ]; then
116 |     for vm in "${stopped_vms[@]}"; do
117 |       echo "Restarting VM $vm..."
118 |       if [ "$dry_run" != "yes" ]; then
119 |         virsh start "$vm"
120 |       else
121 |         echo "Dry Run: VM $vm would be started"
122 |       fi
123 |     done
124 |   fi
125 | }
126 | 
127 | # function  normalize umlauts
128 | normalize_name() {
129 |   local original_name="$1"
130 |   # Replace German umlauts with ASCII approximations
131 |   local normalized_name=$(echo "$original_name" | 
132 |                           sed 's/ä/ae/g; s/ö/oe/g; s/ü/ue/g; 
133 |                                s/Ä/Ae/g; s/Ö/Oe/g; s/Ü/Ue/g; 
134 |                                s/ß/ss/g')
135 |   echo "$normalized_name"
136 | }
137 | 
138 | 
139 | #
140 | # main function creating/converting to datasets and copying data within
141 | create_datasets() {
142 |   for entry in "${mount_point}/${source_path}"/*; do
143 |     base_entry=$(basename "$entry")
144 |     if [[ "$base_entry" != *_temp ]]; then
145 |      base_entry_no_spaces=$(if [ "$replace_spaces" = "yes" ]; then echo "$base_entry" | tr ' ' '_'; else echo "$base_entry"; fi)
146 |      normalized_base_entry=$(normalize_name "$base_entry_no_spaces")
147 |       if zfs list -o name | grep -q "^${source_path}/${normalized_base_entry}$"; then
148 |         echo "Skipping dataset ${entry}..."
149 |       elif [ -d "$entry" ]; then
150 |         echo "Processing folder ${entry}..."
151 |         folder_size=$(du -sb "$entry" | cut -f1)  # This is in bytes
152 |         folder_size_hr=$(du -sh "$entry" | cut -f1)  # This is in human readable
153 |         echo "Folder size: $folder_size_hr"
154 |         buffer_zone_size=$((folder_size * buffer_zone / 100))
155 |         if zfs list | grep -q "$source_path" && (( $(zfs list -o avail -p -H "${source_path}") >= buffer_zone_size )); then
156 |           echo "Creating and populating new dataset ${source_path}/${normalized_base_entry}..."
157 |           if [ "$dry_run" != "yes" ]; then
158 |             mv "$entry" "${mount_point}/${source_path}/${normalized_base_entry}_temp"
159 |             if zfs create "${source_path}/${normalized_base_entry}"; then
160 |               rsync -a "${mount_point}/${source_path}/${normalized_base_entry}_temp/" "${mount_point}/${source_path}/${normalized_base_entry}/"
161 |               rsync_exit_status=$?
162 |               if [ "$cleanup" = "yes" ] && [ $rsync_exit_status -eq 0 ]; then
163 |                 echo "Validating copy..."
164 |                 source_file_count=$(find "${mount_point}/${source_path}/${normalized_base_entry}_temp" -type f | wc -l)
165 |                 destination_file_count=$(find "${mount_point}/${source_path}/${normalized_base_entry}" -type f | wc -l)
166 |                 source_total_size=$(du -sb "${mount_point}/${source_path}/${normalized_base_entry}_temp" | cut -f1)
167 |                 destination_total_size=$(du -sb "${mount_point}/${source_path}/${normalized_base_entry}" | cut -f1)
168 |                 if [ "$source_file_count" -eq "$destination_file_count" ] && [ "$source_total_size" -eq "$destination_total_size" ]; then
169 |                   echo "Validation successful, cleanup can proceed."
170 |                   rm -r "${mount_point}/${source_path}/${normalized_base_entry}_temp"
171 |                   converted_folders+=("$entry")  # Save the name of the converted folder
172 |                 else
173 |                   echo "Validation failed. Source and destination file count or total size do not match."
174 |                   echo "Source files: $source_file_count, Destination files: $destination_file_count"
175 |                   echo "Source total size: $source_total_size, Destination total size: $destination_total_size"
176 |                 fi
177 |               elif [ "$cleanup" = "no" ]; then
178 |                 echo "Cleanup is disabled.. Skipping cleanup for ${entry}"
179 |               else
180 |                 echo "Rsync encountered an error. Skipping cleanup for ${entry}"
181 |               fi
182 |             else
183 |               echo "Failed to create new dataset ${source_path}/${normalized_base_entry}"
184 |             fi
185 |           fi
186 |         else
187 |           echo "Skipping folder ${entry} due to insufficient space"
188 |         fi
189 |       fi
190 |     fi
191 |   done
192 | }
193 | 
194 | print_new_datasets() {
195 |  echo "The following folders were successfully converted to datasets:"
196 | for folder in "${converted_folders[@]}"; do
197 |   echo "$folder"
198 | done
199 |     }
200 | #
201 | #
202 | # Run the functions
203 | stop_docker_containers
204 | stop_virtual_machines
205 | create_datasets
206 | start_docker_containers
207 | start_virtual_machines
208 | print_new_datasets
209 | 
210 | 


--------------------------------------------------------------------------------
/Unraid Auto Dataset Watcher & Converter v2.sh:
--------------------------------------------------------------------------------
  1 | #!/bin/bash
  2 | # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  3 | # #   Script for watching a dataset and auto updating regular folders converting them to datasets                                         # #
  4 | # #   (needs Unraid 6.12 or above)                                                                                                        # # 
  5 | # #   by - SpaceInvaderOne                                                                                                                # # 
  6 | # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
  7 | #set -x
  8 | 
  9 | ## Please consider this script in beta at the moment.
 10 | ## new functions
 11 | ## Auto stop only docker containers whose appdata is not zfs based.
 12 | ## Auto stop only vms whose vdisk folder is not a dataset
 13 | ## Add extra datasets to auto update to source_datasets_array
 14 | ## Normalises German umlauts into ascii
 15 | ## Various safety and other checks
 16 | 
 17 | # ---------------------------------------
 18 | # Main Variables
 19 | # ---------------------------------------
 20 | 
 21 | # real run or dry run
 22 | dry_run="no"  # Set to "yes" for a dry run. Change to "no" to run for real
 23 | 
 24 | # Paths
 25 | # ---------------------------------------
 26 | 
 27 | # Process Docker Containers
 28 | should_process_containers="no"  # set to "yes" to process and convert appdata. set paths below
 29 | source_pool_where_appdata_is="sg1_storage"  #source pool
 30 | source_dataset_where_appdata_is="appdata"   #source appdata dataset
 31 | 
 32 | # Process Virtual Machines
 33 | should_process_vms="no"  # set to "yes" to process and convert vm vdisk folders. set paths below
 34 | source_pool_where_vm_domains_are="darkmatter_disks"  # source pool
 35 | source_dataset_where_vm_domains_are="domains"        # source domains dataset
 36 | vm_forceshutdown_wait="90"                           # how long to wait for vm to shutdown without force stopping it
 37 | 
 38 | # Additional User-Defined Datasets
 39 | # Add more paths as needed in the format pool/dataset in quotes, for example: "tank/mydata"
 40 | source_datasets_array=(
 41 |   # ... user-defined paths here ...
 42 | )
 43 | 
 44 | cleanup="yes"
 45 | replace_spaces="no"
 46 | 
 47 | # ---------------------------------------
 48 | # Advanced Variables - No need to modify
 49 | # ---------------------------------------
 50 | 
 51 | # Check if container processing is set to "yes". If so, add location to array and create bind mount compare variable.
 52 | if [[ "$should_process_containers" =~ ^[Yy]es$ ]]; then
 53 |     source_datasets_array+=("${source_pool_where_appdata_is}/${source_dataset_where_appdata_is}")
 54 |     source_path_appdata="$source_pool_where_appdata_is/$source_dataset_where_appdata_is"
 55 | fi
 56 | 
 57 | # Check if VM processing is set to "yes". If so, add location to array and create vdisk compare variable.
 58 | if [[ "$should_process_vms" =~ ^[Yy]es$ ]]; then
 59 |     source_datasets_array+=("${source_pool_where_vm_domains_are}/${source_dataset_where_vm_domains_are}")
 60 |     source_path_vms="$source_pool_where_vm_domains_are/$source_dataset_where_vm_domains_are"
 61 | fi
 62 | 
 63 | mount_point="/mnt"
 64 | stopped_containers=()
 65 | stopped_vms=()
 66 | converted_folders=()
 67 | buffer_zone=11
 68 | 
 69 | #--------------------------------
 70 | #     FUNCTIONS START HERE      #
 71 | #--------------------------------
 72 | 
 73 | #-------------------------------------------------------------------------------------------------
 74 | # this function finds the real location of union folder  ie unraid /mnt/user
 75 | #
 76 | find_real_location() {
 77 |   local path="$1"
 78 | 
 79 |   if [[ ! -e $path ]]; then
 80 |     echo "Path not found."
 81 |     return 1
 82 |   fi
 83 | 
 84 |   for disk_path in /mnt/*/; do
 85 |     if [[ "$disk_path" != "/mnt/user/" && -e "${disk_path%/}${path#/mnt/user}" ]]; then
 86 |       echo "${disk_path%/}${path#/mnt/user}"
 87 |       return 0
 88 |     fi
 89 |   done
 90 | 
 91 |   echo "Real location not found."
 92 |   return 2
 93 | }
 94 | 
 95 | #---------------------------
 96 | # this function checks if location is an actively mounted ZFS dataset or not
 97 | #
 98 | is_zfs_dataset() {
 99 |   local location="$1"
100 |   
101 |   if zfs list -H -o mounted,mountpoint | grep -q "^yes"$'\t'"$location$"; then
102 |     return 0
103 |   else
104 |     return 1
105 |   fi
106 | }
107 | 
108 | #-----------------------------------------------------------------------------------------------------------------------------------  #
109 | # this function checks the running containers and sees if bind mounts are folders or datasets and shuts down containers if needed #
110 | stop_docker_containers() {
111 |   if [ "$should_process_containers" = "yes" ]; then
112 |     echo "Checking Docker containers..."
113 |     
114 |     for container in $(docker ps -q); do
115 |       local container_name=$(docker container inspect --format '{{.Name}}' "$container" | cut -c 2-)
116 |       local bindmounts=$(docker inspect --format '{{ range .Mounts }}{{ if eq .Type "bind" }}{{ .Source }}{{printf "\n"}}{{ end }}{{ end }}' $container) 
117 |       
118 |       if [ -z "$bindmounts" ]; then
119 |         echo "Container ${container_name} has no bind mounts so nothing to convert. No need to stop the container."
120 |         continue
121 |       fi
122 |       
123 |       local stop_container=false
124 | 
125 |       while IFS= read -r bindmount; do
126 |         if [[ "$bindmount" == /mnt/user/* ]]; then
127 |             bindmount=$(find_real_location "$bindmount")
128 |             if [[ $? -ne 0 ]]; then
129 |                 echo "Error finding real location for $bindmount in container $container_name."
130 |                 continue
131 |             fi
132 |         fi
133 | 
134 |         # check if bind mount matches source_path_appdata, if not, skip it
135 |         if [[ "$bindmount" != "/mnt/$source_path_appdata"* ]]; then
136 |             continue
137 |         fi
138 | 
139 |         local immediate_child=$(echo "$bindmount" | sed -n "s|^/mnt/$source_path_appdata/||p" | cut -d "/" -f 1)
140 |         local combined_path="/mnt/$source_path_appdata/$immediate_child"
141 | 
142 |         is_zfs_dataset "$combined_path"
143 |         if [[ $? -eq 1 ]]; then
144 |           echo "The appdata for container ${container_name} is not a ZFS dataset (it's a folder). Container will be stopped so it can be converted to a dataset."
145 |           stop_container=true
146 |           break
147 |         fi
148 |       done <<< "$bindmounts"  #  send  bindmounts into the loop
149 | 
150 |       if [ "$stop_container" = true ]; then
151 |         docker stop "$container"
152 |         stopped_containers+=("$container_name")
153 |       else
154 |         echo "Container ${container_name} is not required to be stopped as it is already a separate dataset."
155 |       fi
156 |     done
157 | 
158 |     if [ "${#stopped_containers[@]}" -gt 0 ]; then
159 |       echo "The container/containers ${stopped_containers[*]} has/have been stopped during conversion and will be restarted afterwards."
160 |     fi
161 |   fi
162 | }
163 | #----------------------------------------------------------------------------------    
164 | # this function restarts any containers that had to be stopped
165 | #
166 | start_docker_containers() {
167 |   if [ "$should_process_containers" = "yes" ]; then
168 |     for container_name in "${stopped_containers[@]}"; do
169 |       echo "Restarting Docker container $container_name..."
170 |       if [ "$dry_run" != "yes" ]; then
171 |         docker start "$container_name"
172 |       else
173 |         echo "Dry Run: Docker container $container_name would be restarted"
174 |       fi
175 |     done
176 |   fi
177 | }
178 | 
179 | 
180 | # ----------------------------------------------------------------------------------    
181 | #this function gets  dataset path from the full vdisk path
182 | #
183 | get_dataset_path() {
184 |     local fullpath="$1"
185 |     # Extract dataset path
186 |     echo "$fullpath" | rev | cut -d'/' -f2- | rev
187 | }
188 | 
189 | #------------------------------------------    
190 | # this function getsvdisk info from a vm
191 | #
192 | get_vm_disk() {
193 |     local vm_name="$1"
194 |     # Redirecting debug output to stderr
195 |     echo "Fetching disk for VM: $vm_name" >&2
196 | 
197 |     # Get target (like hdc, hda, etc.)
198 |     local vm_target=$(virsh domblklist "$vm_name" --details | grep disk | awk '{print $3}')
199 | 
200 |     # Check if target was found
201 |     if [ -n "$vm_target" ]; then
202 |         # Get the disk for the given target
203 |         local vm_disk=$(virsh domblklist "$vm_name" | grep "$vm_target" | awk '{$1=""; print $0}' | sed 's/^[ \t]*//;s/[ \t]*$//')
204 |         # Redirecting debug output to stderr
205 |         echo "Found disk for $vm_name at target $vm_target: $vm_disk" >&2
206 |         echo "$vm_disk"
207 |     else
208 |         # Redirecting error output to stderr
209 |         echo "Disk not found for VM: $vm_name" >&2
210 |         return 1
211 |     fi
212 | }
213 | 
214 | #-----------------------------------------------------------------------------------------------------------------------------------  
215 | # this function checks the vdisks any running vm. If visks is not inside a dataset it will stop the vm for processing the conversion
216 | stop_virtual_machines() {
217 |   if [ "$should_process_vms" = "yes" ]; then
218 |     echo "Checking running VMs..."
219 |     
220 |     while IFS= read -r vm; do
221 |       if [ -z "$vm" ]; then
222 |         # Skip if VM name is empty
223 |         continue
224 |       fi
225 | 
226 |       local vm_disk=$(get_vm_disk "$vm")
227 | 
228 |       # If the disk is not set, skip this vm
229 |       if [ -z "$vm_disk" ]; then
230 |         echo "No disk found for VM $vm. Skipping..."
231 |         continue
232 |       fi
233 |       
234 |       # Check if VM disk is in a folder and matches source_path_vms
235 |       if [[ "$vm_disk" == /mnt/user/* ]]; then
236 |           vm_disk=$(find_real_location "$vm_disk")
237 |           if [[ $? -ne 0 ]]; then
238 |               echo "Error finding real location for $vm_disk in VM $vm."
239 |               continue
240 |           fi
241 |       fi
242 | 
243 |       # Check if vm_disk matches source_path_vms, if not, skip it
244 |       if [[ "$vm_disk" != "/mnt/$source_path_vms"* ]]; then
245 |           continue
246 |       fi
247 | 
248 |       local dataset_path=$(get_dataset_path "$vm_disk")
249 |       local immediate_child=$(echo "$dataset_path" | sed -n "s|^/mnt/$source_path_vms/||p" | cut -d "/" -f 1)
250 |       local combined_path="/mnt/$source_path_vms/$immediate_child"
251 | 
252 |       is_zfs_dataset "$combined_path"
253 |       if [[ $? -eq 1 ]]; then
254 |         echo "The vdisk for VM ${vm} is not a ZFS dataset (it's a folder). VM will be stopped so it can be converted to a dataset."
255 |         
256 |         if [ "$dry_run" != "yes" ]; then
257 |             virsh shutdown "$vm"  
258 |             
259 |       #  waiting loop for the VM to shutdown
260 |       local start_time=$(date +%s)
261 |       while virsh dominfo "$vm" | grep -q 'running'; do
262 |     sleep 5
263 |     local current_time=$(date +%s)
264 |     if (( current_time - start_time >= $vm_forceshutdown_wait )); then
265 |         echo "VM $vm has not shut down after $vm_forceshutdown_wait seconds. Forcing shutdown now."
266 |         virsh destroy "$vm"
267 |         break
268 |     fi
269 | done
270 |         else
271 |             echo "Dry Run: VM $vm would be stopped"
272 |         fi
273 |         stopped_vms+=("$vm")
274 |       else
275 |         echo "VM ${vm} is not required to be stopped as its vdisk is already in its own dataset."
276 |       fi
277 |     done < <(virsh list --name | grep -v '^$')  # filter empty lines
278 | 
279 |     if [ "${#stopped_vms[@]}" -gt 0 ]; then
280 |       echo "The VM/VMs ${stopped_vms[*]} has/have been stopped during conversion and will be restarted afterwards."
281 |     fi
282 |   fi
283 | }
284 | 
285 | #----------------------------------------------------------------------------------    
286 | # this function restarts any vms that had to be stopped
287 | #
288 | start_virtual_machines() {
289 |   if [ "$should_process_vms" = "yes" ]; then
290 |     for vm in "${stopped_vms[@]}"; do
291 |       echo "Restarting VM $vm..."
292 |       if [ "$dry_run" != "yes" ]; then
293 |         virsh start "$vm"  
294 |       else
295 |         echo "Dry Run: VM $vm would be restarted"
296 |       fi
297 |     done
298 |   fi
299 | }
300 | 
301 | #----------------------------------------------------------------------------------    
302 | # this function normalises umlauts into ascii
303 | #
304 | normalize_name() {
305 |   local original_name="$1"
306 |   # Replace German umlauts with ASCII approximations
307 |   local normalized_name=$(echo "$original_name" | 
308 |                           sed 's/ä/ae/g; s/ö/oe/g; s/ü/ue/g; 
309 |                                s/Ä/Ae/g; s/Ö/Oe/g; s/Ü/Ue/g; 
310 |                                s/ß/ss/g')
311 |   echo "$normalized_name"
312 | }
313 | 
314 | #----------------------------------------------------------------------------------    
315 | # this function creates the new datasets and does the conversion
316 | #
317 | create_datasets() {
318 |   local source_path="$1"
319 |   for entry in "${mount_point}/${source_path}"/*; do
320 |     base_entry=$(basename "$entry")
321 |     if [[ "$base_entry" != *_temp ]]; then
322 |       base_entry_no_spaces=$(if [ "$replace_spaces" = "yes" ]; then echo "$base_entry" | tr ' ' '_'; else echo "$base_entry"; fi)
323 |       normalized_base_entry=$(normalize_name "$base_entry_no_spaces")
324 |       
325 |       if zfs list -o name | grep -qE "^${source_path}/${normalized_base_entry}$"; then
326 |         echo "Skipping dataset ${entry}..."
327 |       elif [ -d "$entry" ]; then
328 |         echo "Processing folder ${entry}..."
329 |         folder_size=$(du -sb "$entry" | cut -f1)  # This is in bytes
330 |         folder_size_hr=$(du -sh "$entry" | cut -f1)  # This is in human readable
331 |         echo "Folder size: $folder_size_hr"
332 |         buffer_zone_size=$((folder_size * buffer_zone / 100))
333 |         
334 |         if zfs list -o name | grep -qE "^${source_path}" && (( $(zfs list -o avail -p -H "${source_path}") >= buffer_zone_size )); then
335 |           echo "Creating and populating new dataset ${source_path}/${normalized_base_entry}..."
336 |           if [ "$dry_run" != "yes" ]; then
337 |             mv "$entry" "${mount_point}/${source_path}/${normalized_base_entry}_temp"
338 |             if zfs create "${source_path}/${normalized_base_entry}"; then
339 |               rsync -a "${mount_point}/${source_path}/${normalized_base_entry}_temp/" "${mount_point}/${source_path}/${normalized_base_entry}/"
340 |               rsync_exit_status=$?
341 |               if [ "$cleanup" = "yes" ] && [ $rsync_exit_status -eq 0 ]; then
342 |                 echo "Validating copy..."
343 |                 source_file_count=$(find "${mount_point}/${source_path}/${normalized_base_entry}_temp" -type f | wc -l)
344 |                 destination_file_count=$(find "${mount_point}/${source_path}/${normalized_base_entry}" -type f | wc -l)
345 |                 source_total_size=$(du -sb "${mount_point}/${source_path}/${normalized_base_entry}_temp" | cut -f1)
346 |                 destination_total_size=$(du -sb "${mount_point}/${source_path}/${normalized_base_entry}" | cut -f1)
347 |                 if [ "$source_file_count" -eq "$destination_file_count" ] && [ "$source_total_size" -eq "$destination_total_size" ]; then
348 |                   echo "Validation successful, cleanup can proceed."
349 |                   rm -r "${mount_point}/${source_path}/${normalized_base_entry}_temp"
350 |                   converted_folders+=("$entry")  # Save the name of the converted folder
351 |                 else
352 |                   echo "Validation failed. Source and destination file count or total size do not match."
353 |                   echo "Source files: $source_file_count, Destination files: $destination_file_count"
354 |                   echo "Source total size: $source_total_size, Destination total size: $destination_total_size"
355 |                 fi
356 |               elif [ "$cleanup" = "no" ]; then
357 |                 echo "Cleanup is disabled.. Skipping cleanup for ${entry}"
358 |               else
359 |                 echo "Rsync encountered an error. Skipping cleanup for ${entry}"
360 |               fi
361 |             else
362 |               echo "Failed to create new dataset ${source_path}/${normalized_base_entry}"
363 |             fi
364 |           fi
365 |         else
366 |           echo "Skipping folder ${entry} due to insufficient space"
367 |         fi
368 |       fi
369 |     fi
370 |   done
371 | }
372 | 
373 | 
374 | 
375 | #----------------------------------------------------------------------------------    
376 | # this function prints what has been converted
377 | #
378 | print_new_datasets() {
379 |  echo "The following folders were successfully converted to datasets:"
380 | for folder in "${converted_folders[@]}"; do
381 |   echo "$folder"
382 | done
383 |     }
384 |     
385 | #----------------------------------------------------------------------------------    
386 | # this function checks if there any folders to covert in the array and if not exits. Also checks sources are valid locations
387 | #
388 | can_i_go_to_work() {
389 |     echo "Checking if anything needs converting"
390 |     
391 |     # Check if the array is empty
392 |     if [ ${#source_datasets_array[@]} -eq 0 ]; then
393 |         echo "No sources are defined."
394 |         echo "If you're expecting to process 'appdata' or VMs, ensure the respective variables are set to 'yes'."
395 |         echo "For other datasets, please add their paths to 'source_datasets_array'."
396 |         echo "No work for me to do. Exiting..."
397 |         exit 1
398 |     fi
399 | 
400 |     local folder_count=0
401 |     local total_sources=${#source_datasets_array[@]}
402 |     local sources_with_only_datasets=0
403 |     
404 |     for source_path in "${source_datasets_array[@]}"; do
405 |         # Check if source exists
406 |         if [[ ! -e "${mount_point}/${source_path}" ]]; then
407 |             echo "Error: Source ${mount_point}/${source_path} does not exist. Please ensure the specified path is correct."
408 |             exit 1
409 |         fi
410 |         
411 |         # Check if source is a dataset
412 |         if ! zfs list -o name | grep -q "^${source_path}$"; then
413 |             echo "Error: Source ${source_path} is a folder. Sources must be a dataset to host child datasets. Please verify your configuration."
414 |             exit 1
415 |         else
416 |             echo "Source ${source_path} is a dataset and valid for processing ..."
417 |         fi
418 |         
419 |         local current_source_folder_count=0
420 |         for entry in "${mount_point}/${source_path}"/*; do
421 |             base_entry=$(basename "$entry")
422 |             if [ -d "$entry" ] && ! zfs list -o name | grep -q "^${source_path}/$(echo "$base_entry")$"; then
423 | 
424 |                 current_source_folder_count=$((current_source_folder_count + 1))
425 |             fi
426 |         done
427 |         
428 |         if [ "$current_source_folder_count" -eq 0 ]; then
429 |             echo "All children in ${mount_point}/${source_path} are already datasets. No work to do for this source."
430 |             sources_with_only_datasets=$((sources_with_only_datasets + 1))
431 |         else
432 |             echo "Folders found in ${source_path} that need converting..."
433 |         fi
434 |         
435 |         folder_count=$((folder_count + current_source_folder_count))
436 |     done
437 | 
438 |     if [ "$folder_count" -eq 0 ]; then
439 |         echo "All children in all sources are already datasets. No work to do... Exiting"
440 |         exit 1
441 |     fi
442 | }
443 | 
444 | 
445 | #-------------------------------------------------------------------------------------
446 | # this function runs through a loop sending all datasets to process the create_datasets
447 | #
448 | convert() {
449 | for dataset in "${source_datasets_array[@]}"; do
450 |   create_datasets "$dataset"
451 | done
452 | }
453 | 
454 | #--------------------------------
455 | #    RUN THE FUNCTIONS          #
456 | #--------------------------------
457 | can_i_go_to_work
458 | stop_docker_containers
459 | stop_virtual_machines
460 | convert
461 | start_docker_containers
462 | start_virtual_machines
463 | print_new_datasets
464 | 
465 | 


--------------------------------------------------------------------------------