├── LICENSE ├── README.md └── rotcheck /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2019, Jamie Nguyen . 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in 13 | all copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 21 | THE SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # rotcheck 2 | 3 | A simple shell script to **recursively generate, update and verify checksums** 4 | for files you care about. It's **useful for detecting bit rot**. 5 | 6 | It's written in POSIX shell, but requires GNU coreutils, BusyBox or some other 7 | collection that includes similar checksum tools. 8 | 9 | ## Advantages compared with other tools 10 | 11 | Simple and should work forever (because it's POSIX shell). It's also trivial 12 | to edit the code for your own needs. 13 | 14 | ## Disadvantages compared with other tools 15 | 16 | Verifying checksums is just as fast as any other tool, but generating or 17 | updating checksums is slower (because it's POSIX shell). 18 | 19 | Maybe use a faster tool if you have millions of files and regularly update your 20 | checksums. 21 | 22 | # Installation 23 | 24 | Download [this 25 | script](https://raw.githubusercontent.com/jamielinux/rotcheck/master/rotcheck) 26 | to somewhere in your `$PATH`. For example: 27 | 28 | ``` 29 | curl -O https://raw.githubusercontent.com/jamielinux/rotcheck/master/rotcheck 30 | sudo cp rotcheck /usr/local/bin/rotcheck 31 | sudo chmod +x /usr/local/bin/rotcheck 32 | rotcheck -h 33 | ``` 34 | 35 | # Usage 36 | 37 | Create first checksum file (located at `./.rotcheck`): 38 | 39 | ```shell 40 | $ cd /backups 41 | $ rotcheck -a 42 | ``` 43 | 44 | You've added some new files and need to append some checksums: 45 | 46 | ```shell 47 | $ cd /backups 48 | $ rotcheck -av 49 | ADDED: ./backups/foo/one.tar.gz 50 | ADDED: ./backups/foo/two.tar.gz 51 | ``` 52 | 53 | You've edited some files and need to update the checksums: 54 | 55 | ```shell 56 | $ cd /backups 57 | $ rotcheck -uv 58 | CHANGED: ./backups/bar/three.tar.gz 59 | ``` 60 | 61 | Verify checksums: 62 | 63 | ```shell 64 | $ cd /backups 65 | $ rotcheck -c 66 | ./backups/baz/bitrot.tar.gz: FAILED 67 | sha512sum: WARNING: 1 of 49231 computed checksums did NOT match 68 | ``` 69 | 70 | ## Full help text 71 | 72 | ``` 73 | Usage: rotcheck MODE [OPTIONS] 74 | or: rotcheck MODE [OPTIONS] -- [DIRECTORY]... [ARBITRARY FIND OPTION]... 75 | Recursively generate, update and verify checksums. 76 | 77 | MODES: 78 | -a APPEND mode: Record checksums for any files without a checksum 79 | already. Never modify existing checksums. 80 | -c CHECK mode: Check that files checksums are the same. 81 | -d DELETE mode: Remove checksums for files that don't exist. 82 | -u APPEND-AND-UPDATE mode: Like append-only mode, but also update 83 | checksums for files with a modification date newer than the 84 | the checksum file. (NB: Also see `-M`.) 85 | 86 | OPTIONS: 87 | -b COMMAND Checksum command to use. Default: sha512sum 88 | -f FILE File to store checksums. For relative paths, prefix with "./" 89 | or the checksum file will be checksummed. Default: ./.rotcheck 90 | -h Display this help. 91 | -n Don't follow symlinks. The default is to follow symlinks. 92 | -v Be more verbose when adding, deleting, changing or verifying 93 | checksums. 94 | -w Warn about improperly formatted checksum lines. 95 | -x Exclude all hidden files and directories when generating 96 | checksums. The default is to include them. 97 | -M Use with `-u` to update checksums regardless of modification 98 | time. This is very slow so avoid if possible; try `touch` 99 | instead to bump the modification time of specific files. 100 | WARNING: The checksums might have changed due to bit rot so 101 | use this option with care! 102 | 103 | (specific to GNU coreutils >= 8.25) 104 | -i Ignore missing files when verifying checksums. 105 | 106 | 107 | Supported commands: 108 | GNU coreutils: 109 | md5sum, sha1sum, sha224sum, sha256sum, sha384sum, sha512sum, b2sum 110 | 111 | BusyBox (applets must be symlinked): 112 | md5sum, sha1sum, sha256sum, sha512sum, sha3sum 113 | 114 | BSD & macOS (install GNU coreutils): 115 | gmd5sum, gsha1sum, gsha224sum, gsha256sum, gsha384sum, gsha512sum, gb2sum 116 | 117 | 118 | Examples: 119 | # Create checksum file (located at "./.rotcheck"): 120 | rotcheck -a 121 | 122 | # You've added some new files and need to append some checksums: 123 | rotcheck -va 124 | 125 | # You've edited some files and need to update the checksums (for files with 126 | # a modification time newer than the checksum file): 127 | rotcheck -vu 128 | 129 | # Verify checksums: 130 | rotcheck -c 131 | 132 | # Search other directories instead of the current directory. 133 | # WARNING: checksums might get duplicated if mixing relative and absolute 134 | # paths, or if you change the way you specify directory paths! 135 | rotcheck -a -- /mnt/archive-2018/ /mnt/archive-2019/ 136 | 137 | # Exclude .git folders (these arguments are passed directly to find): 138 | rotcheck -a -- ! -path '*/\.git/*' 139 | 140 | ``` 141 | -------------------------------------------------------------------------------- /rotcheck: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | set -uf 3 | IFS="$(printf '\n\t')" 4 | LC_ALL="C" 5 | 6 | # Copyright (C) 2019 Jamie Nguyen 7 | # 8 | # A simple shell script to recursively generate, update and verify checksums 9 | # for files you care about. It's useful for detecting bit rot. 10 | # 11 | # It's written in POSIX shell, but requires GNU coreutils, BusyBox or some 12 | # other collection that includes similar checksum tools. 13 | 14 | VERSION=1.1.2 15 | COMMAND="sha512sum" 16 | CHECKFILE="./.rotcheck" 17 | 18 | APPEND_MODE=0 19 | CHECK_MODE=0 20 | DELETE_MODE=0 21 | UPDATE_MODE=0 22 | 23 | IGNORE_MISSING=0 24 | FOLLOW_SYMLINKS=1 25 | VERBOSE=0 26 | WARN_FORMATTING=0 27 | EXCLUDE_HIDDEN=0 28 | FORCE_UPDATE=0 29 | 30 | usage() { 31 | cat << EOF 32 | rotcheck $VERSION 33 | Usage: rotcheck MODE [OPTIONS] 34 | or: rotcheck MODE [OPTIONS] -- [DIRECTORY]... [ARBITRARY FIND OPTION]... 35 | Recursively generate, update and verify checksums. 36 | 37 | MODES: 38 | -a APPEND mode: Record checksums for any files without a checksum 39 | already. Never modify existing checksums. 40 | -c CHECK mode: Check that files checksums are the same. 41 | -d DELETE mode: Remove checksums for files that don't exist. 42 | -u APPEND-AND-UPDATE mode: Like append-only mode, but also update 43 | checksums for files with a modification date newer than the 44 | the checksum file. (NB: Also see \`-M\`.) 45 | 46 | OPTIONS: 47 | -b COMMAND Checksum command to use. Default: sha512sum 48 | -f FILE File to store checksums. For relative paths, prefix with "./" 49 | or the checksum file will be checksummed. Default: ./.rotcheck 50 | -h Display this help. 51 | -n Don't follow symlinks. The default is to follow symlinks. 52 | -v Be more verbose when adding, deleting, changing or verifying 53 | checksums. 54 | -w Warn about improperly formatted checksum lines. 55 | -x Exclude all hidden files and directories when generating 56 | checksums. The default is to include them. 57 | -M Use with \`-u\` to update checksums regardless of modification 58 | time. This is very slow so avoid if possible; try \`touch\` 59 | instead to bump the modification time of specific files. 60 | WARNING: The checksums might have changed due to bit rot so 61 | use this option with care! 62 | 63 | (specific to GNU coreutils >= 8.25) 64 | -i Ignore missing files when verifying checksums. 65 | 66 | 67 | Supported commands: 68 | GNU coreutils: 69 | md5sum, sha1sum, sha224sum, sha256sum, sha384sum, sha512sum, b2sum 70 | 71 | BusyBox (applets must be symlinked): 72 | md5sum, sha1sum, sha256sum, sha512sum, sha3sum 73 | 74 | BSD & macOS (install GNU coreutils): 75 | gmd5sum, gsha1sum, gsha224sum, gsha256sum, gsha384sum, gsha512sum, gb2sum 76 | 77 | 78 | Examples: 79 | # Create checksum file (located at "./.rotcheck"): 80 | rotcheck -a 81 | 82 | # You've added some new files and need to append some checksums: 83 | rotcheck -va 84 | 85 | # You've edited some files and need to update the checksums (for files with 86 | # a modification time newer than the checksum file): 87 | rotcheck -vu 88 | 89 | # Verify checksums: 90 | rotcheck -c 91 | 92 | # Search other directories instead of the current directory. 93 | # WARNING: checksums might get duplicated if mixing relative and absolute 94 | # paths, or if you change the way you specify directory paths! 95 | rotcheck -a -- /mnt/archive-2018/ /mnt/archive-2019/ 96 | 97 | # Exclude .git folders (these arguments are passed directly to find): 98 | rotcheck -a -- ! -path '*/\\.git/*' 99 | 100 | EOF 101 | exit 0 102 | } 103 | 104 | fail() { 105 | printf '%s\n' "$@"; exit 1 106 | } 107 | 108 | # Curiously, I stumbled across a bug in bash-3.0.16 (c. 2004) or older 109 | # where \0177 (DEL) isn't handled properly. See the `find_safe` function below. 110 | # bash-3.1 (c. 2005), dash-0.5.2 (c. 2005), and zsh-3.1 (c. 2000) all work 111 | # and probably others too. 112 | if [ -n ${BASH+x} ] && [ -n ${BASH_VERSION+x} ]; then 113 | if printf '%s' "${BASH_VERSION:-x}" | grep -qE '^[0-2]+|^3\.0'; then 114 | fail "bash-3.0.16 and older are broken." \ 115 | "Try bash>=3.1, dash, zsh, or another POSIX shell." 116 | fi 117 | fi 118 | 119 | # Command-line arguments. `getopts` is POSIX, while `getopt` is not. 120 | [ $# -gt 0 ] && [ "$1" = "--help" ] && usage 121 | while getopts ":acdub:f:hinvwxM" opt; do 122 | case "$opt" in 123 | a) APPEND_MODE=1;; 124 | c) CHECK_MODE=1;; 125 | d) DELETE_MODE=1;; 126 | u) UPDATE_MODE=1;; 127 | b) COMMAND="$OPTARG";; 128 | f) CHECKFILE="$OPTARG";; 129 | h) usage;; 130 | i) IGNORE_MISSING=1;; 131 | n) FOLLOW_SYMLINKS=0;; 132 | v) VERBOSE=1;; 133 | w) WARN_FORMATTING=1;; 134 | x) EXCLUDE_HIDDEN=1;; 135 | M) FORCE_UPDATE=1;; 136 | \?) fail "-$OPTARG: Invalid argument";; 137 | :) fail "-$OPTARG requires an argument";; 138 | esac 139 | done; shift $(($OPTIND - 1)) 140 | 141 | 142 | 143 | # A few sanity checks. 144 | MODE=$(($APPEND_MODE + $CHECK_MODE + $DELETE_MODE + $UPDATE_MODE)) 145 | if [ $MODE -eq 0 ]; then 146 | fail "Please specify one of -a, -c, -d, or -u." \ 147 | "See \`rotcheck -h\` for help with usage." 148 | elif [ $MODE -gt 1 ]; then 149 | fail "You can only use one of -a, -c, -d, or -u options." \ 150 | "See \`rotcheck -h\` for help with usage." 151 | elif [ $CHECK_MODE -eq 1 ] || [ $DELETE_MODE -eq 1 ]; then 152 | if [ ! -f "$CHECKFILE" ]; then 153 | fail "$CHECKFILE: No such file." \ 154 | "Try running \`rotcheck -a\` first, or see \`rotcheck -h\`." 155 | fi 156 | elif ! command -v "$COMMAND" >/dev/null 2>/dev/null; then 157 | fail "$COMMAND: command not found" \ 158 | "Try specifying a supported command using \`rotcheck -b COMMAND\`." \ 159 | "You may need to install GNU coreutils or BusyBox." \ 160 | "On *BSD, GNU coreutils commands begin with 'g', like 'gsha512sum'." \ 161 | "See \`rotcheck -h\` for help with usage." 162 | fi 163 | 164 | # When printing text to terminal, make sure it won't do anything unexpected. 165 | printf_sanitized() { 166 | printf '%s' "$@" | tr -d '[:cntrl:]' | iconv -cs -f UTF-8 -t UTF-8 167 | printf '\n' 168 | } 169 | 170 | verify_checksums() { 171 | IGNORE="" ; [ $IGNORE_MISSING -eq 1 ] && IGNORE="--ignore-missing" 172 | WARN="" ; [ $WARN_FORMATTING -eq 1 ] && WARN="-w" 173 | $COMMAND -c $WARN $IGNORE -- "$CHECKFILE" 174 | } 175 | 176 | # Just verify checksums. 177 | if [ $CHECK_MODE -eq 1 ]; then 178 | # Only GNU coreutils supports `--quiet`, so use `grep -v` instead. 179 | # Unfortunately, pipefail isn't POSIX so to return the exit status from the 180 | # checksum command, we have to be clever (aka crazy) with file descriptors 181 | # and subshells instead. 182 | if [ $VERBOSE -eq 1 ]; then 183 | verify_checksums 184 | exit $? 185 | else 186 | exec 4>&1 187 | ( 188 | exec 3>&1 189 | ( 190 | # 2>&1 preserves order of stdout/stderr. 191 | verify_checksums 2>&1; printf '%d' $? 1>&3 192 | ) | grep -Ev ': OK$' 1>&4 193 | exec 3>&- 194 | ) | ( read -r retval; exit $retval ); retval=$? 195 | exec 4>&- 196 | exit $retval 197 | fi 198 | fi 199 | 200 | # Delete checksums for files that no longer exist. 201 | if [ $DELETE_MODE -eq 1 ]; then 202 | i=1 203 | for file in $(cut -d ' ' -f 3- -- "$CHECKFILE"); do 204 | # `sed -i` isn't POSIX (nor is `mktemp`), so use `ex` instead. 205 | if [ ! -f "$file" ]; then 206 | cat << EOF | ex -s -- "$CHECKFILE" 207 | ${i}d 208 | x 209 | EOF 210 | # Print what checksums were deleted. 211 | if [ $VERBOSE -eq 1 ]; then 212 | printf '%s' "DELETED: " 213 | printf_sanitized "$file" 214 | fi 215 | else 216 | # Only increment the line number if we didn't delete a line. 217 | i=$(($i + 1)) 218 | fi 219 | done 220 | exit $? 221 | fi 222 | 223 | # For safety and sanity, ignore all filenames that have control characters 224 | # like newline, tab, delete etc. 225 | find_safe() { 226 | FIND_L="" 227 | FIND_FOLLOW="" 228 | if [ $FOLLOW_SYMLINKS -eq 1 ]; then 229 | # Old versions of findutils don't have -L. Use it if available. 230 | if find -L / -maxdepth 0 -type d >/dev/null 2>/dev/null; then 231 | FIND_L="-L" 232 | else 233 | FIND_FOLLOW="-follow" 234 | fi 235 | fi 236 | 237 | # POSIX find requires that you specify the search path either first 238 | # or immediately after -H/-L. Use current directory by default unless 239 | # user has specified a path. 240 | FIND_DOT="./" 241 | if [ $# -gt 0 ]; then 242 | first_char="$(printf '%s' "$1" | cut -c 1)" 243 | # Replace search path unless first arg is a non-path `find` option. 244 | if [ "$first_char" != "-" ] \ 245 | && [ "$first_char" != "!" ] && [ "$first_char" != "(" ]; then 246 | FIND_DOT="" 247 | fi 248 | fi 249 | 250 | HIDDEN="" 251 | [ $EXCLUDE_HIDDEN -eq 1 ] && HIDDEN='*/\.*' 252 | 253 | find $FIND_L $FIND_DOT "$@" $FIND_FOLLOW \ 254 | -type f ! -path "$CHECKFILE" ! -path "$HIDDEN" \ 255 | ! -name "$(printf '*%b*' '\0001')" ! -name "$(printf '*%b*' '\0002')" \ 256 | ! -name "$(printf '*%b*' '\0003')" ! -name "$(printf '*%b*' '\0004')" \ 257 | ! -name "$(printf '*%b*' '\0005')" ! -name "$(printf '*%b*' '\0006')" \ 258 | ! -name "$(printf '*%b*' '\0007')" ! -name "$(printf '*%b*' '\0010')" \ 259 | ! -name "$(printf '*%b*' '\0011')" ! -name "$(printf '*%b*' '\0012')" \ 260 | ! -name "$(printf '*%b*' '\0013')" ! -name "$(printf '*%b*' '\0014')" \ 261 | ! -name "$(printf '*%b*' '\0015')" ! -name "$(printf '*%b*' '\0016')" \ 262 | ! -name "$(printf '*%b*' '\0017')" ! -name "$(printf '*%b*' '\0020')" \ 263 | ! -name "$(printf '*%b*' '\0021')" ! -name "$(printf '*%b*' '\0022')" \ 264 | ! -name "$(printf '*%b*' '\0023')" ! -name "$(printf '*%b*' '\0024')" \ 265 | ! -name "$(printf '*%b*' '\0025')" ! -name "$(printf '*%b*' '\0026')" \ 266 | ! -name "$(printf '*%b*' '\0027')" ! -name "$(printf '*%b*' '\0030')" \ 267 | ! -name "$(printf '*%b*' '\0031')" ! -name "$(printf '*%b*' '\0032')" \ 268 | ! -name "$(printf '*%b*' '\0033')" ! -name "$(printf '*%b*' '\0034')" \ 269 | ! -name "$(printf '*%b*' '\0035')" ! -name "$(printf '*%b*' '\0036')" \ 270 | ! -name "$(printf '*%b*' '\0037')" ! -name "$(printf '*%b*' '\0177')" 271 | } 272 | 273 | find_updated_files() { 274 | if [ $FORCE_UPDATE -eq 1 ]; then 275 | find_safe "$@" 276 | else 277 | find_safe "$@" -newer "$CHECKFILE" 278 | fi 279 | } 280 | 281 | # This function could be replaced entirely with the much simpler: 282 | # cut -d ' ' -f 3- "$CHECKFILE" | grep -Fxn -- "$file" | cut -d ':' -f 1 283 | # But this function is slightly faster as it avoids passing huge chunks of text 284 | # (ie, the whole checksum file minus the first column) through a pipe. 285 | get_line_number() { 286 | # Avoid `grep -E` as filename characters might get interpreted (eg, $). 287 | for l in $(grep -Fn -- "$file" "$CHECKFILE" | cut -d ':' -f 1); do 288 | if sed -n -e "${l}p" -- "$CHECKFILE" \ 289 | | cut -d ' ' -f 3- | grep -Fxq -- "$file" >/dev/null; then 290 | printf '%d' "$l" 291 | return 0 292 | fi 293 | done 294 | printf '%d' "0" 295 | } 296 | 297 | umask 077 298 | # For files with a modification date newer than the checksum file, if there's 299 | # an existing checksum then update it. Otherwise append a new checksum. 300 | if [ $UPDATE_MODE -eq 1 ] && [ -f "$CHECKFILE" ]; then 301 | for file in $(find_updated_files "$@"); do 302 | line_num="$(get_line_number)" 303 | if [ ${line_num:-0} -eq 0 ]; then 304 | # No checksum yet, so append one. 305 | $COMMAND -- "$file" >> "$CHECKFILE" 306 | else 307 | old="$(sed -n -e "${line_num}p" -- "$CHECKFILE" | cut -d ' ' -f 1)" 308 | new="$($COMMAND -- "$file")" 309 | # Should never happen, but double check these aren't empty: 310 | if [ -z ${old:+x} ] || [ -z ${new:+x} ]; then 311 | continue 312 | fi 313 | # `sed -i` isn't POSIX (nor is `mktemp`), so use `ex` instead. 314 | if [ "$old" != "${new%% *}" ]; then 315 | cat << EOF | ex -s -- "$CHECKFILE" 316 | ${line_num}c 317 | $new 318 | . 319 | x 320 | EOF 321 | # Bail immediately if something went wrong. 322 | [ $? -ne 0 ] && fail "Failed to update checksum file." 323 | 324 | # Print what checksums were changed. 325 | if [ $VERBOSE -eq 1 ]; then 326 | printf '%s' "CHANGED: " 327 | printf_sanitized "$file" 328 | fi 329 | fi 330 | fi 331 | done 332 | fi 333 | 334 | # Append checksums for files that have no checksum yet. 335 | if [ $APPEND_MODE -eq 1 ] || [ $UPDATE_MODE -eq 1 ]; then 336 | for file in $(find_safe "$@"); do 337 | # Avoid `grep -E` as filename characters might get interpreted (eg, $). 338 | # The first grep isn't strictly needed, but grep+cut+grep is faster 339 | # than just cut+grep here. 340 | if [ ! -f "$CHECKFILE" ] || ! grep -- "$file" "$CHECKFILE" \ 341 | | cut -d ' ' -f 3- | grep -Fxq -- "$file"; then 342 | if ! $COMMAND -- "$file" >> "$CHECKFILE"; then 343 | fail "Failed to write to checksum file." 344 | fi 345 | 346 | # Print what checksums were appended. 347 | if [ $VERBOSE -eq 1 ]; then 348 | printf '%s' "ADDED: " 349 | printf_sanitized "$file" 350 | fi 351 | fi 352 | done 353 | fi 354 | --------------------------------------------------------------------------------