├── COPYING ├── README.md └── awsbackup.sh /COPYING: -------------------------------------------------------------------------------- 1 | DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 2 | Version 2, December 2004 3 | 4 | Copyright (C) 2004 Sam Hocevar 5 | 6 | Everyone is permitted to copy and distribute verbatim or modified 7 | copies of this license document, and changing it is allowed as long 8 | as the name is changed. 9 | 10 | DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE 11 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 12 | 13 | 0. You just DO WHAT THE FUCK YOU WANT TO. 14 | 15 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # awsbackup.sh 2 | 3 | This is a bash script I use for storing my private picture collection at AWS S3 Deep Glacier. 4 | I store around ~120GB of pictures (~550 archives) for only ~0.12 ct/month! 5 | 6 | It was designed with the following criterias in mind: 7 | * Uploaded archives never change, only new archives are added from time to time. 8 | * Total number of archives is less than 10.000. 9 | * No strange dependencies, should run on any Linux system. 10 | * I have two local copies of all archives, AWS is only used for the off-site backup. AWS is only for the "My house burned to the ground."-scenarios. 11 | * Bit-rot in local copies can be detected. 12 | * File contents and file names are encrypted. Amazon cannot read any data of me. 13 | * I use a >40 characters passphrase instead of a public/private key. This is not as convenient, but otherwise I would need to worry about losing the private key, too. 14 | * It should be as simple as possible, restoring an archives should be possible without any scripts, too. 15 | 16 | # How it works 17 | 18 | Every local copy of the backup has the following file structure: 19 | ``` 20 | awsbackup/1999-02-17_Holiday.tar.xz.enc 21 | awsbackup/2014-08-31_Wedding.tar.xz.enc 22 | awsbackup/2017-09-01_Holiday_2.tar.xz.enc 23 | awsbackup/...[many more]... 24 | ETAGS.txt 25 | SHA256.txt 26 | awsbackup.sh 27 | ```` 28 | All actual content goes into the folder ```awsbackup```. 29 | Additionally, some text files containing checksums are stored, too. 30 | A new local copy can be made by simply copying all of the files above somewhere else. 31 | 32 | Whenever I want to add a new archive with pictures, I copy them from my smartphone or camera to my laptop. 33 | Let's assume the new pictures are stored in a folder called ```DCIM```. 34 | Then I would run the command ```awsbackup.sh ~/DCIM/ 2019-05-02_Business_Trip``` which creates a new, xz-compressed, encrypted and checksummed archive. 35 | This is an offline operation which does not require internet. 36 | I can delete the DCIM folder now, as I have one local, encrypted copy of it. 37 | 38 | Afterwards I usually run ```awsbackup.sh cloud-sync``` which uploads all local archives which are not yet stored in the cloud. 39 | It will warn for files which are stored in the cloud but I don't have a local copy, too. 40 | The integrity of all files in AWS is checked by calculating the hash locally and comparing it to the one returned by AWS S3. 41 | 42 | From time to time, I copy the local archives on my laptop to other local copies, for instance on an external HDD. 43 | When I do this, I usually also run ```awsbackup.sh local-verify``` in the local copies which basically simply runs ```sha256sum -c SHA256.txt" in order to verify the integrity of the local files against the stored SHA256 hashes. 44 | 45 | # Dependencies 46 | 47 | You will usually need to install the following tools: 48 | 49 | * xz 50 | * openssl 51 | * jq 52 | * aws-cli 53 | 54 | Additional, the following trivial ones are likely already installed on your Linux system. 55 | 56 | * bash 57 | * md5sum 58 | * sha256sum 59 | * dd 60 | * xxd 61 | * cut 62 | * tar 63 | 64 | -------------------------------------------------------------------------------- /awsbackup.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | # 3 | # ./awsbackup.sh - personal backup with AWS S3 Glacier Deep Archive 4 | # 5 | # written by Klaus Eisentraut, May 2019, last update May 2020 6 | # 7 | # This work is free. It comes without any warranty, to the extent permissible 8 | # by applicable law. You can redistribute it and/or modify it under the 9 | # terms of the Do What The Fuck You Want To Public License, Version 2, 10 | # as published by Sam Hocevar. See http://www.wtfpl.net/ for more details. 11 | # 12 | 13 | # ----------------------------------------------------------------------------- 14 | # TODO: adjust everything below for your personal needs 15 | PASSWORD_LOCATION='/tmp/backup-password' # for convenience, encryption password can be (temporarily) stored in a file. 16 | ITERATIONS=1000000 # number of PBKDF2 iterations 17 | BUCKET="my-bucket" # name of aws bucket 18 | LOCAL="/mnt/backup" # path to directory where local copy of archives is stored 19 | FOLDER="awsbackup" # name of subfolder in bucket and local folder, this is where the actual data goes 20 | MULTIPART_CHUNKSIZE=$((8*1024*1024)) # must be identical with your AWS settings! Default is 8MiB for both settings. 21 | MULTIPART_THRESHOLD=$((8*1024*1024)) 22 | STORAGE_CLASS="DEEP_ARCHIVE" # AWS S3 storage class, DEEP_ARCHIVE is the cheapest for long-term archiving 23 | # We need to catch wrong passwords because a backup with a wrong encryption password is useless. 24 | # For initial setup, run 25 | # echo "backupsalt" | openssl enc -e -nosalt -aes-256-cbc -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" | xxd -p 26 | # and copy 4 hexadecimal characters out of it (can be copied from inside the middle, too). 27 | # This will catch 1-(32-4)/16**4 > 99.9% of the wrong passwords while not giving any advantage for a brute-force attack. 28 | PASSWORD_HASH='b4e4' 29 | # TODO: adjust everything above 30 | # ----------------------------------------------------------------------------- 31 | 32 | # fail on all errors 33 | set -e 34 | 35 | # This helper function reads the password either from a file, or from stdin. 36 | function getPassword { 37 | if [ -f "$PASSWORD_LOCATION" ]; then 38 | pass=`cat "$PASSWORD_LOCATION"` 39 | else 40 | echo -n "Enter password: " 41 | read pass 42 | fi 43 | # check password 44 | if [[ ! $(echo "backupsalt" | openssl enc -e -nosalt -aes-256-cbc -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" | xxd -p ) =~ "$PASSWORD_HASH" ]]; then 45 | echo "FATAL ERROR: Password wrong, exit." 46 | exit 1 47 | fi 48 | } 49 | 50 | # en/decrypt a filename deterministically into a URL-safe string 51 | # please be aware that we use CBC mode here, CTR mode with no-salt would be very dangerous! 52 | function encryptString { 53 | echo -n "$1" | openssl enc -e -nosalt -aes-256-cbc -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" -a -A | tr '/+' '_-' 54 | } 55 | function decryptString { 56 | echo -n "$1" | tr '_-' '/+' | openssl enc -d -nosalt -aes-256-cbc -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" -a -A 57 | } 58 | 59 | # check local archives for bitrot by calculating SHA256 sum and comparing against SHA256 hash calculated during creation 60 | function localVerify { 61 | echo "local verification has started, please be patient" 62 | cd "$LOCAL"/ >/dev/null 63 | sha256sum -c --quiet SHA256.txt 64 | cd - >/dev/null 65 | # sha256sum will have aborted this script otherwise (set -e) 66 | echo "SUCCESS: local verification did not detect errors" 67 | } 68 | 69 | # calculate AWS S3 etag, see https://stackoverflow.com/a/19896823 70 | function etagHash { 71 | filename="$1" 72 | if [[ ! -f "$filename" ]]; then echo "FATAL ERROR: wrong usage of etagHash"; exit 1; fi 73 | size=$(du -b "$filename" | cut -f1) 74 | if [[ "$size" -lt "$MULTIPART_THRESHOLD" ]]; then 75 | # etag is simply the md5sum 76 | md5sum "$filename" | cut -d ' ' -f 1 77 | else 78 | # etag is MD5 of MD5s, see https://stackoverflow.com/a/19896823 79 | part=0 80 | offset=0 81 | tmp=$(mktemp /tmp/awsbackup.XXXXXXX) 82 | while [[ "$offset" -lt "$size" ]]; do 83 | dd if="$filename" bs="$MULTIPART_CHUNKSIZE" skip="$part" count=1 2>/dev/null | md5sum | cut --bytes=-32 | tr -d '\n' >> "$tmp" 84 | part=$(( "$part" + 1)) 85 | offset=$(( "$part"*"$MULTIPART_CHUNKSIZE" )) 86 | done 87 | echo $(xxd -r -p "$tmp" | md5sum | cut --bytes=-32)-"$part" 88 | rm -f "$tmp" 89 | fi 90 | } 91 | 92 | function add { 93 | # check if folder exists 94 | if [ ! -d "$1" ]; then 95 | echo "FATAL ERROR: Directory '$1' does not exist and can not get archived!" 96 | exit 1 97 | fi 98 | # check if name has format YYYY-MM-DD_alphanumeric_description 99 | if [[ ! "$2" =~ [12][0-9X][0-9X]{2}-[01X][0-9X]-[0-3X][0-9X]_[a-zA-Z_\-]+ ]]; then 100 | echo "FATAL ERROR: Name '$2' is invalid! Only names which have a format like 2001-12-2X_Christmas_Vacation are accepted." 101 | exit 1 102 | fi 103 | # check, if archive already exists 104 | if [[ -f "$LOCAL"/"$2".tar.xz.enc ]]; then 105 | echo "FATAL ERROR: Archive "$2" already exists!" 106 | exit 1 107 | fi 108 | 109 | # get encryption password 110 | getPassword 111 | 112 | # tar, compress, encrypt, write and checksum archive 113 | # workaround with temporary directory because file inside TAR archive must be named accordingly 114 | tmp=$(mktemp -d /tmp/awsbackup.tmp.folder.XXXXXXXX) 115 | ln -s "$(pwd)"/"$1" "$tmp"/"$2" 116 | cd "$tmp" 117 | sha2=$(tar cvh "$2" | xz -9e -C sha256 | openssl enc -e -salt -aes-256-ctr -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" | tee "$LOCAL/$FOLDER/$2.tar.xz.enc" | sha256sum | cut --bytes=-64) 118 | unlink "$tmp"/"$2" 119 | rmdir "$tmp" 120 | cd - >/dev/null 121 | 122 | # add to inventory 123 | etag=$(etagHash "$LOCAL/$FOLDER/$2.tar.xz.enc") 124 | echo -e "$etag ./$FOLDER/$2.tar.xz.enc" >> "$LOCAL"/ETAGS.txt 125 | echo -e "$sha2 ./$FOLDER/$2.tar.xz.enc" >> "$LOCAL"/SHA256.txt 126 | 127 | # display success 128 | echo "SUCCESS: Created local copy. Please run \"cloud-sync\" command now." 129 | } 130 | 131 | # This function does 132 | # - upload local archives which are not already stored in the cloud 133 | # - warn, if there are any files in the cloud where we do not have a local copy 134 | function cloudsync { 135 | aws s3api list-objects --bucket "$BUCKET" > "$LOCAL"/list-objects.txt 136 | getPassword 137 | cat "$LOCAL"/ETAGS.txt | tail -n20 | while read etag name; do 138 | if [[ ! "$etag" =~ ^([0-9a-f]{32,32})(-[0-9a-f]{1,5})?$ ]]; then echo "FATAL ERROR: etag $etag invalid!"; exit 1; fi 139 | if [[ ! -s "$LOCAL/$name" ]]; then echo "FATAL ERROR: file $LOCAL/$name invalid!"; exit 1; fi 140 | filename=$(basename "$name") 141 | encfilename=$(encryptString "$filename") 142 | etagAWS=$(cat "$LOCAL"/list-objects.txt | jq ".Contents[] | select(.Key|test(\"$FOLDER/$encfilename\")) | .ETag" | tr -d '"\\ ') 143 | if [[ -z "$etagAWS" ]]; then 144 | echo "TODO: $filename is missing in cloud, will be uploaded." 145 | aws s3 cp --storage-class "$STORAGE_CLASS" "$LOCAL/$FOLDER/$filename" "s3://$BUCKET/$FOLDER/$encfilename" 146 | elif [[ "$etag" == "$etagAWS" ]]; then 147 | echo "OK: $filename." 148 | else 149 | echo "FATAL ERROR: $filename is in cloud, but corrupt! Please check manually." 150 | exit 1 151 | fi 152 | done 153 | # now, check that we have an etag/SHA256 entry for every local file, too 154 | for i in "$LOCAL/$FOLDER/"*; do 155 | if ! grep -Fq $(basename "$i") "$LOCAL/ETAGS.txt"; then echo "FATAL ERROR: $i does not have an ETAG!"; exit 1; fi 156 | if ! grep -Fq $(basename "$i") "$LOCAL/SHA256.txt"; then echo "FATAL ERROR: $i does not have an SHA256!"; exit 1; fi 157 | done 158 | # now, check that every file in cloud exists locally, too. 159 | cat "$LOCAL"/list-objects.txt | jq ".Contents[] | select(.Key|test(\"$FOLDER/\")) | .ETag" | tr -d '"\\ ' | while read -r etag; do 160 | if ! grep -Fq "$etag" "$LOCAL/ETAGS.txt"; then echo "FATAL ERROR: Etag $etag exists in cloud, but not in local copy!"; exit 1; fi 161 | done 162 | echo "SUCCESS: Cloud and local files are in sync." 163 | } 164 | 165 | 166 | function localverify { 167 | cd "$LOCAL"/ 168 | sha256sum -w -c SHA256.txt 169 | cd - >/dev/null 170 | echo "SUCCESS: All files in $LOCAL are ok!" 171 | } 172 | 173 | function list { 174 | # check, if we can unpack something 175 | name="${1%%.tar.xz.enc}" 176 | name="${name##$LOCAL\/$FOLDER/}" 177 | if [ ! -f "$LOCAL/$FOLDER/$name.tar.xz.enc" ]; then echo "FATAL ERROR: file $LOCAL/$FOLDER/$name.tar.xz.enc does not exist!"; exit 1; fi 178 | 179 | getPassword 180 | cat "$LOCAL/$FOLDER/$name.tar.xz.enc" | openssl enc -d -salt -aes-256-ctr -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" | unxz | tar tv 181 | } 182 | 183 | function unpack { 184 | # check, if we can unpack something 185 | name="${1%%.tar.xz.enc}" 186 | name="${name##$LOCAL\/$FOLDER/}" 187 | if [ ! -f "$LOCAL/$FOLDER/$name.tar.xz.enc" ]; then echo "FATAL ERROR: file $LOCAL/$FOLDER/$name.tar.xz.enc does not exist!"; exit 1; fi 188 | 189 | getPassword 190 | cd /tmp 191 | cat "$LOCAL/$FOLDER/$name.tar.xz.enc" | openssl enc -d -salt -aes-256-ctr -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" | unxz | tar xv 192 | echo "SUCCESS: unpacking successful, please look at $LOCAL/$FOLDER/$name.tar.xz.enc!" 193 | } 194 | 195 | function list-all { 196 | cd "$LOCAL/$FOLDER/" 197 | for i in *.tar.xz.enc; do 198 | list "$i" 199 | done 200 | } 201 | 202 | function checksum { 203 | # check, if we can unpack something 204 | name="${1%%.tar.xz.enc}" 205 | name="${name##$LOCAL\/$FOLDER/}" 206 | if [ ! -f "$LOCAL/$FOLDER/$name.tar.xz.enc" ]; then echo "FATAL ERROR: file $LOCAL/$FOLDER/$name.tar.xz.enc does not exist!"; exit 1; fi 207 | 208 | getPassword 209 | cat "$LOCAL/$FOLDER/$name.tar.xz.enc" | openssl enc -d -salt -aes-256-ctr -pbkdf2 -iter "$ITERATIONS" -pass pass:"$pass" | tar xJv --to-command=sha256sum | while read -r i; do 210 | if [[ "$i" =~ ^[0-9a-z]{64}\ \ \-$ ]]; then 211 | h="${i:0:64}" 212 | echo "$h $f" 213 | else 214 | f="$i" 215 | fi 216 | done 217 | } 218 | 219 | 220 | function checksum-all { 221 | cd "$LOCAL/$FOLDER/" 222 | for i in *.tar.xz.enc; do 223 | checksum "$i" 224 | done 225 | } 226 | 227 | 228 | function usage { 229 | echo "./awsbackup.sh - Please use one of the following options:" 230 | echo "" 231 | echo " add ./folder/to/be/backuped 1999-01-XX_Pictures_Vacation" 232 | echo " - create compressed & encrypted archive out of folder on local computer" 233 | echo " - you should run \"cloud-sync\" afterwards" 234 | echo " cloud-sync" 235 | echo " - upload local data to AWS S3 Glacier Deep Archive" 236 | echo " - check for consistency and integrity between local copy & cloud" 237 | echo " local-verify" 238 | echo " - verify local data (no internet necessary)" 239 | echo " store-password" 240 | echo " - store password unsafe (!) until next reboot" 241 | echo " remove-password" 242 | echo " - remove stored password" 243 | echo " unpack 1999-01-XX_Pictures_Vacation" 244 | echo " - decrypt and unpack the archive to /tmp/" 245 | echo " list 1999-01-XX_Pictures_Vacation" 246 | echo " - decrypt and list contents" 247 | exit 1 248 | } 249 | 250 | case "$1" in 251 | add) add "$2" "$3" ;; 252 | local-verify) localverify ;; 253 | cloud-sync) cloudsync;; 254 | store-password) getPassword; echo "$pass" > "$PASSWORD_LOCATION" ;; 255 | remove-password) rm -vf "$PASSWORD_LOCATION" ;; 256 | unpack) unpack "$2" ;; 257 | list) list "$2" ;; 258 | list-all) list-all ;; 259 | checksum) checksum "$2" ;; 260 | checksum-all) checksum-all ;; 261 | *) usage ;; 262 | esac 263 | 264 | 265 | # overwrite password in memory before exiting 266 | pass=01234567890123456789012345678901234567890123456789 267 | 268 | # sync local files to disk 269 | sync 270 | --------------------------------------------------------------------------------