├── CONTRIBUTING.md ├── README.md ├── LICENSE └── cassandra-cloud-backup.sh /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | Want to contribute? Great! First, read this page (including the small print at the end). 2 | 3 | ### Before you contribute 4 | Before we can use your code, you must sign the 5 | [Google Individual Contributor License Agreement] 6 | (https://cla.developers.google.com/about/google-individual) 7 | (CLA), which you can do online. The CLA is necessary mainly because you own the 8 | copyright to your changes, even after your contribution becomes part of our 9 | codebase, so we need your permission to use and distribute your code. We also 10 | need to be sure of various other things—for instance that you'll tell us if you 11 | know that your code infringes on other people's patents. You don't have to sign 12 | the CLA until after you've submitted your code for review and a member has 13 | approved it, but you must do it before we can put your code into our codebase. 14 | Before you start working on a larger contribution, you should get in touch with 15 | us first through the issue tracker with your idea so that we can help out and 16 | possibly guide you. Coordinating up front makes it much easier to avoid 17 | frustration later on. 18 | 19 | ### Code reviews 20 | All submissions, including submissions by project members, require review. We 21 | use Github pull requests for this purpose. 22 | 23 | ### The small print 24 | Contributions made by corporations are covered by a different agreement than 25 | the one above, the 26 | [Software Grant and Corporate Contributor License Agreement] 27 | (https://cla.developers.google.com/about/google-corporate). 28 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Cassandra Backup and Restore with Google Cloud Storage 2 | ==================== 3 | Shell script for creating and managing Cassandra Backups using Google Cloud Storage. 4 | ## Features 5 | - Take snapshot backups 6 | - Copy Incremental backup files 7 | - Compress with gzip or bzip2 to save space 8 | - Prune old incremental and snapshot files 9 | - Execute Dry Run mode to identify target files 10 | 11 | ## Requirements 12 | Google Cloud SDK installed with gsutil utility configured for authentication to 13 | An existing Google Cloud Storage bucket 14 | Linux system with BASH shell. 15 | Cassandra 2+ 16 | 17 | 18 | ## Usage 19 | ./cassandra-cloud-backup.sh [ options ] < command> 20 | 21 | ### Examples 22 | - Take a full snapshot, gzip compress it with nice level 15, use the /var/lib/cassandra/backups directory to stage the backup before 23 | uploading it to the GCS Bucket, and clear old incremental and snapshot files 24 | 25 | `./cassandra-cloud-backup.sh -b gs://cassandra-backups123/ -zCc -N 15 -d /var/lib/cassandra/backups backup` 26 | 27 | - Do a dry run of a full snapshot with verbose output and create list of files that would have been copied 28 | 29 | `./cassandra-cloud-backup.sh -b gs://cassandra-backups123/ -vn backup` 30 | 31 | - Backup and bzip2 compress copies of the most recent incremental backup files since the last incremental backup 32 | 33 | ` ./cassandra-cloud-backup.sh -b gs://cassandra-backups123/ -ji -d /var/lib/cassandra/backups backup` 34 | 35 | - Restore a backup without prompting from given bucket path and keep the old files locally 36 | 37 | ` ./cassandra-cloud-backup.sh -b gs://cass-bk123/backups/host01/snpsht/2016-01-20_18-57/ -fk -d /var/lib/cassandra/backups restore` 38 | 39 | - List inventory of available backups stored in Google Cloud Store 40 | 41 | ` ./cassandra-cloud-backup.sh -b gs://cass-bk123 inventory` 42 | 43 | - List inventory of available backups stored in Google Cloud Store for a different server 44 | 45 | ` ./cassandra-cloud-backup.sh -b gs://cass-bk123 inventory -a testserver01` 46 | 47 | ### Commands: 48 | 49 | - backup --- Backup the Cassandra Node based on passed in options 50 | - restore --- Restore the Cassandra Node from a specific snapshot backup 51 | - inventory --- List available backups 52 | - commands --- List available commands 53 | - options --- list available options 54 | 55 | ### Options: 56 | Flags: 57 | 58 | -a, --alt-hostname 59 | Specify an alternate server name to be used in the bucket path construction. Used 60 | to create or retrieve backups from different servers 61 | 62 | -B, backup 63 | Default action is to take a backup 64 | 65 | -b, --gcsbucket 66 | Google Cloud Storage bucket used in deployment and by the cluster. 67 | 68 | -c, --clear-old-ss 69 | Clear any old SnapShots taken prior to this backup run to save space 70 | additionally will clear any old incremental backup files taken immediately 71 | following a successful snapshot. this option does nothing with the -i flag 72 | 73 | -C, --clear-old-inc 74 | Clear any old incremental backups taken prior to the the current snapshot 75 | 76 | -d, --backupdir 77 | The directory in which to store the backup files, be sure that this directory 78 | has enough space and the appropriate permissions 79 | 80 | -D, --download-only 81 | During a restore this will only download the target files from GCS 82 | 83 | -f, --force 84 | Used to force the restore without confirmation prompt 85 | 86 | -h, --help 87 | Print this help message. 88 | 89 | -H, --home-dir 90 | This is the $CASSANDRA_HOME directory and is only used if the data_directories, commitlog_directory, 91 | or the saved_caches_directory values cannot be parsed out of the yaml file. 92 | 93 | -i, --incremental 94 | Copy the incremental backup files and do not take a snapshot. Can only 95 | be run when compression is enabled with -z or -j 96 | 97 | -j, --bzip 98 | Compresses the backup files with bzip2 prior to pushing to Google Cloud Storage 99 | This option will use additional local disk space set the --target-gz-dir 100 | to use an alternate disk location if free space is an issue 101 | 102 | -k, --keep-old 103 | Set this flag on restore to keep a local copy of the old data files 104 | Set this flag on backup to keep a local copy of the compressed backup and schema dump 105 | 106 | -l, --log-dir 107 | Activate logging to file 'CassandraBackup${DATE}.log' from stdout 108 | Include an optional directory path to write the file 109 | Default path is /var/log/cassandra 110 | 111 | -n, --noop 112 | Will attempt a dry run and verify all the settings are correct 113 | 114 | -N, --nice 115 | Set the process priority, default 10 116 | 117 | -p 118 | The Cassandra User Password if required for security 119 | 120 | -r, restore 121 | Restore a backup, requires a --gcsbucket path and optional --backupdir 122 | 123 | -s, --split-size 124 | Split the resulting tar archive into the configured size in Megabytes, default 100M 125 | 126 | -S, --service-name 127 | Specify the service name for cassandra, default is cassandra use to stop and start service 128 | 129 | -T, --target-gz-dir 130 | Override the directory to save compressed files in case compression is used 131 | default is --backupdir/compressed, also used to decompress for restore 132 | 133 | -u 134 | The Cassandra User account if required for security 135 | 136 | -U, --auth-file 137 | A file that contains authentication credentials for cqlsh and nodetool consisting of 138 | two lines: 139 | CASSANDRA_USER=username 140 | CASSANDRA_PASS=password 141 | 142 | -v --verbose 143 | When provided will print additional information to log file 144 | 145 | -y, --yaml 146 | Path to the Cassandra yaml configuration file 147 | default: /etc/cassandra/cassandra.yaml 148 | 149 | -z, --zip 150 | Compresses the backup files with gzip prior to pushing to Google Cloud Storage 151 | This option will use additional local disk space set the --target-gz-dir 152 | to use an alternate disk location if free space is an issue 153 | 154 | 155 | ###Cron Examples 156 | - Full gzip compressed snapshot every day at 1:30 am with nice level 10 157 | 158 | `30 1 * * * /path_to_scripts/cassandra-cloud-backup.sh -z -N10 -b gs://cass-bk123-vCcj -d /var/lib/cassandra/backups > /var/log/cassandra/$(date +\%Y\%m\%d\%H\%M\%S)-fbackup.log 2>&1` 159 | 160 | - Incremental gzip compressed backups copied every hour nice level 10 161 | 162 | `0 * * * * /path_to_scripts/cassandra-cloud-backup.sh -b -N10 gs://cass-bk123 -vjiz -d /var/lib/cassandra/backups > /var/log/cassandra/$(date +\%Y\%m\%d\%H\%M\%S)-ibackup.log 2>&1` 163 | 164 | ### Notes 165 | 166 | The script must be run with sufficient privileges to be able to stop/start processes and create/delete directories and files within the working directories. 167 | 168 | The restore command is designed to perform a simple restore of a full snapshot. In the event that you want to restore incremental backups you should start by restoring the last full snapshot prior to your target incremental backup file and manually move the files from each incremental backup in chronological order leading up to the target incremental backup file. The schema dump is included in the snapshot backups, but if necessary it must also be restored manually. 169 | 170 | Snapshots are taken at the system level, the script currently does not support backup or restore of an individual keyspace or columnfamily. 171 | 172 | In order to enable incremental backups, the `incremental_backups` option has to be set to true in the cassandra.yaml file. 173 | 174 | ###License 175 | Copyright 2016 Google Inc. All Rights Reserved. 176 | 177 | Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at 178 | http://www.apache.org/licenses/LICENSE-2.0 179 | Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS-IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 180 | 181 | This is not an official Google product. 182 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /cassandra-cloud-backup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | # 3 | # Copyright 2016 Google Inc. All Rights Reserved. 4 | # 5 | # Licensed under the Apache License, Version 2.0 (the "License"); 6 | # you may not use this file except in compliance with the License. 7 | # You may obtain a copy of the License at 8 | # 9 | # http://www.apache.org/licenses/LICENSE-2.0 10 | # 11 | # Unless required by applicable law or agreed to in writing, software 12 | # distributed under the License is distributed on an "AS-IS" BASIS, 13 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | # 17 | # 18 | # Description : Take snapshot and incremental backups of Cassandra and copy them to Google Cloud Storage 19 | # Optionally restore full system from snapshot 20 | # This is not an official Google product. 21 | # 22 | VERSION='1.0' 23 | SCRIPT_NAME="cassandra-cloud-backup.sh" 24 | #exit on any error 25 | set -e 26 | # Prints the usage for this script 27 | function print_usage() { 28 | echo "Cassandra Backup to Google Cloud Storage Version: ${VERSION}" 29 | cat <<'EOF' 30 | Usage: ./cassandra-cloud-backup.sh [ options ] 31 | Description: 32 | Utility for creating and managing Cassandra Backups with Google Cloud Storage. 33 | Run with admin level privileges. 34 | 35 | The backup command can use gzip or bzip2 for compression, and split large files 36 | into multiple smaller files. If incremental backups are enabled in 37 | Cassandra, this script can incrementally copy them as they are created, saving 38 | time and space. Additionally, this script can be used to cleanup old SnapShot 39 | and Incremental files locally. 40 | 41 | The restore command is designed to perform a simple restore of a full snapshot. 42 | In the event that you want to restore incremental backups you should start by 43 | restoring the last full snapshot prior to your target incremental backup file 44 | and manually move the files from each incremental backup in chronological order 45 | leading up to the target incremental backup file. The schema dump and token ring 46 | are included in the snapshot backups, but if necessary they must also be restored 47 | manually. 48 | 49 | Flags: 50 | -a, --alt-hostname 51 | Specify an alternate server name to be used in the bucket path construction. Used 52 | to create or retrieve backups from different servers 53 | 54 | -B, backup 55 | Default action is to take a backup 56 | 57 | -b, --gcsbucket 58 | Google Cloud Storage bucket used in deployment and by the cluster. 59 | 60 | -c, --clear-old-ss 61 | Clear any old SnapShots taken prior to this backup run to save space 62 | additionally will clear any old incremental backup files taken immediately 63 | following a successful snapshot. this option does nothing with the -i flag 64 | 65 | -C, --clear-old-inc 66 | Clear any old incremental backups taken prior to the the current snapshot 67 | 68 | -d, --backupdir 69 | The directory in which to store the backup files, be sure that this directory 70 | has enough space and the appropriate permissions 71 | 72 | -D, --download-only 73 | During a restore this will only download the target files from GCS 74 | 75 | -f, --force 76 | Used to force the restore without confirmation prompt 77 | 78 | -h, --help 79 | Print this help message. 80 | 81 | -H, --home-dir 82 | This is the $CASSANDRA_HOME directory and is only used if the data_directories, 83 | commitlog_directory, or the saved_caches_directory values cannot be parsed out of the 84 | yaml file. 85 | 86 | -i, --incremental 87 | Copy the incremental backup files and do not take a snapshot. Can only 88 | be run when compression is enabled with -z or -j 89 | 90 | -j, --bzip 91 | Compresses the backup files with bzip2 prior to pushing to Google Cloud Storage 92 | This option will use additional local disk space set the --target-gz-dir 93 | to use an alternate disk location if free space is an issue 94 | 95 | -k, --keep-old 96 | Set this flag on restore to keep a local copy of the old data files 97 | Set this flag on backup to keep a local copy of the compressed backup, schema dump, 98 | and token ring 99 | 100 | -l, --log-dir 101 | Activate logging to file 'CassandraBackup${DATE}.log' from stdout 102 | Include an optional directory path to write the file 103 | Default path is /var/log/cassandra 104 | 105 | -L, --inc-commit-logs 106 | Add commit logs to the backup archive. WARNING: This option can cause the script to 107 | fail an active server as the files roll over 108 | 109 | -n, --noop 110 | Will attempt a dry run and verify all the settings are correct 111 | 112 | -N, --nice 113 | Set the process priority, default 10 114 | 115 | -p 116 | The Cassandra User Password if required for security 117 | 118 | -r, restore 119 | Restore a backup, requires a --gcsbucket path and optional --backupdir 120 | 121 | -s, --split-size 122 | Split the resulting tar archive into the configured size in Megabytes, default 100M 123 | 124 | -S, --service-name 125 | Specify the service name for cassandra, default is cassandra use to stop and start service 126 | 127 | -T, --target-gz-dir 128 | Override the directory to save compressed files in case compression is used 129 | default is --backupdir/compressed, also used to decompress for restore 130 | 131 | -u 132 | The Cassandra User account if required for security 133 | 134 | -U, --auth-file 135 | A file that contains authentication credentials for cqlsh and nodetool consisting of 136 | two lines: 137 | CASSANDRA_USER=username 138 | CASSANDRA_PASS=password 139 | 140 | -v, --verbose 141 | When provided will print additional information to log file 142 | 143 | -w, --with-caches 144 | For posterity's sake, to save the read caches in a backup use this flag, although it 145 | likely represents a waste of space 146 | 147 | -y, --yaml 148 | Path to the Cassandra yaml configuration file 149 | default: /etc/cassandra/cassandra.yaml 150 | 151 | -z, --zip 152 | Compresses the backup files with gzip prior to pushing to Google Cloud Storage 153 | This option will use additional local disk space set the --target-gz-dir 154 | to use an alternate disk location if free space is an issue 155 | 156 | Commands: 157 | backup, restore, inventory, commands, options 158 | 159 | backup Backup the Cassandra Node based on passed in options 160 | 161 | restore Restore the Cassandra Node from a specific snapshot backup 162 | or download an incremental backup locally and extract 163 | 164 | inventory List available backups 165 | 166 | commands List available commands 167 | 168 | options list available options 169 | 170 | Examples: 171 | Take a full snapshot, gzip compress it with nice=15, 172 | upload into the GCS Bucket, and clear old incremental and snapshot files 173 | ./cassandra-cloud-backup.sh -b gs://cassandra-backups123/ -zCc -N 15 backup 174 | 175 | Do a dry run of a full snapshot with verbose output and 176 | create list of files that would have been copied 177 | ./cassandra-cloud-backup.sh -b gs://cassandra-backups123/ -vn backup 178 | 179 | Backup and bzip2 compress copies of the most recent incremental 180 | backup files since the last incremental backup 181 | ./cassandra-cloud-backup.sh -b gs://cassandra-backups123/ -ji backup 182 | 183 | Restore a backup without prompting from specified bucket path and keep the old files locally 184 | ./cassandra-cloud-backup.sh -b gs://cass-bk123/backups/host01/snpsht/2016-01-20_18-57/ -fk restore 185 | 186 | Restore a specific backup to a custom CASSANDRA_HOME directory with secure credentials in 187 | password.txt file with Cassandra running as a Linux service name cass 188 | ./cassandra-cloud-backup.sh -b gs://cass-bk123/backups/host01/snpsht/2016-01-20_18-57/ \ 189 | -y /opt/cass/conf/cassandra.yaml -H /opt/cass -U password.txt -S cass restore 190 | 191 | List inventory of available backups stored in Google Cloud Store 192 | ./cassandra-cloud-backup.sh -b gs://cass-bk123 inventory 193 | 194 | EOF 195 | } 196 | 197 | # List all commands for command completion. 198 | function commands() { 199 | print_usage | sed -n -e '/^Commands:/,/^$/p' | tail -n +2 | head -n -1 | tr -d ',' 200 | } 201 | 202 | # List all options for command completion. 203 | function options() { 204 | print_usage | grep -E '^ *-' | tr -d ',' 205 | } 206 | 207 | # Override the date function 208 | function prepare_date() { 209 | date "$@" 210 | } 211 | 212 | # Prefix a date prior to echo output 213 | function loginfo() { 214 | 215 | if ${LOG_OUTPUT}; then 216 | echo "$(prepare_date +%F_%H:%M:%S): ${*}" >> "${LOG_FILE}" 217 | else 218 | echo "$(prepare_date +%F_%H:%M:%S): ${*}" 219 | fi 220 | } 221 | 222 | # Only used if -v --verbose is passed in 223 | function logverbose() { 224 | if ${VERBOSE}; then 225 | loginfo "VERBOSE: ${*}" 226 | fi 227 | } 228 | 229 | # Pass errors to stderr. 230 | function logerror() { 231 | loginfo "ERROR: ${*}" >&2 232 | let ERROR_COUNT++ 233 | } 234 | 235 | # Bad option was found. 236 | function print_help() { 237 | logerror "Unknown Option Encountered. For help run '${SCRIPT_NAME} --help'" 238 | print_usage 239 | exit 1 240 | } 241 | 242 | # Validate that all configuration options are correct and no conflicting options are set 243 | function validate() { 244 | touch_logfile 245 | single_script_check 246 | set_auth_string 247 | verbose_vars 248 | loginfo "***************VALIDATING INPUT******************" 249 | if [ -z ${GSUTIL} ]; then 250 | logerror "Cannot find gsutil utility please make sure it is in the PATH" 251 | exit 1 252 | fi 253 | if [ -z ${GCS_BUCKET} ]; then 254 | logerror "Please pass in the GCS Bucket to use with this script" 255 | exit 1 256 | else 257 | if ! ${GSUTIL} ls ${GCS_BUCKET} &> /dev/null; then 258 | logerror "Cannot access Google Cloud Storage bucket ${GCS_BUCKET} make sure" \ 259 | " it exists" 260 | exit 1 261 | fi 262 | fi 263 | if [ ${ACTION} != "inventory" ]; then 264 | if [ -z ${NODETOOL} ]; then 265 | logerror "Cannot find nodetool utility please make sure it is in the PATH" 266 | fi 267 | if [ -z ${CQLSH} ]; then 268 | logerror "Cannot find cqlsh utility please make sure it is in the PATH" 269 | fi 270 | if [ ! -f ${YAML_FILE} ]; then 271 | logerror "Yaml File ${YAML_FILE} does not exist and --yaml argument is missing" 272 | else 273 | #different values are needed for backup and for restore 274 | eval "parse_yaml_${ACTION}" 275 | fi 276 | 277 | if [ -z ${data_file_directories} ]; then 278 | if [ -z ${CASS_HOME} ]; then 279 | logerror "Cannot parse data_directories from ${YAML_FILE} and --home-dir argument" \ 280 | " is missing, which should be the \$CASSANDRA_HOME path" 281 | else 282 | data_file_directories="${CASS_HOME}/data" 283 | fi 284 | fi 285 | if ${INCLUDE_COMMIT_LOGS}; then 286 | loginfo "WARNING: Backing up Commit Logs can cause script to fail if server is under load" 287 | fi 288 | if [ -z ${commitlog_directory} ]; then 289 | if [ -z ${CASS_HOME} ]; then 290 | logerror "Cannot parse commitlog_directory from ${YAML_FILE} and --home-dir argument" \ 291 | " is missing, which should be the \$CASSANDRA_HOME path" 292 | else 293 | commitlog_directory="${CASS_HOME}/data/commitlog" 294 | fi 295 | fi 296 | if [ ! -d ${commitlog_directory} ]; then 297 | logerror "no diretory commitlog_directory: ${commitlog_directory} " 298 | fi 299 | if ${INCLUDE_CACHES}; then 300 | loginfo "Backing up saved caches can waste space and time, but it is happening anyway" 301 | fi 302 | if [ -z ${saved_caches_directory} ]; then 303 | if [ -z ${CASS_HOME} ]; then 304 | logerror "Cannot parse saved_caches_directory from ${YAML_FILE} and --home-dir argument" \ 305 | " is missing, which should be the \$CASSANDRA_HOME path" 306 | else 307 | saved_caches_directory="${CASS_HOME}/data/saved_caches" 308 | fi 309 | fi 310 | if [ ! -d ${saved_caches_directory} ]; then 311 | logerror "saved_caches_directory does not exist: ${saved_caches_directory} " 312 | fi 313 | if [ ! -d ${data_file_directories} ]; then 314 | logerror "data_file_directories does not exist : ${data_file_directories} " 315 | fi 316 | #BACKUP_DIR is used to stage backups and stage restores, so create it either way 317 | if [ ! -d ${BACKUP_DIR} ]; then 318 | loginfo "Creating backup directory ${BACKUP_DIR}" 319 | mkdir -p ${BACKUP_DIR} 320 | fi 321 | if ${SPLIT_FILE}; then 322 | SPLIT_FILE_SUFFIX="${TAR_EXT}-${SPLIT_FILE_SUFFIX}" 323 | fi 324 | if [ ! -d ${COMPRESS_DIR} ]; then 325 | loginfo "Creating compression target directory" 326 | mkdir -p ${COMPRESS_DIR} 327 | fi 328 | if [ -z ${TAR} ] || [ -z ${NICE} ]; then 329 | logerror "The tar and nice utilities must be present to win." 330 | fi 331 | if [ ${ACTION} = "restore" ]; then 332 | GCS_LS=$(${GSUTIL} ls ${GCS_BUCKET} | head -n1) 333 | loginfo "GCS first file listed: ${GCS_LS}" 334 | if grep -q 'incr' <<< "${GCS_LS}"; then 335 | loginfo "Detected incremental backup requested for restore. This script " \ 336 | "will only download the files locally" 337 | DOWNLOAD_ONLY=true 338 | INCREMENTAL=true 339 | SUFFIX="incr" 340 | else 341 | if grep -q 'snpsht' <<< "${GCS_LS}"; then 342 | loginfo "Detected full snapshot backup requested for restore." 343 | else 344 | logerror "Detected a Google Cloud Storage bucket path that is not a backup" \ 345 | " location. Make sure the --gcsbucket e is the full path to a specific backup" 346 | fi 347 | fi 348 | if grep -q "tgz" <<< "${GCS_LS}"; then 349 | loginfo "Detected compressed .tgz file for restore" 350 | COMPRESSION=true 351 | TAR_EXT="tgz" 352 | TAR_CFLAG="-z" 353 | fi 354 | if grep -q "tbz" <<< "${GCS_LS}"; then 355 | loginfo "Detected compressed .tbz file for restore" 356 | COMPRESSION=true 357 | TAR_EXT="tbz" 358 | TAR_CFLAG="-j" 359 | fi 360 | if grep -q "tar" <<< "${GCS_LS}"; then 361 | loginfo "Detected uncompressed .tar file for restore" 362 | COMPRESSION=false 363 | TAR_EXT="tar" 364 | TAR_CFLAG="" 365 | fi 366 | RESTORE_FILE=$(awk -F"/" '{print $NF}' <<< "${GCS_LS}") 367 | if [[ "${RESTORE_FILE}" != *.${TAR_EXT} ]] ; then 368 | #Detect Split Files${TAR_EXT}- 369 | if [[ "${RESTORE_FILE}" == ${TAR_EXT}-* ]]; then 370 | SPLIT_FILE=true 371 | loginfo "Split file restore detected" 372 | else 373 | logerror "Restore is not a tar file ${GCS_BUCKET}" 374 | fi 375 | fi 376 | if [[ ! ${GCS_BUCKET} =~ ^.*\.${TAR_EXT}$ ]]; then 377 | if ${SPLIT_FILE}; then 378 | #remove the trailing digits and replace the suffix 379 | RESTORE_FILE="${RESTORE_FILE%${SUFFIX}*}${SUFFIX}*" 380 | GCS_BUCKET="${GCS_BUCKET%/}/${RESTORE_FILE}" 381 | else 382 | GCS_BUCKET="${GCS_BUCKET%/}/${RESTORE_FILE}" 383 | loginfo "Fixed up restore bucket path: ${GCS_BUCKET}" 384 | fi 385 | fi 386 | 387 | if grep -q "," <<< "${seed_provider_class_name_parameters_seeds}"; then 388 | loginfo "Restore target node is likely part of a cluster. Restore script" \ 389 | " will not start node automatically" 390 | AUTO_RESTART=false 391 | fi 392 | loginfo "creating staging directory for restore: ${BACKUP_DIR}/restore" 393 | mkdir -p "${BACKUP_DIR}/restore" 394 | else 395 | if ${INCREMENTAL}; then 396 | if ${CLEAR_INCREMENTALS}; then 397 | logerror "--clear-old-inc option is not compatible with --incremental option" 398 | fi 399 | if ${CLEAR_SNAPSHOTS}; then 400 | logerror "--incremental option is not compatible with --clear-old-ss option" 401 | fi 402 | if [ -z ${incremental_backups} ] || [ ${incremental_backups} = false ]; then 403 | logerror "Cannot copy incremental backups until 'incremental_backups' is true " \ 404 | "in ${YAML_FILE} " 405 | fi 406 | if [ ! -f "${BACKUP_DIR}/last_inc_backup_time" ]; then 407 | touch "${BACKUP_DIR}/last_inc_backup_time" 408 | fi 409 | else 410 | if [ ${CLEAR_INCREMENTALS} = true ] && [ ${incremental_backups} != true ]; then 411 | logerror "Cannot clear incremental backups because 'incremental_backups' is " \ 412 | "false in ${YAML_FILE} " 413 | fi 414 | if [ ! -d "${SCHEMA_DIR}" ]; then 415 | loginfo "Creating schema dump directory: ${SCHEMA_DIR}" 416 | mkdir -p "${SCHEMA_DIR}" 417 | fi 418 | if [ ! -d "${TOKEN_RING_DIR}" ]; then 419 | loginfo "Creating token ring dump directory: ${TOKEN_RING_DIR}" 420 | mkdir -p "${TOKEN_RING_DIR}" 421 | fi 422 | fi 423 | fi 424 | else 425 | # ${ACTION} = "inventory" 426 | parse_yaml_inventory 427 | fi 428 | 429 | logverbose "ERROR_COUNT: ${ERROR_COUNT}" 430 | 431 | if [ ${ERROR_COUNT} -gt 0 ]; then 432 | loginfo "*************ERRORS WHILE VALIDATING INPUT*************" 433 | exit 1 434 | fi 435 | loginfo "*************SUCCESSFULLY VALIDATED INPUT**************" 436 | } 437 | 438 | # Print out all the important variables if -v is set 439 | function verbose_vars() { 440 | logverbose "************* PRINTING VARIABLES ****************\n" 441 | logverbose "ACTION: ${ACTION}" 442 | logverbose "AUTO_RESTART: ${AUTO_RESTART}" 443 | logverbose "BACKUP_DIR: ${BACKUP_DIR}" 444 | logverbose "CASSANDRA_PASS: ${CASSANDRA_PASS}" 445 | logverbose "CASSANDRA_USER: ${CASSANDRA_USER}" 446 | logverbose "CASSANDRA_OG: ${CASSANDRA_OG}" 447 | logverbose "CLEAR_INCREMENTALS: ${CLEAR_INCREMENTALS}" 448 | logverbose "CLEAR_SNAPSHOTS: ${CLEAR_SNAPSHOTS}" 449 | logverbose "COMPRESS_DIR: ${COMPRESS_DIR}" 450 | logverbose "COMPRESSION: ${COMPRESSION}" 451 | logverbose "CQLSH: ${CQLSH}" 452 | logverbose "CQLSH_DEFAULT_HOST: ${CQLSH_DEFAULT_HOST}" 453 | logverbose "DATE: ${DATE}" 454 | logverbose "DOWNLOAD_ONLY: ${DOWNLOAD_ONLY}" 455 | logverbose "DRY_RUN: ${DRY_RUN}" 456 | logverbose "FORCE_RESTORE: ${FORCE_RESTORE}" 457 | logverbose "GCS_BUCKET: ${GCS_BUCKET}" 458 | logverbose "GCS_TMPDIR: ${GCS_TMPDIR}" 459 | logverbose "GSUTIL: ${GSUTIL}" 460 | logverbose "HOSTNAME: ${HOSTNAME}" 461 | logverbose "INCREMENTAL: ${INCREMENTAL}" 462 | logverbose "INCLUDE_CACHES: ${INCLUDE_CACHES}" 463 | logverbose "INCLUDE_COMMIT_LOGS: ${INCLUDE_COMMIT_LOGS}" 464 | logverbose "KEEP_OLD_FILES: ${KEEP_OLD_FILES}" 465 | logverbose "LOG_DIR: ${LOG_DIR}" 466 | logverbose "LOG_FILE: ${LOG_FILE}" 467 | logverbose "LOG_OUTPUT: ${LOG_OUTPUT}" 468 | logverbose "NICE: ${NICE}" 469 | logverbose "NICE_LEVEL: ${NICE_LEVEL}" 470 | logverbose "NODETOOL: ${NODETOOL}" 471 | logverbose "SCHEMA_DIR: ${SCHEMA_DIR}" 472 | logverbose "TOKEN_RING_DIR: ${TOKEN_RING_DIR}" 473 | logverbose "SERVICE_NAME: ${SERVICE_NAME}" 474 | logverbose "SNAPSHOT_NAME: ${SNAPSHOT_NAME}" 475 | logverbose "SPLIT_FILE: ${SPLIT_FILE}" 476 | logverbose "SPLIT_SIZE: ${SPLIT_SIZE}" 477 | logverbose "SUFFIX: ${SUFFIX}" 478 | logverbose "TARGET_LIST_FILE: ${TARGET_LIST_FILE}" 479 | logverbose "USER_OPTIONS: ${USER_OPTIONS}" 480 | logverbose "USER_FILE: ${USER_FILE}" 481 | logverbose "YAML_FILE: ${YAML_FILE}" 482 | logverbose "************* DONE PRINTING VARIABLES ************\n" 483 | } 484 | 485 | # Check that script is not running more than once 486 | function single_script_check() { 487 | local grep_script 488 | #wraps a [] around the first letter to trick the grep statement into ignoring itself 489 | grep_script="$(echo ${SCRIPT_NAME} | sed 's/^/\[/' | sed 's/^\(.\{2\}\)/\1\]/')" 490 | logverbose "checking that script isn't already running" 491 | logverbose "grep_script: ${grep_script}" 492 | status="$(ps -feww | grep -w \"${grep_script}\" \ 493 | | awk -v pid="$(echo $$)" '$2 != pid { print $2 }')" 494 | if [ ! -z "${status}" ]; then 495 | logerror " ${SCRIPT_NAME} : Process is already running. Aborting" 496 | exit 1; 497 | fi 498 | } 499 | 500 | # Create the log file if requested 501 | function touch_logfile() { 502 | if [ "${LOG_OUTPUT}" = true ] && [ ! -f "${LOG_FILE}" ]; then 503 | touch "${LOG_FILE}" 504 | fi 505 | } 506 | 507 | # List available backups in GCS 508 | function inventory() { 509 | loginfo "Available Snapshots:" 510 | ${GSUTIL} ls -d "${GCS_BUCKET}/backups/${HOSTNAME}/snpsht/*" 511 | if [ -z $incremental_backups ] || [ $incremental_backups = false ]; then 512 | loginfo "Incremental Backups are not enabled for Cassandra" 513 | fi 514 | loginfo "Available Incremental Backups:" 515 | ${GSUTIL} ls -d "${GCS_BUCKET}/backups/${HOSTNAME}/incr/*" 516 | } 517 | 518 | # This is the main backup function that orchestrates all the options 519 | # to create the backup set and then push it to GCS 520 | function backup() { 521 | create_gcs_backup_path 522 | clear_backup_file_list 523 | if ${CLEAR_SNAPSHOTS}; then 524 | clear_snapshots 525 | fi 526 | if ${INCREMENTAL}; then 527 | find_incrementals 528 | else 529 | export_schema 530 | export_token_ring 531 | take_snapshot 532 | find_snapshots 533 | fi 534 | copy_other_files 535 | if ${SPLIT_FILE}; then 536 | split_archive 537 | else 538 | archive_compress 539 | fi 540 | copy_to_gcs 541 | save_last_inc_backup_time 542 | backup_cleanup 543 | if ${CLEAR_INCREMENTALS}; then 544 | clear_incrementals 545 | fi 546 | } 547 | 548 | #specific variables are needed for backup 549 | function parse_yaml_backup() { 550 | loginfo "Parsing Cassandra Yaml Config Values" 551 | fields=('data_file_directories' \ 552 | 'commitlog_directory' \ 553 | 'saved_caches_directory' \ 554 | 'incremental_backups' \ 555 | 'native_transport_port' \ 556 | 'rpc_address') 557 | parse_yaml ${YAML_FILE} 558 | } 559 | 560 | #specific variables are needed for restore 561 | function parse_yaml_restore() { 562 | loginfo "Parsing Cassandra Yaml Config Values" 563 | fields=('data_file_directories' \ 564 | 'commitlog_directory' \ 565 | 'saved_caches_directory' \ 566 | 'incremental_backups' \ 567 | 'seed_provider_class_name_parameters_seeds') 568 | 569 | parse_yaml ${YAML_FILE} 570 | } 571 | 572 | function parse_yaml_inventory() { 573 | fields=('incremental_backups') 574 | parse_yaml ${YAML_FILE} 575 | } 576 | 577 | # Based on https://gist.github.com/pkuczynski/8665367 578 | # 579 | # Works for arrays of hashes, and some hashes with arrays 580 | # Variable names will be underscore delimitted based on nested parent names 581 | # Send in yaml file as first argument and create a global array named $fields 582 | # with necessary yaml field names fully underscore delimitted to match nesting 583 | # then this will register those variables into the shell's scope if they exist in Yaml 584 | # $VERBOSE=1 will also print the full values 585 | # Defaults to indent of 4 so d=4 586 | # To use with indent of 2 change d to 2 587 | function parse_yaml() { 588 | local s 589 | local w 590 | local fs 591 | local d 592 | s='[[:space:]]*' 593 | w='[a-zA-Z0-9_]*' 594 | fs="$(echo @|tr @ '\034')" 595 | d=4 596 | eval $( 597 | sed -ne "s|^\($s\)\($w\)$s:$s\"\(.*\)\"$s\$|\1$fs\2$fs\3|p" \ 598 | -e "s|^\($s\)-\?$s\($w\)$s[:-]$s\(.*\)$s\$|\1$fs\2$fs\3|p" $1 | 599 | awk -F"$fs" -v names="${fields[*]}" ' 600 | BEGIN { split(names,n," ") } 601 | { sc=length($1) % "'$d'"; 602 | if ( sc == 0 ) { 603 | indent = length($1)/"'$d'" 604 | } else { 605 | indent = (length($1)+("'$d'"-sc))/"'$d'" 606 | } 607 | vname[indent] = $2; 608 | for (i in vname) {if (i > indent){ delete vname[i];}} 609 | if (length($3) > 0 ) { 610 | vn=""; 611 | for (i=0; i 0) {vn=(vn)(vname[i])("_");} 613 | } 614 | ap=""; 615 | if($2 ~ /^ *$/ && vn ~ /_$/) { vn=substr(vn,1,length(vn)-1);ap="+" } 616 | for ( name in n ) { 617 | if ( $2 == n[name] || vn == n[name] || (vn)($2) == n[name]) { 618 | printf("%s%s%s=(\"%s\")\n", vn, $2, ap, $3); 619 | if ("'"$VERBOSE"'" == "true"){ 620 | printf(";logverbose %s%s%s=\\(\\\"%s\\\"\\);", vn, $2, ap, $3); 621 | } 622 | } 623 | } 624 | } 625 | }' 626 | ) 627 | } 628 | 629 | # If a username and password is required for cqlsh and nodetool 630 | function set_auth_string() { 631 | if ${USE_AUTH}; then 632 | if [ -n "${USER_FILE}" ] && [ -f "${USER_FILE}" ]; then 633 | source "${USER_FILE}" 634 | fi 635 | if [ -z "${CASSANDRA_USER}" ] || [ -z "${CASSANDRA_PASS}" ]; then 636 | logerror "Cassandra authentication values are missing or empty CASSANDRA_USER or CASSANDRA_PASS" 637 | fi 638 | USER_OPTIONS=" -u ${CASSANDRA_USER} --password ${CASSANDRA_PASS} " 639 | fi 640 | } 641 | 642 | # Set the backup path bucket URL 643 | function create_gcs_backup_path() { 644 | GCS_BACKUP_PATH="${GCS_BUCKET}/backups/${HOSTNAME}/${SUFFIX}/${DATE}/" 645 | loginfo "Will use target backup directory: ${GCS_BACKUP_PATH}" 646 | } 647 | 648 | # In case there is an existing backup file list, clear it out 649 | function clear_backup_file_list() { 650 | loginfo "Clearing target list file: ${TARGET_LIST_FILE}" 651 | > "${TARGET_LIST_FILE}" 652 | } 653 | 654 | # Use nodetool to take a snapshot with a specific name 655 | function take_snapshot() { 656 | loginfo "Taking Snapshot ${SNAPSHOT_NAME}" 657 | #later used to remove older incrementals 658 | SNAPSHOT_TIME=$(prepare_date "+%F %H:%M:%S") 659 | if ${DRY_RUN}; then 660 | loginfo "DRY RUN: ${NODETOOL} ${USER_OPTIONS} snapshot -t ${SNAPSHOT_NAME} " 661 | else 662 | ${NODETOOL} ${USER_OPTIONS} snapshot -t "${SNAPSHOT_NAME}" #2>&1 > ${LOG_FILE} 663 | loginfo "Completed Snapshot ${SNAPSHOT_NAME}" 664 | fi 665 | } 666 | 667 | # Export the whole schema for safety 668 | function export_schema() { 669 | loginfo "Exporting Schema to ${SCHEMA_DIR}/${DATE}-schema.cql" 670 | local cqlsh_host=${rpc_address:-$CQLSH_DEFAULT_HOST} 671 | local cmd 672 | cmd="${CQLSH} ${cqlsh_host} ${native_transport_port} ${USER_OPTIONS} -e 'DESC SCHEMA;'" 673 | if ${DRY_RUN}; then 674 | loginfo "DRY RUN: ${cmd} > ${SCHEMA_DIR}/${DATE}-schema.cql" 675 | else 676 | #cqlsh does not behave consistently when executed directly from inside a script 677 | bash -c "${cmd} > ${SCHEMA_DIR}/${DATE}-schema.cql" 678 | fi 679 | echo "${SCHEMA_DIR}/${DATE}-schema.cql" >> "${TARGET_LIST_FILE}" 680 | } 681 | 682 | # Export the whole token ring for safety 683 | function export_token_ring() { 684 | loginfo "Exporting Token Ring to ${TOKEN_RING_DIR}/${DATE}-token-ring" 685 | local cmd 686 | cmd="${NODETOOL} ${USER_OPTIONS} ring" 687 | if ${DRY_RUN}; then 688 | loginfo "DRY RUN: ${cmd} > ${TOKEN_RING_DIR}/${DATE}-token-ring" 689 | else 690 | bash -c "${cmd} > ${TOKEN_RING_DIR}/${DATE}-token-ring" 691 | fi 692 | echo "${TOKEN_RING_DIR}/${DATE}-token-ring" >> "${TARGET_LIST_FILE}" 693 | } 694 | 695 | 696 | # Copy the commit logs, saved caches directoy and the yaml config file 697 | function copy_other_files() { 698 | loginfo "Copying caches, commitlogs and config file paths to backup list" 699 | #resolves issue #2 700 | if ${INCLUDE_COMMIT_LOGS}; then 701 | find "${commitlog_directory}" -type f >> "${TARGET_LIST_FILE}" 702 | fi 703 | #resolves issue #3 704 | if ${INCLUDE_CACHES}; then 705 | find "${saved_caches_directory}" -type f >> "${TARGET_LIST_FILE}" 706 | fi 707 | echo "${YAML_FILE}" >> "${TARGET_LIST_FILE}" 708 | } 709 | 710 | # Since incrementals are automatically created as needed 711 | # this script has to find them for each keyspace and then 712 | # compare against a timestamp file to copy only the newest files 713 | function find_incrementals() { 714 | loginfo "Locating Incremental backup files" 715 | LAST_INC_BACKUP_TIME="$(head -n 1 ${BACKUP_DIR}/last_inc_backup_time)" 716 | #take time before list to backup is compiled 717 | local time_before_find=$(prepare_date "+%F %H:%M:%S") 718 | for i in "${data_file_directories[@]}" 719 | do 720 | if [ -n "${LAST_INC_BACKUP_TIME}" ]; then 721 | find ${i} -mindepth 4 -maxdepth 4 -path '*/backups/*' -type f \ 722 | \( -name "*.db" -o -name "*.crc32" -o -name "*.txt" \) \ 723 | -newermt "${LAST_INC_BACKUP_TIME}" >> "${TARGET_LIST_FILE}" 724 | else 725 | find ${i} -mindepth 4 -maxdepth 4 -path '*/backups/*' -type f \ 726 | \( -name "*.db" -o -name "*.crc32" -o -name "*.txt" \) \ 727 | >> "${TARGET_LIST_FILE}" 728 | fi 729 | done 730 | #if there is only one line in the file then no files were found 731 | if [ $(cat "${TARGET_LIST_FILE}" | wc -l) -lt 1 ]; then 732 | loginfo "No new incremental files detected, aborting backup" 733 | exit 0 734 | fi 735 | #store time right before backup list creation to update after successful backup 736 | LAST_INC_BACKUP_TIME=${time_before_find} 737 | } 738 | 739 | # After successful backup, update last_inc_backup_time file 740 | function save_last_inc_backup_time() { 741 | if ! ${DRY_RUN}; then 742 | echo "${LAST_INC_BACKUP_TIME}" > ${BACKUP_DIR}/last_inc_backup_time 743 | fi 744 | } 745 | 746 | # Find snapshots to include in backup 747 | function find_snapshots() { 748 | loginfo "Locating Snapshot ${SNAPSHOT_NAME}" 749 | for i in "${data_file_directories[@]}" 750 | do 751 | find ${i} -path "*/snapshots/${SNAPSHOT_NAME}/*" -type f >> "${TARGET_LIST_FILE}" 752 | done 753 | } 754 | 755 | # Compress contents of backup directory 756 | function archive_compress() { 757 | loginfo "Creating Archive file: ${COMPRESS_DIR}/${ARCHIVE_FILE}" 758 | local cmd 759 | cmd="${NICE} -n${NICE_LEVEL} ${TAR} -pc ${TAR_CFLAG} -f " 760 | cmd+="${COMPRESS_DIR}/${ARCHIVE_FILE} --files-from=${TARGET_LIST_FILE}" 761 | if ${DRY_RUN}; then 762 | loginfo "DRY RUN: ${cmd}" 763 | else 764 | eval "${cmd}" 765 | fi 766 | } 767 | 768 | #For large backup files, this will split the file into multiple smaller files 769 | #which allows for more efficient upload / download from Google Cloud Storage 770 | function split_archive() { 771 | loginfo "Compressing And splitting backup" 772 | local cmd 773 | cmd="(cd ${COMPRESS_DIR} && ${NICE} -n${NICE_LEVEL} ${TAR} -pc ${TAR_CFLAG} -f - " 774 | cmd+="--files-from=${TARGET_LIST_FILE} " 775 | cmd+=" | split -d -b${SPLIT_SIZE} - ${SPLIT_FILE_SUFFIX})" 776 | if ${DRY_RUN}; then 777 | loginfo "DRY RUN: ${cmd}" 778 | else 779 | eval "${cmd}" 780 | fi 781 | } 782 | 783 | # Remove old snapshots to free space 784 | function clear_snapshots() { 785 | loginfo "Clearing old Snapshots" 786 | if ${DRY_RUN}; then 787 | loginfo "DRY RUN: did not clear snapshots" 788 | else 789 | $NODETOOL ${USER_OPTIONS} clearsnapshot 790 | fi 791 | } 792 | 793 | # If requested the old incremental backup files will be pruned following the fresh snapshot 794 | #$AGE is set to 5 minutes assuming this script takes no more than 5 minutes to run 795 | function clear_incrementals() { 796 | loginfo "Clearing old incremental backups" 797 | for i in "${data_file_directories[@]}" 798 | do 799 | if ${DRY_RUN}; then 800 | loginfo "DRY RUN: did not clear old incremental backups" 801 | else 802 | find ${i} -mindepth 4 -maxdepth 4 -path '*/backups/*' -type f \ 803 | \( -name "*.db" -o -name "*.crc32" -o -name "*.txt" \) \ 804 | \! -newermt "${SNAPSHOT_TIME}" -exec rm -f ${VERBOSE_RM} {} \; 805 | fi 806 | done 807 | } 808 | 809 | # Copy the backup files up to the GCS bucket 810 | function copy_to_gcs() { 811 | loginfo "Copying files to ${GCS_BACKUP_PATH}" 812 | if ${DRY_RUN}; then 813 | if ${SPLIT_FILE}; then 814 | loginfo "DRY RUN: ${GSUTIL} -m cp ${COMPRESS_DIR}/${SPLIT_FILE_SUFFIX}* ${GCS_BACKUP_PATH}" 815 | else 816 | loginfo "DRY RUN: ${GSUTIL} cp ${COMPRESS_DIR}/${ARCHIVE_FILE} ${GCS_BACKUP_PATH}" 817 | fi 818 | else 819 | if ${SPLIT_FILE}; then 820 | ${GSUTIL} -m cp "${COMPRESS_DIR}/${SPLIT_FILE_SUFFIX}*" "${GCS_BACKUP_PATH}" 821 | else 822 | ${GSUTIL} cp "${COMPRESS_DIR}/${ARCHIVE_FILE}" "${GCS_BACKUP_PATH}" 823 | fi 824 | fi 825 | } 826 | 827 | # This will optionally go through and delete files generated by the backup 828 | # if the -k --keep-old flag is set then it will not delete these files 829 | function backup_cleanup() { 830 | if ${DRY_RUN}; then 831 | loginfo "DRY RUN: Would have deleted old backup files" 832 | else 833 | if ${KEEP_OLD_FILES}; then 834 | loginfo "Keeping backup files:" 835 | loginfo " ${COMPRESS_DIR}/*" 836 | loginfo " ${SCHEMA_DIR}/${DATE}-schema.cql" 837 | loginfo " ${TOKEN_RING_DIR}/${DATE}-token-ring" 838 | else 839 | loginfo "Deleting backup files" 840 | find "${COMPRESS_DIR}/" -type f -exec rm -f ${VERBOSE_RM} {} \; 841 | find "${SCHEMA_DIR}/" -type f -exec rm -f ${VERBOSE_RM} {} \; 842 | find "${TOKEN_RING_DIR}/" -type f -exec rm -f ${VERBOSE_RM} {} \; 843 | rm -f ${VERBOSE_RM} ${TARGET_LIST_FILE} 844 | fi 845 | fi 846 | } 847 | 848 | # This restore function is designed to perform a simple restore of a full snapshot 849 | # In the event that you want to restore incremental backups you should start by 850 | # restoring the last full snapshot prior to your target incremental backup file 851 | # and manually move the files from each incremental file in chronological order 852 | # leading up to the target incremental backup file 853 | function restore() { 854 | loginfo "****NOTE: Simple restore procedure activated*****************" 855 | loginfo "****NOTE: Restore requires a full snapshot backup************" 856 | loginfo "****NOTE: Incremental backups must be manually restored******\n" 857 | restore_get_files 858 | if ${DOWNLOAD_ONLY} ; then 859 | loginfo "Backup file downloaded to ${BACKUP_DIR}/restore, this script will only" \ 860 | " restore a full snapshot" 861 | loginfo "You must manually restore incremental files in sequence after first" \ 862 | "restoring the last full snapshot taken prior to your incremental file's creation date" 863 | exit 0 864 | else 865 | restore_confirm 866 | restore_stop_cassandra 867 | restore_files 868 | restore_start_cassandra 869 | restore_cleanup 870 | fi 871 | } 872 | 873 | # Orchestrate the retrieval and extraction of the files to recover 874 | function restore_get_files() { 875 | loginfo "Starting file retrieval process" 876 | if ${DRY_RUN}; then 877 | loginfo "DRY RUN: Would have cleared restore dir ${BACKUP_DIR}/restore/*" 878 | else 879 | rm -rf ${VERBOSE_RM} ${BACKUP_DIR}/restore/* 880 | fi 881 | if ${SPLIT_FILE}; then 882 | restore_split_from_gcs 883 | else 884 | restore_compressed_from_gcs 885 | fi 886 | 887 | } 888 | 889 | # Download uncompressed backup files from GCS 890 | function restore_split_from_gcs() { 891 | loginfo "Downloading restore files from GCS" 892 | if ${DRY_RUN}; then 893 | loginfo "DRY RUN: ${GSUTIL} -m -r cp ${GCS_BUCKET} ${COMPRESS_DIR}" 894 | else 895 | ${GSUTIL} -m cp -r "${GCS_BUCKET}" "${COMPRESS_DIR}" 896 | fi 897 | restore_split 898 | } 899 | 900 | # Retrieve the compressed backup file 901 | function restore_compressed_from_gcs() { 902 | if ${DRY_RUN}; then 903 | loginfo "DRY RUN: ${GSUTIL} cp ${GCS_BUCKET} ${COMPRESS_DIR}" 904 | else 905 | #copy the tar.gz file 906 | ${GSUTIL} cp "${GCS_BUCKET}" "${COMPRESS_DIR}" 907 | fi 908 | restore_decompress 909 | } 910 | 911 | # Extract the compressed backup file 912 | function restore_decompress() { 913 | loginfo "Decompressing restore files" 914 | local cmd 915 | cmd="${NICE} -n${NICE_LEVEL} ${TAR} -x ${TAR_CFLAG} " 916 | cmd+="-f ${COMPRESS_DIR}/${RESTORE_FILE} -C ${BACKUP_DIR}/restore/" 917 | if ${DRY_RUN}; then 918 | loginfo "DRY RUN: ${cmd}" 919 | else 920 | eval "${cmd}" 921 | fi 922 | } 923 | 924 | # Concatenate the split backup files and extract them 925 | function restore_split() { 926 | loginfo "Concatening split archive and extracting files" 927 | local cmd 928 | cmd="(cd ${BACKUP_DIR}/restore/ && ${NICE} -n${NICE_LEVEL} " 929 | cmd+="cat ${COMPRESS_DIR}/${RESTORE_FILE} | ${TAR} -x ${TAR_CFLAG} " 930 | cmd+="-f - -C ${BACKUP_DIR}/restore/ )" 931 | if ${DRY_RUN}; then 932 | loginfo "DRY RUN: ${cmd}" 933 | else 934 | eval "${cmd}" 935 | fi 936 | } 937 | 938 | # The archive commands save permissions but the new directories need this 939 | # @param directory path to chown 940 | function restore_fix_perms() { 941 | loginfo "Fixing file ownership" 942 | if ${DRY_RUN}; then 943 | loginfo "DRY RUN: chown -R ${CASSANDRA_OG} ${1} " 944 | else 945 | chown -R ${CASSANDRA_OG} ${1} 946 | fi 947 | } 948 | 949 | # Do the heavy lifting of moving the files from the restore directory back to the 950 | # correct target directories. This will also rename the current important directories 951 | # in order to keep a local copy to roll back. This will then take the snapshot 952 | function restore_files() { 953 | loginfo "Attempting to restore files" 954 | #temporarily move current files 955 | if ${DRY_RUN}; then 956 | loginfo "DRY RUN: Copying files from ${BACKUP_DIR}/restore/" 957 | else 958 | for i in "${data_file_directories[@]}" 959 | do 960 | loginfo "Renaming ${i} to ${i}_old_${DATE} if anything fails, manually rename it" 961 | mv "${i}" "${i}_old_${DATE}" 962 | done 963 | 964 | loginfo "Renaming ${commitlog_directory} to ${commitlog_directory}_old_${DATE} "\ 965 | "if anything fails, manually rename it" 966 | mv "${commitlog_directory}" "${commitlog_directory}_old_${DATE}" 967 | 968 | loginfo "Renaming ${saved_caches_directory} to ${saved_caches_directory}_old_${DATE}"\ 969 | " if anything fails, manually rename it" 970 | mv "${saved_caches_directory}" "${saved_caches_directory}_old_${DATE}" 971 | 972 | #copy the full paths back to the root directory exlude the Yaml File 973 | mkdir -p "${commitlog_directory}" 974 | restore_fix_perms "${commitlog_directory}" 975 | mkdir -p "${saved_caches_directory}" 976 | restore_fix_perms "${saved_caches_directory}" 977 | loginfo "Performing rsync commitlogs and caches from restore directory to full path" 978 | if [ -d "${BACKUP_DIR}/restore${commitlog_directory}" ]; then 979 | rsync -aH ${VERBOSE_RSYNC} ${BACKUP_DIR}/restore${commitlog_directory}/* ${commitlog_directory}/ 980 | fi 981 | if [ -d "${BACKUP_DIR}/restore${saved_caches_directory}" ]; then 982 | rsync -aH ${VERBOSE_RSYNC} ${BACKUP_DIR}/restore${saved_caches_directory}/* ${saved_caches_directory}/ 983 | fi 984 | 985 | for i in "${data_file_directories[@]}" 986 | do 987 | #have to recreate it since we moved the old one for safety 988 | mkdir -p ${i} && restore_fix_perms ${i} 989 | loginfo "Performing rsync data files from restore directory to full path ${i}" 990 | rsync -aH ${VERBOSE_RSYNC} ${BACKUP_DIR}/restore${i}/* ${i}/ 991 | loginfo "Moving snapshot files up two directories to their keyspace base directories" 992 | #assume the snap* pattern is safe since no other 993 | # snapshots should have been copied in the backup process 994 | find ${i} -mindepth 2 -path '*/snapshots/snap*/*' -type f \ 995 | -exec bash -c 'dir={}&& cd ${dir%/*} && mv {} ../..' \; 996 | restore_fix_perms ${i} 997 | done 998 | fi 999 | } 1000 | 1001 | # Stop the Cassandra service after flushing the transaction logs 1002 | # since we're doing a full restore in this case flushing is irrelevant 1003 | # but in future versions of this script there should be the option 1004 | # to restore a specific keyspace and stopping would require a flush first 1005 | function restore_stop_cassandra() { 1006 | if ${DRY_RUN}; then 1007 | loginfo "DRY RUN: Flushing and Stopping Cassandra" 1008 | loginfo "DRY RUN: $NODETOOL ${USER_OPTIONS} " \ 1009 | "flush; service $SERVICE_NAME stop" 1010 | else 1011 | set +e 1012 | #the following status script often throws an error, ignore it 1013 | if $NODETOOL ${USER_OPTIONS} status | grep -q "Connection refused"; then 1014 | loginfo "Attempted to Stop Cassandra service but it seems to already be stopped" 1015 | else 1016 | $NODETOOL ${USER_OPTIONS} flush 1017 | loginfo "Stopping Cassandra Service ${SERVICE_NAME} and sleep for 10 seconds" 1018 | service ${SERVICE_NAME} stop 1019 | sleep 10 1020 | fi 1021 | set -e 1022 | fi 1023 | } 1024 | 1025 | # If Cassandra is not part of a cluster then restart it. If it is part of a cluster, 1026 | # the restore must be completed for every node before restarting them, or the newer 1027 | # data on the other nodes will overwrite the old data that was just restored 1028 | function restore_start_cassandra() { 1029 | if ${DRY_RUN}; then 1030 | loginfo "DRY RUN: Starting Cassandra" 1031 | else 1032 | if "${AUTO_RESTART}"; then 1033 | service ${SERVICE_NAME} start 1034 | fi 1035 | fi 1036 | } 1037 | 1038 | # This will optionally go through and delete any copies of old data files 1039 | #if the -k --keep-old flag is set then it will not delete these files 1040 | function restore_cleanup() { 1041 | if ${DRY_RUN}; then 1042 | loginfo "DRY RUN: Would have deleted old data files" 1043 | else 1044 | if ${KEEP_OLD_FILES}; then 1045 | loginfo "Keeping old files:" 1046 | loginfo " ${commitlog_directory}_old_${DATE}" 1047 | loginfo " ${saved_caches_directory}_old_${DATE}" 1048 | loginfo " ${BACKUP_DIR}/restore/" 1049 | else 1050 | loginfo "Deleting old files" 1051 | rm -rf ${VERBOSE_RM} "${commitlog_directory}_old_${DATE}" 1052 | rm -rf ${VERBOSE_RM} "${saved_caches_directory}_old_${DATE}" 1053 | rm -rf ${VERBOSE_RM} "${BACKUP_DIR}/restore/" 1054 | fi 1055 | 1056 | for i in "${data_file_directories[@]}" 1057 | do 1058 | if ${KEEP_OLD_FILES}; then 1059 | loginfo "keeping old data: ${i}_old_${DATE}" 1060 | else 1061 | loginfo "deleting old data: ${i}_old_${DATE}" 1062 | rm -rf ${VERBOSE_RM} "${i}_old_${DATE}" 1063 | fi 1064 | done 1065 | rm -rf ${VERBOSE_RM} ${BACKUP_DIR}/restore 1066 | rm -rf ${VERBOSE_RM} ${COMPRESS_DIR:?"aborting bad compress_dir"}/* 1067 | fi 1068 | } 1069 | 1070 | #restore should be performed by a person 1071 | #the -f option will force restore without confirmation 1072 | function restore_confirm() { 1073 | 1074 | if ${FORCE_RESTORE}; then 1075 | return 1076 | fi 1077 | while true 1078 | do 1079 | read -p "Confirm: Stop Cassandra and restore the files \ 1080 | from ${BACKUP_DIR}/restore? Y or N" ans 1081 | case $ans in 1082 | [yY]* ) 1083 | echo "Okay, commencing restore"; 1084 | break 1085 | ;; 1086 | [nN]* ) 1087 | loginfo "Exiting restore" 1088 | exit 0 1089 | break 1090 | ;; 1091 | 1092 | * ) 1093 | echo "Enter Y or N, please."; 1094 | ;; 1095 | esac 1096 | done 1097 | } 1098 | 1099 | # Transform long options to short ones 1100 | for arg in "$@"; do 1101 | shift 1102 | case "$arg" in 1103 | 1104 | "backup") set -- "$@" "-B" ;; 1105 | "restore") set -- "$@" "-r" ;; 1106 | "commands") 1107 | commands 1108 | exit 0 1109 | ;; 1110 | "options") 1111 | options 1112 | exit 0 1113 | ;; 1114 | "inventory") set -- "$@" "-I" ;; 1115 | "--alt-hostname") set -- "$@" "-a" ;; 1116 | "--auth-file") set -- "$@" "-U" ;; 1117 | "--gcsbucket") set -- "$@" "-b" ;; 1118 | "--backupdir") set -- "$@" "-d" ;; 1119 | "--bzip") set -- "$@" "-j" ;; 1120 | "--clear-old-ss") set -- "$@" "-c" ;; 1121 | "--clear-old-inc") set -- "$@" "-C" ;; 1122 | "--download-only") set -- "$@" "-D" ;; 1123 | "--force") set -- "$@" "-f" ;; 1124 | "--help") set -- "$@" "-h" ;; 1125 | "--home-dir") set -- "$@" "-H" ;; 1126 | "--inc-commit-logs") set -- "$@" "-L" ;; 1127 | "--incremental") set -- "$@" "-i" ;; 1128 | "--log-dir") set -- "$@" "-l" ;; 1129 | "--keep-old") set -- "$@" "-k" ;; 1130 | "--noop") set -- "$@" "-n" ;; 1131 | "--nice") set -- "$@" "-N" ;; 1132 | "--service-name") set -- "$@" "-S" ;; 1133 | "--split-size") set -- "$@" "-s" ;; 1134 | "--target-gz-dir") set -- "$@" "-T" ;; 1135 | "--verbose") set -- "$@" "-v" ;; 1136 | "--with-caches") set -- "$@" "-w" ;; 1137 | "--yaml") set -- "$@" "-y" ;; 1138 | "--zip") set -- "$@" "-z" ;; 1139 | *) set -- "$@" "$arg" 1140 | esac 1141 | done 1142 | 1143 | while getopts 'a:b:BcCd:DfhH:iIjkl:LnN:p:rs:S:T:u:U:vwy:z' OPTION 1144 | do 1145 | case $OPTION in 1146 | a) 1147 | HOSTNAME=${OPTARG} 1148 | ;; 1149 | b) 1150 | GCS_BUCKET=${OPTARG%/} 1151 | ;; 1152 | B) 1153 | ACTION="backup" 1154 | ;; 1155 | c) 1156 | CLEAR_SNAPSHOTS=true 1157 | ;; 1158 | C) 1159 | CLEAR_INCREMENTALS=true 1160 | ;; 1161 | d) 1162 | BACKUP_DIR=${OPTARG} 1163 | ;; 1164 | D) 1165 | DOWNLOAD_ONLY=true 1166 | ;; 1167 | f) 1168 | FORCE_RESTORE=true 1169 | ;; 1170 | h) 1171 | print_usage 1172 | exit 0 1173 | ;; 1174 | H) 1175 | CASS_HOME=${OPTARG%/} 1176 | ;; 1177 | i) 1178 | INCREMENTAL=true 1179 | ;; 1180 | I) 1181 | ACTION="inventory" 1182 | ;; 1183 | j) 1184 | BZIP=true 1185 | COMPRESSION=true 1186 | TAR_CFLAG="-j" 1187 | TAR_EXT="tbz" 1188 | ;; 1189 | k) 1190 | KEEP_OLD_FILES=true 1191 | ;; 1192 | l) 1193 | LOG_OUTPUT=true 1194 | [ -d ${OPTARG} ] && LOG_DIR=${OPTARG%/} 1195 | ;; 1196 | L) 1197 | INCLUDE_COMMIT_LOGS=true 1198 | ;; 1199 | n) 1200 | DRY_RUN=true 1201 | ;; 1202 | N) 1203 | NICE_LEVEL=${OPTARG} 1204 | ;; 1205 | p) 1206 | CASSANDRA_PASS=${OPTARG} 1207 | ;; 1208 | r) 1209 | ACTION="restore" 1210 | ;; 1211 | s) 1212 | SPLIT_SIZE="${OPTARG/[a-z]*[A-Z]*}M" #replace letters with M 1213 | SPLIT_FILE=true 1214 | ;; 1215 | S) 1216 | SERVICE_NAME=${OPTARG} 1217 | ;; 1218 | T) 1219 | COMPRESS_DIR=${OPTARG%/} 1220 | ;; 1221 | u) 1222 | CASSANDRA_USER=${OPTARG} 1223 | USE_AUTH=true 1224 | ;; 1225 | U) 1226 | USER_FILE=${OPTARG} 1227 | USE_AUTH=true 1228 | ;; 1229 | v) 1230 | VERBOSE=true 1231 | ;; 1232 | w) 1233 | INCLUDE_CACHES=true 1234 | ;; 1235 | y) 1236 | YAML_FILE=${OPTARG} 1237 | ;; 1238 | z) 1239 | COMPRESSION=true 1240 | TAR_CFLAG="-z" 1241 | TAR_EXT="tgz" 1242 | ;; 1243 | ?) 1244 | print_help 1245 | ;; 1246 | esac 1247 | done 1248 | 1249 | ACTION=${ACTION:-backup} # either backup or restore 1250 | AGE=5 #five minutes ago the last modified date of incremental backups to prune 1251 | AUTO_RESTART=true #flag set to false if Cassandra is part of a cluster 1252 | BACKUP_DIR=${BACKUP_DIR:-/cassandra/backups} # Backups base directory 1253 | BZIP=${BZIP:-false} #use bzip2 compression 1254 | CASSANDRA_PASS=${CASSANDRA_PASS:-''} #Password for Cassandra CQLSH account 1255 | CASSANDRA_USER=${CASSANDRA_USER:-''} #Username for Cassandra CQLSH account 1256 | CASSANDRA_OG="cassandra:cassandra" #modify this if you changed the system cassandra user and group 1257 | CLEAR_INCREMENTALS=${CLEAR_INCREMENTALS:-false} #flag to delete incrementals post snapshot 1258 | CLEAR_SNAPSHOTS=${CLEAR_SNAPSHOTS:-false} #clear old snapshots pre-snapshot 1259 | COMPRESS_DIR=${COMPRESS_DIR:-${BACKUP_DIR}/compressed} #directory to house backup archive 1260 | COMPRESSION=${COMPRESSION:-false} #flag to use tar+gz 1261 | CQLSH="$(which cqlsh)" #which cqlsh command 1262 | CQLSH_DEFAULT_HOST="127.0.0.1" #cqlsh host - currently hard coded 1263 | DATE="$(prepare_date +%F_%H-%M )" #nicely formatted date string for files 1264 | DOWNLOAD_ONLY=${DOWNLOAD_ONLY:-false} #user flag or used if incremental restore is requested 1265 | DRY_RUN=${DRY_RUN:-false} #flag to only print what would have executed 1266 | ERROR_COUNT=0 #used in validation step will exit if > 0 1267 | FORCE_RESTORE=${FORCE_RESTORE:-false} #flag to bypass restore confirmation prompt 1268 | GSUTIL="$(which gsutil)" #which gsutil script 1269 | HOSTNAME=${HOSTNAME:-"$(hostname)"} #used for gcs backup location 1270 | INCLUDE_CACHES=${INCLUDE_CACHES:-false} #include the saved caches for posterity 1271 | INCLUDE_COMMIT_LOGS=${INCLUDE_COMMIT_LOGS:-false} #include the commit logs for extra safety 1272 | INCREMENTAL=${INCREMENTAL:-false} # flag to indicate only incremental files 1273 | KEEP_OLD_FILES=${KEEP_OLD_FILES:-false} 1274 | LOG_DIR=${LOG_DIR:-/var/log/cassandra} #where to write the log files 1275 | LOG_FILE="${LOG_DIR}/CassandraBackup${DATE}.log" #script log file 1276 | LOG_OUTPUT=${LOG_OUTPUT:-false} #flag to output to log file instead of stdout 1277 | NICE="$(which nice)" #which nice for low impact tar 1278 | NICE_LEVEL=${NICE_LEVEL:-10} ##10 is default nice level 1279 | NODETOOL="$(which nodetool)" #which nodetool 1280 | USER_OPTIONS="" #nodetool and cqlsh options 1281 | SCHEMA_DIR="${BACKUP_DIR}/schema" # schema backups directory 1282 | TOKEN_RING_DIR="${BACKUP_DIR}/token_ring" # token ring backups directory 1283 | SERVICE_NAME=${SERVICE_NAME:-cassandra} # sometimes the service name is different 1284 | SNAPSHOT_NAME=snap-${DATE} #name of new snapshot to take 1285 | SNAPSHOT_TIME="" #used to keep track of when the snapshot was taken 1286 | SPLIT_SIZE=${SPLIT_SIZE:-"100M"} #size of files if split command is used 1287 | SPLIT_FILE=${SPLIT_FILE:-false} #whether or not to use the split command on backup archive 1288 | SUFFIX="snpsht" #Differentiates the two types of backup files 1289 | ${INCREMENTAL} && SUFFIX="incr" #If incremental mode change the file suffix 1290 | TAR="$(which tar)" #which tar command 1291 | TAR_EXT=${TAR_EXT:-tar} #default tar gzip file extension 1292 | TAR_CFLAG=${TAR_CFLAG:-""} #flag for tar to use gzip, if bzip is requested then -j 1293 | #TARGET_SCHEMA=${TARGET_SCHEMA:-schema} #Restore a specific Schema, not implemented yet 1294 | USE_AUTH=${USE_AUTH:-false} #flag to use cqlsh authentication 1295 | VERBOSE=${VERBOSE:-false} #prints detailed information 1296 | VERBOSE_RSYNC="" # add more detail to rsync when verbose mode is active 1297 | ${VERBOSE} && VERBOSE_RSYNC="-v --progress" 1298 | VERBOSE_RM="" # add more detail to remove when verbose mode is active 1299 | ${VERBOSE} && VERBOSE_RM="-v" 1300 | YAML_FILE=${YAML_FILE:-/etc/cassandra/cassandra.yaml} #Cassandra config file 1301 | ARCHIVE_FILE="cass-${DATE}-${SUFFIX}.${TAR_EXT}" 1302 | SPLIT_FILE_SUFFIX="cass-${DATE}-${SUFFIX}" 1303 | TARGET_LIST_FILE="${BACKUP_DIR}/${SUFFIX}_backup_files-${DATE}" 1304 | LAST_INC_BACKUP_TIME="" #used to keep track of the incremental backup time 1305 | 1306 | # Validate input 1307 | validate 1308 | # Execute the requested action 1309 | eval $ACTION 1310 | --------------------------------------------------------------------------------