├── .gitignore ├── CHANGES ├── CHECKLIST ├── COPYING ├── INSTALL ├── Makefile.am ├── README ├── README.md ├── README_nbd.md ├── TODO ├── block_cache.c ├── block_cache.h ├── block_part.c ├── block_part.h ├── cleanup.sh ├── compress.c ├── compress.h ├── configure.ac ├── dcache.c ├── dcache.h ├── ec_protect.c ├── ec_protect.h ├── erase.c ├── erase.h ├── examples └── create_zpool.py ├── fuse_ops.c ├── fuse_ops.h ├── hash.c ├── hash.h ├── http_io.c ├── http_io.h ├── main.c ├── nbdkit.c ├── nbdkit.h ├── rebuild.sh ├── reset.c ├── reset.h ├── s3b_config.c ├── s3b_config.h ├── s3backer.1.in ├── s3backer.h ├── sslcompat.c ├── test_io.c ├── test_io.h ├── tester.c ├── util.c ├── util.h ├── zero_cache.c └── zero_cache.h /.gitignore: -------------------------------------------------------------------------------- 1 | aclocal.m4 2 | autom4te.cache 3 | config.h 4 | config.h.in 5 | config.log 6 | config.status 7 | configure 8 | debian/Makefile 9 | debian/Makefile.in 10 | .deps 11 | gitrev.c 12 | *.la 13 | .libs/ 14 | libtool 15 | *.lo 16 | m4 17 | Makefile 18 | Makefile.in 19 | *.o 20 | s3backer 21 | s3backer.1 22 | scripts 23 | stamp-h1 24 | tags 25 | TAGS 26 | tester 27 | -------------------------------------------------------------------------------- /CHANGES: -------------------------------------------------------------------------------- 1 | Version 2.1.5 released May 22, 2025 2 | 3 | - Add new flag "--statsFileMirror" (issue #237) 4 | - Update to FUSE 3.x (#239) 5 | 6 | Version 2.1.4 released May 10, 2025 7 | 8 | - Add new flag "--sharedDiskMode" (issue #236) 9 | - Fix use of deprecated cURL constant (pr #228) 10 | - Conditionalize nbdkit "block_size" callback (issue #226) 11 | - Skip nbd-client(8) cleanup step if it failed to start properly (#227). 12 | - Add new flag "--accessEC2IAM-IMDSv2" (issue #228) 13 | 14 | Version 2.1.3 released June 12, 2024 15 | 16 | - Implement nbdkit "block_size" callback 17 | - Added configure flag "--enable-nbd" (issue #223) 18 | - Fix spurious "cache file is truncated" error (issue #222) 19 | - Fix bugs in dcache.c when USE_FALLOCATE enabled (issue #224) 20 | 21 | Version 2.1.2 released April 23, 2024 22 | 23 | - Automatically recreate nbdkit socket's directory if missing (issue #219) 24 | - Recalculate date and authorization on retry (issue #214) 25 | 26 | Version 2.1.1 released October 23, 2023 27 | 28 | - Fixed bug with bulk delete HTTP requests (issue #211) 29 | - Fixed empty payload bug when retrying an upload (issue #212) 30 | - Check for errors returned by curl_easy_setopt() 31 | - Added --noCurlCache flag to facilitate debugging 32 | - Bump wait time for nbdkit to start from 1s to 5s 33 | 34 | Version 2.1.0 released September 23, 2023 35 | 36 | - Avoid sending an empty Accept-Encoding header (issue #208) 37 | - Use fallocate(FALLOC_FL_PUNCH_HOLE) for empty disk cache blocks (issue #200) 38 | - Made sync(1) work on s3backer file to flush cached data (issue #197) 39 | - Fixed bug where s3b_compress=deflate NDB flag would fail (issue #195) 40 | - Fixed bug in zero cache "current_cache_size" stats value (issue #204) 41 | - Reuse cURL handles after normal HTTP error codes (issue #199) 42 | - Run modprobe(8) if needed when starting with the --nbd flag (issue #203) 43 | - Use newer OpenSSL 3.0 API functions when available 44 | 45 | Version 2.0.2 released July 17, 2022 46 | 47 | - Fixed bugs with the --nbd flag when not also using -f flag (issue #188) 48 | - Fixed free() of invalid pointer bug in zero_cache.c (issue #191) 49 | - Added support for zstd compression (pr #187) 50 | 51 | Version 2.0.1 released June 4, 2022 52 | 53 | - Fix null pointer dereference bug in zero cache. 54 | 55 | Version 2.0.0 released June 4, 2022 56 | 57 | - Support running as a Network Block Device (NBD) server (issue #178) 58 | - Fix slow write performance with large block sizes since 1.6.0 (issue ##185) 59 | - Disable MD5 cache by default now that Amazon S3 is consistent (issue #183) 60 | - Fix bug where "(null)" was appearing in usage message (issue #161) 61 | - Give more meaningful usage error when size limits are exceeded (issue #162) 62 | - Added flag "--http11" to restrict cURL to HTTP 1.1 (issue #168) 63 | - Clean up mount token if FUSE itself fails to start (issue #175) 64 | - Detect HTTP redirects and emit appropriate hint/warning (issue #174) 65 | - Added man page reference to Linux bcache (issue #169) 66 | - Added "--blockCacheFileAdvise" flag (pr #176) 67 | - Treat 3xx HTTP response codes as errors 68 | 69 | Version 1.6.3 released October 2, 2021 70 | 71 | - Fixed bug with `--listBlocks' threads doing redundant overlapping queries 72 | - Refactor to support multiple different compression algorithms 73 | - Added support for bulk deletion of blocks (issue #46) 74 | - Fixed mutex handling bug in block cache (issue #152) 75 | - Release mutexs prior to destruction (issue #151) 76 | - Fixed use-after-free bug in XML parser (pr #154) 77 | 78 | Version 1.6.2 released June 14, 2020 79 | 80 | - Require test directory to be absolute unless `-f' flag given (issue #143) 81 | - Show HTTP error response payload content when `--debug-http' flag given 82 | - List blocks in the background; added `--listBlocksThreads' (issue #24) 83 | - Don't let IAM credentials expire while listing blocks (issue #146) 84 | - Fixed bug parsing "--configFile" inside comma-separated option list 85 | 86 | Version 1.6.1 released December 30, 2020 87 | 88 | - Fixed integer overflow bug setting "x-amz-meta-s3backer-filesize" (issue #141) 89 | 90 | Version 1.6.0 released December 23, 2020 91 | 92 | - Added `--sse-key-id' flag (issue #137) 93 | - Log a more useful error message when IAM credentials not found (issue #136) 94 | - Warn on startup if disk space is insufficient for disk cache (issue #138) 95 | - Added zero block cache to better handle fstrim operations (issue #139) 96 | 97 | Version 1.5.6 released October 30, 2020 98 | 99 | - Fixed phantom write error when server-side-encryption used (issue #135) 100 | - Allow bucket names with "subdirectory" for prefix (issue #130) 101 | - Added `--configFile' flag (issue #129) 102 | - Added `--accessKeyEnv' flag (issue #128) 103 | - Removed deprecated `--rrs' flag 104 | 105 | Version 1.5.5 released August 22, 2020 106 | 107 | - Added `--no-vhost' flag (issue #117) 108 | - Added `--blockCacheNumProtected' flag (pr #119) 109 | - Added `--test-errors', `--test-delays', and `--test-discard' 110 | - Disallow stream encryption ciphers (issue #123) 111 | 112 | Version 1.5.4 released October 8, 2019 113 | 114 | - Only set "x-amz-server-side-encryption" header with PUT requests (issue #116) 115 | - Don't kill IAM thread unless actually started (issue #115). 116 | 117 | Version 1.5.3 released August 9, 2019 118 | 119 | - Fixed bug where IAM update thread was killed after fork (issue #115) 120 | - Fixed use-after-free bug in block_cache_verified() (issue #113) 121 | - Fixed use-after-free bug when updating IAM credentials (pr #114) 122 | - Fixed bug in test mode that was causing bogus I/O errors 123 | 124 | Version 1.5.2 released July 9, 2019 125 | 126 | - Fixed bug where block cache would not work when run in the background (issue #112) 127 | - Fixed bug where we were not parsing HTTP headers case-insensitively (pr #11) 128 | - Bail out during `--listBlocks' if we see an object name past our block range 129 | - Added `--blockHashPrefix' flag (issue #80) 130 | 131 | Version 1.5.1 released April 15, 2019 132 | 133 | - Fixed a few places where fixed-sized buffers were too small (issue #108) 134 | - Don't claim cache hit if partial write required reading the block (pr #103) 135 | - Exit process with error code if s3backer store setup fails at startup 136 | - Reset statistics if stats file is unlinked (issue #106) 137 | 138 | Version 1.5.0 released June 9, 2018 139 | 140 | - Add support for recovering dirty blocks in the disk cache (issue #87) 141 | - Replaced boolean 'mounted' flag with a unique 32-bit mount token (issue #87) 142 | - Wait for min_write_delay before access after write error (issue #76) 143 | - Configure TCP keep-alive on HTTP connections (issue #78) 144 | - Added support for server side encryption (pull #81) 145 | 146 | Version 1.4.4 released February 1, 2017 147 | 148 | - Added `--defaultContentEncoding' for non-compliant backends (issue #68) 149 | - Fixed auth bug when prefix contains URL-encodable char (issue #69) 150 | - Remove restriction preventing streaming encryption modes (issue #70) 151 | 152 | Version 1.4.3 released July 25, 2016 153 | 154 | - Add support for STANDARD_IA storage class (issue #59) 155 | - Set "Accept-Encoding" header appropriately (issue #48) 156 | - Fix build issue with OpenSSL 1.1.0 (issue #64) 157 | 158 | Version 1.4.2 released September 1, 2015 159 | 160 | - Update license to fix OpenSSL vs. GPL conflict 161 | - Remove obsolete Debian files 162 | - Fix typos in help output 163 | 164 | Version 1.4.1 released May 4, 2015 165 | 166 | - Fix use-after-free bug configuring base URL (github issue #44) 167 | 168 | Version 1.4.0 released April 17, 2015 169 | 170 | - Added support for authentication version 4 (issue #51) 171 | - Added support for credentials via IAM role from EC2 meta-data (issue #48) 172 | - Fixed bug where `--erase' did not clear the mounted flag 173 | - Moved from Google project hosting to GitHub 174 | - Fixed compile problem on FreeBSD 175 | 176 | Version 1.3.7 (r496) released 18 July 2013 177 | 178 | - Add `--keyLength' for overriding generated encryption key length 179 | 180 | Version 1.3.6 (r493) released 16 July 2013 181 | 182 | - Fix use of MAX_HOST_NAME in http_io.c (issue #42) 183 | - Fix encryption key generation bug (on some systems) 184 | 185 | Version 1.3.5 (r485) released 29 May 2013 186 | 187 | - Check for duplicate mount at startup (issue #10) 188 | - Remove obsolete backward-compatibility block size check 189 | 190 | Version 1.3.4 (r476) released 2 Apr 2013 191 | 192 | - Support FUSE fallocate() call to zero unused blocks 193 | 194 | Version 1.3.3 (r463) released 7 Apr 2012 195 | 196 | - Fix bug in validation of --baseURL parameter (issue #34) 197 | - Accept 404 Not Found as a valid response to a DELETE (issue #35) 198 | - Added a fix for building on Mac OS X (issue #32) 199 | 200 | Version 1.3.2 (r451) released 14 May 2011 201 | 202 | - Added `--directIO' flag to disable kernel caching of the backed file. 203 | - Fixed bug where the stats file was not up to date (issue #26). 204 | - Fixed bug with `--blockCacheMaxDirty' not working (issue #25). 205 | - Added automatic block cache disk file resizing (issue #23). 206 | - Added `--maxUploadSpeed' and `--maxDownloadSpeed' flags. 207 | - Added `-rrs' flag to support Reduced Redundancy Storage. 208 | - Fixed missing warning for `--baseURL' when missing trailing slash. 209 | 210 | Version 1.3.1 (r413) released 19 Oct 2009 211 | 212 | - Added `--blockCacheMaxDirty' flag. 213 | - Fixed cURL handle leak when cancelling in-progress writes. 214 | - Updated Mac OS X build instructions and added Snow Leopard support. 215 | 216 | Version 1.3.0 (r392) released 27 Sep 2009 217 | 218 | - Added support for local cache files that can persist across restarts. 219 | - Added built-in support for encryption and authentication. 220 | - In-progress writes are now cancelled when a duplicate write occurs. 221 | - Changed default for `--blockCacheWriteDelay' from zero to 250ms. 222 | - Fix obscure and unlikely deadlock bug in ec_protect.c. 223 | - Allow configurable compression level via --compress=LEVEL. 224 | - Fix bug that caused spurious "impossible expected MD5" log messages. 225 | 226 | Version 1.2.3 (r333) released 15 May 2009 227 | 228 | - Added `--vhost' flag for virtual hosted style URLs in all requests. 229 | - Don't send LOG_DEBUG messages to syslog unless --debug flag given. 230 | - Fix race condition when generating HTTP Date: headers. 231 | - Allow command line flags to be specified in /etc/fstab. 232 | 233 | Version 1.2.2 (r316) released 20 Dec 2008 234 | 235 | - Added `--compress' flag enabling compression of file blocks. 236 | Note: compressed blocks are not compatible with versions < 1.2.2. 237 | - Disable the MD5 cache when the `--readOnly' flag is given. 238 | - Make `--md5CacheTime=0' really mean `infinite' as promised in man page. 239 | - Added `--debug-http' flag for debugging HTTP headers. 240 | - Don't let block and MD5 caches be configured larger than necessary. 241 | - Fixed a few minor issues with statistics reporting. 242 | 243 | Version 1.2.1 (r300) released 23 Oct 2008 244 | 245 | - Added `--erase' and `--quiet' command line flags. 246 | - Added `--blockCacheSync' command line flag. 247 | - Fixed extra copying slowdown when using large block sizes (issue #5). 248 | - Eliminate extra copy of blocks when written by block_cache worker threads. 249 | - Fixed bug in EC layer where dirty data might not be flushed at shutdown. 250 | - Fixed bug where 'http' was shown instead of 'https' in mount(8) output 251 | when the --ssl flag was given. 252 | 253 | Version 1.2.0 (r248) released 12 Sep 2008 254 | 255 | - Use new custom hash table implementation; this removes glib dependency. 256 | - Replaced `--assumeEmpty' flag with safer and more useful `--listBlocks'. 257 | - Fixed bug where the zero block optimization got disabled when the 258 | MD5 cache was disabled. 259 | - Supply `-o allow_other' option by default, since default mode is 0600. 260 | - Fixed bug where cp(1)'ing the backed file gave `Illegal seek' error. 261 | - Use FUSE version 25 API so code builds on older O/S distributions. 262 | 263 | Version 1.1.1 (r202) released 5 Aug 2008 264 | 265 | - Added `--ssl' as an alias for `--baseURL https://s3.amazonaws.com/'. 266 | - Added `--insecure' and `--cacert' flags to configure cURL SSL checks. 267 | - Implemented `--blockCacheWriteDelay' and `--blockCacheTimeout' flags. 268 | - Implemented read-ahead using `--readAhead' and `--readAheadTrigger' flags. 269 | - Set FUSE max_readahead option to zero by default since we do it too now. 270 | - Added new `--test' flag which turns on local test mode. 271 | - Display the URL, bucket, and prefix in the output of mount(8). 272 | - Fixed bug where an error during auto-detection would cause a segfault. 273 | - Fixed bug where read errors from the underlying store were being ignored 274 | by the block cache layer. 275 | 276 | Version 1.1.0 (r150) released 26 July 2008 277 | 278 | - Added a block cache with parallel writes which vastly improves performance. 279 | - Added a new `stats' file to the filesystem containing various statistics. 280 | - Added `--noAutoDetect' flag to disable auto-detection at startup. 281 | - Fixed a few small race conditions and memory leaks. 282 | - Return zeroes for unwritten blocks with `assumeEmpty'. 283 | 284 | Version 1.0.5 (r111) released 15 July 2008 285 | 286 | - Avoid reuse of CURL instance after receiving any HTTP error (issue #3) 287 | - On MacOS, prevent kernel timeouts prior to our own timeout (issue #2) 288 | - Replaced `--connectTimeout' and `--ioTimeout' with `--timeout' because 289 | CURL's I/O timeout includes in it the connection time as well. 290 | 291 | Version 1.0.4 (r82) released 9 July 2008 292 | 293 | - Retry on all HTTP error codes, not just 500 or greater. Tests show that 294 | a valid request can return a 4xx response due to network issues. 295 | - Added `--fileMode' and `--readOnly' flags. 296 | - Added `--assumeEmpty' flag. 297 | - Support 'E' for 'exabytes'. 298 | - Port to Mac OS (issue #1) 299 | 300 | Version 1.0.3 (r39) released 30 June 2008 301 | 302 | - Implement exponential backoff: replace ``--maxRetry'' and ``--retryPause'' 303 | with ``--initialRetryPause'' and ``--maxRetryPause''. 304 | - Fix `--accessType' flag which was not being properly handled. 305 | - Improvements to the man page. 306 | 307 | Version 1.0.2 (r25) released 20 June 2008 308 | 309 | - Fix bug in setting User-Agent HTTP header. 310 | - Fix glitch in man page. 311 | 312 | Version 1.0.1 (r18) released 20 June 2008 313 | 314 | - Store filesystem size in meta-data associated with the first block and 315 | use it to auto-detect filesystem block and file sizes if not specified. 316 | As a result, `--size' flag is now optional. 317 | - Log a warning and zero remaining bytes when we encounter a short read. 318 | - Add User-Agent HTTP header to all HTTP requests. 319 | - Include SVN revision in version string. 320 | - Don't log every HTTP operation unless `-d' is passed. 321 | - Added `--force' flag. 322 | 323 | Version 1.0.0 released 19 June 2008 324 | 325 | - Initial release 326 | -------------------------------------------------------------------------------- /CHECKLIST: -------------------------------------------------------------------------------- 1 | Checklist for releasing version VERSION 2 | --------------------------------------- 3 | 4 | Final check 5 | make distcheck 6 | test tarball builds and works on Linux, MacOS... 7 | 8 | Tag release and release tarball 9 | sh cleanup.sh 10 | verify everything is clean 11 | update CHANGES with today's date and VERSION 12 | edit configure.ac and update with VERSION 13 | git commit 14 | git tag -a -m 'Tagging release VERSION' VERSION 15 | sh rebuild.sh && ./configure && make distcheck 16 | upload tarball to Amazon S3 17 | 18 | s3backer project 19 | update wikified man page 20 | update wiki Download page 21 | send email to s3backer-devel google group 22 | 23 | OBS 24 | update OBS project 25 | 26 | -------------------------------------------------------------------------------- /COPYING: -------------------------------------------------------------------------------- 1 | 2 | In addition to the license below, as a special exception, the copyright holders 3 | give permission to link the code of portions of this program with the OpenSSL 4 | library under certain conditions as described in each individual source file, 5 | and distribute linked combinations including the two. 6 | 7 | You must obey the GNU General Public License in all respects for all of the code 8 | used other than OpenSSL. If you modify file(s) with this exception, you may 9 | extend this exception to your version of the file(s), but you are not obligated 10 | to do so. If you do not wish to do so, delete this exception statement from 11 | your version. If you delete this exception statement from all source files in 12 | the program, then also delete it here. 13 | 14 | ---------------------- 15 | 16 | GNU GENERAL PUBLIC LICENSE 17 | Version 2, June 1991 18 | 19 | Copyright (C) 1989, 1991 Free Software Foundation, Inc., 20 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA 21 | Everyone is permitted to copy and distribute verbatim copies 22 | of this license document, but changing it is not allowed. 23 | 24 | Preamble 25 | 26 | The licenses for most software are designed to take away your 27 | freedom to share and change it. By contrast, the GNU General Public 28 | License is intended to guarantee your freedom to share and change free 29 | software--to make sure the software is free for all its users. This 30 | General Public License applies to most of the Free Software 31 | Foundation's software and to any other program whose authors commit to 32 | using it. (Some other Free Software Foundation software is covered by 33 | the GNU Lesser General Public License instead.) You can apply it to 34 | your programs, too. 35 | 36 | When we speak of free software, we are referring to freedom, not 37 | price. Our General Public Licenses are designed to make sure that you 38 | have the freedom to distribute copies of free software (and charge for 39 | this service if you wish), that you receive source code or can get it 40 | if you want it, that you can change the software or use pieces of it 41 | in new free programs; and that you know you can do these things. 42 | 43 | To protect your rights, we need to make restrictions that forbid 44 | anyone to deny you these rights or to ask you to surrender the rights. 45 | These restrictions translate to certain responsibilities for you if you 46 | distribute copies of the software, or if you modify it. 47 | 48 | For example, if you distribute copies of such a program, whether 49 | gratis or for a fee, you must give the recipients all the rights that 50 | you have. You must make sure that they, too, receive or can get the 51 | source code. And you must show them these terms so they know their 52 | rights. 53 | 54 | We protect your rights with two steps: (1) copyright the software, and 55 | (2) offer you this license which gives you legal permission to copy, 56 | distribute and/or modify the software. 57 | 58 | Also, for each author's protection and ours, we want to make certain 59 | that everyone understands that there is no warranty for this free 60 | software. If the software is modified by someone else and passed on, we 61 | want its recipients to know that what they have is not the original, so 62 | that any problems introduced by others will not reflect on the original 63 | authors' reputations. 64 | 65 | Finally, any free program is threatened constantly by software 66 | patents. We wish to avoid the danger that redistributors of a free 67 | program will individually obtain patent licenses, in effect making the 68 | program proprietary. To prevent this, we have made it clear that any 69 | patent must be licensed for everyone's free use or not licensed at all. 70 | 71 | The precise terms and conditions for copying, distribution and 72 | modification follow. 73 | 74 | GNU GENERAL PUBLIC LICENSE 75 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 76 | 77 | 0. This License applies to any program or other work which contains 78 | a notice placed by the copyright holder saying it may be distributed 79 | under the terms of this General Public License. The "Program", below, 80 | refers to any such program or work, and a "work based on the Program" 81 | means either the Program or any derivative work under copyright law: 82 | that is to say, a work containing the Program or a portion of it, 83 | either verbatim or with modifications and/or translated into another 84 | language. (Hereinafter, translation is included without limitation in 85 | the term "modification".) Each licensee is addressed as "you". 86 | 87 | Activities other than copying, distribution and modification are not 88 | covered by this License; they are outside its scope. The act of 89 | running the Program is not restricted, and the output from the Program 90 | is covered only if its contents constitute a work based on the 91 | Program (independent of having been made by running the Program). 92 | Whether that is true depends on what the Program does. 93 | 94 | 1. You may copy and distribute verbatim copies of the Program's 95 | source code as you receive it, in any medium, provided that you 96 | conspicuously and appropriately publish on each copy an appropriate 97 | copyright notice and disclaimer of warranty; keep intact all the 98 | notices that refer to this License and to the absence of any warranty; 99 | and give any other recipients of the Program a copy of this License 100 | along with the Program. 101 | 102 | You may charge a fee for the physical act of transferring a copy, and 103 | you may at your option offer warranty protection in exchange for a fee. 104 | 105 | 2. You may modify your copy or copies of the Program or any portion 106 | of it, thus forming a work based on the Program, and copy and 107 | distribute such modifications or work under the terms of Section 1 108 | above, provided that you also meet all of these conditions: 109 | 110 | a) You must cause the modified files to carry prominent notices 111 | stating that you changed the files and the date of any change. 112 | 113 | b) You must cause any work that you distribute or publish, that in 114 | whole or in part contains or is derived from the Program or any 115 | part thereof, to be licensed as a whole at no charge to all third 116 | parties under the terms of this License. 117 | 118 | c) If the modified program normally reads commands interactively 119 | when run, you must cause it, when started running for such 120 | interactive use in the most ordinary way, to print or display an 121 | announcement including an appropriate copyright notice and a 122 | notice that there is no warranty (or else, saying that you provide 123 | a warranty) and that users may redistribute the program under 124 | these conditions, and telling the user how to view a copy of this 125 | License. (Exception: if the Program itself is interactive but 126 | does not normally print such an announcement, your work based on 127 | the Program is not required to print an announcement.) 128 | 129 | These requirements apply to the modified work as a whole. If 130 | identifiable sections of that work are not derived from the Program, 131 | and can be reasonably considered independent and separate works in 132 | themselves, then this License, and its terms, do not apply to those 133 | sections when you distribute them as separate works. But when you 134 | distribute the same sections as part of a whole which is a work based 135 | on the Program, the distribution of the whole must be on the terms of 136 | this License, whose permissions for other licensees extend to the 137 | entire whole, and thus to each and every part regardless of who wrote it. 138 | 139 | Thus, it is not the intent of this section to claim rights or contest 140 | your rights to work written entirely by you; rather, the intent is to 141 | exercise the right to control the distribution of derivative or 142 | collective works based on the Program. 143 | 144 | In addition, mere aggregation of another work not based on the Program 145 | with the Program (or with a work based on the Program) on a volume of 146 | a storage or distribution medium does not bring the other work under 147 | the scope of this License. 148 | 149 | 3. You may copy and distribute the Program (or a work based on it, 150 | under Section 2) in object code or executable form under the terms of 151 | Sections 1 and 2 above provided that you also do one of the following: 152 | 153 | a) Accompany it with the complete corresponding machine-readable 154 | source code, which must be distributed under the terms of Sections 155 | 1 and 2 above on a medium customarily used for software interchange; or, 156 | 157 | b) Accompany it with a written offer, valid for at least three 158 | years, to give any third party, for a charge no more than your 159 | cost of physically performing source distribution, a complete 160 | machine-readable copy of the corresponding source code, to be 161 | distributed under the terms of Sections 1 and 2 above on a medium 162 | customarily used for software interchange; or, 163 | 164 | c) Accompany it with the information you received as to the offer 165 | to distribute corresponding source code. (This alternative is 166 | allowed only for noncommercial distribution and only if you 167 | received the program in object code or executable form with such 168 | an offer, in accord with Subsection b above.) 169 | 170 | The source code for a work means the preferred form of the work for 171 | making modifications to it. For an executable work, complete source 172 | code means all the source code for all modules it contains, plus any 173 | associated interface definition files, plus the scripts used to 174 | control compilation and installation of the executable. However, as a 175 | special exception, the source code distributed need not include 176 | anything that is normally distributed (in either source or binary 177 | form) with the major components (compiler, kernel, and so on) of the 178 | operating system on which the executable runs, unless that component 179 | itself accompanies the executable. 180 | 181 | If distribution of executable or object code is made by offering 182 | access to copy from a designated place, then offering equivalent 183 | access to copy the source code from the same place counts as 184 | distribution of the source code, even though third parties are not 185 | compelled to copy the source along with the object code. 186 | 187 | 4. You may not copy, modify, sublicense, or distribute the Program 188 | except as expressly provided under this License. Any attempt 189 | otherwise to copy, modify, sublicense or distribute the Program is 190 | void, and will automatically terminate your rights under this License. 191 | However, parties who have received copies, or rights, from you under 192 | this License will not have their licenses terminated so long as such 193 | parties remain in full compliance. 194 | 195 | 5. You are not required to accept this License, since you have not 196 | signed it. However, nothing else grants you permission to modify or 197 | distribute the Program or its derivative works. These actions are 198 | prohibited by law if you do not accept this License. Therefore, by 199 | modifying or distributing the Program (or any work based on the 200 | Program), you indicate your acceptance of this License to do so, and 201 | all its terms and conditions for copying, distributing or modifying 202 | the Program or works based on it. 203 | 204 | 6. Each time you redistribute the Program (or any work based on the 205 | Program), the recipient automatically receives a license from the 206 | original licensor to copy, distribute or modify the Program subject to 207 | these terms and conditions. You may not impose any further 208 | restrictions on the recipients' exercise of the rights granted herein. 209 | You are not responsible for enforcing compliance by third parties to 210 | this License. 211 | 212 | 7. If, as a consequence of a court judgment or allegation of patent 213 | infringement or for any other reason (not limited to patent issues), 214 | conditions are imposed on you (whether by court order, agreement or 215 | otherwise) that contradict the conditions of this License, they do not 216 | excuse you from the conditions of this License. If you cannot 217 | distribute so as to satisfy simultaneously your obligations under this 218 | License and any other pertinent obligations, then as a consequence you 219 | may not distribute the Program at all. For example, if a patent 220 | license would not permit royalty-free redistribution of the Program by 221 | all those who receive copies directly or indirectly through you, then 222 | the only way you could satisfy both it and this License would be to 223 | refrain entirely from distribution of the Program. 224 | 225 | If any portion of this section is held invalid or unenforceable under 226 | any particular circumstance, the balance of the section is intended to 227 | apply and the section as a whole is intended to apply in other 228 | circumstances. 229 | 230 | It is not the purpose of this section to induce you to infringe any 231 | patents or other property right claims or to contest validity of any 232 | such claims; this section has the sole purpose of protecting the 233 | integrity of the free software distribution system, which is 234 | implemented by public license practices. Many people have made 235 | generous contributions to the wide range of software distributed 236 | through that system in reliance on consistent application of that 237 | system; it is up to the author/donor to decide if he or she is willing 238 | to distribute software through any other system and a licensee cannot 239 | impose that choice. 240 | 241 | This section is intended to make thoroughly clear what is believed to 242 | be a consequence of the rest of this License. 243 | 244 | 8. If the distribution and/or use of the Program is restricted in 245 | certain countries either by patents or by copyrighted interfaces, the 246 | original copyright holder who places the Program under this License 247 | may add an explicit geographical distribution limitation excluding 248 | those countries, so that distribution is permitted only in or among 249 | countries not thus excluded. In such case, this License incorporates 250 | the limitation as if written in the body of this License. 251 | 252 | 9. The Free Software Foundation may publish revised and/or new versions 253 | of the General Public License from time to time. Such new versions will 254 | be similar in spirit to the present version, but may differ in detail to 255 | address new problems or concerns. 256 | 257 | Each version is given a distinguishing version number. If the Program 258 | specifies a version number of this License which applies to it and "any 259 | later version", you have the option of following the terms and conditions 260 | either of that version or of any later version published by the Free 261 | Software Foundation. If the Program does not specify a version number of 262 | this License, you may choose any version ever published by the Free Software 263 | Foundation. 264 | 265 | 10. If you wish to incorporate parts of the Program into other free 266 | programs whose distribution conditions are different, write to the author 267 | to ask for permission. For software which is copyrighted by the Free 268 | Software Foundation, write to the Free Software Foundation; we sometimes 269 | make exceptions for this. Our decision will be guided by the two goals 270 | of preserving the free status of all derivatives of our free software and 271 | of promoting the sharing and reuse of software generally. 272 | 273 | NO WARRANTY 274 | 275 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY 276 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN 277 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES 278 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED 279 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 280 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS 281 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE 282 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, 283 | REPAIR OR CORRECTION. 284 | 285 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 286 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR 287 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, 288 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING 289 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED 290 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY 291 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER 292 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE 293 | POSSIBILITY OF SUCH DAMAGES. 294 | 295 | END OF TERMS AND CONDITIONS 296 | 297 | How to Apply These Terms to Your New Programs 298 | 299 | If you develop a new program, and you want it to be of the greatest 300 | possible use to the public, the best way to achieve this is to make it 301 | free software which everyone can redistribute and change under these terms. 302 | 303 | To do so, attach the following notices to the program. It is safest 304 | to attach them to the start of each source file to most effectively 305 | convey the exclusion of warranty; and each file should have at least 306 | the "copyright" line and a pointer to where the full notice is found. 307 | 308 | 309 | Copyright (C) 310 | 311 | This program is free software; you can redistribute it and/or modify 312 | it under the terms of the GNU General Public License as published by 313 | the Free Software Foundation; either version 2 of the License, or 314 | (at your option) any later version. 315 | 316 | This program is distributed in the hope that it will be useful, 317 | but WITHOUT ANY WARRANTY; without even the implied warranty of 318 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 319 | GNU General Public License for more details. 320 | 321 | You should have received a copy of the GNU General Public License along 322 | with this program; if not, write to the Free Software Foundation, Inc., 323 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. 324 | 325 | Also add information on how to contact you by electronic and paper mail. 326 | 327 | If the program is interactive, make it output a short notice like this 328 | when it starts in an interactive mode: 329 | 330 | Gnomovision version 69, Copyright (C) year name of author 331 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 332 | This is free software, and you are welcome to redistribute it 333 | under certain conditions; type `show c' for details. 334 | 335 | The hypothetical commands `show w' and `show c' should show the appropriate 336 | parts of the General Public License. Of course, the commands you use may 337 | be called something other than `show w' and `show c'; they could even be 338 | mouse-clicks or menu items--whatever suits your program. 339 | 340 | You should also get your employer (if you work as a programmer) or your 341 | school, if any, to sign a "copyright disclaimer" for the program, if 342 | necessary. Here is a sample; alter the names: 343 | 344 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program 345 | `Gnomovision' (which makes passes at compilers) written by James Hacker. 346 | 347 | , 1 April 1989 348 | Ty Coon, President of Vice 349 | 350 | This General Public License does not permit incorporating your program into 351 | proprietary programs. If your program is a subroutine library, you may 352 | consider it more useful to permit linking proprietary applications with the 353 | library. If this is what you want to do, use the GNU Lesser General 354 | Public License instead of this License. 355 | -------------------------------------------------------------------------------- /INSTALL: -------------------------------------------------------------------------------- 1 | 2 | Simplified instructions: 3 | 4 | 1. Ensure you have the following software packages installed: 5 | 6 | fuse-devel 7 | libcurl-devel 8 | libexpat-devel 9 | libopenssl-devel 10 | libzstd-devel 11 | nbdkit-devel 12 | pkg-config 13 | zlib-devel 14 | 15 | 2. ./configure && make && sudo make install 16 | 17 | Please see 18 | 19 | https://github.com/archiecobbs/s3backer/wiki/Build-and-Install 20 | 21 | for more build and install information. 22 | -------------------------------------------------------------------------------- /Makefile.am: -------------------------------------------------------------------------------- 1 | 2 | # 3 | # s3backer - FUSE-based single file backing store via Amazon S3 4 | # 5 | # Copyright 2008-2023 Archie L. Cobbs 6 | # 7 | # This program is free software; you can redistribute it and/or 8 | # modify it under the terms of the GNU General Public License 9 | # as published by the Free Software Foundation; either version 2 10 | # of the License, or (at your option) any later version. 11 | # 12 | # This program is distributed in the hope that it will be useful, 13 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | # GNU General Public License for more details. 16 | # 17 | # You should have received a copy of the GNU General Public License 18 | # along with this program; if not, write to the Free Software 19 | # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | # 02110-1301, USA. 21 | # 22 | # In addition, as a special exception, the copyright holders give 23 | # permission to link the code of portions of this program with the 24 | # OpenSSL library under certain conditions as described in each 25 | # individual source file, and distribute linked combinations including 26 | # the two. 27 | # 28 | # You must obey the GNU General Public License in all respects for all 29 | # of the code used other than OpenSSL. If you modify file(s) with this 30 | # exception, you may extend this exception to your version of the 31 | # file(s), but you are not obligated to do so. If you do not wish to do 32 | # so, delete this exception statement from your version. If you delete 33 | # this exception statement from all source files in the program, then 34 | # also delete it here. 35 | 36 | # Setup build for executables: s3backer and tester 37 | bin_PROGRAMS= s3backer 38 | 39 | noinst_PROGRAMS= tester 40 | 41 | noinst_HEADERS= s3backer.h \ 42 | block_cache.h \ 43 | block_part.h \ 44 | dcache.h \ 45 | ec_protect.h \ 46 | zero_cache.h \ 47 | erase.h \ 48 | fuse_ops.h \ 49 | hash.h \ 50 | nbdkit.h \ 51 | util.h \ 52 | compress.h \ 53 | http_io.h \ 54 | reset.h \ 55 | test_io.h \ 56 | s3b_config.h 57 | 58 | # Setup build for NBD plugin shared library (if NBDKit is available) 59 | if NBDKIT_FOUND 60 | 61 | nbdplugindir= $(NBDKIT_PLUGINDIR) 62 | 63 | nbdplugin_LTLIBRARIES= nbdkit-s3backer-plugin.la 64 | 65 | nbdkit_s3backer_plugin_la_SOURCES = \ 66 | nbdkit.c \ 67 | block_cache.c \ 68 | block_part.c \ 69 | dcache.c \ 70 | ec_protect.c \ 71 | zero_cache.c \ 72 | erase.c \ 73 | fuse_ops.c \ 74 | hash.c \ 75 | util.c \ 76 | compress.c \ 77 | http_io.c \ 78 | reset.c \ 79 | s3b_config.c \ 80 | test_io.c \ 81 | sslcompat.c \ 82 | gitrev.c 83 | 84 | nbdkit_s3backer_plugin_la_LDFLAGS = \ 85 | -module -avoid-version -shared 86 | 87 | endif 88 | 89 | # See https://www.gnu.org/software/automake/manual/html_node/Objects-created-both-with-libtool-and-without.html 90 | s3backer_CFLAGS= $(AM_CFLAGS) 91 | tester_CFLAGS= $(AM_CFLAGS) 92 | 93 | # libtool random 94 | ACLOCAL_AMFLAGS= -I m4 95 | 96 | man_MANS= s3backer.1 97 | 98 | docdir= $(datadir)/doc/packages/$(PACKAGE) 99 | 100 | doc_DATA= CHANGES COPYING README INSTALL TODO 101 | 102 | EXTRA_DIST= CHANGES s3backer.1 103 | 104 | s3backer_SOURCES= main.c \ 105 | block_cache.c \ 106 | block_part.c \ 107 | dcache.c \ 108 | ec_protect.c \ 109 | zero_cache.c \ 110 | erase.c \ 111 | fuse_ops.c \ 112 | hash.c \ 113 | util.c \ 114 | compress.c \ 115 | http_io.c \ 116 | reset.c \ 117 | s3b_config.c \ 118 | test_io.c \ 119 | sslcompat.c \ 120 | gitrev.c 121 | 122 | tester_SOURCES= tester.c \ 123 | block_cache.c \ 124 | block_part.c \ 125 | dcache.c \ 126 | ec_protect.c \ 127 | zero_cache.c \ 128 | erase.c \ 129 | hash.c \ 130 | util.c \ 131 | compress.c \ 132 | http_io.c \ 133 | reset.c \ 134 | s3b_config.c \ 135 | test_io.c \ 136 | sslcompat.c \ 137 | gitrev.c 138 | 139 | AM_CFLAGS= $(FUSE_CFLAGS) $(NBDKIT_CFLAGS) 140 | 141 | gitrev.c: 142 | printf 'const char *const s3backer_version = "%s";\n' "`git describe`" > gitrev.c 143 | -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | s3backer - FUSE-based single file backing store via Amazon S3 2 | 3 | Overview 4 | s3backer is a filesystem that contains a single file backed by the Amazon 5 | Simple Storage Service (Amazon S3). As a filesystem, it is very simple: 6 | it provides a single normal file having a fixed size. Underneath, the 7 | file is divided up into blocks, and the content of each block is stored 8 | in a unique Amazon S3 object. In other words, what s3backer provides is 9 | really more like an S3-backed virtual hard disk device, rather than a 10 | filesystem. 11 | 12 | In typical usage, a `normal' filesystem is mounted on top of the file 13 | exported by the s3backer filesystem using a loopback mount (or disk image 14 | mount on Mac OS X). 15 | 16 | This arrangement has several benefits compared to more complete S3 17 | filesystem implementations: 18 | 19 | o By not attempting to implement a complete filesystem, which is a com- 20 | plex undertaking and difficult to get right, s3backer can stay very 21 | lightweight and simple. Only three HTTP operations are used: GET, 22 | PUT, and DELETE. All of the experience and knowledge about how to 23 | properly implement filesystems that already exists can be reused. 24 | 25 | o By utilizing existing filesystems, you get full UNIX filesystem 26 | semantics. Subtle bugs or missing functionality relating to hard 27 | links, extended attributes, POSIX locking, etc. are avoided. 28 | 29 | o The gap between normal filesystem semantics and Amazon S3 ``eventual 30 | consistency'' is more easily and simply solved when one can interpret 31 | S3 objects as simple device blocks rather than filesystem objects 32 | (see below). 33 | 34 | o When storing your data on Amazon S3 servers, which are not under your 35 | control, the ability to encrypt and authenticate data becomes a crit- 36 | ical issue. s3backer supports secure encryption and authentication. 37 | Alternately, the encryption capability built into the Linux loopback 38 | device can be used. 39 | 40 | o Since S3 data is accessed over the network, local caching is also 41 | very important for performance reasons. Since s3backer presents the 42 | equivalent of a virtual hard disk to the kernel, most of the filesys- 43 | tem caching can be done where it should be: in the kernel, via the 44 | kernel's page cache. However s3backer also includes its own internal 45 | block cache for increased performance, using asynchronous worker 46 | threads to take advantage of the parallelism inherent in the network. 47 | 48 | Consistency Guarantees 49 | Amazon S3 makes relatively weak guarantees relating to the timing and 50 | consistency of reads vs. writes (collectively known as ``eventual consis- 51 | tency''). s3backer includes logic and configuration parameters to work 52 | around these limitations, allowing the user to guarantee consistency to 53 | whatever level desired, up to and including 100% detection and avoidance 54 | of incorrect data. These are: 55 | 56 | 1. s3backer enforces a minimum delay between consecutive PUT or DELETE 57 | operations on the same block. This ensures that Amazon S3 doesn't 58 | receive these operations out of order. 59 | 60 | 2. s3backer maintains an internal block MD5 checksum cache, which 61 | enables automatic detection and rejection of `stale' blocks returned 62 | by GET operations. 63 | 64 | This logic is configured by the following command line options: 65 | --md5CacheSize, --md5CacheTime, and --minWriteDelay. 66 | 67 | Zeroed Block Optimization 68 | As a simple optimization, s3backer does not store blocks containing all 69 | zeroes; instead, they are simply deleted. Conversely, reads of non-exis- 70 | tent blocks will contain all zeroes. In other words, the backed file is 71 | always maximally sparse. 72 | 73 | As a result, blocks do not need to be created before being used and no 74 | special initialization is necessary when creating a new filesystem. 75 | 76 | When the --listBlocks flag is given, s3backer will list all existing 77 | blocks at startup so it knows ahead of time exactly which blocks are 78 | empty. 79 | 80 | File and Block Size Auto-Detection 81 | As a convenience, whenever the first block of the backed file is written, 82 | s3backer includes as meta-data (in the ``x-amz-meta-s3backer-filesize'' 83 | header) the total size of the file. Along with the size of the block 84 | itself, this value can be checked and/or auto-detected later when the 85 | filesystem is remounted, eliminating the need for the --blockSize or 86 | --size flags to be explicitly provided and avoiding accidental mis-inter- 87 | pretation of an existing filesystem. 88 | 89 | Block Cache 90 | s3backer includes support for an internal block cache to increase perfor- 91 | mance. The block cache cache is completely separate from the MD5 cache 92 | which only stores MD5 checksums transiently and whose sole purpose is to 93 | mitigate ``eventual consistency''. The block cache is a traditional 94 | cache containing cached data blocks. When full, clean blocks are evicted 95 | as necessary in LRU order. 96 | 97 | Reads of cached blocks will return immediately with no network traffic. 98 | Writes to the cache also return immediately and trigger an asynchronous 99 | write operation to the network via a separate worker thread. Because the 100 | kernel typically writes blocks through FUSE filesystems one at a time, 101 | performing writes asynchronously allows s3backer to take advantage of the 102 | parallelism inherent in the network, vastly improving write performance. 103 | 104 | The block cache can be configured to store the cached data in a local 105 | file instead of in memory. This permits larger cache sizes and allows 106 | s3backer to reload cached data after a restart. Reloaded data is veri- 107 | fied before reuse. 108 | 109 | The block cache is configured by the following command line options: 110 | --blockCacheFile, --blockCacheNoVerify, --blockCacheSize, 111 | --blockCacheSync, --blockCacheThreads, --blockCacheTimeout, and 112 | --blockCacheWriteDelay. 113 | 114 | Read Ahead 115 | s3backer implements a simple read-ahead algorithm in the block cache. 116 | When a configurable number of blocks are read in order, block cache 117 | worker threads are awoken to begin reading subsequent blocks into the 118 | block cache. Read ahead continues as long as the kernel continues read- 119 | ing blocks sequentially. The kernel typically requests blocks one at a 120 | time, so having multiple worker threads already reading the next few 121 | blocks improves read performance by taking advantage of the parallelism 122 | inherent in the network. 123 | 124 | Note that the kernel implements a read ahead algorithm as well; its 125 | behavior should be taken into consideration. By default, s3backer passes 126 | the -o max_readahead=0 option to FUSE. 127 | 128 | Read ahead is configured by the --readAhead and --readAheadTrigger com- 129 | mand line options. 130 | 131 | Encryption and Authentication 132 | s3backer supports encryption via the --encrypt, --password, and 133 | --passwordFile flags. When encryption is enabled, SHA1 HMAC authentica- 134 | tion is also automatically enabled, and s3backer rejects any blocks that 135 | are not properly encrypted and signed. 136 | 137 | Encrypting at the s3backer layer is preferable to encrypting at an upper 138 | layer (e.g., at the loopback device layer), because if the data s3backer 139 | sees is already encrypted it can't optimize away zeroed blocks or do 140 | meaningful compression. 141 | 142 | Read-Only Access 143 | An Amazon S3 account is not required in order to use s3backer. The 144 | filesystem must already exist and have S3 objects with ACL's configured 145 | for public read access (see --accessType below); users should perform the 146 | looback mount with the read-only flag (see mount(8)) and provide the 147 | --readOnly flag to s3backer. This mode of operation facilitates the cre- 148 | ation of public, read-only filesystems. 149 | 150 | Simultaneous Mounts 151 | Although it functions over the network, the s3backer filesystem is not a 152 | distributed filesystem and does not support simultaneous read/write 153 | mounts. (This is not something you would normally do with a hard-disk 154 | partition either.) s3backer does not detect this situation; it is up to 155 | the user to ensure that it doesn't happen. 156 | 157 | Statistics File 158 | s3backer populates the filesystem with a human-readable statistics file. 159 | See --statsFilename below. 160 | 161 | Logging 162 | In normal operation s3backer will log via syslog(3). When run with the 163 | -d or -f flags, s3backer will log to standard error. 164 | 165 | ------------------------------------------------------------------------ 166 | 167 | Home page: https://github.com/archiecobbs/s3backer 168 | 169 | ------------------------------------------------------------------------ 170 | 171 | See INSTALL for installation instructions. After installing, see the s3backer(1) 172 | man page for how to run it. 173 | 174 | See COPYING for license. 175 | 176 | See CHANGES for change history. 177 | 178 | Enjoy! 179 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | **s3backer** is a filesystem that contains a single file backed by the [Amazon Simple Storage Service](http://aws.amazon.com/s3) (Amazon S3). As a filesystem, it is very simple: it provides a single normal file having a fixed size. Underneath, the file is divided up into blocks, and the content of each block is stored in a unique Amazon S3 object. In other words, what **s3backer** provides is really more like an S3-backed virtual hard disk device, rather than a filesystem. 2 | 3 | In typical usage, a normal filesystem is mounted on top of the file exported by the **s3backer** filesystem using a loopback mount (or disk image mount on Mac OS X). 4 | 5 | This arrangement has several benefits compared to more complete S3 filesystem implementations: 6 | 7 | * By not attempting to implement a complete filesystem, which is a complex undertaking and difficult to get right, **s3backer** can stay very lightweight and simple. Only three HTTP operations are used: GET, PUT, and DELETE. All of the experience and knowledge about how to properly implement filesystems that already exists can be reused. 8 | 9 | * By utilizing existing filesystems, you get full UNIX filesystem semantics. Subtle bugs or missing functionality relating to hard links, extended attributes, POSIX locking, etc. are avoided. 10 | 11 | * The gap between normal filesystem semantics and Amazon S3 ``eventual consistency'' is more easily and simply solved when one can interpret S3 objects as simple device blocks rather than filesystem objects (see below). 12 | 13 | * When storing your data on Amazon S3 servers, which are not under your control, the ability to encrypt data becomes a critical issue. **s3backer** supports secure encryption and authentication. Alternately, the encryption capability built into the Linux loopback device can be used. 14 | 15 | * Since S3 data is accessed over the network, local caching is also very important for performance reasons. Since **s3backer** presents the equivalent of a virtual hard disk to the kernel, most of the filesystem caching can be done where it should be: in the kernel, via the kernel's page cache. However **s3backer** also includes its own internal block cache for increased performance, using asynchronous worker threads to take advantage of the parallelism inherent in the network. 16 | 17 | ### Consistency Guarantees 18 | Amazon S3 makes relatively weak guarantees relating to the timing and consistency of reads vs. writes (collectively known as "eventual consistency"). **s3backer** includes logic and configuration parameters to work around these limitations, allowing the user to guarantee consistency to whatever level desired, up to and including 100% detection and avoidance of incorrect data. These are: 19 | 20 | 1. **s3backer** enforces a minimum delay between consecutive PUT or DELETE operations on the same block. This ensures that Amazon S3 doesn't receive these operations out of order. 21 | 1. **s3backer** maintains an internal block MD5 checksum cache, which enables automatic detection and rejection of `stale' blocks returned by GET operations. 22 | 23 | This logic is configured by the following command line options: `--md5CacheSize`, `--md5CacheTime`, and `--minWriteDelay`. 24 | 25 | ### Zeroed Block Optimization 26 | As a simple optimization, **s3backer** does not store blocks containing all zeroes; instead, they are simply deleted. Conversely, reads of non-existent blocks will contain all zeroes. In other words, the backed file is always maximally sparse. 27 | 28 | As a result, blocks do not need to be created before being used and no special initialization is necessary when creating a new filesystem. 29 | 30 | When the `--listBlocks` flag is given, s3backer will list all existing blocks at startup so it knows ahead of time exactly which blocks are empty. 31 | 32 | ### File and Block Size Auto-Detection 33 | As a convenience, whenever the first block of the backed file is written, **s3backer** includes as meta-data (in the `x-amz-meta-s3backer-filesize` header) the total size of the file. Along with the size of the block itself, this value can be checked and/or auto-detected later when the filesystem is remounted, eliminating the need for the `--blockSize` or `--size` flags to be explicitly provided and avoiding accidental mis-interpretation of an existing filesystem. 34 | 35 | ### Block Cache 36 | **s3backer** includes support for an internal block cache to increase performance. The block cache cache is completely separate from the MD5 cache which only stores MD5 checksums transiently and whose sole purpose is to mitigate ``eventual consistency''. The block cache is a traditional cache containing cached data blocks. When full, clean blocks are evicted as necessary in LRU order. 37 | 38 | Reads of cached blocks will return immediately with no network traffic. Writes to the cache also return immediately and trigger an asynchronous write operation to the network via a separate worker thread. Because the kernel typically writes blocks through FUSE filesystems one at a time, performing writes asynchronously allows **s3backer** to take advantage of the parallelism inherent in the network, vastly improving write performance. 39 | 40 | The block cache can be configured to store the cached data in a local file instead of in memory. This permits larger cache sizes and allows **s3backer** to reload cached data after a restart. Reloaded data is verified via MD5 checksum with Amazon S3 before reuse. 41 | 42 | The block cache is configured by the following command line options: `--blockCacheFile`, `--blockCacheNoVerify`, `--blockCacheSize`, `--blockCacheThreads` and `--blockCacheWriteDelay`. 43 | 44 | ### Read Ahead 45 | **s3backer** implements a simple read-ahead algorithm in the block cache. When a configurable number of blocks are read in order, block cache worker threads are awoken to begin reading subsequent blocks into the block cache. Read ahead continues as long as the kernel continues reading blocks sequentially. The kernel typically requests blocks one at a time, so having multiple worker threads already reading the next few blocks improves read performance by taking advantage of the parallelism inherent in the network. 46 | 47 | Note that the kernel implements a read ahead algorithm as well; its behavior should be taken into consideration. By default, **s3backer** passes the `-o max_readahead=0` option to FUSE. 48 | 49 | Read ahead is configured by the `--readAhead` and `--readAheadTrigger` command line options. 50 | 51 | ### Encryption and Authentication 52 | **s3backer** supports encryption via the `--encrypt`, `--password`, and `--passwordFile` flags. When encryption is enabled, SHA1 HMAC authentication is also automatically enabled, and s3backer rejects any blocks that are not properly encrypted and signed. 53 | 54 | Encrypting at the s3backer layer is preferable to encrypting at an upper layer (e.g., at the loopback device layer), because if the data s3backer sees is already encrypted it can't optimize away zeroed blocks or do meaningful compression. 55 | 56 | ### Compression 57 | **s3backer** supports block-level compression, which minimizes transfer time and storage costs. 58 | 59 | Compression is configured via the`--compress` flag. Compression is automatically enabled when encryption is enabled. 60 | 61 | ### Read-Only Access 62 | An Amazon S3 account is not required in order to use **s3backer**. Of course a filesystem must already exist and have S3 objects with ACL's configured for public read access (see `--accessType` below); users should perform the looback mount with the read-only flag (see mount(8)) and provide the `--readOnly` flag to **s3backer**. This mode of operation facilitates the creation of public, read-only filesystems. 63 | 64 | ### Simultaneous Mounts 65 | Although it functions over the network, the **s3backer** filesystem is not a distributed filesystem and does not support simultaneous read/write mounts. (This is not something you would normally do with a hard-disk partition either.) **s3backer** does not detect this situation; it is up to the user to ensure that it doesn't happen. 66 | 67 | ### Statistics File 68 | **s3backer** populates the filesystem with a human-readable statistics file. See `--statsFilename` below. 69 | 70 | ### Logging 71 | In normal operation **s3backer** will log via `syslog(3)`. When run with the `-d` or `-f` flags, **s3backer** will log to standard error. 72 | 73 | ### OK, Where to Next? 74 | 75 | **[Try it out!](https://github.com/archiecobbs/s3backer/wiki/Running-the-Demo)** No Amazon S3 account is required. 76 | 77 | See the [ManPage](https://github.com/archiecobbs/s3backer/wiki/ManPage) for further documentation and the [CHANGES](https://github.com/archiecobbs/s3backer/blob/master/CHANGES) file for release notes. 78 | 79 | Join the [s3backer-devel](http://groups.google.com/group/s3backer-devel) group to participate in discussion and development of **s3backer**. 80 | -------------------------------------------------------------------------------- /README_nbd.md: -------------------------------------------------------------------------------- 1 | Using the NBD plugin 2 | -------------------- 3 | 4 | Instead of using s3backer to provide a FUSE file system with a single file that is then loop-mounted to provide a block device, it can also act as a Network Block Device (NBD) server. In this case, the kernel will directly provide a `/dev/nbdX` block device that is backed by s3backer. NBD is supported on Linux, FreeBSD, and other systems. 5 | 6 | Architecturally, using a network block device makes more sense than using a FUSE file system since it is both simpler and better matches the intented use of either feature. In theory, NBD-mode should use less memory and give higher throughput and lower latency because: 7 | 8 | - The kernel no longer serializes write and read requests but issues them concurrently. 9 | - Read and write request size can exceed 128 kB 10 | - The system can still be reliably hibernated (a running FUSE daemon may prevent this) 11 | - Requests pass through the VFS only once, not twice 12 | - Data is present in the page cache only once, not twice 13 | 14 | However, this mode of s3backer operation is still experimental. It is possible that in practice, performance is actually inferior due to implementation details in any of s3backer, nbdkit, FUSE, or NBD. Please report any improvements, degradations, or bugs that you observe. 15 | 16 | To use NBD-mode, make sure you have [nbdkit](https://github.com/libguestfs/nbdkit) installed, then build and install s3backer normally. You can then run s3backer in NBD mode using the `--nbd` flag. In this mode, specify an NBD device such as `/dev/nbd0` instead of a mount point. Then `/dev/nbd0` can be used with regular filesystems commands (`mkfs` et al). You must run s3backer as root when using the `--nbd` flag. 17 | 18 | To disconnect the block device, use: 19 | 20 | ``` 21 | $ nbd-client -d /dev/ndb0 22 | ``` 23 | 24 | The `ndbkit` server (and s3backer plugin) will still be running; to have it disconnect automatically when the client disconnects, add `--ndb-flags --filter=exitlast` to the `s3backer` command line. 25 | 26 | You can also invoke the installed `s3backer` NBD plugin directly using `ndbkit(1)`. See the `s3backer(1)` and `nbdkit(1)` man pages for details. 27 | 28 | Performance Tuning 29 | ------------------ 30 | 31 | - Set `/sys/block/nbdX/queue/max_sectors_kb` to the block size configured in s3backer (not smaller to avoid needless reading/writing of partial blocks, and not larger to maximize concurrency) 32 | 33 | - Experiment with different numbers of threads (`--threads` option to nbdkit). Especially for small block sizes and slow connections, the default is probably not optimal. 34 | 35 | - When using journaling filesystems (like ext4), disable journaling to prevent the same 36 | data being sent over the network twice. 37 | 38 | - When using ZFS, assemble the zpool from two NBDs (reading two distinct S3 buckets): (1) a *special* vdev with a small s3backer block size, and (2) a regular (*disk*) vdev with a larger s3backer block size. Set the *special_small_blocks* property of your ZFS datasets to *recordsize* - 1. The small block size should well below *recordsize* (but no smaller than 2^*ashift*), and the large block size some multiple of *recordsize*. 39 | 40 | - When using ZFS, disable synchronous requests (through the zfs `sync=disabled` property) or configure a separate *log* vdev for the zpool that is backed by local storage. In either case, this avoids data from synchronous requests being sent over the network twice. 41 | 42 | - If you aren't using ZFS, try to avoid synchronous writes by other means. If this can't 43 | be done at the filesystem level, you can wrap userspace applications with 44 | [eatmydata](https://www.flamingspork.com/projects/libeatmydata/). If neither of this is 45 | possible (and only then), you can reduce the impact by enabling s3backer's on-disk cache 46 | (but see below). 47 | 48 | - Avoid using the s3backer block cache (both in-memory and on-disk variants) unless you 49 | absolutely require it for cross-reboot cache persistence or to reduce the performance 50 | penalty of synchronous requests. For any other purpose, you should be able to get better 51 | results by adjusting the maximum size of NBD requests, the s3backer block size, the file 52 | system block size, and the page cache parameters (`/proc/sys/vm/*`). 53 | -------------------------------------------------------------------------------- /TODO: -------------------------------------------------------------------------------- 1 | TODO 2 | 3 | - support alternate backends, generalize `--test' to `--backend=localfs', etc. 4 | - Add "extents" support to NBD plugin 5 | 6 | -------------------------------------------------------------------------------- /block_cache.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Configuration info structure for block_cache 38 | struct block_cache_conf { 39 | u_int block_size; 40 | u_int cache_size; 41 | u_int write_delay; 42 | u_int max_dirty; 43 | u_int synchronous; 44 | u_int timeout; 45 | u_int num_threads; 46 | u_int read_ahead; 47 | u_int read_ahead_trigger; 48 | u_int no_verify; 49 | u_int fadvise; 50 | u_int recover_dirty_blocks; 51 | u_int perform_flush; 52 | u_int num_protected; 53 | const char *cache_file; 54 | log_func_t *log; 55 | }; 56 | 57 | // Statistics structure for block_cache 58 | struct block_cache_stats { 59 | u_int initial_size; 60 | u_int current_size; 61 | double dirty_ratio; 62 | u_int read_hits; 63 | u_int read_misses; 64 | u_int write_hits; 65 | u_int write_misses; 66 | u_int verified; 67 | u_int mismatch; 68 | u_int out_of_memory_errors; 69 | }; 70 | 71 | // block_cache.c 72 | extern struct s3backer_store *block_cache_create(struct block_cache_conf *config, struct s3backer_store *inner); 73 | extern void block_cache_get_stats(struct s3backer_store *s3b, struct block_cache_stats *stats); 74 | extern void block_cache_clear_stats(struct s3backer_store *s3b); 75 | 76 | -------------------------------------------------------------------------------- /block_part.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "s3backer.h" 38 | #include "block_part.h" 39 | #include "util.h" 40 | 41 | // Block read/write lock states: 0x00-0xfe: there are this many readers; 0xff: there is one writer 42 | #define BLOCK_IDLE ((u_int8_t)0x00) 43 | #define BLOCK_WRITING ((u_int8_t)0xff) 44 | #define BLOCK_READERS_MAX ((u_int8_t)0xfe) // inclusive upper bound 45 | 46 | // Internal state 47 | struct block_part { 48 | u_int block_size; 49 | s3b_block_t num_blocks; 50 | pthread_mutex_t mutex; 51 | pthread_cond_t wakeup; 52 | u_int8_t *block_states; // read/write locks for each block 53 | }; 54 | 55 | struct block_part * 56 | block_part_create(u_int block_size, s3b_block_t num_blocks) 57 | { 58 | struct block_part *priv; 59 | int r; 60 | 61 | if ((priv = malloc(sizeof(*priv))) == NULL) 62 | return NULL; 63 | memset(priv, 0, sizeof(*priv)); 64 | priv->block_size = block_size; 65 | priv->num_blocks = num_blocks; 66 | if ((priv->block_states = calloc(num_blocks, sizeof(*priv->block_states))) == NULL) { 67 | r = errno; 68 | goto fail1; 69 | } 70 | if ((r = pthread_mutex_init(&priv->mutex, NULL)) != 0) 71 | goto fail2; 72 | if ((r = pthread_cond_init(&priv->wakeup, NULL)) != 0) 73 | goto fail3; 74 | 75 | // Done 76 | return priv; 77 | 78 | // Fail 79 | fail3: 80 | pthread_mutex_destroy(&priv->mutex); 81 | fail2: 82 | free(priv->block_states); 83 | fail1: 84 | free(priv); 85 | errno = r; 86 | return NULL; 87 | } 88 | 89 | void 90 | block_part_destroy(struct block_part **block_partp) 91 | { 92 | struct block_part *const priv = *block_partp; 93 | 94 | if (priv == NULL) 95 | return; 96 | *block_partp = NULL; 97 | pthread_cond_destroy(&priv->wakeup); 98 | pthread_mutex_destroy(&priv->mutex); 99 | free(priv->block_states); 100 | free(priv); 101 | } 102 | 103 | /* 104 | * Read a partial block by reading the whole block, then copying the part we want. 105 | * 106 | * For any given block, we can do this simultaneously with other readers, but not while there are any writers. 107 | */ 108 | int 109 | block_part_read_block_part(struct s3backer_store *s3b, struct block_part *const priv, const struct boundary_edge *const edge) 110 | { 111 | u_int8_t block_state; 112 | u_char *buf; 113 | int r; 114 | 115 | // Sanity check 116 | assert(edge->offset <= priv->block_size); 117 | assert(edge->length > 0); 118 | assert(edge->length < priv->block_size); 119 | assert(edge->offset + edge->length <= priv->block_size); 120 | 121 | // Does the next layer down support partial block reads natively? 122 | if (s3b->read_block_part != NULL) 123 | return (*s3b->read_block_part)(s3b, edge->block, edge->offset, edge->length, edge->data); 124 | 125 | // Allocate buffer 126 | if ((buf = malloc(priv->block_size)) == NULL) 127 | return errno; 128 | 129 | // Increment readers count 130 | pthread_mutex_lock(&priv->mutex); 131 | while (1) { 132 | switch ((block_state = priv->block_states[(size_t)edge->block])) { 133 | case BLOCK_WRITING: 134 | case BLOCK_READERS_MAX: 135 | pthread_cond_wait(&priv->wakeup, &priv->mutex); 136 | continue; 137 | default: 138 | break; 139 | } 140 | priv->block_states[(size_t)edge->block] = (u_int8_t)(block_state + 1); // increment #readers 141 | break; 142 | } 143 | pthread_mutex_unlock(&priv->mutex); 144 | 145 | // Read entire block 146 | if ((r = (*s3b->read_block)(s3b, edge->block, buf, NULL, NULL, 0)) != 0) 147 | goto done; 148 | 149 | // Copy out desired fragment 150 | memcpy(edge->data, buf + edge->offset, edge->length); 151 | 152 | done: 153 | // Decrement readers count 154 | pthread_mutex_lock(&priv->mutex); 155 | block_state = priv->block_states[(size_t)edge->block]; 156 | assert(block_state != BLOCK_IDLE); 157 | assert(block_state != BLOCK_WRITING); 158 | if (block_state == BLOCK_READERS_MAX) // there might be a waiting reader 159 | pthread_cond_signal(&priv->wakeup); 160 | block_state = (u_int8_t)(block_state - 1); // decrement #readers 161 | if ((priv->block_states[(size_t)edge->block] = block_state) == BLOCK_IDLE) // there might be a waiting writer 162 | pthread_cond_signal(&priv->wakeup); 163 | pthread_mutex_unlock(&priv->mutex); 164 | 165 | // Done 166 | free(buf); 167 | return r; 168 | } 169 | 170 | /* 171 | * Write a partial block by reading the whole block, patching it, and writing it back. 172 | * 173 | * If edge->data is NULL then write zeros. 174 | * 175 | * For any given block, we can only do this if there are no other simultaneous readers or writers. 176 | */ 177 | int 178 | block_part_write_block_part(struct s3backer_store *s3b, struct block_part *const priv, const struct boundary_edge *const edge) 179 | { 180 | const void *data = edge->data != NULL ? edge->data : zero_block; // if edge->data is NULL then write zeros 181 | u_char *buf; 182 | int r; 183 | 184 | // Sanity check 185 | assert(edge->offset <= priv->block_size); 186 | assert(edge->length > 0); 187 | assert(edge->length < priv->block_size); 188 | assert(edge->offset + edge->length <= priv->block_size); 189 | 190 | // Does the next layer down support partial block writes natively? 191 | if (s3b->write_block_part != NULL) 192 | return (*s3b->write_block_part)(s3b, edge->block, edge->offset, edge->length, edge->data); 193 | 194 | // Allocate buffer 195 | if ((buf = malloc(priv->block_size)) == NULL) 196 | return errno; 197 | 198 | // Grab exclusive lock on this block 199 | pthread_mutex_lock(&priv->mutex); 200 | while (1) { 201 | if (priv->block_states[(size_t)edge->block] != BLOCK_IDLE) { 202 | pthread_cond_wait(&priv->wakeup, &priv->mutex); 203 | continue; 204 | } 205 | priv->block_states[(size_t)edge->block] = BLOCK_WRITING; // grab exclusive write lock 206 | break; 207 | } 208 | pthread_mutex_unlock(&priv->mutex); 209 | 210 | // Read entire block 211 | if ((r = (*s3b->read_block)(s3b, edge->block, buf, NULL, NULL, 0)) != 0) 212 | goto done; 213 | 214 | // Write in supplied fragment 215 | memcpy(buf + edge->offset, data, edge->length); 216 | 217 | // Write back entire block 218 | r = (*s3b->write_block)(s3b, edge->block, buf, NULL, NULL, NULL); 219 | 220 | done: 221 | // Release exclusive lock on this block 222 | pthread_mutex_lock(&priv->mutex); 223 | assert(priv->block_states[(size_t)edge->block] == BLOCK_WRITING); 224 | priv->block_states[(size_t)edge->block] = BLOCK_IDLE; // release exclusive write lock 225 | pthread_cond_signal(&priv->wakeup); // there might be a waiting reader or writer 226 | pthread_mutex_unlock(&priv->mutex); 227 | 228 | // Done 229 | free(buf); 230 | return r; 231 | } 232 | -------------------------------------------------------------------------------- /block_part.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Forward decl's 38 | struct block_part; 39 | struct boundary_edge; 40 | 41 | // block_part.c 42 | extern struct block_part *block_part_create(u_int block_size, s3b_block_t num_blocks); 43 | extern int block_part_read_block_part(struct s3backer_store *s3b, struct block_part *block_part, const struct boundary_edge *edge); 44 | extern int block_part_write_block_part(struct s3backer_store *s3b, struct block_part *block_part, const struct boundary_edge *edge); 45 | extern void block_part_destroy(struct block_part **block_partp); 46 | -------------------------------------------------------------------------------- /cleanup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # 4 | # Script to clean out generated GNU auto* gunk. 5 | # 6 | 7 | set -e 8 | 9 | echo "cleaning up" 10 | rm -rf autom4te*.cache scripts aclocal.m4 configure config.log config.status .deps stamp-h1 11 | rm -f config.h.in config.h.in~ config.h 12 | rm -f *.lo *.la libtool 13 | rm -rf .libs scripts m4 tags TAGS 14 | find . \( -name Makefile -o -name Makefile.in \) -print0 | xargs -0 rm -f 15 | rm -f gitrev.c s3backer.spec 16 | rm -f *.o s3backer{,.1} tester 17 | rm -f s3backer-?.?.?.tar.gz 18 | 19 | -------------------------------------------------------------------------------- /compress.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "s3backer.h" 38 | #include "compress.h" 39 | 40 | #if ZSTD 41 | #include 42 | #endif 43 | 44 | // Internal helpers 45 | static int *parse_integer_level(const char *string); 46 | static void free_integer_level(void *levelp); 47 | 48 | // Compression hooks - Deflate 49 | static comp_cfunc_t deflate_compress; 50 | static comp_dfunc_t deflate_decompress; 51 | static comp_lparse_t deflate_lparse; 52 | 53 | #if ZSTD 54 | 55 | // Compression hooks - Zstd 56 | static comp_cfunc_t zstd_compress; 57 | static comp_dfunc_t zstd_decompress; 58 | static comp_lparse_t zstd_lparse; 59 | #endif 60 | 61 | // Compression algorithms 62 | const struct comp_alg comp_algs[] = { 63 | #if COMP_ALG_ZLIB != 0 64 | #error incorrect COMP_ALG_ZLIB 65 | #endif 66 | 67 | // Deflate 68 | { 69 | .name= "deflate", 70 | .cfunc= deflate_compress, 71 | .dfunc= deflate_decompress, 72 | .lparse= deflate_lparse, 73 | .lfree= free_integer_level 74 | }, 75 | 76 | #if ZSTD 77 | // Zstandard 78 | { 79 | .name= "zstd", 80 | .cfunc= zstd_compress, 81 | .dfunc= zstd_decompress, 82 | .lparse= zstd_lparse, 83 | .lfree= free_integer_level 84 | }, 85 | #endif 86 | }; 87 | const size_t num_comp_algs = sizeof(comp_algs) / sizeof(*comp_algs); 88 | 89 | /**************************************************************************** 90 | * GENERAL PURPOSE * 91 | ****************************************************************************/ 92 | 93 | const struct comp_alg * 94 | comp_find(const char *name) 95 | { 96 | int i; 97 | 98 | for (i = 0; i < num_comp_algs; i++) { 99 | const struct comp_alg *calg = &comp_algs[i]; 100 | 101 | if (strcasecmp(name, calg->name) == 0) 102 | return calg; 103 | } 104 | return NULL; 105 | } 106 | 107 | /**************************************************************************** 108 | * DEFLATE * 109 | ****************************************************************************/ 110 | 111 | static int 112 | deflate_compress(log_func_t *log, const void *input, size_t inlen, void **outputp, size_t *outlenp, void *levelp) 113 | { 114 | u_long clen; 115 | void *cbuf; 116 | int level; 117 | int r; 118 | 119 | // Allocate buffer 120 | clen = compressBound(inlen); 121 | if ((cbuf = malloc(clen)) == NULL) { 122 | r = errno; 123 | (*log)(LOG_ERR, "malloc: %s", strerror(r)); 124 | return r; 125 | } 126 | 127 | // Extract compression level 128 | level = levelp != NULL ? *(int *)levelp : Z_DEFAULT_COMPRESSION; 129 | 130 | // Compress data 131 | r = compress2(cbuf, &clen, input, inlen, level); 132 | switch (r) { 133 | case Z_OK: 134 | *outputp = cbuf; 135 | *outlenp = clen; 136 | return 0; 137 | case Z_MEM_ERROR: 138 | (*log)(LOG_ERR, "zlib compress: %s", strerror(ENOMEM)); 139 | r = ENOMEM; 140 | break; 141 | default: 142 | (*log)(LOG_ERR, "zlib compress: error %d", r); 143 | r = EIO; 144 | break; 145 | } 146 | 147 | // Fail 148 | free(cbuf); 149 | return r; 150 | } 151 | 152 | static int 153 | deflate_decompress(log_func_t *log, const void *input, size_t inlen, void *output, size_t *outlenp) 154 | { 155 | u_long uclen = *outlenp; 156 | int r; 157 | 158 | switch ((r = uncompress(output, &uclen, input, inlen))) { 159 | case Z_OK: 160 | *outlenp = uclen; 161 | return 0; 162 | case Z_MEM_ERROR: 163 | (*log)(LOG_ERR, "zlib uncompress: %s", strerror(ENOMEM)); 164 | return ENOMEM; 165 | case Z_BUF_ERROR: 166 | (*log)(LOG_ERR, "zlib uncompress: %s", "decompressed block is oversize"); 167 | return EIO; 168 | case Z_DATA_ERROR: 169 | (*log)(LOG_ERR, "zlib uncompress: %s", "data is corrupted or truncated"); 170 | return EIO; 171 | default: 172 | (*log)(LOG_ERR, "zlib uncompress: error %d", r); 173 | return EIO; 174 | } 175 | } 176 | 177 | static void * 178 | deflate_lparse(const char *string) 179 | { 180 | int *levelp; 181 | 182 | // Parse level 183 | if ((levelp = parse_integer_level(string)) == NULL) 184 | goto invalid; 185 | 186 | // Check level 187 | switch (*levelp) { 188 | case Z_DEFAULT_COMPRESSION: 189 | case Z_NO_COMPRESSION: 190 | break; 191 | default: 192 | if (*levelp < Z_BEST_SPEED || *levelp > Z_BEST_COMPRESSION) { 193 | free(levelp); 194 | goto invalid; 195 | } 196 | break; 197 | } 198 | 199 | // Done 200 | return levelp; 201 | 202 | invalid: 203 | warnx("invalid deflate compression level \"%s\"", string); 204 | return NULL; 205 | } 206 | 207 | #if ZSTD 208 | 209 | /**************************************************************************** 210 | * ZSTD * 211 | ****************************************************************************/ 212 | 213 | static int 214 | zstd_compress(log_func_t *log, const void *input, size_t inlen, void **outputp, size_t *outlenp, void *levelp) 215 | { 216 | u_long clen; 217 | void *cbuf; 218 | int level; 219 | int r; 220 | 221 | // Allocate buffer 222 | clen = ZSTD_compressBound(inlen); 223 | if ((cbuf = malloc(clen)) == NULL) { 224 | r = errno; 225 | (*log)(LOG_ERR, "malloc: %s", strerror(r)); 226 | return r; 227 | } 228 | 229 | // Extract compression level 230 | level = levelp != NULL ? *(int *)levelp : ZSTD_CLEVEL_DEFAULT; 231 | 232 | // Compress data 233 | clen = ZSTD_compress(cbuf, clen, input, inlen, level); 234 | if (ZSTD_isError(clen)) { 235 | (*log)(LOG_ERR, "zstd compress: error, %s", ZSTD_getErrorName(clen)); 236 | free(cbuf); 237 | return EIO; 238 | } 239 | 240 | // Done 241 | *outputp = cbuf; 242 | *outlenp = clen; 243 | return 0; 244 | } 245 | 246 | static int 247 | zstd_decompress(log_func_t *log, const void *input, size_t inlen, void *output, size_t *outlenp) 248 | { 249 | size_t code; 250 | 251 | // Decompress 252 | code = ZSTD_decompress(output, *outlenp, input, inlen); 253 | if (ZSTD_isError(code)) { 254 | (*log)(LOG_ERR, "zstd uncompress: %s", ZSTD_getErrorName(code)); 255 | return EIO; 256 | } 257 | 258 | // Done 259 | *outlenp = code; 260 | return 0; 261 | } 262 | 263 | static void * 264 | zstd_lparse(const char *string) 265 | { 266 | int *levelp; 267 | 268 | // Parse level 269 | if ((levelp = parse_integer_level(string)) == NULL) 270 | goto invalid; 271 | 272 | // Check level 273 | if (*levelp < ZSTD_minCLevel() || *levelp > ZSTD_maxCLevel()) { 274 | free(levelp); 275 | goto invalid; 276 | } 277 | 278 | // Done 279 | return levelp; 280 | 281 | invalid: 282 | warnx("invalid zstd compression level \"%s\"", string); 283 | return NULL; 284 | } 285 | #endif 286 | 287 | /**************************************************************************** 288 | * INTERNAL HELPERS * 289 | ****************************************************************************/ 290 | 291 | static int * 292 | parse_integer_level(const char *string) 293 | { 294 | char *endptr; 295 | long level; 296 | int *levelp; 297 | 298 | // Parse level 299 | errno = 0; 300 | level = strtol(string, &endptr, 10); 301 | if ((errno == ERANGE && (level == LONG_MIN || level == LONG_MAX)) 302 | || (errno != 0 && level == 0) 303 | || *endptr != '\0' 304 | || (int)level != level) 305 | return NULL; 306 | 307 | // Store in buffer 308 | if ((levelp = malloc(sizeof(*levelp))) == NULL) { 309 | warn("malloc"); 310 | return NULL; 311 | } 312 | *levelp = (int)level; 313 | return levelp; 314 | } 315 | 316 | static void 317 | free_integer_level(void *levelp) 318 | { 319 | free(levelp); 320 | } 321 | -------------------------------------------------------------------------------- /compress.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | /* 38 | * Compression function 39 | * 40 | * Returns 0 on success, otherwise (positive) error code. 41 | * 42 | * log - where to log errors 43 | * input - the data to compress 44 | * inlen - length of input 45 | * outputp - on successful return, points to compressed data in a malloc'd buffer 46 | * outlenp - on successful return, length of *outputp 47 | * level - compression level info from parse function, or NULL for default 48 | */ 49 | typedef int comp_cfunc_t(log_func_t *log, const void *input, size_t inlen, void **outputp, size_t *outlenp, void *level); 50 | 51 | /* 52 | * Decompression function 53 | * 54 | * Returns 0 on success, otherwise (positive) error code. 55 | * 56 | * log - where to log errors 57 | * input - the data to decompress 58 | * inlen - length of input 59 | * output - buffer for decompressed data, having length at least *outlenp 60 | * outlenp 61 | * - on invocation, points to maximum possible/expected length of decompressed data 62 | * - on successful return, points to the actual length of decompressed data 63 | */ 64 | typedef int comp_dfunc_t(log_func_t *log, const void *input, size_t inlen, void *output, size_t *outlenp); 65 | 66 | /* 67 | * Compression level parsing function. 68 | * 69 | * Returns opaque level information on success, otherwise prints error to stderr and returns NULL. 70 | * 71 | * This function is only invoked if a "--compress-level=xxx" flag is given on the command line. If so, then the "xxx" must 72 | * be parsed successfully by this function and a non-NULL result returned. The returned result (or NULL if no such command 73 | * line flag was given) is passed as the "level" parameter to the compression function. 74 | * 75 | * string - compression level string 76 | */ 77 | typedef void *comp_lparse_t(const char *string); 78 | 79 | /* 80 | * Compression level free function. 81 | * 82 | * Frees value previously returned by comp_lparse_t. Must gracefully handle NULL. 83 | */ 84 | typedef void comp_lfree_t(void *value); 85 | 86 | // Compression algorithms 87 | struct comp_alg { 88 | const char *name; 89 | comp_cfunc_t *cfunc; 90 | comp_dfunc_t *dfunc; 91 | comp_lparse_t *lparse; 92 | comp_lfree_t *lfree; 93 | }; 94 | 95 | // Globals 96 | extern const size_t num_comp_algs; 97 | extern const struct comp_alg comp_algs[]; 98 | 99 | // Functions 100 | extern const struct comp_alg *comp_find(const char *name); 101 | -------------------------------------------------------------------------------- /configure.ac: -------------------------------------------------------------------------------- 1 | # 2 | # s3backer - FUSE-based single file backing store via Amazon S3 3 | # 4 | # Copyright 2008-2023 Archie L. Cobbs 5 | # 6 | # This program is free software; you can redistribute it and/or 7 | # modify it under the terms of the GNU General Public License 8 | # as published by the Free Software Foundation; either version 2 9 | # of the License, or (at your option) any later version. 10 | # 11 | # This program is distributed in the hope that it will be useful, 12 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 13 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 | # GNU General Public License for more details. 15 | # 16 | # You should have received a copy of the GNU General Public License 17 | # along with this program; if not, write to the Free Software 18 | # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 19 | # 02110-1301, USA. 20 | # 21 | # In addition, as a special exception, the copyright holders give 22 | # permission to link the code of portions of this program with the 23 | # OpenSSL library under certain conditions as described in each 24 | # individual source file, and distribute linked combinations including 25 | # the two. 26 | # 27 | # You must obey the GNU General Public License in all respects for all 28 | # of the code used other than OpenSSL. If you modify file(s) with this 29 | # exception, you may extend this exception to your version of the 30 | # file(s), but you are not obligated to do so. If you do not wish to do 31 | # so, delete this exception statement from your version. If you delete 32 | # this exception statement from all source files in the program, then 33 | # also delete it here. 34 | 35 | AC_INIT([s3backer FUSE filesystem backed by Amazon S3],[2.1.5],[https://github.com/archiecobbs/s3backer],[s3backer]) 36 | AC_CONFIG_AUX_DIR(scripts) 37 | AM_INIT_AUTOMAKE(foreign) 38 | LT_INIT() 39 | dnl AM_MAINTAINER_MODE 40 | AC_PREREQ([2.69]) 41 | AC_PREFIX_DEFAULT(/usr) 42 | AC_PROG_MAKE_SET 43 | 44 | [CFLAGS="-g -O3 -pipe -Wall -Wcast-align -Wchar-subscripts -Wcomment -Wformat -Wimplicit -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wno-long-long -Wparentheses -Wpointer-arith -Wredundant-decls -Wreturn-type -Wswitch -Wtrigraphs -Wuninitialized -Wunused -Wwrite-strings -Wshadow -Wstrict-prototypes -Wcast-qual $CFLAGS"] 45 | AC_SUBST(CFLAGS) 46 | 47 | # Not sure why this is needed but it seems to help 48 | AC_CONFIG_MACRO_DIRS([m4]) 49 | 50 | # Compile flags for Linux. See https://stackoverflow.com/a/29201732 51 | AC_DEFINE(_GNU_SOURCE, 1, GNU functions) 52 | AC_DEFINE(_BSD_SOURCE, 1, BSD functions) 53 | AC_DEFINE(_DEFAULT_SOURCE, 1, Default functions) 54 | 55 | # Compile flags for Mac OS 56 | AC_DEFINE(_DARWIN_C_SOURCE, 1, MacOS functions) 57 | 58 | # Compile flags for FUSE 59 | AC_DEFINE(FUSE_USE_VERSION, 35, FUSE API version) 60 | 61 | # Check for required programs 62 | AC_PROG_INSTALL 63 | AC_PROG_CC 64 | 65 | # Check for required pkg-config'd stuff 66 | PKG_PROG_PKG_CONFIG(0.19) 67 | PKG_CHECK_MODULES(FUSE, fuse3, 68 | [CFLAGS="${CFLAGS} ${FUSE_CFLAGS}" 69 | LDFLAGS="${LDFLAGS} ${FUSE_LIBS}"], 70 | [AC_MSG_ERROR(["fuse3" not found in pkg-config])]) 71 | 72 | # Check for zstd (optional) 73 | PKG_CHECK_MODULES([ZSTD], libzstd, 74 | [AC_DEFINE([ZSTD], [1], [Whether zstd is available]) 75 | CFLAGS="${CFLAGS} ${ZSTD_CFLAGS}" 76 | LDFLAGS="${LDFLAGS} ${ZSTD_LIBS}"], 77 | [true]) 78 | 79 | # Check if NBD support is enabled 80 | AS_IF([test "x$enable_nbd" != xno], [ 81 | 82 | # Check for NBDKit 83 | PKG_CHECK_MODULES([NBDKIT], [nbdkit >= 1.24.1], 84 | [AC_DEFINE([NBDKIT], [1], [Whether NBDKit is available]) 85 | NBDKIT_FOUND=true], 86 | [NBDKIT_FOUND=false]) 87 | 88 | # If NBDKit was found, find the plugin directory and executable 89 | AS_IF([test "x$NBDKIT_FOUND" = xtrue], [ 90 | 91 | # Find nbdkit plugins directory 92 | AC_MSG_CHECKING([NBDKit plugins directory]) 93 | PKG_CHECK_VAR([NBDKIT_PLUGINDIR], [nbdkit], [plugindir]) 94 | AS_IF([test "x$NBDKIT_PLUGINDIR" = "x"], 95 | [AC_MSG_FAILURE([Unable to identify the NBDKit plugin directory.])], 96 | [AC_MSG_RESULT([$NBDKIT_PLUGINDIR])]) 97 | 98 | # Find nbdkit(1) 99 | AC_PATH_PROGS([NBDKIT_EXECUTABLE], [nbdkit]) 100 | AS_IF([test "x$NBDKIT_EXECUTABLE" = x], [AC_MSG_ERROR([required executable nbdkit not found])]) 101 | AC_DEFINE_UNQUOTED([NBDKIT_EXECUTABLE], ["$NBDKIT_EXECUTABLE"], [path to nbdkit(1) executable]) 102 | 103 | # Find nbd-client(8) 104 | AC_PATH_PROGS([NBD_CLIENT_EXECUTABLE], [nbd-client]) 105 | AS_IF([test "x$NBD_CLIENT_EXECUTABLE" = x], [AC_MSG_ERROR([required executable nbd-client not found])]) 106 | AC_DEFINE_UNQUOTED([NBD_CLIENT_EXECUTABLE], ["$NBD_CLIENT_EXECUTABLE"], [path to nbd-client(8) executable]) 107 | 108 | # Find modprobe(8) (optional) 109 | AC_PATH_PROGS([MODPROBE_EXECUTABLE], [modprobe]) 110 | AC_DEFINE_UNQUOTED([MODPROBE_EXECUTABLE], ["$MODPROBE_EXECUTABLE"], [path to modprobe(8) executable]) 111 | ]) 112 | ], [NBDKIT_FOUND=false]) 113 | AM_CONDITIONAL([NBDKIT_FOUND], [test "x$NBDKIT_FOUND" = xtrue]) 114 | AC_SUBST([NBDKIT_FOUND]) 115 | 116 | # Define our directory for nbdkit(1) UNIX socket files, defaulting to /run/s3backer-nbd 117 | AC_ARG_VAR([S3B_NBD_DIR], [directory containing UNIX socket files used by nbd-client]) 118 | AS_IF([test "x$S3B_NBD_DIR" = x], [S3B_NBD_DIR="/run/s3backer-nbd"]) 119 | #AC_SUBST([S3B_NBD_DIR]) 120 | AC_DEFINE_UNQUOTED([S3B_NBD_DIR], ["$S3B_NBD_DIR"], [nbdkit UNIX socket file directory]) 121 | 122 | # Check for required libraries 123 | AC_CHECK_LIB(curl, curl_version,, 124 | [AC_MSG_ERROR([required library libcurl missing])]) 125 | AC_CHECK_LIB(crypto, BIO_new,, 126 | [AC_MSG_ERROR([required library libcrypto missing])]) 127 | AC_CHECK_LIB(expat, XML_ParserCreate,, 128 | [AC_MSG_ERROR([required library expat missing])]) 129 | AC_CHECK_LIB(fuse3, fuse_version,, 130 | [AC_MSG_ERROR([required library libfuse3 missing])]) 131 | AC_CHECK_LIB(z, compressBound,, 132 | [AC_MSG_ERROR([required library zlib missing])]) 133 | AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ 134 | #include 135 | long x = CURLOPT_TCP_KEEPALIVE; 136 | int y = CURLINFO_CONTENT_LENGTH_DOWNLOAD_T; 137 | ]])],, [AC_MSG_ERROR([unable to compile with curl, or curl version is < 7.55.0])]) 138 | 139 | # Avoid libcurl 8.11.0. Ref: https://github.com/archiecobbs/s3backer/issues/232 140 | AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ 141 | #include 142 | #if LIBCURL_VERSION_MAJOR == 8 && LIBCURL_VERSION_MINOR == 11 && LIBCURL_VERSION_PATCH == 0 143 | #error 144 | #endif 145 | ]])],, [AC_MSG_ERROR([curl 8.11.0 is broken; upgrade/downgrade to another version. See issue 232 for details.])]) 146 | 147 | # Set some O/S specific stuff 148 | case `uname -s` in 149 | Darwin|FreeBSD) 150 | AC_CHECK_LIB(pthread, pthread_create,, 151 | [AC_MSG_ERROR([required library libpthread missing])]) 152 | ;; 153 | Linux) 154 | LDFLAGS="${LDFLAGS} -pthread" 155 | ;; 156 | *) 157 | ;; 158 | esac 159 | 160 | # Check for some O/S specific functions 161 | AC_CHECK_DECLS(fdatasync) 162 | AC_CHECK_DECLS([posix_fadvise], [], [], [[#include ]]) 163 | AC_CHECK_DECLS([prctl, PR_SET_IO_FLUSHER], [], [], [[#include ]]) 164 | AC_CHECK_DECLS([fallocate, FALLOC_FL_PUNCH_HOLE, FALLOC_FL_KEEP_SIZE], [], [], [[#include ]]) 165 | 166 | # Check for required header files 167 | AC_CHECK_HEADERS(assert.h ctype.h curl/curl.h err.h errno.h expat.h pthread.h stdarg.h stddef.h stdint.h stdio.h stdlib.h string.h syslog.h time.h unistd.h sys/queue.h sys/statvfs.h openssl/bio.h openssl/buffer.h openssl/evp.h openssl/hmac.h openssl/md5.h zlib.h, [], 168 | [AC_MSG_ERROR([required header file '$ac_header' missing])]) 169 | 170 | # Optional features 171 | AC_ARG_ENABLE(assertions, 172 | AS_HELP_STRING([--enable-assertions],[enable debugging sanity checks (default NO)]), 173 | [test x"$enableval" = "xyes" || AC_DEFINE(NDEBUG, 1, [disable assertions])], 174 | [AC_DEFINE(NDEBUG, 1, [disable assertions])]) 175 | AC_ARG_ENABLE(gprof, 176 | AS_HELP_STRING([--enable-gprof],[Compile and link with gprof(1) support (default NO)]), 177 | [test x"$enableval" = "xyes" && CFLAGS="${CFLAGS} -pg"]) 178 | AC_ARG_ENABLE(Werror, 179 | AS_HELP_STRING([--enable-Werror],[enable compilation with -Werror flag (default NO)]), 180 | [test x"$enableval" = "xyes" && CFLAGS="${CFLAGS} -Werror"]) 181 | AC_ARG_ENABLE(sanitize, 182 | AS_HELP_STRING([--enable-sanitize],[enable compilation with -fsanitize=address and -fsanitize=undefined (default NO)]), 183 | [test x"$enableval" = "xyes" && CFLAGS="${CFLAGS} -fsanitize=address -fsanitize=undefined"]) 184 | AC_ARG_ENABLE(nbd, 185 | AS_HELP_STRING([--enable-nbd],[include NBD support if nbdkit is found (default YES)])) 186 | 187 | # Generated files 188 | AC_CONFIG_FILES(Makefile) 189 | AC_CONFIG_FILES(s3backer.1) 190 | AC_CONFIG_HEADERS(config.h) 191 | 192 | # Go 193 | AC_OUTPUT 194 | -------------------------------------------------------------------------------- /dcache.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | /* 38 | * Simple on-disk persistent cache. 39 | */ 40 | 41 | // Declarations 42 | struct s3b_dcache; 43 | struct block_cache_conf; 44 | 45 | /* 46 | * Startup visitor callback. Each non-empty slot in the disk cache is visited. 47 | * 48 | * The "etag" pointer is NULL for dirty blocks, and not NULL for clean blocks. 49 | */ 50 | typedef int s3b_dcache_visit_t(void *arg, s3b_block_t dslot, s3b_block_t block_num, const u_char *etag); 51 | 52 | // dcache.c 53 | extern int s3b_dcache_open(struct s3b_dcache **dcachep, 54 | struct block_cache_conf *config, s3b_dcache_visit_t *visitor, void *arg, u_int visit_dirty); 55 | extern void s3b_dcache_close(struct s3b_dcache *dcache); 56 | extern u_int s3b_dcache_size(struct s3b_dcache *dcache); 57 | extern int s3b_dcache_alloc_block(struct s3b_dcache *priv, u_int *dslotp); 58 | extern int s3b_dcache_record_block(struct s3b_dcache *priv, u_int dslot, s3b_block_t block_num, const u_char *etag); 59 | extern int s3b_dcache_erase_block(struct s3b_dcache *priv, u_int dslot); 60 | extern int s3b_dcache_free_block(struct s3b_dcache *dcache, u_int dslot); 61 | extern int s3b_dcache_read_block(struct s3b_dcache *dcache, u_int dslot, void *dest, u_int off, u_int len); 62 | extern int s3b_dcache_write_block(struct s3b_dcache *dcache, u_int dslot, const void *src, u_int off, u_int len); 63 | extern int s3b_dcache_fsync(struct s3b_dcache *dcache); 64 | extern int s3b_dcache_has_mount_token(struct s3b_dcache *priv); 65 | extern int s3b_dcache_set_mount_token(struct s3b_dcache *priv, int32_t *old_valuep, int32_t new_value); 66 | 67 | -------------------------------------------------------------------------------- /ec_protect.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Configuration info structure for ec_protect store 38 | struct ec_protect_conf { 39 | u_int block_size; 40 | u_int min_write_delay; 41 | u_int cache_time; 42 | u_int cache_size; 43 | log_func_t *log; 44 | }; 45 | 46 | // Statistics structure for ec_protect store 47 | struct ec_protect_stats { 48 | u_int current_cache_size; 49 | u_int cache_data_hits; 50 | uint64_t cache_full_delay; 51 | uint64_t repeated_write_delay; 52 | u_int out_of_memory_errors; 53 | }; 54 | 55 | // ec_protect.c 56 | extern struct s3backer_store *ec_protect_create(struct ec_protect_conf *config, struct s3backer_store *inner); 57 | extern void ec_protect_get_stats(struct s3backer_store *s3b, struct ec_protect_stats *stats); 58 | extern void ec_protect_clear_stats(struct s3backer_store *s3b); 59 | 60 | -------------------------------------------------------------------------------- /erase.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "s3backer.h" 38 | #include "block_cache.h" 39 | #include "ec_protect.h" 40 | #include "zero_cache.h" 41 | #include "fuse_ops.h" 42 | #include "http_io.h" 43 | #include "test_io.h" 44 | #include "s3b_config.h" 45 | #include "erase.h" 46 | #include "util.h" 47 | 48 | #define BLOCKS_PER_DOT 0x100 49 | #define MAX_QUEUE_LENGTH 100000 50 | #define NUM_ERASURE_THREADS 25 51 | 52 | // Erasure state 53 | struct erase_state { 54 | struct s3backer_store *s3b; 55 | s3b_block_t queue[MAX_QUEUE_LENGTH]; 56 | u_int qlen; 57 | pthread_t threads[NUM_ERASURE_THREADS]; 58 | int quiet; 59 | int stopping; 60 | uintmax_t count; 61 | bitmap_t *seen; 62 | pthread_mutex_t mutex; 63 | pthread_cond_t thread_wakeup; 64 | pthread_cond_t queue_not_full; 65 | pthread_cond_t queue_empty; 66 | }; 67 | 68 | // Internal functions 69 | static block_list_func_t erase_list_callback; 70 | static void *erase_thread_main(void *arg); 71 | 72 | int 73 | s3backer_erase(struct s3b_config *config) 74 | { 75 | struct erase_state state; 76 | struct erase_state *const priv = &state; 77 | char response[10]; 78 | int num_threads; 79 | int ok = 0; 80 | int r; 81 | 82 | // Double check with user 83 | if (!config->force) { 84 | warnx("\"--erase\" flag given: erasing all blocks in %s", config->description); 85 | fprintf(stderr, "s3backer: is this correct? [y/N] "); 86 | *response = '\0'; 87 | if (fgets(response, sizeof(response), stdin) != NULL) { 88 | while (*response && isspace(response[strlen(response) - 1])) 89 | response[strlen(response) - 1] = '\0'; 90 | } 91 | if (strcasecmp(response, "y") != 0 && strcasecmp(response, "yes") != 0) { 92 | warnx("not confirmed"); 93 | goto fail0; 94 | } 95 | } 96 | 97 | // Initialize state 98 | memset(priv, 0, sizeof(*priv)); 99 | priv->quiet = config->quiet; 100 | if ((r = pthread_mutex_init(&priv->mutex, NULL)) != 0) { 101 | warnx("pthread_mutex_init: %s", strerror(r)); 102 | goto fail0; 103 | } 104 | if ((r = pthread_cond_init(&priv->thread_wakeup, NULL)) != 0) { 105 | warnx("pthread_cond_init: %s", strerror(r)); 106 | goto fail1; 107 | } 108 | if ((r = pthread_cond_init(&priv->queue_not_full, NULL)) != 0) { 109 | warnx("pthread_cond_init: %s", strerror(r)); 110 | goto fail2; 111 | } 112 | if ((r = pthread_cond_init(&priv->queue_empty, NULL)) != 0) { 113 | warnx("pthread_cond_init: %s", strerror(r)); 114 | goto fail3; 115 | } 116 | if ((priv->seen = bitmap_init(config->num_blocks, 0)) == NULL) { 117 | r = errno; 118 | warnx("calloc: %s", strerror(r)); 119 | goto fail4; 120 | } 121 | for (num_threads = 0; num_threads < NUM_ERASURE_THREADS; num_threads++) { 122 | if ((r = pthread_create(&priv->threads[num_threads], NULL, erase_thread_main, priv)) != 0) 123 | goto fail5; 124 | } 125 | 126 | // Logging 127 | if (!config->quiet) { 128 | fprintf(stderr, "s3backer: erasing non-zero blocks..."); 129 | fflush(stderr); 130 | } 131 | 132 | // Create temporary lower layer 133 | if ((priv->s3b = config->test ? test_io_create(&config->test_io) : http_io_create(&config->http_io)) == NULL) { 134 | warnx(config->test ? "test_io_create" : "http_io_create"); 135 | goto fail5; 136 | } 137 | 138 | // Iterate over non-zero blocks 139 | if ((r = (*priv->s3b->survey_non_zero)(priv->s3b, erase_list_callback, priv)) != 0) { 140 | warnx("can't list blocks: %s", strerror(r)); 141 | goto fail5; 142 | } 143 | 144 | // Wait for queue to drain 145 | pthread_mutex_lock(&priv->mutex); 146 | while (priv->qlen > 0) 147 | pthread_cond_wait(&priv->queue_empty, &priv->mutex); 148 | CHECK_RETURN(pthread_mutex_unlock(&priv->mutex)); 149 | 150 | // Clear mount token 151 | if ((r = (*priv->s3b->set_mount_token)(priv->s3b, NULL, 0)) != 0) { 152 | warnx("can't clear mount token: %s", strerror(r)); 153 | goto fail5; 154 | } 155 | 156 | // Success 157 | ok = 1; 158 | 159 | // Clean up 160 | fail5: 161 | pthread_mutex_lock(&priv->mutex); 162 | priv->stopping = 1; 163 | pthread_cond_broadcast(&priv->thread_wakeup); 164 | CHECK_RETURN(pthread_mutex_unlock(&priv->mutex)); 165 | while (num_threads > 0) { 166 | if ((r = pthread_join(priv->threads[--num_threads], NULL)) != 0) 167 | warnx("pthread_join: %s", strerror(r)); 168 | } 169 | if (priv->s3b != NULL) { 170 | if (ok && !config->quiet) { 171 | fprintf(stderr, "done\n"); 172 | warnx("erased %ju non-zero blocks", priv->count); 173 | } 174 | (*priv->s3b->shutdown)(priv->s3b); 175 | (*priv->s3b->destroy)(priv->s3b); 176 | } 177 | bitmap_free(&priv->seen); 178 | fail4: 179 | pthread_cond_destroy(&priv->queue_empty); 180 | fail3: 181 | pthread_cond_destroy(&priv->queue_not_full); 182 | fail2: 183 | pthread_cond_destroy(&priv->thread_wakeup); 184 | fail1: 185 | pthread_mutex_destroy(&priv->mutex); 186 | fail0: 187 | return ok ? 0 : -1; 188 | } 189 | 190 | static int 191 | erase_list_callback(void *arg, const s3b_block_t *block_nums, u_int num_blocks) 192 | { 193 | struct erase_state *const priv = arg; 194 | 195 | pthread_mutex_lock(&priv->mutex); 196 | while (num_blocks-- > 0) { 197 | const s3b_block_t block_num = *block_nums++; 198 | 199 | if (bitmap_test(priv->seen, block_num)) 200 | continue; // already reported to us 201 | bitmap_set(priv->seen, block_num, 1); 202 | while (priv->qlen == MAX_QUEUE_LENGTH) 203 | pthread_cond_wait(&priv->queue_not_full, &priv->mutex); 204 | priv->queue[priv->qlen++] = block_num; 205 | } 206 | pthread_cond_broadcast(&priv->thread_wakeup); 207 | CHECK_RETURN(pthread_mutex_unlock(&priv->mutex)); 208 | return 0; 209 | } 210 | 211 | static void * 212 | erase_thread_main(void *arg) 213 | { 214 | struct erase_state *const priv = arg; 215 | s3b_block_t block_num; 216 | int r; 217 | 218 | // Acquire lock 219 | pthread_mutex_lock(&priv->mutex); 220 | 221 | // Erase blocks until there are no more 222 | while (1) { 223 | 224 | // Is there a block to erase? 225 | if (priv->qlen > 0) { 226 | 227 | // Grab next bock 228 | if (priv->qlen == MAX_QUEUE_LENGTH) 229 | pthread_cond_broadcast(&priv->queue_not_full); 230 | block_num = priv->queue[--priv->qlen]; 231 | if (priv->qlen == 0) 232 | pthread_cond_signal(&priv->queue_empty); 233 | 234 | // Do block deletion 235 | CHECK_RETURN(pthread_mutex_unlock(&priv->mutex)); 236 | r = (*priv->s3b->write_block)(priv->s3b, block_num, NULL, NULL, NULL, NULL); 237 | pthread_mutex_lock(&priv->mutex); 238 | 239 | // Check for error 240 | if (r != 0) { 241 | warnx("can't delete block %0*jx: %s", S3B_BLOCK_NUM_DIGITS, (uintmax_t)block_num, strerror(r)); 242 | continue; 243 | } 244 | 245 | // Update count and output a dot 246 | if ((++priv->count % BLOCKS_PER_DOT) == 0 && !priv->quiet) { 247 | fprintf(stderr, "."); 248 | fflush(stderr); 249 | } 250 | 251 | // Spin again 252 | continue; 253 | } 254 | 255 | // Are we done? 256 | if (priv->stopping) 257 | break; 258 | 259 | // Wait for something to do 260 | pthread_cond_wait(&priv->thread_wakeup, &priv->mutex); 261 | } 262 | 263 | // Done 264 | CHECK_RETURN(pthread_mutex_unlock(&priv->mutex)); 265 | return NULL; 266 | } 267 | 268 | -------------------------------------------------------------------------------- /erase.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // erase.c 38 | extern int s3backer_erase(struct s3b_config *config); 39 | 40 | -------------------------------------------------------------------------------- /examples/create_zpool.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # 3 | # This script creates an NBD-backed zpool. It needs to be run with root permissions. 4 | # 5 | # Some explanatory remarks on the procedure: 6 | # 7 | # Generally, the larger the s3backer block size the better we utilize the network 8 | # bandwidth and the fewer we pay for HTTP requests. However, larger block sizes come for 9 | # the price of read/write amplification (when partial blocks have to be read or written). 10 | # 11 | # To eliminate all read/write amplification, the s3backer block size must match the 12 | # "modification unit" of the filesystem. Lowering it beyond this value has no further 13 | # advantages. 14 | # 15 | # For ZFS, the block size varies between 2^ashift and *recordsize*. We also want the gap 16 | # between these values to be reasonably large (otherwise compression efficiency will be 17 | # limited). However, as far as s3backer is concerned, this means that any read/write could 18 | # thus be for as little as a^ashift. 19 | # 20 | # Unfortunately, with a 4 kB block size (ashift=12), we're dealing not just with a large 21 | # number of HTTP requests ($ 1.31 for PUT'ing 1 GB) but also with at least 10% request 22 | # overhead (assuming 512 bytes of HTTP metadata) and at least 20x "time overhead" (latency 23 | # of 30 ms is 20x higher than raw transmission time of 4 kB at 3 MB/s). 24 | # 25 | # In theory, we could could compensate for the extra latency by using a large number of 26 | # parallel connections. In practice, this is difficult to accomplish because s3backer 27 | # splits incoming write requests and processes them sequentially. Reducing the size of NBD 28 | # requests to 4 kB would address this, but then we'd probably run into kernel-limits on 29 | # the number of pending NBD requests. So instead we'd have to enable the s3backer on-disk 30 | # cache - which then incurs its own overhead. 31 | # 32 | # Furthermore, a large number of files will consist of full *recordsize* blocks, so 33 | # processing those in 4 kB units adds needless overhead in many cases. 34 | # 35 | # Therefore, we almost certainly do not want to set the s3backer block size as low 36 | # as 2^ashift. 37 | # 38 | # Luckily, for ZFS we can also do better than picking some intermediate value. We assemble 39 | # the zpool from a regular (disk) vdev (backed by one S3 bucket) and a *special* vdev 40 | # (backed by a second S3 bucket). The *special* vdev will be used for all metadata 41 | # as well as file blocks above *special_small_blocks* in size. 42 | # 43 | # This means that the minimum modification unit for the regular vdev will be at least 44 | # *special_small_blocks*, and we can set a corresponding s3backer block size, while 45 | # smaller blocks will be written to the *special* vdev for which we can then use a smaller 46 | # s3backer block size. To maximize the benefit, we set *special_small_blocks* to 47 | # *recordsize-1* (128 kB-1), and the large s3 block size to a multiple of this (5x). 48 | # 49 | # For the small s3backer block size we use, somewhat arbitrarily, 32 kB. At this value, 50 | # the HTTP overhead is 1.5%, and the the time to transfer the data (10 ms at 3 MB/s) is 51 | # well below the overall Starlink connection latency (~30 ms). Therefore, read 52 | # amplification is neglectable and write amplification will incur only the round-trip 53 | # latency for the additional read(). 54 | # 55 | 56 | KEYFILE = "/home/nikratio/lib/s3_zfs.key" 57 | BUCKET_NAME = 'nikratio-backup' 58 | BUCKET_REGION = 'eu-west-2' 59 | ZPOOL_NAME = 's3backup' 60 | DSET_NAME = 'vostro' 61 | 62 | S3_BLOCKSIZE_SMALL_KB=32 63 | S3_BLOCKSIZE_LARGE_KB=512 64 | ZFS_BLOCKSIZE_THRESHOLD_KB=128 65 | ZFS_RECORDSIZE_KB=128 66 | 67 | # Where to mount the zpool 68 | ZPOOL_DIR = '/zpools' 69 | 70 | 71 | import subprocess 72 | from contextlib import ExitStack 73 | import os.path 74 | import sys 75 | import time 76 | import shlex 77 | import tempfile 78 | 79 | if os.geteuid() != 0: 80 | print('This script must be run as root.', file=sys.stderr) 81 | sys.exit(3) 82 | 83 | def wait_for(pred, *args, timeout=5): 84 | waited = 0 85 | while waited < timeout: 86 | if pred(*args): 87 | return 88 | time.sleep(1) 89 | waited += 1 90 | raise TimeoutError() 91 | 92 | 93 | def run(cmdline, wait=True, **kwargs): 94 | print('Running', shlex.join(cmdline)) 95 | if wait: 96 | return subprocess.run(cmdline, **kwargs, check=True) 97 | else: 98 | return subprocess.Popen(cmdline, **kwargs) 99 | 100 | 101 | def find_unused_nbd(): 102 | if not os.path.exists('/sys/block/nbd0'): 103 | return RuntimeError("Can't find NBDs - is the nbd module loaded?") 104 | 105 | for devno in range(20): 106 | if not os.path.exists(f'/sys/block/nbd{devno}/pid'): 107 | return devno 108 | 109 | raise RuntimeError("Can't find any available NBDs") 110 | 111 | 112 | with ExitStack() as exit_stack: 113 | 114 | tempdir = exit_stack.enter_context(tempfile.TemporaryDirectory()) 115 | 116 | nbdkits = {} 117 | def cleanup(): 118 | for proc in nbdkits.values(): 119 | try: 120 | proc.wait(5) 121 | except subprocess.TimeoutExpired: 122 | proc.terminate() 123 | exit_stack.callback(cleanup) 124 | 125 | sockets = {} 126 | for kind in ('sb', 'lb'): # "small blocks" and "large blocks" 127 | socket_name = f'{tempdir}/nbd_socket_{kind}' 128 | cmdline = [ 129 | 'nbdkit', '--unix', socket_name, '--foreground', 130 | '--filter=exitlast', '--threads', '16', 's3backer', 131 | f's3b_region={BUCKET_REGION}', 's3b_size=50G', 's3b_force=true', 132 | f'bucket={BUCKET_NAME}/{kind}'] 133 | 134 | if kind == 'lb': 135 | cmdline.append(f's3b_blockSize={S3_BLOCKSIZE_LARGE_KB}K') 136 | else: 137 | assert kind == 'sb' 138 | cmdline.append(f's3b_blockSize={S3_BLOCKSIZE_SMALL_KB}K') 139 | nbdkits[kind] = run(cmdline, wait=False) 140 | sockets[kind] = socket_name 141 | 142 | devices = {} 143 | for (kind, socket) in sockets.items(): 144 | wait_for(os.path.exists, socket) 145 | devno = find_unused_nbd() 146 | devname = f'/dev/nbd{devno}' 147 | run(['nbd-client', '-unix', socket, devname]) 148 | devices[kind] = devname 149 | exit_stack.callback(run, ['nbd-client', '-d', devname]) 150 | 151 | with open(f'/sys/block/nbd{devno}/queue/max_sectors_kb', 'w') as fh: 152 | if kind == 'lb': 153 | print(str(S3_BLOCKSIZE_LARGE_KB), file=fh) 154 | else: 155 | assert kind == 'sb' 156 | print(str(S3_BLOCKSIZE_SMALL_KB), file=fh) 157 | 158 | cmdline = ['zpool', 'create', '-R', ZPOOL_DIR ] 159 | for arg in ('ashift=9', 'autotrim=on', 'failmode=continue'): 160 | cmdline.append('-o') 161 | cmdline.append(arg) 162 | for arg in ('acltype=posixacl', 'relatime=on', 'xattr=sa', 'compression=zstd-19', 163 | 'dedup=on', 'sync=disabled', f'special_small_blocks={ZFS_BLOCKSIZE_THRESHOLD_KB*1024}', 164 | 'redundant_metadata=most', f'recordsize={ZFS_RECORDSIZE_KB*1024}', 165 | 'encryption=on', 'keyformat=passphrase', f'keylocation=file://{KEYFILE}'): 166 | cmdline.append('-O') 167 | cmdline.append(arg) 168 | cmdline += [ ZPOOL_NAME, devices['lb'], 'special', devices['sb'] ] 169 | run(cmdline) 170 | exit_stack.callback(run, ['zpool', 'export', ZPOOL_NAME]) 171 | 172 | run(['zfs', 'create', f'{ZPOOL_NAME}/{DSET_NAME}']) 173 | -------------------------------------------------------------------------------- /fuse_ops.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Forward decl's 38 | struct s3b_config; 39 | struct s3backer_store; 40 | struct block_part; 41 | 42 | // Stats mirror state values 43 | #define STATS_MIRROR_INITIAL 0 44 | #define STATS_MIRROR_RUNNING 1 45 | #define STATS_MIRROR_SHUTDOWN 2 46 | 47 | // Function types 48 | typedef void printer_t(void *prarg, const char *fmt, ...) __attribute__ ((__format__ (__printf__, 2, 3))); 49 | typedef void print_stats_t(void *prarg, printer_t *printer); 50 | typedef void clear_stats_t(void); 51 | 52 | // Configuration info structure for fuse_ops 53 | struct fuse_ops_conf { 54 | struct s3b_config *s3bconf; 55 | print_stats_t *print_stats; 56 | clear_stats_t *clear_stats; 57 | int read_only; 58 | int direct_io; 59 | const char *filename; 60 | const char *stats_filename; 61 | const char *stats_mirror_path; 62 | u_int stats_mirror_interval; 63 | uid_t uid; 64 | gid_t gid; 65 | u_int block_size; 66 | s3b_block_t num_blocks; 67 | int file_mode; 68 | log_func_t *log; 69 | }; 70 | 71 | // Private information 72 | struct fuse_ops_private { 73 | struct s3backer_store *s3b; 74 | struct block_part *block_part; 75 | u_int block_bits; 76 | off_t file_size; 77 | time_t start_time; 78 | time_t file_atime; 79 | time_t file_mtime; 80 | time_t stats_atime; 81 | 82 | // Stats mirror 83 | pthread_t stats_mirror_thread; 84 | volatile int stats_mirror_state; 85 | }; 86 | 87 | // fuse_ops.c 88 | extern const struct fuse_operations *fuse_ops_create(struct fuse_ops_conf *config, struct s3backer_store *s3b); 89 | extern void fuse_ops_destroy(void); 90 | 91 | -------------------------------------------------------------------------------- /hash.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | /* 38 | * This is a simple closed hash table implementation with linear probing. 39 | * We pre-allocate the hash array based on the expected maximum size. 40 | */ 41 | 42 | #include "s3backer.h" 43 | #include "hash.h" 44 | 45 | // Definitions 46 | #define LOAD_FACTOR 0.666666 47 | #define FIRST(hash, key) (s3b_hash_index((hash), (key))) 48 | #define NEXT(hash, index) ((index) + 1 < (hash)->alen ? (index) + 1 : 0) 49 | #define EMPTY(value) ((value) == NULL) 50 | #define VALUE(hash, index) ((hash)->array[(index)]) 51 | #define KEY(value) (*(s3b_block_t *)(value)) 52 | 53 | // Hash table structure 54 | struct s3b_hash { 55 | u_int maxkeys; // max capacity 56 | u_int numkeys; // number of keys in table 57 | u_int alen; // hash array length 58 | void *array[0]; // hash array 59 | }; 60 | 61 | // Declarations 62 | static u_int s3b_hash_index(struct s3b_hash *hash, s3b_block_t key); 63 | 64 | // Public functions 65 | 66 | int 67 | s3b_hash_create(struct s3b_hash **hashp, u_int maxkeys) 68 | { 69 | struct s3b_hash *hash; 70 | u_int alen; 71 | 72 | if (maxkeys >= (u_int)(UINT_MAX * LOAD_FACTOR) - 1) 73 | return EINVAL; 74 | alen = (u_int)(maxkeys / LOAD_FACTOR) + 1; 75 | if ((hash = calloc(1, sizeof(*hash) + alen * sizeof(*hash->array))) == NULL) 76 | return ENOMEM; 77 | hash->maxkeys = maxkeys; 78 | hash->alen = alen; 79 | *hashp = hash; 80 | return 0; 81 | } 82 | 83 | void 84 | s3b_hash_destroy(struct s3b_hash *hash) 85 | { 86 | free(hash); 87 | } 88 | 89 | u_int 90 | s3b_hash_size(struct s3b_hash *hash) 91 | { 92 | return hash->numkeys; 93 | } 94 | 95 | void * 96 | s3b_hash_get(struct s3b_hash *hash, s3b_block_t key) 97 | { 98 | u_int i; 99 | 100 | for (i = FIRST(hash, key); 1; i = NEXT(hash, i)) { 101 | void *const value = VALUE(hash, i); 102 | 103 | if (EMPTY(value)) 104 | return NULL; 105 | if (KEY(value) == key) 106 | return value; 107 | } 108 | } 109 | 110 | /* 111 | * Add/replace entry. 112 | * 113 | * Note that the value being replaced (if any) is referenced by this function, 114 | * so it should not be free'd until after this function returns. 115 | */ 116 | void * 117 | s3b_hash_put(struct s3b_hash *hash, void *value) 118 | { 119 | const s3b_block_t key = KEY(value); 120 | u_int i; 121 | 122 | for (i = FIRST(hash, key); 1; i = NEXT(hash, i)) { 123 | void *const value2 = VALUE(hash, i); 124 | 125 | if (EMPTY(value)) 126 | break; 127 | if (KEY(value2) == key) { 128 | VALUE(hash, i) = value; // replace existing value having the same key with new value 129 | return value2; 130 | } 131 | } 132 | assert(hash->numkeys < hash->maxkeys); 133 | VALUE(hash, i) = value; 134 | hash->numkeys++; 135 | return NULL; 136 | } 137 | 138 | /* 139 | * Optimization of s3b_hash_put() for when it is known that no matching entry exists. 140 | */ 141 | void 142 | s3b_hash_put_new(struct s3b_hash *hash, void *value) 143 | { 144 | const s3b_block_t key = KEY(value); 145 | u_int i; 146 | 147 | for (i = FIRST(hash, key); 1; i = NEXT(hash, i)) { 148 | void *const value2 = VALUE(hash, i); 149 | 150 | if (EMPTY(value2)) 151 | break; 152 | assert(KEY(value2) != key); 153 | } 154 | assert(hash->numkeys < hash->maxkeys); 155 | VALUE(hash, i) = value; 156 | hash->numkeys++; 157 | } 158 | 159 | void 160 | s3b_hash_remove(struct s3b_hash *hash, s3b_block_t key) 161 | { 162 | u_int i; 163 | u_int j; 164 | u_int k; 165 | 166 | // Find entry 167 | for (i = FIRST(hash, key); 1; i = NEXT(hash, i)) { 168 | void *const value = VALUE(hash, i); 169 | 170 | if (EMPTY(value)) // no such entry 171 | return; 172 | if (KEY(value) == key) // entry found 173 | break; 174 | } 175 | 176 | // Repair subsequent entries as necessary 177 | for (j = NEXT(hash, i); 1; j = NEXT(hash, j)) { 178 | void *const value = VALUE(hash, j); 179 | 180 | if (value == NULL) 181 | break; 182 | k = FIRST(hash, KEY(value)); 183 | if (j > i ? (k <= i || k > j) : (k <= i && k > j)) { 184 | VALUE(hash, i) = value; 185 | i = j; 186 | } 187 | } 188 | 189 | // Remove entry 190 | assert(VALUE(hash, i) != NULL); 191 | VALUE(hash, i) = NULL; 192 | hash->numkeys--; 193 | } 194 | 195 | int 196 | s3b_hash_foreach(struct s3b_hash *hash, s3b_hash_visit_t *visitor, void *arg) 197 | { 198 | u_int i; 199 | 200 | for (i = 0; i < hash->alen; i++) { 201 | void *const value = VALUE(hash, i); 202 | int r; 203 | 204 | if (value != NULL && (r = (*visitor)(arg, value)) != 0) 205 | return r; 206 | } 207 | return 0; 208 | } 209 | 210 | /* 211 | * Jenkins one-at-a-time hash 212 | */ 213 | static u_int 214 | s3b_hash_index(struct s3b_hash *hash, s3b_block_t key) 215 | { 216 | u_int value = 0; 217 | int i; 218 | 219 | for (i = 0; i < sizeof(key); i++) { 220 | value += ((u_char *)&key)[i]; 221 | value += (value << 10); 222 | value ^= (value >> 6); 223 | } 224 | value += (value << 3); 225 | value ^= (value >> 11); 226 | value += (value << 15); 227 | return value % hash->alen; 228 | } 229 | 230 | -------------------------------------------------------------------------------- /hash.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | /* 38 | * Our hash table implementation. 39 | * 40 | * We make the following simplifying assumptions: 41 | * 42 | * 1. Keys are of type s3b_block_t 43 | * 2. Values are structures in which the first field is the key 44 | * 3. No attempts will be made to overload the table 45 | */ 46 | 47 | // Definitions 48 | typedef int s3b_hash_visit_t(void *arg, void *value); 49 | 50 | // Declarations 51 | struct s3b_hash; 52 | 53 | // hash.c 54 | extern int s3b_hash_create(struct s3b_hash **hashp, u_int maxkeys); 55 | extern void s3b_hash_destroy(struct s3b_hash *hash); 56 | extern u_int s3b_hash_size(struct s3b_hash *hash); 57 | extern void *s3b_hash_get(struct s3b_hash *hash, s3b_block_t key); 58 | extern void *s3b_hash_put(struct s3b_hash *hash, void *value); 59 | extern void s3b_hash_put_new(struct s3b_hash *hash, void *value); 60 | extern void s3b_hash_remove(struct s3b_hash *hash, s3b_block_t key); 61 | extern int s3b_hash_foreach(struct s3b_hash *hash, s3b_hash_visit_t *visitor, void *arg); 62 | 63 | -------------------------------------------------------------------------------- /http_io.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Upload/download indexes 38 | #define HTTP_DOWNLOAD 0 39 | #define HTTP_UPLOAD 1 40 | 41 | // Authentication types 42 | #define AUTH_VERSION_AWS2 "aws2" 43 | #define AUTH_VERSION_AWS4 "aws4" 44 | 45 | // Storage classes 46 | #define STORAGE_CLASS_STANDARD "STANDARD" 47 | #define STORAGE_CLASS_STANDARD_IA "STANDARD_IA" 48 | #define STORAGE_CLASS_ONEZONE_IA "ONEZONE_IA" 49 | #define STORAGE_CLASS_REDUCED_REDUNDANCY "REDUCED_REDUNDANCY" 50 | #define STORAGE_CLASS_INTELLIGENT_TIERING "INTELLIGENT_TIERING" 51 | #define STORAGE_CLASS_GLACIER "GLACIER" 52 | #define STORAGE_CLASS_DEEP_ARCHIVE "DEEP_ARCHIVE" 53 | #define STORAGE_CLASS_OUTPOSTS "OUTPOSTS" 54 | 55 | // Server side encryption types 56 | #define SSE_AES256 "AES256" 57 | #define SSE_AWS_KMS "aws:kms" 58 | 59 | // Configuration info structure for http_io store 60 | struct http_io_conf { 61 | char *accessId; 62 | char *accessKey; 63 | char *iam_token; 64 | const char *accessType; 65 | const char *ec2iam_role; 66 | int ec2iam_imdsv2; // use IMDSv2 instead of IMDSv1 67 | const char *storage_class; 68 | const char *authVersion; 69 | const char *baseURL; 70 | const char *vhostURL; // "baseURL" in --vhost format 71 | const char *region; 72 | const char *bucket; 73 | const char *prefix; 74 | const char *user_agent; 75 | const char *cacert; 76 | const char *password; 77 | const char *encryption; 78 | const char *default_ce; 79 | u_int key_length; 80 | int debug; 81 | int debug_http; 82 | int http_11; // restrict to HTTP 1.1 83 | int quiet; 84 | int no_curl_cache; // don't cache cURL handles 85 | const struct comp_alg *compress_alg; // compression algorithm, or NULL for none 86 | void *compress_level; // compression level info 87 | int vhost; // use virtual host style URL 88 | bitmap_t *nonzero_bitmap; // is set to NULL by http_io_create() 89 | int blockHashPrefix; 90 | int insecure; 91 | u_int block_size; 92 | s3b_block_t num_blocks; 93 | int list_blocks_threads; 94 | u_int timeout; 95 | u_int initial_retry_pause; 96 | u_int max_retry_pause; 97 | uintmax_t max_speed[2]; 98 | log_func_t *log; 99 | const char *sse; 100 | const char *sse_key_id; 101 | }; 102 | 103 | // Statistics structure for http_io store 104 | struct http_io_evst { 105 | u_int count; // number of occurrences 106 | double time; // total time taken 107 | }; 108 | 109 | struct http_io_stats { 110 | 111 | // Block stats 112 | u_int normal_blocks_read; 113 | u_int normal_blocks_written; 114 | u_int zero_blocks_read; 115 | u_int zero_blocks_written; 116 | u_int empty_blocks_read; // only when nonzero_bitmap != NULL 117 | u_int empty_blocks_written; // only when nonzero_bitmap != NULL 118 | 119 | // HTTP transfer stats 120 | struct http_io_evst http_heads; // total successful 121 | struct http_io_evst http_gets; // total successful 122 | struct http_io_evst http_puts; // total successful 123 | struct http_io_evst http_deletes; // total successful 124 | u_int http_unauthorized; 125 | u_int http_forbidden; 126 | u_int http_stale; 127 | u_int http_redirect; 128 | u_int http_verified; 129 | u_int http_mismatch; 130 | u_int http_5xx_error; 131 | u_int http_4xx_error; 132 | u_int http_3xx_error; 133 | u_int http_other_error; 134 | u_int http_canceled_writes; 135 | 136 | // CURL stats 137 | u_int curl_handles_created; 138 | u_int curl_handles_reused; 139 | u_int curl_timeouts; 140 | u_int curl_connect_failed; 141 | u_int curl_host_unknown; 142 | u_int curl_out_of_memory; 143 | u_int curl_other_error; 144 | 145 | // Retry stats 146 | u_int num_retries; 147 | uint64_t retry_delay; 148 | 149 | // Misc 150 | u_int out_of_memory_errors; 151 | }; 152 | 153 | // http_io.c 154 | extern struct s3backer_store *http_io_create(struct http_io_conf *config); 155 | extern void http_io_get_stats(struct s3backer_store *s3b, struct http_io_stats *stats); 156 | extern void http_io_clear_stats(struct s3backer_store *s3b); 157 | extern int http_io_parse_block(const char *prefix, s3b_block_t num_blocks, 158 | int blockHashPrefix, const char *name, s3b_block_t *hash_valuep, s3b_block_t *block_nump); 159 | extern void http_io_format_block_hash(int blockHashPrefix, char *block_hash_buf, size_t bufsiz, s3b_block_t block_num); 160 | extern int http_io_list_blocks(struct s3backer_store *s3b, block_list_func_t *callback, void *arg); 161 | 162 | -------------------------------------------------------------------------------- /main.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "s3backer.h" 38 | #include "block_cache.h" 39 | #include "ec_protect.h" 40 | #include "zero_cache.h" 41 | #include "fuse_ops.h" 42 | #include "http_io.h" 43 | #include "test_io.h" 44 | #include "s3b_config.h" 45 | #include "erase.h" 46 | #include "reset.h" 47 | #include "util.h" 48 | #include "nbdkit.h" 49 | 50 | #if NBDKIT 51 | 52 | // Some definitions 53 | #define NBD_CLIENT_BLOCK_SIZE 4096 54 | #define NBDKIT_STARTUP_WAIT_PAUSE (long)50 55 | #define MAX_NBDKIT_STARTUP_WAIT_MILLIS (long)5000 56 | #define NBD_MODULE_NAME "nbd" 57 | 58 | // Internal state 59 | static const int forward_signals[] = { SIGHUP, SIGINT, SIGQUIT, SIGTERM }; 60 | static const int num_forward_signals = sizeof(forward_signals) / sizeof(*forward_signals); 61 | 62 | // Internal functions 63 | static int trampoline_to_nbd(int argc, char **argv); 64 | static void handle_signal(int signal); 65 | static void try_to_load_nbd_module(void); 66 | #endif 67 | 68 | // Global pointer to config 69 | static struct s3b_config *config; 70 | 71 | int 72 | main(int argc, char **argv) 73 | { 74 | const struct fuse_operations *fuse_ops; 75 | struct s3backer_store *s3b; 76 | int nbd = 0; 77 | int i; 78 | int r; 79 | 80 | // Look for "--nbd" flag 81 | for (i = 1; i < argc; i++) { 82 | const char *param = argv[i]; 83 | if (*param != '-' || strcmp(param, "--") == 0) 84 | break; 85 | if (strcmp(param, "--nbd") == 0) { 86 | nbd = 1; 87 | break; 88 | } 89 | } 90 | 91 | // Handle `--nbd' flag 92 | if (nbd) { 93 | #if NBDKIT 94 | if ((r = trampoline_to_nbd(argc, argv)) == 2) { 95 | usage(); 96 | r = 1; 97 | } 98 | return r; 99 | #else 100 | errx(1, "invalid flag \"--nbd\": %s was not built with NBD support", PACKAGE); 101 | #endif 102 | } 103 | 104 | // Get configuration 105 | if ((config = s3backer_get_config(argc, argv, 0, 0)) == NULL) 106 | return 1; 107 | if (config->nbd) 108 | errx(1, "the \"--nbd\" flag is not supported in config files (must be on the command line)"); 109 | 110 | // Handle `--erase' flag 111 | if (config->erase) { 112 | if (s3backer_erase(config) != 0) 113 | return 1; 114 | return 0; 115 | } 116 | 117 | // Handle `--reset' flag 118 | if (config->reset) { 119 | if (s3backer_reset(config) != 0) 120 | return 1; 121 | return 0; 122 | } 123 | 124 | // Create backing store 125 | if ((s3b = s3backer_create_store(config)) == NULL) 126 | err(1, "error creating s3backer_store"); 127 | 128 | // Start logging to syslog now 129 | if (!config->foreground) 130 | set_config_log(config, syslog_logger); 131 | 132 | // Setup FUSE operation hooks 133 | if ((fuse_ops = fuse_ops_create(&config->fuse_ops, s3b)) == NULL) { 134 | (*s3b->shutdown)(s3b); 135 | (*s3b->destroy)(s3b); 136 | return 1; 137 | } 138 | 139 | // Start 140 | (*config->log)(LOG_INFO, "s3backer process %lu for %s started", (u_long)getpid(), config->mount); 141 | if ((r = fuse_main(config->fuse_args.argc, config->fuse_args.argv, fuse_ops, NULL)) != 0) 142 | (*config->log)(LOG_ERR, "error %s FUSE: fuse_main() returned %d", r < 7 ? "starting" : "running", r); 143 | 144 | // Done 145 | fuse_ops_destroy(); 146 | return r; 147 | } 148 | 149 | #if NBDKIT 150 | static int 151 | trampoline_to_nbd(int argc, char **argv) 152 | { 153 | struct string_array command_line; 154 | struct string_array nbd_flags; 155 | struct string_array nbd_params; 156 | struct child_proc exit_proc; 157 | struct sigaction act; 158 | struct timespec pause; 159 | const char *bucket_param; 160 | const char *device_param; 161 | int skip_client_cleanup; 162 | char *unix_socket; 163 | long elapsed_millis; 164 | int file_created; 165 | pid_t server_pid; 166 | pid_t client_pid; 167 | pid_t exit_pid; 168 | struct stat sb; 169 | int i; 170 | 171 | // Initialize 172 | memset(&command_line, 0, sizeof(command_line)); 173 | memset(&nbd_flags, 0, sizeof(nbd_flags)); 174 | memset(&nbd_params, 0, sizeof(nbd_params)); 175 | 176 | // Find and extract any "--nbd", "--nbd-flag", and "--nbd-param" flags 177 | for (i = 1; i < argc; i++) { 178 | struct string_array *nbd_list; 179 | char *flag = argv[i]; 180 | char *value; 181 | if (*flag != '-') 182 | break; 183 | if (strcmp(flag, "--") == 0) { 184 | i++; 185 | break; 186 | } 187 | if (strncmp(flag, "--nbd", 5) != 0) 188 | continue; 189 | memmove(argv + i, argv + i + 1, (--argc - i) * sizeof(*argv)); // squish it 190 | i--; 191 | if (strcmp(flag, "--nbd") == 0) // the "--nbd" flag that got us here 192 | continue; 193 | if ((value = strchr(flag, '=')) == NULL) { 194 | warnx("invalid flag \"%s\"", flag); 195 | return 2; 196 | } 197 | *value++ = '\0'; 198 | if (strcmp(flag, "--nbd-flag") == 0) 199 | nbd_list = &nbd_flags; 200 | else if (strcmp(flag, "--nbd-param") == 0) 201 | nbd_list = &nbd_params; 202 | else { 203 | warnx("invalid flag \"%s\"", flag); 204 | return 2; 205 | } 206 | if (add_string(nbd_list, "%s", value) == -1) 207 | err(1, "add_string"); 208 | } 209 | 210 | // There should be two remaining parameters 211 | switch (argc - i) { 212 | case 2: 213 | bucket_param = argv[i]; 214 | device_param = argv[i + 1]; 215 | break; 216 | default: 217 | return 2; 218 | } 219 | 220 | // Get configuration (parse only) 221 | if ((config = s3backer_get_config(argc, argv, 1, 1)) == NULL) 222 | return 1; 223 | 224 | // Auto-load the nbd kernel module if needed 225 | if (stat(device_param, &sb) == -1 && errno == ENOENT) 226 | try_to_load_nbd_module(); 227 | 228 | // Get info about /dev/nbdX block device 229 | if (stat(device_param, &sb) == -1) { 230 | if (errno == EPERM || errno == EACCES) 231 | errx(1, "must be run as root when the \"--nbd\" flag is used"); 232 | err(1, "%s", device_param); 233 | } 234 | 235 | // Determine the UNIX socket file uniquely corresponding to the block device 236 | if (asprintf(&unix_socket, "%s/%0*jx_%0*jx", S3B_NBD_DIR, 237 | (int)(sizeof(dev_t) * 2), (uintmax_t)sb.st_dev, (int)(sizeof(ino_t) * 2), (uintmax_t)sb.st_ino) == -1) 238 | err(1, "asprintf"); 239 | 240 | // (Re)create UNIX socket directory if needed 241 | (void)mkdir(S3B_NBD_DIR, 0700); 242 | 243 | // Delete leftover UNIX socket file from last time, if any 244 | (void)unlink(unix_socket); 245 | 246 | // Verify we have sufficient privileges 247 | if (stat(unix_socket, &sb) == -1 && errno != ENOENT) { 248 | if (errno == EPERM || errno == EACCES) 249 | errx(1, "must be run as root when the \"--nbd\" flag is used"); 250 | err(1, "%s", unix_socket); 251 | } 252 | 253 | // Initialize nbdkit(1) command line 254 | if (add_string(&command_line, "%s", NBDKIT_EXECUTABLE) == -1 255 | || (config->debug && add_string(&command_line, "--verbose") == -1) 256 | || (config->foreground && add_string(&command_line, "--foreground") == -1) 257 | || (config->fuse_ops.read_only && add_string(&command_line, "--read-only") == -1) 258 | || add_string(&command_line, "--filter=exitlast") == -1 // exit when nbd-client disconnects 259 | || add_string(&command_line, "--unix") == -1 260 | || add_string(&command_line, "%s", unix_socket) == -1) 261 | err(1, "add_string"); 262 | 263 | // Add any custom "--nbd-flag" flags 264 | for (i = 0; i < nbd_flags.num_strings; i++) { 265 | if (add_string(&command_line, "%s", nbd_flags.strings[i]) == -1) 266 | err(1, "add_string"); 267 | } 268 | free_strings(&nbd_flags); 269 | 270 | // Add plugin name 271 | if (add_string(&command_line, "%s", PACKAGE) == -1) 272 | err(1, "add_string"); 273 | 274 | // Add s3backer plugin parameters, converting "--foo bar" to "s3b_foo=bar" and "--foo" to "s3b_foo=true" 275 | for (i = 1; i < argc; i++) { 276 | char *param = argv[i]; 277 | char *value; 278 | 279 | // Detect when we've seen the last flag 280 | if (*param != '-' || strcmp(param, "--") == 0) 281 | break; 282 | 283 | // Skip flags we've already handled 284 | if (strcmp(param, "-f") == 0 || strcmp(param, "-d") == 0) 285 | continue; 286 | 287 | // Only accept --doubleDashFlags from here on out 288 | if (param[1] != '-') { 289 | warnx("invalid flag \"%s\"", param); 290 | return 2; 291 | } 292 | param += 2; 293 | 294 | // Get flag name and value (if any) 295 | if ((value = strchr(param, '=')) != NULL) 296 | *value++ = '\0'; 297 | switch (is_valid_s3b_flag(param)) { 298 | case 1: 299 | if (value != NULL && strcasecmp(value, "true") != 0) { 300 | warnx("boolean flag \"--%s\" value must be \"true\"", param); 301 | return 2; 302 | } 303 | break; 304 | case 2: 305 | if (value == NULL) { 306 | warnx("flag \"--%s\" requires a value", param); 307 | return 2; 308 | } 309 | break; 310 | case 3: // flag works either with or without a value 311 | break; 312 | default: 313 | warnx("invalid flag \"--%s\"", param); 314 | return 2; 315 | } 316 | 317 | // Add corresponding nbdkit parameter 318 | if (add_string(&command_line, "%s%s=%s", NBD_S3B_PARAM_PREFIX, param, value != NULL ? value : "true") == -1) 319 | err(1, "add_string"); 320 | } 321 | 322 | // Add bucket[/subdir] param 323 | if (add_string(&command_line, "%s=%s", NBD_BUCKET_PARAMETER_NAME, bucket_param) == -1) 324 | err(1, "add_string"); 325 | 326 | // Add any custom "--nbd-param" params 327 | for (i = 0; i < nbd_params.num_strings; i++) { 328 | if (add_string(&command_line, "%s", nbd_params.strings[i]) == -1) 329 | err(1, "add_string"); 330 | } 331 | free_strings(&nbd_params); 332 | 333 | // Fire up nbdkit 334 | server_pid = start_child_process(config, NBDKIT_EXECUTABLE, &command_line); 335 | free_strings(&command_line); 336 | 337 | // If we're not running in the foreground, nbdkit is going to fork off so go ahead and wait for it to exit 338 | if (!config->foreground) { 339 | 340 | // Wait for exit 341 | if ((exit_pid = wait_for_child_to_exit(config, &exit_proc, 0, 0)) != server_pid) { 342 | if (exit_pid == (pid_t)-1) 343 | err(1, "got signal during setup"); 344 | err(1, "wait() returned %d", (int)exit_pid); 345 | } 346 | 347 | // Verify normal exit 348 | if (!WIFEXITED(exit_proc.wstatus) || WEXITSTATUS(exit_proc.wstatus) != 0) 349 | exit(1); 350 | } 351 | 352 | // If we're not running in the foreground, spit out a message and daemonize 353 | if (!config->foreground) { 354 | warnx("connecting %s to %s and daemonizing", bucket_param, device_param); 355 | if (daemon(0, 0) == -1) 356 | err(1, "daemon"); 357 | set_config_log(config, syslog_logger); 358 | daemonized = 1; 359 | if (config->debug) 360 | daemon_debug(config, "successfully daemonized as process %d", (int)getpid()); 361 | } 362 | 363 | // Wait for socket file to come into existence 364 | file_created = 0; 365 | for (elapsed_millis = 0; elapsed_millis <= MAX_NBDKIT_STARTUP_WAIT_MILLIS; elapsed_millis += NBDKIT_STARTUP_WAIT_PAUSE) { 366 | if (stat(unix_socket, &sb) == 0) { 367 | file_created = 1; 368 | break; 369 | } 370 | if (errno != ENOENT) 371 | daemon_err(config, 1, "%s", unix_socket); 372 | pause.tv_sec = 0; 373 | pause.tv_nsec = NBDKIT_STARTUP_WAIT_PAUSE * (long)1000000; 374 | (void)nanosleep(&pause, NULL); 375 | } 376 | if (!file_created) 377 | daemon_errx(config, 1, "%s failed to start within %lums", NBDKIT_EXECUTABLE, MAX_NBDKIT_STARTUP_WAIT_MILLIS); 378 | 379 | // Build nbd-client command line 380 | if (add_string(&command_line, "%s", NBD_CLIENT_EXECUTABLE) == -1 381 | || add_string(&command_line, "-unix") == -1 382 | || add_string(&command_line, "%s", unix_socket) == -1 383 | || add_string(&command_line, "-block-size") == -1 384 | || add_string(&command_line, "%u", NBD_CLIENT_BLOCK_SIZE) == -1 385 | || add_string(&command_line, "-nofork") == -1 386 | || (config->fuse_ops.read_only && add_string(&command_line, "-readonly") == -1) 387 | || add_string(&command_line, "%s", device_param) == -1) 388 | daemon_err(config, 1, "add_string"); 389 | 390 | // Fire up nbd-client 391 | client_pid = start_child_process(config, NBD_CLIENT_EXECUTABLE, &command_line); 392 | free_strings(&command_line); 393 | 394 | // Setup so if we get a death signal, we terminate our child processes (via SIGTERM) 395 | memset(&act, 0, sizeof(act)); 396 | act.sa_handler = &handle_signal; 397 | for (i = 0; i < num_forward_signals; i++) { 398 | if (sigaction(forward_signals[i], &act, NULL) == -1) 399 | daemon_err(config, 1, "sigaction"); 400 | } 401 | 402 | // Wait for the first child process to exit or a signal to be recieved, but ignore exit of nbd-client 403 | skip_client_cleanup = 0; 404 | while (1) { 405 | int abnormal_exit; 406 | 407 | // Wait for next child to exit or signal 408 | exit_pid = wait_for_child_to_exit(config, &exit_proc, !config->foreground, 0); 409 | 410 | // If we get a signal, or no more child processes left (foreground mode only), then we're done 411 | if (exit_pid == (pid_t)0 || exit_pid == (pid_t)-1) 412 | break; 413 | 414 | // Did the process exited abnormally? 415 | abnormal_exit = !WIFEXITED(exit_proc.wstatus) || WEXITSTATUS(exit_proc.wstatus) != 0; 416 | 417 | // We are expecting nbd-client to exit immediately; but if it had an error, skip the corresponding cleanup 418 | if (exit_pid == client_pid) { 419 | client_pid = (pid_t)-2; // don't match pid again 420 | if (abnormal_exit) 421 | skip_client_cleanup = 1; 422 | } 423 | 424 | // If process exited abnormally, bail out 425 | if (abnormal_exit) 426 | break; 427 | } 428 | 429 | // Logging 430 | daemon_debug(config, "shutting down %s NDB server", PACKAGE); 431 | 432 | // Run "nbd-client -d" to help clean up 433 | if (!skip_client_cleanup) { 434 | if (add_string(&command_line, "%s", NBD_CLIENT_EXECUTABLE) == -1 435 | || add_string(&command_line, "-d") == -1 436 | || add_string(&command_line, "%s", device_param) == -1) 437 | daemon_err(config, 1, "add_string"); 438 | client_pid = start_child_process(config, NBD_CLIENT_EXECUTABLE, &command_line); 439 | free_strings(&command_line); 440 | } 441 | 442 | // Kill all other child processes 443 | kill_remaining_children(config, client_pid, SIGTERM); 444 | 445 | // Wait for all processes to exit 446 | while (1) { 447 | if (wait_for_child_to_exit(config, NULL, 0, SIGTERM) == 0) 448 | break; 449 | } 450 | 451 | // Delete UNIX socket file 452 | (void)unlink(unix_socket); 453 | 454 | // Done 455 | return 0; 456 | } 457 | 458 | // Somebody killed us, so we need to kill our child processes as well. 459 | static void 460 | handle_signal(int signal) 461 | { 462 | if (config->debug) 463 | daemon_debug(config, "got signal %d", signal); 464 | } 465 | 466 | // Run "modprobe nbd" 467 | static void 468 | try_to_load_nbd_module() 469 | { 470 | struct string_array modprobe_params; 471 | struct child_proc exit_proc; 472 | const char *modprobe_name; 473 | const char *slash; 474 | pid_t modprobe_pid; 475 | pid_t exit_pid; 476 | 477 | // See if modprobe(8) was found 478 | if (*MODPROBE_EXECUTABLE == '\0') 479 | return; 480 | 481 | // Get executable base name 482 | slash = strrchr(MODPROBE_EXECUTABLE, '/'); 483 | modprobe_name = slash != NULL ? slash + 1 : MODPROBE_EXECUTABLE; 484 | 485 | // Set up process invocation 486 | memset(&modprobe_params, 0, sizeof(modprobe_params)); 487 | if (add_string(&modprobe_params, "%s", modprobe_name) == -1 488 | || add_string(&modprobe_params, "%s", NBD_MODULE_NAME) == -1) 489 | err(1, "add_string"); 490 | 491 | // Execute process and wait for it to finish 492 | modprobe_pid = start_child_process(config, MODPROBE_EXECUTABLE, &modprobe_params); 493 | free_strings(&modprobe_params); 494 | if ((exit_pid = wait_for_child_to_exit(config, &exit_proc, 0, 0)) != modprobe_pid) { 495 | if (exit_pid == (pid_t)-1) 496 | err(1, "got signal waiting for modprobe(8)"); 497 | err(1, "wait() returned %d", (int)exit_pid); 498 | } 499 | 500 | // Verify normal exit 501 | if (!WIFEXITED(exit_proc.wstatus) || WEXITSTATUS(exit_proc.wstatus) != 0) 502 | exit(1); 503 | } 504 | #endif 505 | -------------------------------------------------------------------------------- /nbdkit.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #define NBD_BUCKET_PARAMETER_NAME "bucket" 38 | #define NBD_S3B_PARAM_PREFIX "s3b_" 39 | #define NBD_S3B_PARAM_PREFIX_LEN (sizeof(NBD_S3B_PARAM_PREFIX) - 1) 40 | -------------------------------------------------------------------------------- /rebuild.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Bail on error 4 | set -e 5 | 6 | # Reset 7 | . ./cleanup.sh 8 | 9 | # Create directory to avoid warning 10 | mkdir -p m4 11 | 12 | # Apply autofoo magic 13 | autoreconf -iv 14 | 15 | # Configure for debug 16 | ./configure --enable-Werror --enable-assertions 17 | 18 | # Build 19 | make -j 8 20 | -------------------------------------------------------------------------------- /reset.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "s3backer.h" 38 | #include "block_cache.h" 39 | #include "ec_protect.h" 40 | #include "zero_cache.h" 41 | #include "fuse_ops.h" 42 | #include "http_io.h" 43 | #include "test_io.h" 44 | #include "s3b_config.h" 45 | #include "reset.h" 46 | #include "dcache.h" 47 | 48 | int 49 | s3backer_reset(struct s3b_config *config) 50 | { 51 | struct s3backer_store *s3b = NULL; 52 | struct s3b_dcache *dcache = NULL; 53 | struct stat cache_file_stat; 54 | int ok = 0; 55 | int r; 56 | 57 | // Logging 58 | if (!config->quiet) 59 | warnx("resetting mount token for %s", config->description); 60 | 61 | // Create temporary lower layer 62 | if ((s3b = config->test ? test_io_create(&config->test_io) : http_io_create(&config->http_io)) == NULL) { 63 | warnx(config->test ? "test_io_create" : "http_io_create"); 64 | goto fail; 65 | } 66 | 67 | // Clear mount token 68 | if ((r = (*s3b->set_mount_token)(s3b, NULL, 0)) != 0) { 69 | warnx("error clearing s3 mount token: %s", strerror(r)); 70 | goto fail; 71 | } 72 | 73 | // Open disk cache file, if any, and clear the mount token there too 74 | if (config->block_cache.cache_file != NULL) { 75 | if (stat(config->block_cache.cache_file, &cache_file_stat) == -1) { 76 | if (errno != ENOENT) { 77 | warnx("error opening cache file \"%s\"", config->block_cache.cache_file); 78 | goto fail; 79 | } 80 | } else { 81 | if ((r = s3b_dcache_open(&dcache, &config->block_cache, NULL, NULL, 0)) != 0) 82 | warnx("error opening cache file \"%s\": %s", config->block_cache.cache_file, strerror(r)); 83 | if ((r = s3b_dcache_set_mount_token(dcache, NULL, 0)) != 0) 84 | warnx("error reading mount token from \"%s\": %s", config->block_cache.cache_file, strerror(r)); 85 | } 86 | } 87 | 88 | // Success 89 | if (!config->quiet) 90 | warnx("done"); 91 | ok = 1; 92 | 93 | fail: 94 | // Clean up 95 | if (dcache != NULL) 96 | s3b_dcache_close(dcache); 97 | if (s3b != NULL) { 98 | (*s3b->shutdown)(s3b); 99 | (*s3b->destroy)(s3b); 100 | } 101 | return ok ? 0 : -1; 102 | } 103 | 104 | -------------------------------------------------------------------------------- /reset.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // reset.c 38 | extern int s3backer_reset(struct s3b_config *config); 39 | 40 | -------------------------------------------------------------------------------- /s3b_config.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Overal application configuration info 38 | struct s3b_config { 39 | 40 | // Various sub-module configurations 41 | struct block_cache_conf block_cache; 42 | struct fuse_ops_conf fuse_ops; 43 | struct zero_cache_conf zero_cache; 44 | struct ec_protect_conf ec_protect; 45 | struct http_io_conf http_io; 46 | struct test_io_conf test_io; 47 | 48 | // Common/global stuff 49 | const char *accessFile; 50 | const char *mount; 51 | char description[768]; 52 | u_int block_size; 53 | off_t file_size; 54 | s3b_block_t num_blocks; 55 | const char *bucket; 56 | const char *prefix; 57 | const char *accessKeyEnv; 58 | int blockHashPrefix; 59 | int foreground; 60 | int debug; 61 | int erase; 62 | int reset; 63 | int quiet; 64 | int force; 65 | int test; 66 | int ssl; 67 | int nbd; 68 | int shared_disk_mode; 69 | int no_auto_detect; 70 | int list_blocks; 71 | struct fuse_args fuse_args; 72 | log_func_t *log; 73 | 74 | // These are only used during command line parsing 75 | const char *file_size_str; 76 | const char *block_size_str; 77 | const char *password_file; 78 | const char *max_speed_str[2]; 79 | const char *compress_alg; 80 | const char *compress_level; 81 | int compress_flag; 82 | int encrypt; 83 | }; 84 | 85 | // Options 86 | extern struct s3b_config *s3backer_get_config(int argc, char **argv, int nbd, int parse_only); 87 | extern struct s3b_config *s3backer_get_config2(int argc, char **argv, int nbd, int parse_only, fuse_opt_proc_t unknown_handler); 88 | extern struct s3backer_store *s3backer_create_store(struct s3b_config *config); 89 | extern int is_valid_s3b_flag(const char *flag); 90 | extern void s3b_cleanup(void); 91 | extern void dump_config(const struct s3b_config *config); 92 | extern void usage(void); 93 | 94 | -------------------------------------------------------------------------------- /s3backer.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "config.h" 38 | 39 | #include 40 | #include 41 | #if HAVE_SYS_STATVFS_H 42 | #include 43 | #endif 44 | #include 45 | #include 46 | #if HAVE_DECL_PRCTL 47 | #include 48 | #endif 49 | 50 | // Add some queue.h definitions missing on Linux 51 | #ifndef LIST_FIRST 52 | #define LIST_FIRST(head) ((head)->lh_first) 53 | #endif 54 | #ifndef LIST_NEXT 55 | #define LIST_NEXT(item, field) ((item)->field.le_next) 56 | #endif 57 | #ifndef TAILQ_FIRST 58 | #define TAILQ_FIRST(head) ((head)->tqh_first) 59 | #endif 60 | #ifndef TAILQ_NEXT 61 | #define TAILQ_NEXT(item, field) ((item)->field.tqe_next) 62 | #endif 63 | 64 | #include 65 | #include 66 | #include 67 | #include 68 | #include 69 | #include 70 | #include 71 | #include 72 | #include 73 | #include 74 | #include 75 | #include 76 | #include 77 | #include 78 | #include 79 | #include 80 | #include 81 | #include 82 | #include 83 | #include 84 | #include 85 | 86 | #include 87 | #include 88 | #include 89 | #include 90 | #include 91 | #include 92 | 93 | #include 94 | #include 95 | 96 | #if defined __APPLE__ || defined __FreeBSD__ 97 | extern char **environ; 98 | #endif 99 | 100 | #ifndef FUSE_OPT_KEY_DISCARD 101 | #define FUSE_OPT_KEY_DISCARD -4 102 | #endif 103 | 104 | // Bail out on error (implies bug) 105 | #define CHECK_RETURN(x) do { \ 106 | const int _r = (x); \ 107 | (void)_r; \ 108 | assert(_r == 0); \ 109 | } while (0) 110 | 111 | // In case we don't have glibc >= 2.18 112 | #ifndef FALLOC_FL_KEEP_SIZE 113 | #define FALLOC_FL_KEEP_SIZE 0x01 114 | #endif 115 | #ifndef FALLOC_FL_PUNCH_HOLE 116 | #define FALLOC_FL_PUNCH_HOLE 0x02 117 | #endif 118 | 119 | // Special mount token value for shared disk mode 120 | #define SHARED_DISK_MOUNT_TOKEN ((int32_t)0x7fffffff) 121 | 122 | // Integral type for holding a block number 123 | typedef uint32_t s3b_block_t; 124 | 125 | // Integral type used for bitmaps 126 | typedef uint32_t bitmap_t; 127 | 128 | /* 129 | * How many hex digits we will use to print a block number. 130 | */ 131 | #define S3B_BLOCK_NUM_DIGITS ((int)(sizeof(s3b_block_t) * 2)) 132 | 133 | // Logging function type 134 | typedef void log_func_t(int level, const char *fmt, ...) __attribute__ ((__format__ (__printf__, 2, 3))); 135 | 136 | // Interactive non-zero block list callback function type. Returns zero for success, else positive errno to abort. 137 | typedef int block_list_func_t(void *arg, const s3b_block_t *block_nums, u_int num_blocks); 138 | 139 | // Block write cancel check function type 140 | typedef int check_cancel_t(void *arg, s3b_block_t block_num); 141 | 142 | // Backing store instance structure 143 | struct s3backer_store { 144 | 145 | /* 146 | * Implementation private data 147 | */ 148 | void *data; 149 | 150 | /* 151 | * Create any background pthreads that may be required. 152 | * 153 | * This must be invoked prior to any of the following functions: 154 | * 155 | * o block_read 156 | * o block_read_part 157 | * o block_write 158 | * o block_write_part 159 | * 160 | * It should be invoked after the initial process fork() because it may create pthreads. 161 | * 162 | * Returns: 163 | * 164 | * 0 Success 165 | * Other Other error 166 | */ 167 | int (*create_threads)(struct s3backer_store *s3b); 168 | 169 | /* 170 | * Get meta-data associated with the underlying store. 171 | * 172 | * The information we acquire is: 173 | * o Block size 174 | * o Total size 175 | * 176 | * Returns: 177 | * 178 | * 0 Success 179 | * ENOENT Information not found 180 | * Other Other error 181 | */ 182 | int (*meta_data)(struct s3backer_store *s3b, off_t *file_sizep, u_int *block_sizep); 183 | 184 | /* 185 | * Read and (optionally) set the mount token. The mount token is any 32 bit integer value greater than zero, 186 | * where the special value 0x7fffffff means shared disk mode. 187 | * 188 | * Previous value, if any, is returned in *old_valuep (if not NULL). A returned value of zero means there was 189 | * no previous value. 190 | * 191 | * new_value can be: 192 | * < 0 Don't change anything, just read the existing value, if any 193 | * = 0 Clear the flag 194 | * > 0 Set flag to new_value 195 | * 196 | * Returns zero on success or a (positive) errno value on error. 197 | */ 198 | int (*set_mount_token)(struct s3backer_store *s3b, int32_t *old_valuep, int32_t new_value); 199 | 200 | /* 201 | * Read one block. Never-written-to blocks will return all zeros. 202 | * 203 | * If not NULL, 'actual_etag' should be filled in with a value suitable for the 'expect_etag' parameter, 204 | * or all zeros if unknown. 205 | * 206 | * If 'expect_etag' is not NULL: 207 | * - expect_etag should be the value returned from a previous call to read_block() or write_block(). 208 | * - If strict != 0, expect_etag must be the value returned from the most recent call to write_block() 209 | * and the data must match it or else an error is returned. Aside from this check, read normally. 210 | * - If strict == 0: 211 | * - If block's ETag does not match expect_etag, expect_etag is ignored and the block is read normally 212 | * - If block's ETag matches expect_etag, the implementation may either: 213 | * - Ignore expect_etag and read the block normally; OR 214 | * - Return EEXIST; the block may or may not also be read normally into *dest 215 | * 216 | * Returns zero on success or a (positive) errno value on error. 217 | * May return ENOTCONN if create_threads() has not yet been invoked. 218 | */ 219 | int (*read_block)(struct s3backer_store *s3b, s3b_block_t block_num, void *dest, 220 | u_char *actual_etag, const u_char *expect_etag, int strict); 221 | 222 | /* 223 | * Read part of one block. 224 | * 225 | * This is an optional function; if not supported, this hook may be null. 226 | * 227 | * Returns zero on success or a (positive) errno value on error. 228 | * May return ENOTCONN if create_threads() has not yet been invoked. 229 | */ 230 | int (*read_block_part)(struct s3backer_store *s3b, s3b_block_t block_num, u_int off, u_int len, void *dest); 231 | 232 | /* 233 | * Write one block. 234 | * 235 | * Passing src == NULL is equivalent to passing a block containing all zeros. 236 | * 237 | * If check_cancel != NULL, then it may be invoked periodically during the write. If so, and it ever 238 | * returns a non-zero value, then this function may choose to abort the write and return ECONNABORTED. 239 | * 240 | * Upon successful return, etag (if not NULL) will get updated with a value suitable for the 'expect_etag' 241 | * parameter of read_block(); if the block is all zeros, etag will be zeroed. 242 | * 243 | * Returns zero on success or a (positive) errno value on error. 244 | * May return ENOTCONN if create_threads() has not yet been invoked. 245 | */ 246 | int (*write_block)(struct s3backer_store *s3b, s3b_block_t block_num, const void *src, u_char *etag, 247 | check_cancel_t *check_cancel, void *arg); 248 | 249 | /* 250 | * Write part of one block. 251 | * 252 | * This is an optional function; if not supported, this hook may be null. 253 | * 254 | * Returns zero on success or a (positive) errno value on error. 255 | * May return ENOTCONN if create_threads() has not yet been invoked. 256 | */ 257 | int (*write_block_part)(struct s3backer_store *s3b, s3b_block_t block_num, u_int off, u_int len, const void *src); 258 | 259 | /* 260 | * Bulk block zeroing (i.e., deletion). 261 | * 262 | * If a block to be deleted does not exist, then that's not an error - just do nothing in that case. 263 | * 264 | * Returns zero on success or a (positive) errno value if one or more blocks exist but cannot be deleted. 265 | */ 266 | int (*bulk_zero)(struct s3backer_store *s3b, const s3b_block_t *block_nums, u_int num_blocks); 267 | 268 | /* 269 | * Flush any outstanding changes for the specific blocks to persistent storge. 270 | * 271 | * This function will block until (at least) all of the specified blocks are persisted. 272 | * 273 | * If "timeout" is greater than zero, impose a maximum time in milliseconds to wait for the flush to succeed; 274 | * if it takes any longer, return ETIMEDOUT. 275 | * 276 | * If "block_nums" is NULL, then "num_blocks" is ignored and this means all "dirty" blocks should be flushed. 277 | * 278 | * If any attempts are made by other threads to write to any of the specified blocks while this function is waiting 279 | * (this includes the scenario in which a write is in progress in another thread when this function is invoked), then 280 | * it is not defined whether those other writes will also be flushed. 281 | */ 282 | int (*flush_blocks)(struct s3backer_store *s3b, const s3b_block_t *block_nums, u_int num_blocks, long timeout); 283 | 284 | /* 285 | * Identify all blocks that are, or could possibly be, non-zero. 286 | * 287 | * The callback must be invoked for all blocks which could possibly be non-zero. Note: the same block 288 | * may be reported more than once to "callback". 289 | * 290 | * If "callback" ever returns a non-zero value, survey should be aborted and an error returned. 291 | * 292 | * It's possible for "shutdown" to be invoked before this method returns; if so, the survey should be aborted 293 | * and ECANCELED returned. 294 | * 295 | * This method should never be invoked more than once at a time. 296 | * 297 | * Returns zero on success or a (positive) errno value on error. 298 | */ 299 | int (*survey_non_zero)(struct s3backer_store *s3b, block_list_func_t *callback, void *arg); 300 | 301 | /* 302 | * Shutdown this instance. Sync any dirty data to the underlying data store (as required). 303 | * 304 | * The only other concurrent activity that should possibly be happening when this function is invoked 305 | * is an invocation of "survey_non_zero". The only functions that may be invoked after this one are 306 | * "set_mount_token" and "destroy". 307 | */ 308 | int (*shutdown)(struct s3backer_store *s3b); 309 | 310 | /* 311 | * Destroy this instance. Free all resources. Caller must have invoked "shutdown" prior to this. 312 | */ 313 | void (*destroy)(struct s3backer_store *s3b); 314 | }; 315 | 316 | // gitrev.c 317 | extern const char *const s3backer_version; 318 | 319 | // Issue #64 OpenSSL 1.1.0 compatibility - sslcompat.c 320 | #if OPENSSL_VERSION_NUMBER < 0x10100000L 321 | HMAC_CTX *HMAC_CTX_new(void); 322 | void HMAC_CTX_free(HMAC_CTX *ctx); 323 | EVP_MD_CTX *EVP_MD_CTX_new(void); 324 | void EVP_MD_CTX_free(EVP_MD_CTX *ctx); 325 | #endif 326 | -------------------------------------------------------------------------------- /sslcompat.c: -------------------------------------------------------------------------------- 1 | /* 2 | * s3backer - FUSE-based single file backing store via Amazon S3 3 | * 4 | * Copyright 2008-2023 Archie L. Cobbs 5 | * 6 | * This program is free software; you can redistribute it and/or 7 | * modify it under the terms of the GNU General Public License 8 | * as published by the Free Software Foundation; either version 2 9 | * of the License, or (at your option) any later version. 10 | * 11 | * This program is distributed in the hope that it will be useful, 12 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 13 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 14 | * GNU General Public License for more details. 15 | * 16 | * You should have received a copy of the GNU General Public License 17 | * along with this program; if not, write to the Free Software 18 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 19 | * 02110-1301, USA. 20 | * 21 | * In addition, as a special exception, the copyright holders give 22 | * permission to link the code of portions of this program with the 23 | * OpenSSL library under certain conditions as described in each 24 | * individual source file, and distribute linked combinations including 25 | * the two. 26 | * 27 | * You must obey the GNU General Public License in all respects for all 28 | * of the code used other than OpenSSL. If you modify file(s) with this 29 | * exception, you may extend this exception to your version of the 30 | * file(s), but you are not obligated to do so. If you do not wish to do 31 | * so, delete this exception statement from your version. If you delete 32 | * this exception statement from all source files in the program, then 33 | * also delete it here. 34 | */ 35 | 36 | #include "s3backer.h" 37 | 38 | // Issue #64 OpenSSL 1.1.0 compatibility 39 | #if OPENSSL_VERSION_NUMBER < 0x10100000L 40 | 41 | /* 42 | * OpenSSL does not allow for HMAC_CTX or EVP_MD_CTX to be allocated on the 43 | * stack. Instead it provides a set of _new and _free functions for dynamic 44 | * allocation that do not exist in the older versions of the library. For 45 | * older OpenSSL versions we provide our own implementations of these missing 46 | * functions. 47 | */ 48 | 49 | HMAC_CTX *HMAC_CTX_new(void) 50 | { 51 | HMAC_CTX *ctx = OPENSSL_malloc(sizeof(*ctx)); 52 | if (ctx != NULL) { 53 | HMAC_CTX_init(ctx); 54 | } 55 | return ctx; 56 | } 57 | 58 | void HMAC_CTX_free(HMAC_CTX *ctx) 59 | { 60 | if (ctx != NULL) { 61 | HMAC_CTX_cleanup(ctx); 62 | OPENSSL_free(ctx); 63 | } 64 | } 65 | 66 | EVP_MD_CTX *EVP_MD_CTX_new(void) 67 | { 68 | EVP_MD_CTX *ctx = OPENSSL_malloc(sizeof(*ctx)); 69 | if (NULL != ctx) { 70 | EVP_MD_CTX_init(ctx); 71 | } 72 | return ctx; 73 | } 74 | 75 | void EVP_MD_CTX_free(EVP_MD_CTX *ctx) 76 | { 77 | if (ctx != NULL) { 78 | EVP_MD_CTX_cleanup(ctx); 79 | OPENSSL_free(ctx); 80 | } 81 | } 82 | 83 | #endif 84 | -------------------------------------------------------------------------------- /test_io.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Configuration info structure for test_io store 38 | struct test_io_conf { 39 | u_int block_size; 40 | s3b_block_t num_blocks; 41 | const char *bucket; 42 | const char *prefix; 43 | log_func_t *log; 44 | int blockHashPrefix; 45 | int random_errors; 46 | int random_delays; 47 | int discard_data; 48 | int debug; 49 | }; 50 | 51 | // test_io.c 52 | extern struct s3backer_store *test_io_create(struct test_io_conf *config); 53 | extern int test_io_list_blocks(struct s3backer_store *s3b, block_list_func_t *callback, void *arg); 54 | 55 | -------------------------------------------------------------------------------- /tester.c: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | #include "s3backer.h" 38 | #include "block_cache.h" 39 | #include "ec_protect.h" 40 | #include "zero_cache.h" 41 | #include "fuse_ops.h" 42 | #include "http_io.h" 43 | #include "test_io.h" 44 | #include "s3b_config.h" 45 | #include "util.h" 46 | 47 | // Definitions 48 | #define NUM_THREADS 10 49 | #define DELAY_BASE 0 50 | #define DELAY_RANGE 50 51 | #define READ_FACTOR 2 52 | #define ZERO_FACTOR 3 53 | 54 | // Block states 55 | struct block_state { 56 | u_int writing; // block is currently being written by a thread 57 | u_int counter; // counts writes to the block 58 | u_int content; // most recently written content 59 | }; 60 | 61 | // Internal functions 62 | static void *test_thread_main(void *arg); 63 | static void logit(int id, const char *fmt, ...) __attribute__ ((__format__ (__printf__, 2, 3))); 64 | static void catch_signal(int sig); 65 | static uint64_t get_time(void); 66 | 67 | // Internal variables 68 | static pthread_mutex_t mutex; 69 | static pthread_mutex_t log_mutex; 70 | static struct s3b_config *config; 71 | static struct s3backer_store *store; 72 | static struct block_state *blocks; 73 | static uint64_t start_time; 74 | static volatile int stop_threads; 75 | 76 | int 77 | main(int argc, char **argv) 78 | { 79 | s3b_block_t block_num; 80 | pthread_t threads[NUM_THREADS]; 81 | sigset_t sigs; 82 | int sig; 83 | int i; 84 | int r; 85 | 86 | // Get configuration 87 | if ((config = s3backer_get_config(argc, argv, 0, 0)) == NULL) 88 | exit(1); 89 | if (config->block_size < sizeof(u_int)) 90 | err(1, "block size too small"); 91 | 92 | // Open store 93 | logit(-1, "creating s3backer store"); 94 | if ((store = s3backer_create_store(config)) == NULL) 95 | err(1, "s3backer_create_store"); 96 | 97 | // Startup background threads 98 | logit(-1, "starting background threads"); 99 | if ((r = (*store->create_threads)(store)) != 0) 100 | err(1, "create_threads"); 101 | 102 | // Allocate block states 103 | if ((blocks = calloc(config->num_blocks, sizeof(*blocks))) == NULL) 104 | err(1, "calloc"); 105 | 106 | // Random initialization 107 | srandom((u_int)time(NULL)); 108 | if (pthread_mutex_init(&mutex, NULL) != 0) 109 | err(1, "mutex init"); 110 | if (pthread_mutex_init(&log_mutex, NULL) != 0) 111 | err(1, "mutex init"); 112 | start_time = get_time(); 113 | 114 | // Zero all blocks 115 | logit(-1, "started zeroing all blocks"); 116 | for (block_num = 0; block_num < config->num_blocks; block_num++) { 117 | if ((r = (*store->write_block)(store, block_num, zero_block, NULL, NULL, NULL)) != 0) 118 | err(1, "write error"); 119 | if (block_num % 1000 == 999) 120 | logit(-1, "zeroed %jd blocks", (uintmax_t)block_num); 121 | } 122 | logit(-1, "finished zeroing all blocks"); 123 | 124 | // Create my threads 125 | logit(-1, "starting tester threads"); 126 | for (i = 0; i < NUM_THREADS; i++) 127 | pthread_create(&threads[i], NULL, test_thread_main, (void *)(intptr_t)i); 128 | 129 | // Wait for signal 130 | logit(-1, "waiting for termination signal"); 131 | signal(SIGHUP, catch_signal); 132 | signal(SIGINT, catch_signal); 133 | signal(SIGTERM, catch_signal); 134 | signal(SIGQUIT, catch_signal); 135 | sigemptyset(&sigs); 136 | sigaddset(&sigs, SIGHUP); 137 | sigaddset(&sigs, SIGINT); 138 | sigaddset(&sigs, SIGTERM); 139 | sigaddset(&sigs, SIGQUIT); 140 | if ((r = sigwait(&sigs, &sig)) != 0) 141 | err(1, "sigwait"); 142 | logit(-1, "got termination signal"); 143 | 144 | // Stop threads and wait for them to exit 145 | logit(-1, "stopping tester threads"); 146 | stop_threads = 1; 147 | for (i = 0; i < NUM_THREADS; i++) 148 | pthread_join(threads[i], NULL); 149 | 150 | // Done 151 | logit(-1, "done"); 152 | return 0; 153 | } 154 | 155 | static void * 156 | test_thread_main(void *arg) 157 | { 158 | const int id = (int)(intptr_t)arg; 159 | u_char data[config->block_size]; 160 | s3b_block_t block_num; 161 | int millis; 162 | int r; 163 | 164 | // Loop 165 | while (!stop_threads) { 166 | 167 | // Sleep 168 | millis = DELAY_BASE + (random() % DELAY_RANGE); 169 | usleep(millis * 1000); 170 | 171 | // Pick a random block 172 | block_num = random() % config->num_blocks; 173 | 174 | // Randomly read or write it 175 | if ((random() % READ_FACTOR) != 0) { 176 | struct block_state *const state = &blocks[block_num]; 177 | struct block_state before; 178 | struct block_state after; 179 | 180 | // Snapshot block state 181 | pthread_mutex_lock(&mutex); 182 | memcpy(&before, state, sizeof(before)); 183 | CHECK_RETURN(pthread_mutex_unlock(&mutex)); 184 | 185 | // Do the read 186 | logit(id, "rd %0*jx START", S3B_BLOCK_NUM_DIGITS, (uintmax_t)block_num); 187 | if ((r = (*store->read_block)(store, block_num, data, NULL, NULL, 0)) != 0) { 188 | logit(id, "****** READ ERROR: %s", strerror(r)); 189 | continue; 190 | } 191 | 192 | // Snapshot block state again 193 | pthread_mutex_lock(&mutex); 194 | memcpy(&after, state, sizeof(before)); 195 | CHECK_RETURN(pthread_mutex_unlock(&mutex)); 196 | 197 | // Verify content, but only if no write occurred while we were reading 198 | if (before.writing == 0 && after.writing == 0 && before.counter == after.counter) { 199 | if (memcmp(data, &before.content, sizeof(before.content)) != 0) { 200 | logit(id, "got wrong content block %0*jx", S3B_BLOCK_NUM_DIGITS, (uintmax_t)block_num); 201 | exit(1); 202 | } 203 | } 204 | logit(id, "rd %0*jx content=0x%02x%02x%02x%02x COMPLETE", S3B_BLOCK_NUM_DIGITS, (uintmax_t)block_num, 205 | data[0], data[1], data[2], data[3]); 206 | } else { 207 | struct block_state *const state = &blocks[block_num]; 208 | u_int content; 209 | 210 | // Update block state 211 | pthread_mutex_lock(&mutex); 212 | if (state->writing) { // only one writer at a time 213 | CHECK_RETURN(pthread_mutex_unlock(&mutex)); 214 | continue; 215 | } 216 | state->writing = 1; 217 | CHECK_RETURN(pthread_mutex_unlock(&mutex)); 218 | 219 | // Write block 220 | content = (random() % ZERO_FACTOR) != 0 ? 0 : (u_int)random(); 221 | memcpy(data, &content, sizeof(content)); 222 | memset(data + sizeof(content), 0, config->block_size - sizeof(content)); 223 | logit(id, "wr %0*jx content=0x%02x%02x%02x%02x START", S3B_BLOCK_NUM_DIGITS, (uintmax_t)block_num, 224 | data[0], data[1], data[2], data[3]); 225 | if ((r = (*store->write_block)(store, block_num, data, NULL, NULL, NULL)) != 0) 226 | logit(id, "****** WRITE ERROR: %s", strerror(r)); 227 | logit(id, "wr %0*jx content=0x%02x%02x%02x%02x %s%s", S3B_BLOCK_NUM_DIGITS, (uintmax_t)block_num, 228 | data[0], data[1], data[2], data[3], r != 0 ? "FAILED: " : "COMPLETE", r != 0 ? strerror(r) : ""); 229 | 230 | // Update block state 231 | pthread_mutex_lock(&mutex); 232 | if (r == 0) { 233 | state->counter++; 234 | state->content = content; 235 | } 236 | state->writing = 0; 237 | CHECK_RETURN(pthread_mutex_unlock(&mutex)); 238 | } 239 | } 240 | 241 | // Done 242 | return NULL; 243 | } 244 | 245 | static void 246 | logit(int id, const char *fmt, ...) 247 | { 248 | uint64_t timestamp = get_time() - start_time; 249 | va_list args; 250 | 251 | pthread_mutex_lock(&log_mutex); 252 | if (id == -1) 253 | printf("%u.%03u [--] ", (u_int)(timestamp / 1000), (u_int)(timestamp % 1000)); 254 | else 255 | printf("%u.%03u [%02d] ", (u_int)(timestamp / 1000), (u_int)(timestamp % 1000), id); 256 | va_start(args, fmt); 257 | vfprintf(stdout, fmt, args); 258 | printf("\n"); 259 | fflush(stdout); 260 | va_end(args); 261 | CHECK_RETURN(pthread_mutex_unlock(&log_mutex)); 262 | } 263 | 264 | static uint64_t 265 | get_time(void) 266 | { 267 | struct timeval tv; 268 | 269 | gettimeofday(&tv, NULL); 270 | return (uint64_t)tv.tv_sec * 1000 + (uint64_t)tv.tv_usec / 1000; 271 | } 272 | 273 | static void 274 | catch_signal(int sig) 275 | { 276 | // do nothing 277 | } 278 | -------------------------------------------------------------------------------- /util.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Forward decl's 38 | struct s3b_config; 39 | 40 | // A list of block numbers 41 | struct block_list { 42 | s3b_block_t *blocks; 43 | s3b_block_t num_alloc; 44 | s3b_block_t num_blocks; 45 | }; 46 | 47 | // Block boundary condition handling info 48 | struct boundary_edge { 49 | char *data; 50 | s3b_block_t block; 51 | u_int offset; 52 | u_int length; 53 | }; 54 | struct boundary_info { 55 | 56 | // Header portion 57 | struct boundary_edge header; 58 | 59 | // Center block-aligned portion 60 | char *mid_data; 61 | s3b_block_t mid_block_start; 62 | size_t mid_block_count; 63 | 64 | // Footer portion 65 | struct boundary_edge footer; 66 | }; 67 | 68 | // A list of strings 69 | struct string_array { 70 | char **strings; 71 | size_t num_alloc; 72 | size_t num_strings; 73 | }; 74 | 75 | // A child process 76 | struct child_proc { 77 | const char *name; 78 | pid_t pid; 79 | int wstatus; 80 | }; 81 | 82 | // Globals 83 | extern int log_enable_debug; 84 | extern int daemonized; 85 | extern const void *zero_block; 86 | 87 | // Misc 88 | extern int parse_size_string(const char *s, const char *description, u_int max_bytes, uintmax_t *valp); 89 | extern void unparse_size_string(char *buf, int bmax, uintmax_t value); 90 | extern void describe_size(char *buf, int bmax, uintmax_t value); 91 | extern void syslog_logger(int level, const char *fmt, ...) __attribute__ ((__format__ (__printf__, 2, 3))); 92 | extern void stderr_logger(int level, const char *fmt, ...) __attribute__ ((__format__ (__printf__, 2, 3))); 93 | extern int find_string_in_table(const char *const *table, const char *value); 94 | extern int block_is_zeros(const void *data); 95 | extern int snvprintf(char *buf, int bufsize, const char *format, ...) __attribute__ ((__format__ (__printf__, 3, 4))); 96 | extern char *prefix_log_format(int level, const char *fmt); 97 | extern void calculate_boundary_info(struct boundary_info *info, u_int block_size, const void *buf, size_t size, off_t offset); 98 | extern int fsync_path(const char *path, int must_exist); 99 | extern int add_string(struct string_array *array, const char *fmt, ...) __attribute__ ((__format__ (__printf__, 2, 3))); 100 | extern void free_strings(struct string_array *array); 101 | extern int init_zero_block(u_int block_size); 102 | extern void set_config_log(struct s3b_config *config, log_func_t *log); 103 | extern int popcount32(uint32_t value); 104 | 105 | // Versions of that work properly even when daemonized 106 | extern void daemon_debug(const struct s3b_config *config, const char *fmt, ...) 107 | __attribute__ ((__format__ (__printf__, 2, 3))); 108 | extern void daemon_warn(const struct s3b_config *config, const char *fmt, ...) 109 | __attribute__ ((__format__ (__printf__, 2, 3))); 110 | extern void daemon_warnx(const struct s3b_config *config, const char *fmt, ...) 111 | __attribute__ ((__format__ (__printf__, 2, 3))); 112 | extern void daemon_err(const struct s3b_config *config, int value, const char *fmt, ...) 113 | __attribute__ ((__noreturn__, __format__ (__printf__, 3, 4))); 114 | extern void daemon_errx(const struct s3b_config *config, int value, const char *fmt, ...) 115 | __attribute__ ((__noreturn__, __format__ (__printf__, 3, 4))); 116 | 117 | // Forking and child process management 118 | extern pid_t start_child_process(const struct s3b_config *config, const char *executable, struct string_array *params); 119 | extern void kill_remaining_children(const struct s3b_config *config, pid_t except, int signal); 120 | extern pid_t wait_for_child_to_exit(const struct s3b_config *config, struct child_proc *proc, int sleep_if_none, int expect_signal); 121 | extern void apply_process_tweaks(void); 122 | 123 | // Bitmaps 124 | extern bitmap_t *bitmap_init(s3b_block_t num_blocks, int value); 125 | extern void bitmap_free(bitmap_t **bitmapp); 126 | extern size_t bitmap_size(s3b_block_t num_blocks); 127 | extern int bitmap_test(const bitmap_t *bitmap, s3b_block_t block_num); 128 | extern void bitmap_set(bitmap_t *bitmap, s3b_block_t block_num, int value); 129 | extern void bitmap_and(bitmap_t *dst, const bitmap_t *src, s3b_block_t num_blocks); 130 | extern void bitmap_or(bitmap_t *dst, const bitmap_t *src, s3b_block_t num_blocks); 131 | extern size_t bitmap_or2(bitmap_t *dst, const bitmap_t *src, s3b_block_t num_blocks); 132 | extern void bitmap_not(bitmap_t *bitmap, s3b_block_t num_blocks); 133 | 134 | // Block lists 135 | extern void block_list_init(struct block_list *list); 136 | extern int block_list_append(struct block_list *list, s3b_block_t block_num); 137 | extern void block_list_free(struct block_list *list); 138 | 139 | // Generic s3backer_store functions 140 | extern int generic_bulk_zero(struct s3backer_store *s3b, const s3b_block_t *block_nums, u_int num_blocks); 141 | 142 | // Hashing 143 | struct hmac_engine; 144 | struct hmac_ctx; 145 | 146 | extern struct hmac_engine *hmac_engine_create(void); 147 | extern void hmac_engine_free(struct hmac_engine *engine); 148 | extern struct hmac_ctx *hmac_new_sha1(struct hmac_engine *engine, const void *key, size_t keylen); 149 | extern struct hmac_ctx *hmac_new_sha256(struct hmac_engine *engine, const void *key, size_t keylen); 150 | extern void hmac_reset(struct hmac_ctx *ctx, const void *key, size_t keylen); 151 | extern void hmac_update(struct hmac_ctx *ctx, const void *data, size_t len); 152 | extern void hmac_final(struct hmac_ctx *ctx, u_char *result); 153 | extern int hmac_result_length(struct hmac_ctx *ctx); 154 | extern void hmac_free(struct hmac_ctx *ctx); 155 | 156 | extern void md5_quick(const void *data, size_t len, u_char *result); 157 | -------------------------------------------------------------------------------- /zero_cache.h: -------------------------------------------------------------------------------- 1 | 2 | /* 3 | * s3backer - FUSE-based single file backing store via Amazon S3 4 | * 5 | * Copyright 2008-2023 Archie L. Cobbs 6 | * 7 | * This program is free software; you can redistribute it and/or 8 | * modify it under the terms of the GNU General Public License 9 | * as published by the Free Software Foundation; either version 2 10 | * of the License, or (at your option) any later version. 11 | * 12 | * This program is distributed in the hope that it will be useful, 13 | * but WITHOUT ANY WARRANTY; without even the implied warranty of 14 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 15 | * GNU General Public License for more details. 16 | * 17 | * You should have received a copy of the GNU General Public License 18 | * along with this program; if not, write to the Free Software 19 | * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 20 | * 02110-1301, USA. 21 | * 22 | * In addition, as a special exception, the copyright holders give 23 | * permission to link the code of portions of this program with the 24 | * OpenSSL library under certain conditions as described in each 25 | * individual source file, and distribute linked combinations including 26 | * the two. 27 | * 28 | * You must obey the GNU General Public License in all respects for all 29 | * of the code used other than OpenSSL. If you modify file(s) with this 30 | * exception, you may extend this exception to your version of the 31 | * file(s), but you are not obligated to do so. If you do not wish to do 32 | * so, delete this exception statement from your version. If you delete 33 | * this exception statement from all source files in the program, then 34 | * also delete it here. 35 | */ 36 | 37 | // Configuration info structure for zero_cache store 38 | struct zero_cache_conf { 39 | u_int block_size; 40 | s3b_block_t num_blocks; 41 | int list_blocks; 42 | log_func_t *log; 43 | }; 44 | 45 | // Statistics structure for zero_cache store 46 | struct zero_cache_stats { 47 | s3b_block_t current_cache_size; 48 | u_int read_hits; 49 | u_int write_hits; 50 | }; 51 | 52 | // zero_cache.c 53 | extern struct s3backer_store *zero_cache_create(struct zero_cache_conf *config, struct s3backer_store *inner); 54 | extern void zero_cache_init_nonzero(struct s3backer_store *s3b, const u_int *non_zero); 55 | extern void zero_cache_get_stats(struct s3backer_store *s3b, struct zero_cache_stats *stats); 56 | extern void zero_cache_clear_stats(struct s3backer_store *s3b); 57 | 58 | --------------------------------------------------------------------------------