├── .gitignore ├── DESIGN.md ├── LICENSE ├── README.md ├── build.sh ├── db ├── db.go └── migrations.go ├── flake.lock ├── flake.nix ├── go.mod ├── go.sum ├── main.go ├── scanner ├── scanner.go └── summer.go ├── timing └── timing.go ├── vtl.sh └── writer ├── writer.go └── writer_test.go /.gitignore: -------------------------------------------------------------------------------- 1 | /mixtape 2 | /mixtape.db* 3 | /.envrc 4 | -------------------------------------------------------------------------------- /DESIGN.md: -------------------------------------------------------------------------------- 1 | _This whole design is still subject to radical changes. Until the 2 | first tape is written for keeps, everything is up for grabs._ 3 | 4 | # Motivation 5 | 6 | ## Computer backup is not WORM backup 7 | 8 | Most backup software these days aims to keep an incremental history 9 | over time of many small files, stored as efficiently as possible. This 10 | usually involves some form of deduplication using rolling hashes, as 11 | found in Restic, Borg and Bup. 12 | 13 | This is a pretty great solution for backing up your home directory, or 14 | important files on a server. Slightly modified, it's also a decent 15 | strategy for block-level backups such as those found in virtual 16 | machine management software. 17 | 18 | Systems based on these principles make a few core assumptions: 19 | 20 | - **Storage is expensive**; therefore effort should be expended to 21 | minimize the size of the backup set. 22 | - **Data will change over time**, and the deltas will be small; 23 | therefore effort should be expended to minimize the overhead of 24 | storing N small deltas. 25 | - **Random access is cheap**, and getting cheaper (compare spinning 26 | disks to SSDs); therefore storing a file non-linearly in exchange 27 | for reduced overhead (e.g. by chunking it into a content-addressed 28 | store) has minimal cost compared to the benefits. 29 | - **The dataset is (relatively) small**, likely at most a few GiB, 30 | and thus can be cheaply stored on any number of clouds or local 31 | media. 32 | 33 | Now, say that instead of a typical home directory, you have a 34 | collection of WORM (Write-Once Read-Many) data. For example, 35 | phootographers with packrat tendencies accumulate large amounts of 36 | heavy raw photo files. It's a similar story for video-enthusiastic 37 | people, e.g. youtubers who want to preserve raw footage against future 38 | need. Additionally, say that you have an LTO tape drive, or even 39 | fancier a robot library. 40 | 41 | All the assumptions above are incorrect when applied to this kind of 42 | data, stored on tape: 43 | 44 | - **Storage is cheap**: LTO tape media has hovered around $7-10/TiB 45 | for decades, and it's not hard to find promotions as low as 46 | $3/TiB. Compare to $20/TiB for NAS-grade hard drives, 47 | $24.6/TiB/month stored + $92/TiB read for AWS S3, or 48 | ~$5/TiB/month + $10/TiB read for the cheaper cloud options. 49 | - **Data is largely unchanging**: once a photo or video file has been 50 | written out once, it's likely to never change again. New derived 51 | files (RAW to JPEG, transcodes to lower resolution) may be created, 52 | but the next major write event in the file's life is probably 53 | deletion. 54 | - **Random access is expensive**: each seek on a tape can take 55 | several _minutes_, even though sequential I/O can run at hundreds 56 | of MiB/s. It's even worse if you need to access a different tape: 57 | spooling and ejecting the currently loaded tape, loading a new tape 58 | and seeking on it can add up to 10 minutes or more, and that's if 59 | the tape is readily available in the robotic library. 60 | - **The dataset is huge**: even modest amounts of video will promptly 61 | reach into the hundred of GiB without trying too hard. "look at my 62 | shameful pile of looseleaf hard drives full of content" is a 63 | popular type of video for professional youtubers. 64 | 65 | Conclusion: tape backup of WORM data requires a solution that 66 | isn't well served by standard backup software. 67 | 68 | ## Tape backup software is built for CERN, not for you 69 | 70 | Tape-aware backup software is designed for use at massive enterprise 71 | scale: hundreds of data sources, streaming backups to dozens of tape 72 | drives in large robots, with exquisitely complicated schedules for 73 | taking, expiring, storing, consolidating and verifying backups. 74 | 75 | I'm sure this software works very well for the likes of CERN, but I 76 | have 1 NAS full of bytes, 1 tape drive, and I want to move A to 77 | B. Existing software doesn't scale down to that. 78 | 79 | When I first got my tape drive, I tried setting up both Bacula and 80 | Bareos, the two flagship open-source tape aware software suites. Each 81 | took me several _days_ of reading manuals to configure, required 82 | running 3 daemons and a database server on my single computer to do 83 | anything, were brittle and hard to monitor, and were unable to 84 | saturate my tape drive's write bandwidth (a measly 160MiB/s - my ZFS 85 | setup can read at over 500MiB/s sustained). 86 | 87 | Conclusion: smaller scale backup to tape is an underserved niche. As 88 | far as I can tell, the people who attempt it end up eihter suffering 89 | through one of the enterprise stacks, or just format the tapes as LTFS 90 | and treat them as a weirdly-shaped hard drive whose content they have 91 | to curate by hand. 92 | 93 | ## Tape backup software is built for people who don't lose stuff 94 | 95 | The tape-aware software suites write to tapes in a custom format that 96 | cannot be read back out "naively" using basic tools. This means that a 97 | "found tape" from the back of your closet is all but useless: if it's 98 | old enough, the software has moved on and can no longer read its own 99 | older format, and you're left with an undifferentiated stream of bytes 100 | to reverse engineer. 101 | 102 | And even if you're lucky on that front, individual tapes in these 103 | systems are not usually "standalone": you need an out of band 104 | "catalog" to tell you what bytes the tapes contain, and where to find 105 | the files you care about. 106 | 107 | This isn't universal, some of these tools have recovery software that 108 | can painstakingly reconstruct catalog information from the raw 109 | data. The ones I've seen do this by reading out the entire contents of 110 | every tape you can give it, which takes 12-16h per tape if you can 111 | keep the read throughput up. 112 | 113 | I think it's fair to say that the solutions that do offer low-level 114 | disaster recovery expect you to never have to use them, and instead 115 | assume you'll be able to host an extremely highly available and 116 | resilient catalog "somewhere else" for years or decades. Enterprises 117 | and large research labs are readily capable of such continuity across 118 | years, but I don't trust myself to still have access to my Bacula 119 | catalog in 2035. 120 | 121 | Conclusion: right now, I can't write backups to tape and trust that 122 | I'll be able to use them easily in 10 years. 123 | 124 | # Goals 125 | 126 | So, that's why I'm writing my own backup software. My goals: 127 | 128 | - Optimized for WORM data that doesn't deduplicate. Whole files are 129 | the finest level of granularity available. 130 | - Optimized for tape media. Full restores should spend >99% of their 131 | time in sequential reads, restoring a single file should require no 132 | more than 1 or 2 seeks. 133 | - Optimized for recoverability. Each tape should be self-describing 134 | as much as possible. A reasonably technical Unix knower should be 135 | able to restore files from a tape with no prior knowledge of the 136 | on-tape format. 137 | - Optimized for low "continuity of care". Failing to maintain a 138 | catalog database for 10 years should result in minor inconvenience 139 | at most. Given the software and a trained operator, recovery from 140 | catalog loss should take no more than 30 minutes per tape, 141 | including the latency of a robot library manipulating the tape. 142 | - File-oriented policymaking. Backup sets, backup jobs and so forth 143 | are implementation details, what I care about is that I have N 144 | independent copies of a particular file. 145 | - Built in verification support. This shouldn't deserve mention, but 146 | it's surprisingly uncommon at the low end to actually test 147 | backups. The software should make it as easy as possible. 148 | - Optimized for reasonably modern tape hardware. The goal is mass 149 | storage, so I'm not going to make it work for DAT or 8-track. The 150 | loose aim is that things LTO-5-ish drives can do will be enough. 151 | 152 | In addition, as a secondary goal, it'd be nice to be able to use this 153 | software on hard drives as well, treating them like "weird tapes". I 154 | have a bunch of older drives lying around that could get a second life 155 | as backup storage, with suitable respect paid to the lower expected 156 | longevity of an old unplugged drive sitting on a shelf. 157 | 158 | # Aside: the tape I/O interface 159 | 160 | Tape exposes an API that is almost, but not quite entirely unlike 161 | other modern storage. It's stream- and record-oriented storage, with 162 | explicit seeking of the hardware read/write head, and a few other 163 | things like filemarks which straight up don't exist elsewhere. 164 | 165 | You talk to a tape drive with SCSI commands, same as everything these 166 | days. When the tape is first loaded, the drive head is positioned at 167 | the Beginning of Tape (BOT) well-known location. Reading and writing 168 | is done in "records", which are blocks of bytes typically somewhere 169 | between 512b and 4MiB. 170 | 171 | The record size is left entirely to the user. Every read returns 1 172 | record and advances the tape to the next, even if you provided a 173 | target buffer that's too small for the record that was read. If you 174 | issue a read with a 512b buffer, and the next record on the tape is 175 | 1MiB, you'll get the first 512b and a "short read" flag. The next 176 | read will _not_ give you the remainder of that short read, instead 177 | you'll receive the bytes of the next record. It's your problem to know 178 | what record size is in use, and provide appropriately sized buffers. 179 | 180 | In addition to data blocks, you can write out "file marks", which are 181 | a physical manifestation of an EOF. If you write out two tarballs 182 | separated by a file mark, trying to `dd` from the tape drive device 183 | will yield the first tarball and stop when it hits the 184 | filemark. Running `dd` again will yield the second tarball. 185 | 186 | Drives can be told to seek to a record number (slowly, because the 187 | record size and physical layout on the tape isn't as simple as 188 | "multiply #records by length of 1 record and spool that many meters of 189 | tape"), to a file mark (faster, because file marks are designed to be 190 | "highly visible" even when the drive is spooling at high speed), to 191 | BOT (IOW, full rewing), or to End of Media (EOM) which is the logical 192 | "end of data you've written", and isn't related to the physical End of 193 | Tape (EOT). In general, you can't go spelunking around in the space 194 | between EOM and EOT, the best you can do is seek to EOM and write more 195 | stuff. 196 | 197 | Tapes have a nominal capacity written on them. For example, LTO-6 198 | lists 2.5TiB per tape. However, tape is a fickle medium, and as a 199 | result 1 byte of data may not always take the same amount of tape to 200 | store. To deal with this, cartridges include a bit more physical tape 201 | than strictly necessary, to account for space lost to manufacturing 202 | defects or other issues with the medium. IOW, the nominal capacity is 203 | a _minimum_, but you may be able to get a bit more data onto the tape 204 | past that figure. Of a sample of a half-dozen brand new LTO-6 tapes, I 205 | got anywhere from 1GiB to 50GiB "bonus" space at the end of the tape. 206 | 207 | To account for this variable storage size, drives emit "early warning" 208 | notifications when the drive is getting close to EOT. Starting in 209 | LTO-5, the application can also move the warning zone earlier in the 210 | tape, effectively being told ahead of time when the tape is down to 211 | N-ish bytes of remaining space (-ish because see above about the 212 | variability of the medium). 213 | 214 | # Design overview: file format 215 | 216 | The on-tape file format uses well-known, well-documented, open source 217 | stuff only. Where possible, the most mature software possible, to 218 | ensure that recovery doesn't depend on that one npm package that went 219 | away in the Great Javascript Wars of 2031. 220 | 221 | To that end, the on-tape format uses 3 basic building blocks: 222 | 223 | - [tar](https://en.wikipedia.org/wiki/Tar_(computing)), the venerable 224 | archive format literally designed for tape. 225 | - [sqlite](https://www.sqlite.org), a ubiquitous piece of software 226 | with _really_ [long-term support](https://www.sqlite.org/lts.html). 227 | - [age](https://github.com/FiloSottile/age), simple authenticated 228 | encryption for files, that isn't GPG. 229 | 230 | ``` 231 | +--------------------------------+ 232 | | | 233 | | Archaeology .tar | 234 | | | 235 | +--------------------------------+ 236 | | EOF | 237 | +--------------------------------+ 238 | | | 239 | | Index 1 sqlite DB | 240 | | | 241 | +--------------------------------+ 242 | | EOF | 243 | +--------------------------------+ 244 | | | 245 | . Archive 1 .tar . 246 | . . 247 | . . 248 | | | 249 | +--------------------------------+ 250 | | EOF | 251 | +--------------------------------+ 252 | | | 253 | | Index 2 sqlite DB | 254 | | | 255 | +--------------------------------+ 256 | | EOF | 257 | +--------------------------------+ 258 | | | 259 | . Archive 2 .tar . 260 | . . 261 | . . 262 | | | 263 | +--------------------------------+ 264 | | EOF | 265 | +--------------------------------+ 266 | | | 267 | | Empty index sqlite DB | 268 | | | 269 | +--------------------------------+ 270 | | EOF | 271 | +--------------------------------+ 272 | ``` 273 | 274 | The main event on a tape is an uncompressed tar file containing the 275 | files being backed up. If the data being backed up is compressible and 276 | compression is desired, compression is applied to invidual files 277 | within the archive. Initial versions will not support compression, 278 | since WORM-worthy data tends to compress poorly or be already 279 | compressed anyway. 280 | 281 | Tar's big downside is that it's not indexed, so to find a single file 282 | you have to scrub through the entire archive. To avoid that, the tar 283 | file is preceded on the tape by a sqlite database file that provides 284 | an index of the files in the tar archive. Given this database, you can 285 | trivially look up the start record + length of any file in the 286 | archive. 287 | 288 | The index and archive are written out as separate files on the tape, 289 | i.e. they are separated by a file mark. This simplifies readout with 290 | non-specialized software: just issue two `dd`s back to back, and 291 | you'll get 2 files out, one that file(1) will identify as a sqlite 292 | database, and another that will identify as a tar file. Even if you 293 | have no clue what these files are, sqlite databases are 294 | self-describing (load them into the `sqlite3` tool and run `.schema`), 295 | and tools to inspect and unpack tar files are ridiculously ubiquitous. 296 | 297 | Both the index and archive are stored encrypted by age. age ciphertext 298 | clearly advertises itself as such with a plaintext header, so even 299 | with an encryption layer the "kind of file I have to deal with" can be 300 | found out by trivial inspection (and presumably file(1) will someday 301 | learn to recognize age ciphertext, making it even easier). 302 | 303 | In cases where individual backup jobs don't fill a tape, this 304 | index+tar file pair may be repeated on the tape, so successive `dd`s 305 | will yield alternating index and archive files until EOM is reached. 306 | 307 | In cases where a backup job is larger than the tape, the software 308 | breaks the job down into N (index, archive) pairs that individually 309 | fit on each tape. There is no support for "tape continuations", where 310 | an archive stops partway and continues on a different tape. This may 311 | lead to sub-optimal tape utilization if the files available for 312 | bin-packing don't stack neatly into tape-sized chunks. In return, each 313 | tape is an island that can be processed without knowledge of others, 314 | there's never an archive without an associated index, and readers 315 | don't have to recognize and handle a custom "file continues elsewhere" 316 | mark on the tape. 317 | 318 | For reasons explained in the next section, at the very end of the 319 | tape, after the last pair of (index, archive) files, one final "empty" 320 | index is present as the final file on the tape. 321 | 322 | There are still a few bits of information missing for someone trying 323 | to recover a tape with no prior knowledge of its format: what record 324 | size was used to write the tape? Where's this "age" and "sqlite" 325 | software you speak of? If said software is lost, what encoding and 326 | algorithms did they use? 327 | 328 | To address that, each tape begins with an unencrypted, uncompressed 329 | tar file, whose contents identifies the version of this format that 330 | was used to write the tape (so our software can quickly find out what 331 | read procedure to use), and bundles a bunch of "archaeology" data to 332 | teach a future reader how to read the rest of the tape. For example, 333 | the archive could contain: 334 | 335 | - A text file describing the on-tape format. 336 | - Text files describing the file formats of tar, sqlite, and age. 337 | - Text files describing the algorithms used by age (STREAM with 338 | ChaCha20-Poly1305, X25519, scrypt, etc.). 339 | - The backup program binary that wrote the tape. 340 | - The corresponding source code of the backup program, including 341 | transitive dependencies. 342 | - A copy of the `sqlite3` binary and corresponding source code. 343 | 344 | Compared to the overall capacity of a tape, this archive is tiny, and 345 | the "waste" can easily be justified. 346 | 347 | # Design overview: software 348 | 349 | The backup program's very basic: you give it a bunch of roots to 350 | scan. It finds all files within, groups them into tape-sized bundles 351 | in the format above, and writes them out to tape. It persists the 352 | files it's written, and the tapes it's written to, in a sqlite catalog 353 | database. 354 | 355 | From this catalog, it can offer some reporting on which files are 356 | backed up, which versions (in case of WORM data that turned out to be 357 | more Write-Seldom than Write-Once), where they are (which tape and 358 | what location on tape), and so forth. 359 | 360 | To mitigate the impact of losing the catalog, the entire catalog is 361 | copied into every index file that's written out to tape. That is, the 362 | index file on tape is a database that contains both an index of the 363 | following archive on tape, and a full copy of the catalog as it was 364 | right before the index file was written out. 365 | 366 | Additionally, a full copy of the catalog is written out to the end of 367 | full tapes (this is the "empty" index file alluded to 368 | previously). This way, recovery from catalog loss is very easy: load 369 | the last-used tape, seek to the last index on the tape, read it out, 370 | and discard any actual index data within. What remains is a copy of 371 | the catalog that describes everything as it was at the end of the last 372 | backup job. 373 | 374 | Similarly, even if the tapes are scattered far and wide, a single tape 375 | becomes fully described in at most one sequential read, because the 376 | last index on the tape contains the catalog up to but not including 377 | the archive that (optionally) follows, and an index of the archive 378 | that (optionally) follows. 379 | 380 | In addition to fully describing itself, the recovered catalog also 381 | describes all the tapes and backed-up data that predate the writing of 382 | the tape in hand, so with every new tape found you recover much more 383 | than the minimum necessary metadata, hopefully speeding up further 384 | recovery. 385 | 386 | Empirically, the catalog database is quite small. Even with all the 387 | redundant copies, the catalogs should occupy much less than 1GiB of 388 | each tape - 0.04% in the case of LTO-6. 389 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | ANTI-CAPITALIST SOFTWARE LICENSE (v 1.4) 2 | 3 | Copyright © 2021 David Anderson 4 | 5 | This is anti-capitalist software, released for free use by individuals 6 | and organizations that do not operate by capitalist principles. 7 | 8 | Permission is hereby granted, free of charge, to any person or 9 | organization (the "User") obtaining a copy of this software and 10 | associated documentation files (the "Software"), to use, copy, modify, 11 | merge, distribute, and/or sell copies of the Software, subject to the 12 | following conditions: 13 | 14 | 1. The above copyright notice and this permission notice shall be 15 | included in all copies or modified versions of the Software. 16 | 17 | 2. The User is one of the following: 18 | a. An individual person, laboring for themselves 19 | b. A non-profit organization 20 | c. An educational institution 21 | d. An organization that seeks shared profit for all of its members, 22 | and allows non-members to set the cost of their labor 23 | 24 | 3. If the User is an organization with owners, then all owners are 25 | workers and all workers are owners with equal equity and/or equal 26 | vote. 27 | 28 | 4. If the User is an organization, then the User is not law 29 | enforcement or military, or working for or under either. 30 | 31 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT EXPRESS OR IMPLIED WARRANTY 32 | OF ANY KIND, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 33 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 34 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY 35 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 36 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 37 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 38 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Mixtape 2 | 3 | Backup software for tape users with lots of WORM data. [Draft 4 | design](DESIGN.md) 5 | 6 | ## License 7 | 8 | This codebase is _not_ open-source software (or free, or "libre") at 9 | this time. It is licensed under the [Anti-capitalist software 10 | license](LICENSE), which places restrictions on who can use the 11 | software. 12 | 13 | ## Contributing 14 | 15 | This project is source-available but closed to code contributions at 16 | this time. This is in part because I don't want to deal with PRs and 17 | arguing over features, and in part so that I retain the freedom to 18 | relicense in future. 19 | 20 | Bug reports and feature requests filed at github issues are welcome. 21 | -------------------------------------------------------------------------------- /build.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | 3 | go build -tags osusergo,netgo -ldflags='-extldflags=-static' . 4 | -------------------------------------------------------------------------------- /db/db.go: -------------------------------------------------------------------------------- 1 | package db 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | 7 | "github.com/jmoiron/sqlx" 8 | "github.com/tailscale/sqlite" 9 | ) 10 | 11 | type DB struct { 12 | *sqlx.DB 13 | } 14 | 15 | func Open(path string) (*DB, error) { 16 | db, err := sqlx.Connect("sqlite3", "file:"+path) 17 | if err != nil { 18 | return nil, fmt.Errorf("opening %q: %v", path, err) 19 | } 20 | 21 | // Limit to a single Conn that never expires, so per-Conn state 22 | // remains the same. 23 | db.SetConnMaxLifetime(0) 24 | db.SetConnMaxIdleTime(0) 25 | db.SetMaxOpenConns(0) 26 | 27 | conn, err := db.Conn(context.Background()) 28 | if err != nil { 29 | db.Close() 30 | return nil, fmt.Errorf("getting DB conn: %w", err) 31 | } 32 | const init = ` 33 | PRAGMA journal_mode=WAL; 34 | PRAGMA temp_store=MEMORY; 35 | ` 36 | if err := sqlite.ExecScript(conn, init); err != nil { 37 | db.Close() 38 | return nil, fmt.Errorf("initializing %q: %w", path, err) 39 | } 40 | if err := migrate(db); err != nil { 41 | db.Close() 42 | return nil, fmt.Errorf("applying migrations to %q: %w", path, err) 43 | } 44 | if err := conn.Close(); err != nil { 45 | db.Close() 46 | return nil, fmt.Errorf("returning Conn to pool: %w", err) 47 | } 48 | 49 | return &DB{db}, nil 50 | } 51 | -------------------------------------------------------------------------------- /db/migrations.go: -------------------------------------------------------------------------------- 1 | package db 2 | 3 | import ( 4 | "context" 5 | "fmt" 6 | 7 | "github.com/jmoiron/sqlx" 8 | ) 9 | 10 | func migrate(db *sqlx.DB) error { 11 | tx, err := db.BeginTxx(context.Background(), nil) 12 | if err != nil { 13 | return fmt.Errorf("DB migration begin transaction: %w", err) 14 | } 15 | defer tx.Rollback() 16 | 17 | var idx int 18 | err = tx.Get(&idx, "PRAGMA user_version") 19 | if err != nil { 20 | return fmt.Errorf("getting latest applied migration: %w", err) 21 | } 22 | 23 | if idx == len(migrations) { 24 | // Already fully migrated, nothing needed. 25 | } else if idx > len(migrations) { 26 | return fmt.Errorf("database is at version %d, which is more recent than this binary understands", idx) 27 | } 28 | 29 | for i, f := range migrations[idx:] { 30 | if err := f(tx); err != nil { 31 | return fmt.Errorf("migration to version %d failed: %w", i+1, err) 32 | } 33 | } 34 | 35 | // For some reason, ? substitution doesn't work in PRAGMA 36 | // statements, sqlite reports a parse error. 37 | if _, err := tx.Exec(fmt.Sprintf("PRAGMA user_version=%d", len(migrations))); err != nil { 38 | return fmt.Errorf("recording new DB version: %w", err) 39 | } 40 | if err := tx.Commit(); err != nil { 41 | return fmt.Errorf("DB migration commit transaction: %w", err) 42 | } 43 | 44 | return nil 45 | } 46 | 47 | func sql(idl ...string) func(*sqlx.Tx) error { 48 | return func(tx *sqlx.Tx) error { 49 | for _, stmt := range idl { 50 | if _, err := tx.Exec(stmt); err != nil { 51 | return err 52 | } 53 | } 54 | return nil 55 | } 56 | } 57 | 58 | var migrations = []func(*sqlx.Tx) error{ 59 | sql(`CREATE TABLE dirty ( 60 | id INTEGER PRIMARY KEY, 61 | path TEXT, 62 | mtime_sec INTEGER, 63 | mtime_nano INTEGER, 64 | size INTEGER, 65 | dev INTEGER, 66 | inode INTEGER, 67 | readable INTEGER, 68 | dirty INTEGER 69 | )`, 70 | `CREATE UNIQUE INDEX dirty_path_idx ON dirty (path)`, 71 | `CREATE TABLE files ( 72 | id INTEGER PRIMARY KEY, 73 | path TEXT, 74 | size INTEGER, 75 | blake2s TEXT, 76 | firstseen_sec INTEGER 77 | )`, 78 | `CREATE UNIQUE INDEX files_path_hash_idx on files (path, blake2s)`), 79 | sql(`ALTER TABLE files RENAME COLUMN blake2s TO hash`), 80 | sql(`CREATE TABLE media ( 81 | id INTEGER PRIMARY KEY, 82 | serial TEXT, 83 | kind TEXT, 84 | capacity INTEGER 85 | )`, 86 | `CREATE UNIQUE INDEX media_serial on media (serial)`, 87 | `CREATE TABLE bundle ( 88 | id INTEGER PRIMARY KEY, 89 | medium INTEGER, 90 | index_size INTEGER, 91 | data_size INTEGER, 92 | prepared_sec INTEGER, 93 | 94 | written_sec INTEGER, 95 | index_on_tape INTEGER 96 | )`, 97 | `CREATE TABLE bundle_file ( 98 | id INTEGER PRIMARY KEY, 99 | bundle_id INTEGER, 100 | file_id INTEGER, 101 | offset INTEGER 102 | )`), 103 | sql(`CREATE VIEW latest_files (id, path, size, hash, firstseen_sec) 104 | AS SELECT id,path,size,hash,MAX(firstseen_sec) FROM files GROUP BY path`), 105 | sql(`CREATE VIEW file_replicas (id, replicas) 106 | AS SELECT a.id,iif(b.file_id is null, 0, count(*)) 107 | FROM files AS a 108 | LEFT JOIN bundle_file AS b ON a.id=b.file_id 109 | GROUP BY a.path`), 110 | } 111 | -------------------------------------------------------------------------------- /flake.lock: -------------------------------------------------------------------------------- 1 | { 2 | "nodes": { 3 | "flake-utils": { 4 | "locked": { 5 | "lastModified": 1659877975, 6 | "narHash": "sha256-zllb8aq3YO3h8B/U0/J1WBgAL8EX5yWf5pMj3G0NAmc=", 7 | "owner": "numtide", 8 | "repo": "flake-utils", 9 | "rev": "c0e246b9b83f637f4681389ecabcb2681b4f3af0", 10 | "type": "github" 11 | }, 12 | "original": { 13 | "owner": "numtide", 14 | "repo": "flake-utils", 15 | "type": "github" 16 | } 17 | }, 18 | "nixpkgs": { 19 | "locked": { 20 | "lastModified": 1662818301, 21 | "narHash": "sha256-uRjbKN924ptf5CvQ4cfki3R9nIm5EhrJBeb/xUxwfcM=", 22 | "owner": "NixOS", 23 | "repo": "nixpkgs", 24 | "rev": "a25f0b9bbdfedee45305da5d1e1410c5bcbd48f6", 25 | "type": "github" 26 | }, 27 | "original": { 28 | "owner": "NixOS", 29 | "ref": "nixpkgs-unstable", 30 | "repo": "nixpkgs", 31 | "type": "github" 32 | } 33 | }, 34 | "root": { 35 | "inputs": { 36 | "flake-utils": "flake-utils", 37 | "nixpkgs": "nixpkgs" 38 | } 39 | } 40 | }, 41 | "root": "root", 42 | "version": 7 43 | } 44 | -------------------------------------------------------------------------------- /flake.nix: -------------------------------------------------------------------------------- 1 | { 2 | inputs = { 3 | nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable"; 4 | flake-utils.url = "github:numtide/flake-utils"; 5 | }; 6 | 7 | outputs = { nixpkgs, flake-utils, ... }: 8 | flake-utils.lib.eachDefaultSystem (system: 9 | let 10 | pkgs = import nixpkgs { inherit system; }; 11 | in { 12 | devShell = pkgs.mkShell { 13 | packages = [ 14 | pkgs.gcc 15 | pkgs.glibc.static 16 | pkgs.go_1_19 17 | pkgs.gotools 18 | pkgs.sqlite-interactive 19 | pkgs.openiscsi 20 | pkgs.lsscsi 21 | ]; 22 | }; 23 | }); 24 | } 25 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module go.universe.tf/mixtape 2 | 3 | go 1.19 4 | 5 | require ( 6 | github.com/google/go-cmp v0.5.8 7 | github.com/jmoiron/sqlx v1.3.4 8 | github.com/tailscale/sqlite v0.0.0-20211031232420-49007156918b 9 | golang.org/x/crypto v0.0.0-20211117183948-ae814b36b871 10 | ) 11 | 12 | require ( 13 | github.com/mattn/go-sqlite3 v1.14.8 // indirect 14 | golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f // indirect 15 | ) 16 | -------------------------------------------------------------------------------- /go.sum: -------------------------------------------------------------------------------- 1 | github.com/go-sql-driver/mysql v1.5.0 h1:ozyZYNQW3x3HtqT1jira07DN2PArx2v7/mN66gGcHOs= 2 | github.com/go-sql-driver/mysql v1.5.0/go.mod h1:DCzpHaOWr8IXmIStZouvnhqoel9Qv2LBy8hT2VhHyBg= 3 | github.com/google/go-cmp v0.5.8 h1:e6P7q2lk1O+qJJb4BtCQXlK8vWEO8V1ZeuEdJNOqZyg= 4 | github.com/google/go-cmp v0.5.8/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY= 5 | github.com/jmoiron/sqlx v1.3.4 h1:wv+0IJZfL5z0uZoUjlpKgHkgaFSYD+r9CfrXjEXsO7w= 6 | github.com/jmoiron/sqlx v1.3.4/go.mod h1:2BljVx/86SuTyjE+aPYlHCTNvZrnJXghYGpNiXLBMCQ= 7 | github.com/lib/pq v1.2.0 h1:LXpIM/LZ5xGFhOpXAQUIMM1HdyqzVYM13zNdjCEEcA0= 8 | github.com/lib/pq v1.2.0/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo= 9 | github.com/mattn/go-sqlite3 v1.14.6/go.mod h1:NyWgC/yNuGj7Q9rpYnZvas74GogHl5/Z4A/KQRfk6bU= 10 | github.com/mattn/go-sqlite3 v1.14.8 h1:gDp86IdQsN/xWjIEmr9MF6o9mpksUgh0fu+9ByFxzIU= 11 | github.com/mattn/go-sqlite3 v1.14.8/go.mod h1:NyWgC/yNuGj7Q9rpYnZvas74GogHl5/Z4A/KQRfk6bU= 12 | github.com/tailscale/sqlite v0.0.0-20211031232420-49007156918b h1:Z8BQQAx/G8wyi6grH4jRNi88T18pbwnbldm3ZsA2UhQ= 13 | github.com/tailscale/sqlite v0.0.0-20211031232420-49007156918b/go.mod h1:/SYRiJgVz5pu9YGUBGIX5LTu4bOWF9mKB/LbyYRw6L0= 14 | golang.org/x/crypto v0.0.0-20211117183948-ae814b36b871 h1:/pEO3GD/ABYAjuakUS6xSEmmlyVS4kxBNkeA9tLJiTI= 15 | golang.org/x/crypto v0.0.0-20211117183948-ae814b36b871/go.mod h1:IxCIyHEi3zRg3s0A5j5BB6A9Jmi73HwBIUl50j+osU4= 16 | golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f h1:v4INt8xihDGvnrfjMDVXGxw9wrfxYyCjk0KbXjhR55s= 17 | golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= 18 | -------------------------------------------------------------------------------- /main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "log" 5 | "net/http" 6 | _ "net/http/pprof" 7 | "os" 8 | "path/filepath" 9 | "time" 10 | 11 | "go.universe.tf/mixtape/bundler" 12 | "go.universe.tf/mixtape/db" 13 | "go.universe.tf/mixtape/scanner" 14 | ) 15 | 16 | func main() { 17 | go http.ListenAndServe("[::]:1234", nil) 18 | 19 | db, err := db.Open("mixtape.db") 20 | if err != nil { 21 | log.Fatal(err) 22 | } 23 | 24 | roots, err := roots(os.Args[1:]) 25 | if err != nil { 26 | log.Fatal(err) 27 | } 28 | 29 | for { 30 | log.Print("Scanning roots ", roots) 31 | if err := scanner.Scan(db, os.DirFS("/"), roots); err != nil { 32 | log.Print("Error during scan: ", err) 33 | } 34 | 35 | log.Print("Updating sums") 36 | if err := scanner.Sum(db, os.DirFS("/")); err != nil { 37 | log.Print("Error during sum updating: ", err) 38 | } 39 | 40 | log.Print("Creating bundle") 41 | if err := bundler.Bundle(db); err != nil { 42 | log.Print("Error during bundling: ", err) 43 | } 44 | 45 | time.Sleep(10 * time.Second) 46 | } 47 | 48 | db.Close() 49 | } 50 | 51 | func roots(rs []string) ([]string, error) { 52 | for i := range rs { 53 | root, err := filepath.Abs(rs[i]) 54 | if err != nil { 55 | return nil, err 56 | } 57 | rs[i] = root[1:] 58 | } 59 | return rs, nil 60 | } 61 | -------------------------------------------------------------------------------- /scanner/scanner.go: -------------------------------------------------------------------------------- 1 | package scanner 2 | 3 | import ( 4 | "errors" 5 | "fmt" 6 | "io/fs" 7 | "log" 8 | "sort" 9 | "syscall" 10 | 11 | "github.com/google/go-cmp/cmp" 12 | "go.universe.tf/mixtape/db" 13 | ) 14 | 15 | // Scan adds all files in roots to the DB's dirty table, marking 16 | // potentially-changed files dirty as it goes. 17 | func Scan(d *db.DB, fsys fs.FS, roots []string) error { 18 | for _, root := range roots { 19 | if !fs.ValidPath(root) { 20 | return fmt.Errorf("invalid root path %q", root) 21 | } 22 | } 23 | 24 | var disk []*fileInfo 25 | for _, root := range roots { 26 | err := fs.WalkDir(fsys, root, func(path string, ent fs.DirEntry, err error) error { 27 | inf, err := toInfo(path, ent, err) 28 | if err != nil { 29 | return err 30 | } 31 | if inf != nil { 32 | disk = append(disk, inf) 33 | } 34 | return nil 35 | }) 36 | if err != nil { 37 | return fmt.Errorf("scanning root %q: %v", root, err) 38 | } 39 | } 40 | 41 | sort.Slice(disk, func(i, j int) bool { 42 | return disk[i].Path < disk[j].Path 43 | }) 44 | 45 | var last []*fileInfo 46 | tx, err := d.Beginx() 47 | if err != nil { 48 | return fmt.Errorf("begin tx: %w", err) 49 | } 50 | defer tx.Rollback() 51 | err = tx.Select(&last, "SELECT path,mtime_sec,mtime_nano,size,dev,inode,readable FROM dirty") 52 | if err != nil { 53 | return fmt.Errorf("reading previous dirty state: %w", err) 54 | } 55 | 56 | // Sort in Go rather than SQL, to avoid unicode collation differences. 57 | sort.Slice(last, func(i, j int) bool { 58 | return last[i].Path < last[j].Path 59 | }) 60 | 61 | insert, update, delete := diff(disk, last) 62 | 63 | for _, st := range insert { 64 | if _, err := tx.Exec("INSERT INTO dirty (path, mtime_sec, mtime_nano, size, dev, inode, readable, dirty) VALUES (?,?,?,?,?,?,?,1)", st.Path, st.Msec, st.Mnano, st.Size, st.Dev, st.Inode, st.Readable); err != nil { 65 | return fmt.Errorf("inserting fileinfo for %q: %w", st.Path, err) 66 | } 67 | } 68 | for _, st := range update { 69 | if _, err := tx.Exec("UPDATE dirty SET mtime_sec=?,mtime_nano=?,size=?,dev=?,inode=?,readable=?,dirty=1 WHERE path=?", st.Msec, st.Mnano, st.Size, st.Dev, st.Inode, st.Readable, st.Path); err != nil { 70 | return fmt.Errorf("updating fileinfo for %q: %w", st.Path, err) 71 | } 72 | } 73 | for _, st := range delete { 74 | if _, err := tx.Exec("DELETE FROM dirty WHERE path=?", st.Path); err != nil { 75 | return fmt.Errorf("deleting fileinfo for %q: %w", st.Path, err) 76 | } 77 | } 78 | if err := tx.Commit(); err != nil { 79 | return fmt.Errorf("committing dirty file set: %w", err) 80 | } 81 | 82 | return nil 83 | } 84 | 85 | func diff(disk, db []*fileInfo) (insert, update, delete []*fileInfo) { 86 | for len(disk) > 0 && len(db) > 0 { 87 | a, b := disk[0], db[0] 88 | switch { 89 | case *a == *b: // In sync 90 | disk = disk[1:] 91 | db = db[1:] 92 | case a.Path == b.Path: // metadata changed 93 | log.Println("file changed: ", cmp.Diff(*a, *b)) 94 | update = append(update, a) 95 | disk = disk[1:] 96 | db = db[1:] 97 | case a.Path < b.Path: // On disk but not in DB, fresh insert 98 | insert = append(insert, a) 99 | disk = disk[1:] 100 | default: // In DB but not on disk, delete 101 | delete = append(delete, b) 102 | db = db[1:] 103 | } 104 | } 105 | insert = append(insert, disk...) 106 | delete = append(delete, db...) 107 | return insert, update, delete 108 | } 109 | 110 | type fileInfo struct { 111 | Path string 112 | Msec int64 `db:"mtime_sec"` 113 | Mnano int64 `db:"mtime_nano"` 114 | Size uint64 115 | Dev uint64 116 | Inode uint64 117 | Readable bool 118 | } 119 | 120 | func toInfo(path string, ent fs.DirEntry, err error) (*fileInfo, error) { 121 | if ent.Type() != 0 { 122 | // Process all subdirs, ignore other irregular files. 123 | return nil, nil 124 | } 125 | 126 | info, err := ent.Info() 127 | if errors.Is(err, fs.ErrNotExist) { 128 | // Delete race, nothing to do. 129 | return nil, nil 130 | } else if errors.Is(err, fs.ErrPermission) { 131 | return &fileInfo{ 132 | Path: path, 133 | Readable: false, 134 | }, nil 135 | } 136 | 137 | inf := &fileInfo{ 138 | Path: path, 139 | Msec: info.ModTime().Unix(), 140 | Mnano: int64(info.ModTime().Nanosecond()), 141 | Size: uint64(info.Size()), 142 | Readable: true, 143 | } 144 | if sys, ok := info.Sys().(*syscall.Stat_t); ok { 145 | inf.Dev = sys.Dev 146 | inf.Inode = sys.Ino 147 | } 148 | return inf, nil 149 | } 150 | -------------------------------------------------------------------------------- /scanner/summer.go: -------------------------------------------------------------------------------- 1 | package scanner 2 | 3 | import ( 4 | "context" 5 | "database/sql" 6 | "encoding/hex" 7 | "errors" 8 | "fmt" 9 | "io" 10 | "io/fs" 11 | "log" 12 | "sync" 13 | "time" 14 | 15 | "go.universe.tf/mixtape/db" 16 | "golang.org/x/crypto/blake2s" 17 | ) 18 | 19 | var done = errors.New("done") 20 | 21 | var hashBuf = sync.Pool{ 22 | New: func() interface{} { return make([]byte, 10*1024*1024) }, 23 | } 24 | 25 | func Sum(d *db.DB, fsys fs.FS) error { 26 | for { 27 | err := sum(context.Background(), d, fsys) 28 | if errors.Is(err, done) { 29 | return nil 30 | } else if err != nil { 31 | return err 32 | } 33 | } 34 | } 35 | 36 | func sum(ctx context.Context, d *db.DB, fsys fs.FS) error { 37 | scanTime := time.Now().Unix() 38 | 39 | tx, err := d.Beginx() 40 | if err != nil { 41 | return fmt.Errorf("begin tx: %w", err) 42 | } 43 | defer tx.Rollback() 44 | 45 | var f struct { 46 | ID int64 `db:"id"` 47 | Path string `db:"path"` 48 | } 49 | err = tx.Get(&f, "SELECT id,path FROM dirty WHERE dirty=1 LIMIT 1") 50 | if errors.Is(err, sql.ErrNoRows) { 51 | return done 52 | } else if err != nil { 53 | return fmt.Errorf("getting dirty file to sum: %w", err) 54 | } 55 | 56 | log.Printf("hashing %q", f.Path) 57 | 58 | h, sz, err := sumOne(ctx, fsys, f.Path) 59 | if err != nil { 60 | return fmt.Errorf("hashing %q: %w", f.Path, err) 61 | } 62 | 63 | if _, err := tx.Exec("INSERT INTO files (path, size, hash, firstseen_sec) VALUES (?,?,?,?) ON CONFLICT DO NOTHING", f.Path, sz, h, scanTime); err != nil { 64 | return fmt.Errorf("recording file hash for %q: %w", f.Path, err) 65 | } 66 | if _, err := tx.Exec("UPDATE dirty SET dirty=0 WHERE id=?", f.ID); err != nil { 67 | return fmt.Errorf("clearing dirty bit on %q: %w", f.Path, err) 68 | } 69 | 70 | if err := tx.Commit(); err != nil { 71 | return fmt.Errorf("commit tx: %w", err) 72 | } 73 | return nil 74 | } 75 | 76 | func sumOne(ctx context.Context, fsys fs.FS, path string) (h string, sz int64, err error) { 77 | f, err := fsys.Open(path) 78 | if err != nil { 79 | return "", 0, fmt.Errorf("opening %q: %w", path, err) 80 | } 81 | defer f.Close() 82 | hasher, _ := blake2s.New256(nil) 83 | buf := hashBuf.Get().([]byte) 84 | 85 | sz, err = io.CopyBuffer(hasher, f, buf) 86 | if err != nil { 87 | return "", 0, err 88 | } 89 | 90 | return hex.EncodeToString(hasher.Sum(nil)), sz, nil 91 | } 92 | -------------------------------------------------------------------------------- /timing/timing.go: -------------------------------------------------------------------------------- 1 | package timing 2 | 3 | import ( 4 | "bytes" 5 | "fmt" 6 | "sync" 7 | "time" 8 | ) 9 | 10 | type Phase struct { 11 | Name string 12 | Duration time.Duration 13 | } 14 | 15 | type Rec struct { 16 | mu sync.Mutex 17 | start time.Time 18 | phases []Phase 19 | } 20 | 21 | func (r *Rec) finishLocked() { 22 | if !r.start.IsZero() { 23 | r.phases[len(r.phases)-1].Duration = time.Since(r.start) 24 | } 25 | } 26 | 27 | func (r *Rec) Phase(name string) { 28 | r.mu.Lock() 29 | defer r.mu.Unlock() 30 | r.finishLocked() 31 | r.start = time.Now() 32 | r.phases = append(r.phases, Phase{Name: name}) 33 | } 34 | 35 | func (r *Rec) Done() Timings { 36 | r.mu.Lock() 37 | defer r.mu.Unlock() 38 | r.finishLocked() 39 | r.start = time.Time{} 40 | ph := r.phases 41 | r.phases = nil 42 | return ph 43 | } 44 | 45 | type Timings []Phase 46 | 47 | func (t Timings) Total() time.Duration { 48 | var ret time.Duration 49 | for _, p := range t { 50 | ret += p.Duration 51 | } 52 | return ret 53 | } 54 | 55 | func (t Timings) DebugString() string { 56 | var b bytes.Buffer 57 | for _, p := range t { 58 | fmt.Fprintf(&b, "%s %v\n", p.Name, p.Duration) 59 | } 60 | fmt.Fprintf(&b, "total %v\n", t.Total()) 61 | return b.String() 62 | } 63 | -------------------------------------------------------------------------------- /vtl.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | set -euo pipefail 4 | 5 | function list() { 6 | sudo iscsiadm --mode discovery -t sendtargets --portal virtual-tape 7 | } 8 | 9 | case "$1" in 10 | list) 11 | list 12 | ;; 13 | mount) 14 | loader=$(list | grep autoloader | grep 100. | cut -f2 -d' ') 15 | sudo iscsiadm --mode node --targetname "$loader" --portal virtual-tape --login 16 | drive=$(list | grep drive | grep 100. | cut -f2 -d' ') 17 | sudo iscsiadm --mode node --targetname "$drive" --portal virtual-tape --login 18 | ;; 19 | umount) 20 | loader=$(list | grep autoloader | grep 100. | cut -f2 -d' ') 21 | sudo iscsiadm --mode node --targetname "$loader" --portal virtual-tape --logout || true 22 | drive=$(list | grep drive | grep 100. | cut -f2 -d' ') 23 | sudo iscsiadm --mode node --targetname "$drive" --portal virtual-tape --logout || true 24 | ;; 25 | esac 26 | -------------------------------------------------------------------------------- /writer/writer.go: -------------------------------------------------------------------------------- 1 | package writer 2 | 3 | import ( 4 | "archive/tar" 5 | "bufio" 6 | "encoding/hex" 7 | "errors" 8 | "io" 9 | "io/fs" 10 | "time" 11 | 12 | "golang.org/x/crypto/blake2s" 13 | ) 14 | 15 | type Config struct { 16 | Out io.Writer 17 | FSys fs.FS 18 | Paths []string 19 | MaxBytes int64 20 | Progress func(*Progress) error 21 | } 22 | 23 | type File struct { 24 | Path string 25 | Hash string 26 | Size int64 27 | Offset int64 28 | } 29 | 30 | type Progress struct { 31 | Start time.Time 32 | End time.Time 33 | Written []File 34 | NotFound int 35 | TooBig int 36 | Bytes int64 37 | } 38 | 39 | type Report struct { 40 | Start time.Time 41 | End time.Time 42 | Bytes int64 43 | Files int 44 | NotFound int 45 | TooBig int 46 | } 47 | 48 | const ( 49 | // writeEndBuffer is how much slack we leave at the end of writing, to 50 | // account for "unknown unknowns" like the media not having quite as 51 | // many bytes as we hoped, or the writer over-writing a little bit due 52 | // to header inflation. 53 | writeEndBuffer = 1 << 20 // 1M 54 | // writeBufSize is the size of the write buffer when outputting 55 | // tarfiles. 56 | writeBufSize = 4 << 20 // 4M 57 | ) 58 | 59 | // WriteTar files to a sink as a tarball, as specified in cfg. 60 | func WriteTar(cfg *Config) (*Report, error) { 61 | bw := bufio.NewWriterSize(cfg.Out, writeBufSize) 62 | cw := &countingWriter{w: bw} 63 | tw := tar.NewWriter(cw) 64 | 65 | remaining := cfg.MaxBytes - writeEndBuffer 66 | wantProgress := cfg.Progress != nil 67 | ret := &Report{ 68 | Start: time.Now(), 69 | } 70 | pr := &Progress{ 71 | Start: ret.Start, 72 | } 73 | recordBytes := func() { 74 | pr.Bytes += cw.N 75 | ret.Bytes += cw.N 76 | cw.N = 0 77 | } 78 | defer func() { 79 | recordBytes() // so final byte count is correct on error 80 | ret.End = time.Now() 81 | }() 82 | reportProgress := func() error { 83 | if !wantProgress { 84 | return nil 85 | } 86 | pr.End = time.Now() 87 | if err := cfg.Progress(pr); err != nil { 88 | return err 89 | } 90 | pr.Start = pr.End 91 | pr.Written = pr.Written[:0] 92 | pr.NotFound = 0 93 | pr.TooBig = 0 94 | pr.Bytes = 0 95 | return nil 96 | } 97 | 98 | for _, path := range cfg.Paths { 99 | fi, err := fs.Stat(cfg.FSys, path) 100 | if errors.Is(err, fs.ErrNotExist) { 101 | pr.NotFound++ 102 | ret.NotFound++ 103 | continue 104 | } else if err != nil { 105 | return ret, err 106 | } 107 | 108 | if fi.Size() > remaining { 109 | pr.TooBig++ 110 | ret.TooBig++ 111 | continue 112 | } 113 | 114 | f := File{ 115 | Path: path, 116 | Offset: cw.N, 117 | } 118 | hdr := &tar.Header{ 119 | Name: path, 120 | Size: fi.Size(), 121 | Mode: 0600, 122 | } 123 | if err := tw.WriteHeader(hdr); err != nil { 124 | return ret, err 125 | } 126 | hash, sz, err := writeFile(tw, cfg.FSys, path) 127 | // TODO: could check for the tar writer reporting a size 128 | // mismatch, and possibly rewind the writer and try again/move 129 | // on to a different file. 130 | if err != nil { 131 | return ret, err 132 | } 133 | f.Size = sz 134 | f.Hash = hash 135 | if wantProgress { 136 | pr.Written = append(pr.Written, f) 137 | } 138 | // Flush explicitly even though it's not necessary, so that 139 | // the next file's offset is correct. 140 | if err := tw.Flush(); err != nil { 141 | return ret, err 142 | } 143 | recordBytes() 144 | ret.Files++ 145 | 146 | if now := time.Now(); now.After(pr.Start.Add(10 * time.Second)) { 147 | if err := reportProgress(); err != nil { 148 | return ret, err 149 | } 150 | } 151 | } 152 | 153 | if wantProgress && pr.Bytes > 0 { 154 | if err := reportProgress(); err != nil { 155 | return ret, err 156 | } 157 | } 158 | 159 | if err := tw.Close(); err != nil { 160 | return ret, err 161 | } 162 | if err := bw.Flush(); err != nil { 163 | return ret, err 164 | } 165 | return ret, nil 166 | } 167 | 168 | // writeFile writes path from fsys into w, and returns the blake2s 169 | // hash and number of copied bytes on success. 170 | func writeFile(w io.Writer, fsys fs.FS, path string) (string, int64, error) { 171 | f, err := fsys.Open(path) 172 | if err != nil { 173 | return "", 0, err 174 | } 175 | 176 | hasher, _ := blake2s.New256(nil) 177 | mw := io.MultiWriter(w, hasher) 178 | n, err := io.Copy(mw, f) 179 | if err != nil { 180 | return "", 0, err 181 | } 182 | return hex.EncodeToString(hasher.Sum(nil)), n, nil 183 | } 184 | 185 | type countingWriter struct { 186 | w io.Writer 187 | N int64 188 | } 189 | 190 | func (cw *countingWriter) Write(bs []byte) (int, error) { 191 | n, err := cw.w.Write(bs) 192 | cw.N += int64(n) 193 | return n, err 194 | } 195 | 196 | // func Write(d *db.DB, destMedia, replicationGoal int) error { 197 | // tx, err := d.Beginx() 198 | // if err != nil { 199 | // return err 200 | // } 201 | // defer tx.Rollback() 202 | 203 | // const filesNeedingMoreReplication = ` 204 | // SELECT a.id,a.path 205 | // FROM latest_files AS a 206 | // INNER JOIN file_replicas AS b ON a.id=b.id 207 | // WHERE replicas < ? 208 | 209 | // EXCEPT 210 | 211 | // SELECT c.id,c.path 212 | // FROM bundle AS a 213 | // INNER JOIN bundle_file AS b ON a.id=b.bundle_id 214 | // INNER JOIN files AS c ON b.file_id=c.id 215 | // WHERE a.medium = ? 216 | 217 | // ORDER BY a.path 218 | // ` 219 | // var files []*file 220 | // } 221 | 222 | // type file struct { 223 | // Id uint64 224 | // Path string 225 | // Size int64 226 | // } 227 | 228 | // func filesToWrite(d *db.DB, destMedia, replicationGoal int) ([]*file, error) { 229 | 230 | // var files []*file 231 | // err = tx.Select(&files, filesNeedingMoreReplication, replicationGoal, destMedia) 232 | // if err != nil { 233 | // return nil, fmt.Errorf("reading under-replicated files: %w", err) 234 | // } 235 | 236 | // return files, nil 237 | // } 238 | -------------------------------------------------------------------------------- /writer/writer_test.go: -------------------------------------------------------------------------------- 1 | package writer 2 | 3 | import ( 4 | "archive/tar" 5 | "encoding/hex" 6 | "errors" 7 | "fmt" 8 | "io" 9 | "io/fs" 10 | "log" 11 | "os" 12 | "path/filepath" 13 | "testing" 14 | "time" 15 | 16 | "golang.org/x/crypto/blake2s" 17 | ) 18 | 19 | func TestWriter(t *testing.T) { 20 | fsys := &zeroFS{ 21 | files: map[string]int64{ 22 | "data/test/0001": 10, 23 | "data/test/0002": 100, 24 | "data/test/0003": 1000, 25 | }, 26 | } 27 | 28 | f, err := os.CreateTemp("", "mixtape_writer") 29 | if err != nil { 30 | t.Fatalf("creating tmpfile: %v", err) 31 | } 32 | defer os.Remove(f.Name()) 33 | 34 | var written []File 35 | progress := func(pr *Progress) error { 36 | written = append(written, pr.Written...) 37 | return nil 38 | } 39 | 40 | cfg := &Config{ 41 | Out: f, 42 | FSys: fsys, 43 | Paths: []string{ 44 | "data/test/0001", 45 | "data/test/0002", 46 | "data/test/0003", 47 | }, 48 | MaxBytes: 10 << 20, // MiB 49 | Progress: progress, 50 | } 51 | 52 | res, err := Write(cfg) 53 | if err != nil { 54 | t.Fatalf("Write(cfg): %v", err) 55 | } 56 | 57 | if want, got := len(cfg.Paths), len(written); got != want { 58 | t.Fatalf("Progress callback gave wrong file count, got %d want %d", got, want) 59 | } 60 | for i, fi := range written { 61 | if got, want := fi.Path, cfg.Paths[i]; got != want { 62 | t.Errorf("progress file %d wrong path, got %q want %q", i, got, want) 63 | } 64 | if got, want := fi.Size, fsys.files[fi.Path]; got != want { 65 | t.Errorf("progress file %d (%q) wrong size, got %d want %d", i, fi.Path, got, want) 66 | } 67 | if got, want := fi.Hash, zeroHash(fi.Size); got != want { 68 | t.Errorf("progress file %d (%q) wrong hash, got %q want %q", i, fi.Path, got, want) 69 | } 70 | } 71 | 72 | if minBytes := int64(1110); res.Bytes < minBytes { 73 | t.Errorf("didn't write enough bytes, got %d want at least %d", res.Bytes, minBytes) 74 | } 75 | if got, want := res.Files, len(cfg.Paths); got != want { 76 | t.Errorf("didn't write enough files, got %d want %d", got, want) 77 | } 78 | if res.NotFound > 0 { 79 | t.Errorf("%d files were not found", res.NotFound) 80 | } 81 | if res.TooBig > 0 { 82 | t.Errorf("%d files did not fit", res.TooBig) 83 | } 84 | 85 | if _, err := f.Seek(0, os.SEEK_SET); err != nil { 86 | t.Fatalf("seeking to start of archive: %v", err) 87 | } 88 | 89 | tr := tar.NewReader(f) 90 | for i, path := range cfg.Paths { 91 | hdr, err := tr.Next() 92 | if errors.Is(err, io.EOF) { 93 | t.Fatalf("%d files missing from archive: %v", len(cfg.Paths[i:]), cfg.Paths[i:]) 94 | } else if err != nil { 95 | t.Fatalf("Advancing tar reader: %v", err) 96 | } 97 | if hdr.Name != path { 98 | t.Errorf("archive file %d wrong name, got %q want %q", i, hdr.Name, path) 99 | } 100 | if hdr.Typeflag != tar.TypeReg { 101 | t.Errorf("archive file %d wrong type, got %v want %v", i, hdr.Typeflag, tar.TypeReg) 102 | } 103 | if got, want := hdr.Size, fsys.files[path]; got != want { 104 | t.Errorf("archive file %d wrong size, got %v want %v", i, got, want) 105 | } 106 | if _, err = io.ReadAll(tr); err != nil { 107 | t.Fatalf("reading file %d: %v", i, err) 108 | } 109 | } 110 | if _, err := tr.Next(); err != io.EOF { 111 | t.Fatal("unexpected extra file in archive") 112 | } 113 | } 114 | 115 | func BenchmarkWriter(b *testing.B) { 116 | devnull, err := os.OpenFile("/dev/null", os.O_WRONLY, 0) 117 | if err != nil { 118 | log.Fatal("couldn't open /dev/null") 119 | } 120 | defer devnull.Close() 121 | b.Run("inmem", func(b *testing.B) { benchToWriter(b, io.Discard, false) }) 122 | b.Run("syscalls", func(b *testing.B) { benchToWriter(b, devnull, true) }) 123 | } 124 | 125 | func benchToWriter(b *testing.B, w io.Writer, syscalls bool) { 126 | sizes := []int64{ 127 | 10, 128 | 1 << 10, // 1k 129 | 1 << 20, // 1M 130 | 10 << 20, // 10M 131 | } 132 | if !testing.Short() { 133 | sizes = append(sizes, 134 | 100<<20, // 100M 135 | 1<<30, // 1G 136 | ) 137 | } 138 | for _, filesize := range sizes { 139 | b.Run(fmt.Sprint(filesize), func(b *testing.B) { 140 | benchToWriterSize(b, w, filesize, syscalls) 141 | }) 142 | } 143 | } 144 | 145 | func benchToWriterSize(b *testing.B, w io.Writer, sz int64, syscalls bool) { 146 | const filesPerArchive = 10 147 | fsys := &zeroFS{ 148 | files: map[string]int64{ 149 | "test": sz, 150 | }, 151 | syscalls: syscalls, 152 | } 153 | 154 | cfg := &Config{ 155 | Out: io.Discard, 156 | FSys: fsys, 157 | Paths: make([]string, 0, filesPerArchive), 158 | MaxBytes: 1 << 40, 159 | } 160 | for i := 0; i < filesPerArchive; i++ { 161 | cfg.Paths = append(cfg.Paths, "test") 162 | } 163 | b.ReportAllocs() 164 | b.ResetTimer() 165 | for n := 0; n < b.N; n++ { 166 | res, err := Write(cfg) 167 | if err != nil { 168 | b.Fatalf("write failed: %v", err) 169 | } 170 | b.SetBytes(res.Bytes) 171 | } 172 | } 173 | 174 | type zeroFS struct { 175 | files map[string]int64 176 | syscalls bool 177 | } 178 | 179 | func (fsys *zeroFS) Open(path string) (fs.File, error) { 180 | sz, ok := fsys.files[path] 181 | if !ok { 182 | return nil, fs.ErrNotExist 183 | } 184 | 185 | if fsys.syscalls { 186 | f, err := os.Open("/dev/zero") 187 | if err != nil { 188 | return nil, err 189 | } 190 | return &nullFile{ 191 | File: f, 192 | r: io.LimitReader(f, sz), 193 | }, nil 194 | } 195 | 196 | return &zeroFile{N: sz}, nil 197 | } 198 | 199 | func (fsys *zeroFS) Stat(path string) (fs.FileInfo, error) { 200 | sz, ok := fsys.files[path] 201 | if !ok { 202 | return nil, fs.ErrNotExist 203 | } 204 | return stat{filepath.Base(path), sz}, nil 205 | } 206 | 207 | type stat struct { 208 | name string 209 | sz int64 210 | } 211 | 212 | func (s stat) Name() string { return s.name } 213 | func (s stat) Size() int64 { return s.sz } 214 | func (s stat) Mode() fs.FileMode { return 0600 } 215 | func (s stat) ModTime() time.Time { return time.Time{} } 216 | func (s stat) IsDir() bool { return false } 217 | func (s stat) Sys() any { return nil } 218 | 219 | type nullFile struct { 220 | *os.File 221 | r io.Reader 222 | } 223 | 224 | func (f *nullFile) Read(bs []byte) (int, error) { 225 | return f.r.Read(bs) 226 | } 227 | 228 | type zeroFile struct { 229 | N int64 230 | off int64 231 | closed bool 232 | } 233 | 234 | func (f *zeroFile) Stat() (fs.FileInfo, error) { 235 | panic("don't call zeroFile.Stat") 236 | } 237 | 238 | func (f *zeroFile) Read(bs []byte) (int, error) { 239 | if f.closed { 240 | return 0, fs.ErrClosed 241 | } 242 | if f.off == f.N { 243 | return 0, io.EOF 244 | } 245 | 246 | end := f.off + int64(len(bs)) 247 | if end > f.N { 248 | end = f.N 249 | } 250 | n := end - f.off 251 | f.off = end 252 | return int(n), nil 253 | } 254 | 255 | func (f *zeroFile) Close() error { 256 | f.closed = true 257 | return nil 258 | } 259 | 260 | func zeroHash(n int64) string { 261 | bs := make([]byte, int(n)) 262 | hasher, _ := blake2s.New256(nil) 263 | if _, err := hasher.Write(bs); err != nil { 264 | panic("hasher write failed") 265 | } 266 | return hex.EncodeToString(hasher.Sum(nil)) 267 | } 268 | --------------------------------------------------------------------------------