├── README.md ├── bookmark └── tagbak /README.md: -------------------------------------------------------------------------------- 1 | # TagBak 2 | 3 | A command line utility for saving and restoring OS X tag information. TagBak was inspired by [Michael Simons](http://info.michael-simons.eu/2013/10/25/archiving-os-x-mavericks-tags-and-other-data-with-git/). 4 | 5 | --- 6 | 7 | TagBak can store tags for all files in the current folder and all subfolders. It can then restore the tags to the state they were in at the last run. Files which had existing tags will have any current tags replaced. Files that didn't have tags at the time of the run but have since been tagged will be left with their new tags. 8 | 9 | TagBak is intended for use with services that do not currently preserve tag data. If you run a remote backup, for instance, and the service strips tags, you can run TagBak prior to a backup and have the metadata for the backed-up files stored with them. Upon restore, TagBak can read the metadata file and restore the state of the tags as they were at the time of backup. 10 | 11 | It can also be used with Git repositories and the like. Run `tagbak store` once, and add the resulting `.metadata.stash` file to the repository. Use hooks (probably post-commit and post-receive) to update it before pushing and restore after pulling from another endpoint. 12 | 13 | ## Installation 14 | 15 | Put the `tagbak` script in a folder in your PATH, such as `/usr/local/bin/`. Make it executable by running `chmod a+x /path/to/tagbak`. Optionally install the `bookmark` utility (see below). 16 | 17 | ## Usage 18 | 19 | `tagbak store` will create a `.metadata.stash` file in the current directory (or one specified with an argument, e.g. `tagbak store ~/Dropbox`). Unless the `-s` switch is given, a progress readout will show the current status of the storage task. On my MacBook Air, it takes about 10 seconds per 100 files and the resulting file is about 10k per 100 files. 20 | 21 | `tagbak restore` will find the nearest `.metadata.stash` file, looking up the folder tree if necessary, and restore the tags from the current folder down the tree. 22 | 23 | `tagbak info` will give you a file list, total bookmarks, and file size for the neartest `.metadata.stash` file. 24 | 25 | ## Additional command line options 26 | 27 | Usage: tagbak [options] (store|restore|info) [dir [dir...]] 28 | -s, --silent Run silently 29 | --cli_util PATH Use alternate location for bookmark tool (default /usr/local/bin/bookmark) 30 | --ignore_pattern PATTERN File pattern (regex) to ignore (default "\.(git|svn)\/.*$") 31 | --debug LEVEL Degug level [1-3] (default 1) 32 | --stash_name FILENAME Use alternate stash file name (default .metadata.stash) 33 | -h, --help Display this screen 34 | 35 | ## bookmark-cli 36 | 37 | If the [bookmark utility](https://github.com/ttscoff/bookmark-cli) is installed in `/usr/local/bin/bookmark`, TagBak will store bookmark information with the metadata. This makes it possible to restore tags on files that have moved or been renamed, but only within the local drive and only if their file data hasn't changed. Restoring from a backup service or an external drive will destroy this data, so it's only a valid precaution in a few cases. Storing this data doesn't cause any major slowdown and the resulting stash sizes are still of a very manageable size, so it doesn't really hurt to include it. 38 | 39 | ## Guarantee 40 | 41 | None. Any data loss or failure to perform is not my responsibility. Run at your own risk. 42 | 43 | That being said, the storage process only affects the `.metadata.stash` file and if it's aborted, a backup of that file is restored. The restore process only changes the tag attribute of the affected files, so at worst, you might fail to restore your tags, which in most cases doesn't leave you in any worse position than you were in when you needed to run TagBak. 44 | -------------------------------------------------------------------------------- /bookmark: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ttscoff/tagbak/86144aaece6086a39da4c226c6b9b3c8867a4a8f/bookmark -------------------------------------------------------------------------------- /tagbak: -------------------------------------------------------------------------------- 1 | #!/usr/bin/ruby 2 | # TODO: Save tags should check for existing file and merge data? 3 | # TODO: Ability to restore tags on specific files, handy in a git hook to only re-tag changed files. 4 | # mdfind -0 -onlyin ~ 'kMDItemUserTags = "=*"c' | xargs -0 ~/scripts/tagbak 5 | 6 | %w[fileutils zlib logger shellwords optparse].each do |filename| 7 | require filename 8 | end 9 | 10 | class TagBak 11 | 12 | attr_writer :base_dir, :ignore_pattern 13 | 14 | def initialize(base = Dir.getwd, options = {}) 15 | # Locate the .metadata.stash file, working up the directory tree 16 | @opts = options 17 | @base_dir = base 18 | @stash_name = @opts[:stash_name] || '.metadata.stash' 19 | @cli_util = @opts[:cli_util] || '/usr/local/bin/bookmark' 20 | pattern = @opts[:ignore_pattern] || "\.(git|svn)\/.*$" 21 | @ignore_pattern = %r{#{pattern}} 22 | @stash_file ||= File.join(@base_dir, @stash_name) 23 | @log = Logger.new(STDERR) 24 | if @opts[:debug] 25 | case @opts[:debug].to_i 26 | when 1 27 | @log.level = Logger::WARN 28 | when 2 29 | @log.level = Logger::INFO 30 | else 31 | @log.level = Logger::DEBUG 32 | end 33 | else 34 | @log.level = Logger::INFO 35 | end 36 | 37 | original_formatter = Logger::Formatter.new 38 | @log.progname = self.class.name 39 | @log.formatter = proc { |severity, datetime, progname, msg| 40 | abbr_sev = case severity 41 | when 'WARN' then "> " 42 | when 'ERROR' then ">> " 43 | when 'FATAL' then "FATAL > " 44 | when 'INFO' then "* " 45 | else "(d) " 46 | end 47 | 48 | unless @opts[:silent] 49 | "#{abbr_sev}#{datetime.strftime('%H:%M:%S')} #{progname}: #{msg}\n" 50 | end 51 | } 52 | 53 | Dir.chdir(@base_dir) 54 | @start_time = Time.now 55 | end 56 | 57 | def debug 58 | @log.level = Logger::DEBUG 59 | end 60 | 61 | def store_tags 62 | datastore = [] 63 | files = %x{mdfind -onlyin . 'kMDItemUserTags = "*"'} 64 | files = files.encode('utf-8', 'binary', :invalid => :replace, :undef => :replace) 65 | files = files.split("\n") 66 | files.delete_if {|file| file =~ @ignore_pattern } 67 | 68 | trap("SIGINT") { exit_cleanly } 69 | 70 | begin 71 | backup_stash 72 | store([]) # create stash and clobber existing file 73 | 74 | total = files.length 75 | 76 | clear 77 | @log.info("Storing tags for #{total} files in #{@base_dir}") 78 | term_width = get_term_width 79 | files.each_with_index { |file, i| 80 | filename = file.sub(/#{@base_dir}\//,"") 81 | clear 82 | printf "\r[%5d / %5d] %s\r", i + 1, total, filename[0..(term_width - 30)] unless @opts[:silent] 83 | if datastore.length >= 500 84 | @log.debug("Storing data to #{@stash_file}") 85 | datastore = swapout(datastore) 86 | end 87 | # inode = File.stat(file).ino 88 | if bookmark_util_installed? 89 | bookmark = %x{#{@cli_util} save #{Shellwords.escape(file)}}.strip 90 | else 91 | bookmark = "" 92 | end 93 | data = %x{xattr -px com.apple.metadata:_kMDItemUserTags #{Shellwords.escape(file)} 2> /dev/null} 94 | if data && data.length > 0 95 | datastore << { 96 | 'filename' => filename, 97 | # 'inode' => inode, 98 | 'bookmark' => bookmark, 99 | 'data' => data 100 | } 101 | 102 | end 103 | } 104 | swapout(datastore) 105 | finish_timer("storing tags for #{files.length} files") 106 | rescue 107 | restore_backup_stash 108 | ensure 109 | clean_backup_stash 110 | end 111 | end 112 | 113 | def bookmark_util_installed? 114 | if File.file?(@cli_util) 115 | true 116 | else 117 | false 118 | end 119 | end 120 | 121 | def restore_tags 122 | 123 | unless stash_found? 124 | raise "No stash found" 125 | Process.exit 1 126 | end 127 | 128 | datastore = load 129 | files = datastore.select { |h| File.join(File.dirname(@stash_file), h['filename']) =~ /#{@base_dir}/ } 130 | 131 | total = files.length 132 | 133 | clear 134 | @log.info("Restoring tags for #{total} files in #{@base_dir}") 135 | 136 | files.each_with_index { |entry, i| 137 | file = File.expand_path(entry['filename']) 138 | clear 139 | printf "[%5d / %5d] %s\r", i + 1, total, file unless @opts[:silent] 140 | unless File.exists?(file) 141 | # @log.debug("File #{file} no longer exists in #{File.dirname(file)}, trying inode") 142 | # find_result = %x{find #{Shellwords.escape(@base_dir)} -inum #{entry['inode']}}.strip 143 | if bookmark_util_installed? 144 | @log.debug("File #{file} no longer exists in #{File.dirname(file)}, trying bookmark") 145 | find_result = %x{#{@cli_util} find '#{entry['bookmark']}'}.strip 146 | 147 | if find_result != "" 148 | file = File.expand_path(find_result) 149 | else 150 | @log.debug("bookmark no longer exists") 151 | next 152 | end 153 | end 154 | end 155 | 156 | next unless File.exists?(file) 157 | 158 | hex = entry['data'] 159 | 160 | if hex && hex.length > 0 161 | %x{xattr -wx com.apple.metadata:_kMDItemUserTags "#{hex}" #{Shellwords.escape(file)}} 162 | @log.debug("Restored tags for: #{file}") 163 | end 164 | } 165 | finish_timer("restoring tags for #{files.length} files") 166 | end 167 | 168 | def stash_info 169 | if stash_found? 170 | stash = load 171 | curr_dir = @base_dir.sub(/^#{File.dirname(@stash_file)}\/?/,'').sub(/\/$/,'') 172 | puts curr_dir 173 | curr_stash = stash.dup.delete_if {|file| 174 | file['filename'] !~ /^#{curr_dir}/ 175 | } 176 | curr_stash.each {|file| 177 | $stdout.printf "%s\n", file['filename'] unless @opts[:silent] 178 | } 179 | $stdout.puts "------------------------------------" 180 | 181 | $stdout.puts "Located stash at #{@stash_file}" 182 | $stdout.puts "Stash Size: #{readable_file_size(File.stat(@stash_file).size)}" 183 | $stdout.puts "Records for current dir: #{curr_stash.length}" 184 | $stdout.puts "Total Records: #{stash.length}" 185 | else 186 | $stdout.puts "No stash file found for current directory" 187 | end 188 | end 189 | 190 | private 191 | 192 | def get_tags(file) 193 | rawtags = %x{mdls -raw -name 'kMDItemUserTags' #{Shellwords.escape(file)}} 194 | rawtags.gsub(/\n?[\(\)]\n?/m,'').strip.split(/,\n\s+/) 195 | end 196 | 197 | def stash_found? 198 | until File.exists? @stash_file 199 | if File.expand_path(Dir.getwd).split('/').length > 2 200 | Dir.chdir('..') 201 | @stash_file = File.join(Dir.getwd, @stash_name) 202 | else 203 | break 204 | end 205 | end 206 | 207 | unless File.exists? @stash_file 208 | @log.error("No metadata stash found for current tree.") 209 | return false 210 | else 211 | clear 212 | @log.info("Using metadata stash from #{@stash_file}") 213 | return true 214 | end 215 | end 216 | 217 | # Add data in memory to disk stash 218 | def swapout(arr) 219 | stored_data = load_or_create 220 | stored_data.concat(arr) 221 | if store(stored_data) 222 | [] 223 | else 224 | false 225 | end 226 | end 227 | 228 | # Store a Marshal dump of a hash 229 | def store(obj) 230 | begin 231 | marshal_dump = Marshal.dump(obj) 232 | file = Zlib::GzipWriter.new(File.new(@stash_file,'w')) 233 | file.write marshal_dump 234 | file.close 235 | rescue 236 | @log.error("Failed to write stash to disk") 237 | return false 238 | end 239 | return true 240 | end 241 | 242 | # Load the Marshal dump to a hash 243 | def load 244 | begin 245 | file = Zlib::GzipReader.open(@stash_file) 246 | rescue Zlib::GzipFile::Error 247 | @log.error("Error reading #{file_name}") 248 | raise "Error reading #{file_name}" 249 | ensure 250 | obj = Marshal.load file.read 251 | file.close 252 | return obj 253 | end 254 | end 255 | 256 | def load_or_create 257 | unless File.exists?(@stash_file) 258 | store([]) 259 | end 260 | load 261 | end 262 | 263 | def backup_stash 264 | if File.exists?(@stash_file) 265 | FileUtils.cp @stash_file, @stash_file+'~' 266 | end 267 | end 268 | 269 | def restore_backup_stash 270 | if File.exists?(@stash_file+'~') 271 | FileUtils.mv @stash_file+'~', @stash_file 272 | end 273 | clean_backup_stash 274 | end 275 | 276 | def clean_backup_stash 277 | if File.exists?(@stash_file+'~') 278 | FileUtils.rm @stash_file+'~' 279 | end 280 | end 281 | 282 | def finish_timer(task) 283 | finish_time = Time.now - @start_time 284 | total_min = "%d" % (finish_time / 60) 285 | total_sec = "%02d" % (finish_time % 60) 286 | clear 287 | @log.info("Finished #{task} in #{total_min}:#{total_sec}") 288 | end 289 | 290 | def clear 291 | print "\r" 292 | print(" " * (get_term_width - 1)) 293 | print "\r" 294 | end 295 | 296 | def get_term_width 297 | begin 298 | if ENV['COLUMNS'] =~ /^\d+$/ 299 | ENV['COLUMNS'].to_i 300 | else 301 | cols = `tput cols` 302 | cols.length > 0 ? cols.to_i : 80 303 | end 304 | rescue 305 | 80 306 | end 307 | end 308 | 309 | def exit_cleanly 310 | restore_backup_stash 311 | Process.exit 1 312 | end 313 | 314 | # Return the file size with a readable style. 315 | def readable_file_size(size) 316 | gigsize = 1073741824.0 317 | megsize = 1048576.0 318 | ksize = 1024.0 319 | case 320 | when size == 1 321 | "1 Byte" 322 | when size < ksize 323 | "%d Bytes" % size 324 | when size < megsize 325 | "%.2f KB" % (size / ksize) 326 | when size < gigsize 327 | "%.2f MB" % (size / megsize) 328 | else 329 | "%.2f GB" % (size / gigsize) 330 | end 331 | end 332 | 333 | end # TagBak class 334 | 335 | ### CLI 336 | 337 | def usage 338 | $stderr.puts "Usage: #{File.basename(__FILE__)} [options] (store|restore|info) [dir [dir...]]" 339 | $stderr.puts "#{File.basename(__FILE__)} -h for more info." 340 | Process.exit 1 341 | end 342 | 343 | def perform_action(type, base_dir, options) 344 | t = TagBak.new(base_dir, options) 345 | 346 | case type 347 | when 'restore' 348 | t.restore_tags 349 | when 'store' 350 | t.store_tags 351 | when 'info' 352 | t.stash_info 353 | end 354 | end 355 | 356 | options = {} 357 | optparse = OptionParser.new do|opts| 358 | opts.banner = "Usage: #{File.basename(__FILE__)} [options] (store|restore|info) [dir [dir...]]" 359 | 360 | options[:silent] = false 361 | opts.on( '-s', '--silent', 'Run silently' ) do 362 | options[:silent] = true 363 | end 364 | 365 | opts.on('--cli_util PATH', 'Use alternate location for bookmark tool (default /usr/local/bin/bookmark)') do |cli| 366 | options[:cli_util] = cli 367 | end 368 | 369 | opts.on('--ignore_pattern PATTERN', 'File pattern (regex) to ignore (default "\.(git|svn)\/.*$")') do |ignore| 370 | options[:ignore_pattern] = ignore 371 | end 372 | 373 | opts.on('--debug LEVEL', 'Degug level [1-3] (default 1)') do |debug| 374 | options[:debug] = debug 375 | end 376 | 377 | opts.on('--stash_name FILENAME', 'Use alternate stash file name (default .metadata.stash)' ) do |stash| 378 | options[:stash_name] = stash 379 | end 380 | 381 | opts.on( '-h', '--help', 'Display this screen' ) do 382 | puts opts 383 | exit 384 | end 385 | end 386 | 387 | optparse.parse! 388 | 389 | usage if ARGV.length == 0 390 | rel_base = Dir.getwd 391 | 392 | type = 'store' 393 | case ARGV[0] 394 | when /^r(estore)?/ 395 | type = 'restore' 396 | ARGV.shift 397 | when /^s(tore|ave)?/ 398 | type = 'store' 399 | ARGV.shift 400 | when /^i(nfo)?/ 401 | type = 'info' 402 | ARGV.shift 403 | else 404 | usage unless File.exists?(File.expand_path(ARGV[0])) 405 | end 406 | 407 | if ARGV.length == 0 408 | base_dir = Dir.getwd 409 | perform_action(type, base_dir, options) 410 | else 411 | ARGV.each {|arg| 412 | target = arg =~ /^\// ? arg : File.join(rel_base, arg) 413 | if File.directory? File.expand_path(target) 414 | base_dir = File.expand_path(target) 415 | perform_action(type, base_dir, options) 416 | else 417 | $stderr.puts "#{ARGV[1]} is not a directory" 418 | next 419 | end 420 | } 421 | end 422 | --------------------------------------------------------------------------------