├── LICENSE ├── README └── imapbackup.py /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (C) 2007, Rui Carmo, Michael Leonhard 2 | 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without 6 | modification, are permitted provided that the following conditions are met: 7 | 8 | * Redistributions of source code must retain the above copyright notice, this 9 | list of conditions and the following disclaimer. 10 | * Redistributions in binary form must reproduce the above copyright notice, 11 | this list of conditions and the following disclaimer in the documentation 12 | and/or other materials provided with the distribution. 13 | * Neither the name of the nor the names of its contributors may 14 | be used to endorse or promote products derived from this software without 15 | specific prior written permission. 16 | 17 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 18 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 19 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 20 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR 21 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 22 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 23 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON 24 | ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 25 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 26 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 27 | -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | NOTE: This is an old version. imapbackup now lives at https://github.com/rcarmo/imapbackup . 2 | 3 | IMAP Backup Tool 4 | http://github.com/mleonhard/imapbackup 5 | http://tamale.net/imapbackup/ 6 | 7 | This program incrementally backs up IMAP folders to local mbox 8 | files. New messages are appended to the folder's mbox file. 9 | 10 | 11 | = Features = 12 | 13 | * Downloads all IMAP folders 14 | * Stores messages in mbox, mbox.gz, or mbox.bz2 15 | * Each folder downloads to its own mbox file, eg. Inbox.Drafts.mbox 16 | * Downloads only new messages, appends them to the mbox file 17 | * IMAP4 SSL, supporting client and server certificates 18 | * Accesses IMAP account in read-only mode. Does not affect message 'seen' status. 19 | 20 | 21 | = Usage = 22 | 23 | Usage: imapbackup [OPTIONS] -s HOST -u USERNAME [-p PASSWORD] 24 | -a --append-to-mboxes Append new messages to mbox files. (default) 25 | -y --yes-overwrite-mboxes Overwite existing mbox files instead of appending. 26 | -n --compress=none Use one plain mbox file for each folder. (default) 27 | -z --compress=gzip Use mbox.gz files. Appending may be very slow. 28 | -b --compress=bzip2 Use mbox.bz2 files. Appending not supported: use -y. 29 | -f --=folder Specifify which folders use. Comma separated list. 30 | -e --ssl Use SSL. Port defaults to 993. 31 | -k KEY --key=KEY PEM private key file for SSL. Specify cert, too. 32 | -c CERT --cert=CERT PEM certificate chain for SSL. Specify key, too. 33 | Python's SSL module doesn't check the cert chain. 34 | -s HOST --server=HOST Address of server, port optional, eg. mail.com:143 35 | -u USER --user=USER Username to log into server 36 | -p PASS --pass=PASS Prompts for password if not specified. 37 | 38 | NOTE: mbox files are created in the current working directory. 39 | 40 | 41 | = Example = 42 | 43 | $ python imapbackup.py -s shevek.tamale.net -u michael -e -f INBOX 44 | Password: 45 | Connecting to 'shevek.tamale.net' TCP port 993, SSL 46 | Logging in as 'michael' 47 | Finding Folders: 67 folders 48 | Folder INBOX: 1231 messages 49 | File INBOX.mbox / 50 | WARNING: Message #117 in INBOX.mbox has a malformed Message-Id header. 51 | File INBOX.mbox - 52 | WARNING: Message #269 in INBOX.mbox has a malformed Message-Id header. 53 | File INBOX.mbox - 54 | WARNING: Message #498 in INBOX.mbox has a malformed Message-Id header. 55 | File INBOX.mbox / 56 | WARNING: Message #609 in INBOX.mbox has a malformed Message-Id header. 57 | File INBOX.mbox | 58 | WARNING: Message #976 in INBOX.mbox has a malformed Message-Id header. 59 | File INBOX.mbox - 60 | WARNING: Message #1042 in INBOX.mbox has a malformed Message-Id header. 61 | File INBOX.mbox: 1230 messages 62 | Downloading 1 new messages to INBOX.mbox: 1.89 KB total, 1.89 KB for largest message 63 | Disconnecting 64 | $ 65 | 66 | 67 | = Compatibility = 68 | 69 | Python 2.5. May work on Python 2.4 with minor tweaks. 70 | 71 | 72 | = Changes = 73 | 74 | 2009-12-12 v1.4c 75 | * Use hashlib module instead of deprecated sha module - Ronan Sheth 76 | * Added --folders argument - Giuseppe Scrivano 77 | * imapbackup disappeared from Rui Carmo's site. This version found at: 78 | https://gist.github.com/raw/273418/fe7c59f69ba57c40dde8c8c33d6105f46f458df8/imapbackup.py 79 | 2008-12-08 v1.4b 80 | * Fetch with BODY.PEEK instead of BODY, to avoid marking Gmail messages as 81 | read - Brandon Long (Gmail team) 82 | 2007-05-28 v1.4a 83 | * SSL support! Can use a private key file and server certificate chain file. 84 | Unfortunately, Python's ssl module doesn't check the certificate chain. This 85 | needs to be fixed. 86 | * You can now specify the port number as part of the server name. Example: 87 | imapbackup.py -u user -s mail.com:1234 88 | * Cleaned up code. Used pylint to find code that didn't comply with best 89 | practices. 90 | 2007-05-27 v1.3b 91 | * Fixed bug in error message printout. 92 | 2007-05-26 v1.3a 93 | * Better support for result of LIST command. Fixes the problem of some folders 94 | not getting backed up from Courier IMAPd 95 | * Improved usage printout, made parameters more consistent. 96 | * Added support for socket._fileobject.recv bugfix on Windows 97 | 2007-03-27 v1.2e 98 | * By Rui Carmo. Downloaded from: 99 | http://the.taoofmac.com/space/Projects/imapbackup 100 | http://web.archive.org/web/20071011040436/http://the.taoofmac.com/space/Projects/imapbackup 101 | -------------------------------------------------------------------------------- /imapbackup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | """IMAP Incremental Backup Script v.1.4c (Dec 12 2009) found on http://the.taoofmac.com/space/Projects/imapbackup""" 4 | __version__ = "1.4c" 5 | __author__ = "Rui Carmo (http://the.taoofmac.com)" 6 | __copyright__ = "(C) 2006 Rui Carmo. Code under BSD License.\n(C)" 7 | __contributors__ = "Bob Ippolito, Michael Leonhard, Giuseppe Scrivano , Ronan Sheth, Brandon Long" 8 | 9 | # = Contributors = 10 | # Brandon Long (Gmail team): Reminder to use BODY.PEEK instead of BODY 11 | # Ronan Sheth: hashlib patch (this now requires Python 2.5, although reverting it back is trivial) 12 | # Giuseppe Scrivano: Added support for folders. 13 | # Michael Leonhard: LIST result parsing, SSL support, revamped argument processing, 14 | # moved spinner into class, extended recv fix to Windows 15 | # Bob Ippolito: fix for MemoryError on socket recv, http://python.org/sf/1092502 16 | # Rui Carmo: original author, up to v1.2e 17 | 18 | # = TODO = 19 | # - Add proper exception handlers to scanFile() and downloadMessages() 20 | # - Migrate mailbox usage from rfc822 module to email module 21 | # - Investigate using the noseek mailbox/email option to improve speed 22 | # - Use the email module to normalize downloaded messages 23 | # and add missing Message-Id 24 | # - Test parseList() and its descendents on other imapds 25 | # - Test bzip2 support 26 | # - Add option to download only subscribed folders 27 | # - Add regex option to filter folders 28 | # - Use a single IMAP command to get Message-IDs 29 | # - Use a single IMAP command to fetch the messages 30 | # - Add option to turn off spinner. Since sys.stdin.isatty() doesn't work on 31 | # Windows, redirecting output to a file results in junk output. 32 | # - Patch Python's ssl module to do proper checking of certificate chain 33 | # - Patch Python's ssl module to raise good exceptions 34 | # - Submit patch of socket._fileobject.read 35 | # - Improve imaplib module with LIST parsing code, submit patch 36 | # DONE: 37 | # v1.3c 38 | # - Add SSL support 39 | # - Support host:port 40 | # - Cleaned up code using PyLint to identify problems 41 | # pylint -f html --indent-string=" " --max-line-length=90 imapbackup.py > report.html 42 | import getpass, os, gc, sys, time, platform, getopt 43 | import mailbox, imaplib, socket 44 | import re, hashlib, gzip, bz2 45 | 46 | class SkipFolderException(Exception): 47 | """Indicates aborting processing of current folder, continue with next folder.""" 48 | pass 49 | 50 | class Spinner: 51 | """Prints out message with cute spinner, indicating progress""" 52 | 53 | def __init__(self, message): 54 | """Spinner constructor""" 55 | self.glyphs = "|/-\\" 56 | self.pos = 0 57 | self.message = message 58 | sys.stdout.write(message) 59 | sys.stdout.flush() 60 | self.spin() 61 | 62 | def spin(self): 63 | """Rotate the spinner""" 64 | if sys.stdin.isatty(): 65 | sys.stdout.write("\r" + self.message + " " + self.glyphs[self.pos]) 66 | sys.stdout.flush() 67 | self.pos = (self.pos+1) % len(self.glyphs) 68 | 69 | def stop(self): 70 | """Erase the spinner from the screen""" 71 | if sys.stdin.isatty(): 72 | sys.stdout.write("\r" + self.message + " ") 73 | sys.stdout.write("\r" + self.message) 74 | sys.stdout.flush() 75 | 76 | def pretty_byte_count(num): 77 | """Converts integer into a human friendly count of bytes, eg: 12.243 MB""" 78 | if num == 1: 79 | return "1 byte" 80 | elif num < 1024: 81 | return "%s bytes" % (num) 82 | elif num < 1048576: 83 | return "%.2f KB" % (num/1024.0) 84 | elif num < 1073741824: 85 | return "%.3f MB" % (num/1048576.0) 86 | elif num < 1099511627776: 87 | return "%.3f GB" % (num/1073741824.0) 88 | else: 89 | return "%.3f TB" % (num/1099511627776.0) 90 | 91 | 92 | # Regular expressions for parsing 93 | MSGID_RE = re.compile("^Message\-Id\: (.+)", re.IGNORECASE + re.MULTILINE) 94 | BLANKS_RE = re.compile(r'\s+', re.MULTILINE) 95 | 96 | # Constants 97 | UUID = '19AF1258-1AAF-44EF-9D9A-731079D6FAD7' # Used to generate Message-Ids 98 | 99 | def download_messages(server, filename, messages, config): 100 | """Download messages from folder and append to mailbox""" 101 | 102 | if config['overwrite']: 103 | if os.path.exists(filename): 104 | print "Deleting", filename 105 | os.remove(filename) 106 | return [] 107 | else: 108 | assert('bzip2' != config['compress']) 109 | 110 | # Open disk file 111 | if config['compress'] == 'gzip': 112 | mbox = gzip.GzipFile(filename, 'ab', 9) 113 | elif config['compress'] == 'bzip2': 114 | mbox = bz2.BZ2File(filename, 'wb', 512*1024, 9) 115 | else: 116 | mbox = file(filename, 'ab') 117 | 118 | # the folder has already been selected by scanFolder() 119 | 120 | # nothing to do 121 | if not len(messages): 122 | print "New messages: 0" 123 | mbox.close() 124 | return 125 | 126 | spinner = Spinner("Downloading %s new messages to %s" % (len(messages), filename)) 127 | total = biggest = 0 128 | 129 | # each new message 130 | for msg_id in messages.keys(): 131 | # This "From" and the terminating newline below delimit messages 132 | # in mbox files 133 | buf = "From nobody %s\n" % time.strftime('%a %m %d %H:%M:%S %Y') 134 | # If this is one of our synthesised Message-IDs, insert it before 135 | # the other headers 136 | if UUID in msg_id: 137 | buf = buf + "Message-Id: %s\n" % msg_id 138 | mbox.write(buf) 139 | 140 | # fetch message 141 | typ, data = server.fetch(messages[msg_id], "RFC822") 142 | assert('OK' == typ) 143 | text = data[0][1].strip().replace('\r','') 144 | mbox.write(text) 145 | mbox.write('\n\n') 146 | 147 | size = len(text) 148 | biggest = max(size, biggest) 149 | total += size 150 | 151 | del data 152 | gc.collect() 153 | spinner.spin() 154 | 155 | mbox.close() 156 | spinner.stop() 157 | print ": %s total, %s for largest message" % (pretty_byte_count(total), 158 | pretty_byte_count(biggest)) 159 | 160 | def scan_file(filename, compress, overwrite): 161 | """Gets IDs of messages in the specified mbox file""" 162 | # file will be overwritten 163 | if overwrite: 164 | return [] 165 | else: 166 | assert('bzip2' != compress) 167 | 168 | # file doesn't exist 169 | if not os.path.exists(filename): 170 | print "File %s: not found" % (filename) 171 | return [] 172 | 173 | spinner = Spinner("File %s" % (filename)) 174 | 175 | # open the file 176 | if compress == 'gzip': 177 | mbox = gzip.GzipFile(filename,'rb') 178 | elif compress == 'bzip2': 179 | mbox = bz2.BZ2File(filename,'rb') 180 | else: 181 | mbox = file(filename,'rb') 182 | 183 | messages = {} 184 | 185 | # each message 186 | i = 0 187 | for message in mailbox.PortableUnixMailbox(mbox): 188 | header = '' 189 | # We assume all messages on disk have message-ids 190 | try: 191 | header = ''.join(message.getfirstmatchingheader('message-id')) 192 | except KeyError: 193 | # No message ID was found. Warn the user and move on 194 | print 195 | print "WARNING: Message #%d in %s" % (i, filename), 196 | print "has no Message-Id header." 197 | 198 | header = BLANKS_RE.sub(' ', header.strip()) 199 | try: 200 | msg_id = MSGID_RE.match(header).group(1) 201 | if msg_id not in messages.keys(): 202 | # avoid adding dupes 203 | messages[msg_id] = msg_id 204 | except AttributeError: 205 | # Message-Id was found but could somehow not be parsed by regexp 206 | # (highly bloody unlikely) 207 | print 208 | print "WARNING: Message #%d in %s" % (i, filename), 209 | print "has a malformed Message-Id header." 210 | spinner.spin() 211 | i = i + 1 212 | 213 | # done 214 | mbox.close() 215 | spinner.stop() 216 | print ": %d messages" % (len(messages.keys())) 217 | return messages 218 | 219 | def scan_folder(server, foldername): 220 | """Gets IDs of messages in the specified folder, returns id:num dict""" 221 | messages = {} 222 | spinner = Spinner("Folder %s" % (foldername)) 223 | try: 224 | typ, data = server.select(foldername, readonly=True) 225 | if 'OK' != typ: 226 | raise SkipFolderException("SELECT failed: %s" % (data)) 227 | num_msgs = int(data[0]) 228 | 229 | # each message 230 | for num in range(1, num_msgs+1): 231 | # Retrieve Message-Id, making sure we don't mark all messages as read 232 | typ, data = server.fetch(num, '(BODY.PEEK[HEADER.FIELDS (MESSAGE-ID)])') 233 | if 'OK' != typ: 234 | raise SkipFolderException("FETCH %s failed: %s" % (num, data)) 235 | 236 | header = data[0][1].strip() 237 | # remove newlines inside Message-Id (a dumb Exchange trait) 238 | header = BLANKS_RE.sub(' ', header) 239 | try: 240 | msg_id = MSGID_RE.match(header).group(1) 241 | if msg_id not in messages.keys(): 242 | # avoid adding dupes 243 | messages[msg_id] = num 244 | except (IndexError, AttributeError): 245 | # Some messages may have no Message-Id, so we'll synthesise one 246 | # (this usually happens with Sent, Drafts and .Mac news) 247 | typ, data = server.fetch(num, '(BODY[HEADER.FIELDS (FROM TO CC DATE SUBJECT)])') 248 | if 'OK' != typ: 249 | raise SkipFolderException("FETCH %s failed: %s" % (num, data)) 250 | header = data[0][1].strip() 251 | header = header.replace('\r\n','\t') 252 | messages['<' + UUID + '.' + hashlib.sha1(header).hexdigest() + '>'] = num 253 | spinner.spin() 254 | finally: 255 | spinner.stop() 256 | print ":", 257 | 258 | # done 259 | print "%d messages" % (len(messages.keys())) 260 | return messages 261 | 262 | def parse_paren_list(row): 263 | """Parses the nested list of attributes at the start of a LIST response""" 264 | # eat starting paren 265 | assert(row[0] == '(') 266 | row = row[1:] 267 | 268 | result = [] 269 | 270 | # NOTE: RFC3501 doesn't fully define the format of name attributes 271 | name_attrib_re = re.compile("^\s*(\\\\[a-zA-Z0-9_]+)\s*") 272 | 273 | # eat name attributes until ending paren 274 | while row[0] != ')': 275 | # recurse 276 | if row[0] == '(': 277 | paren_list, row = parse_paren_list(row) 278 | result.append(paren_list) 279 | # consume name attribute 280 | else: 281 | match = name_attrib_re.search(row) 282 | assert(match != None) 283 | name_attrib = row[match.start():match.end()] 284 | row = row[match.end():] 285 | #print "MATCHED '%s' '%s'" % (name_attrib, row) 286 | name_attrib = name_attrib.strip() 287 | result.append(name_attrib) 288 | 289 | # eat ending paren 290 | assert(')' == row[0]) 291 | row = row[1:] 292 | 293 | # done! 294 | return result, row 295 | 296 | def parse_string_list(row): 297 | """Parses the quoted and unquoted strings at the end of a LIST response""" 298 | slist = re.compile('\s*(?:"([^"]+)")\s*|\s*(\S+)\s*').split(row) 299 | return [s for s in slist if s] 300 | 301 | def parse_list(row): 302 | """Prases response of LIST command into a list""" 303 | row = row.strip() 304 | paren_list, row = parse_paren_list(row) 305 | string_list = parse_string_list(row) 306 | assert(len(string_list) == 2) 307 | return [paren_list] + string_list 308 | 309 | def get_hierarchy_delimiter(server): 310 | """Queries the imapd for the hierarchy delimiter, eg. '.' in INBOX.Sent""" 311 | # see RFC 3501 page 39 paragraph 4 312 | typ, data = server.list('', '') 313 | assert(typ == 'OK') 314 | assert(len(data) == 1) 315 | lst = parse_list(data[0]) # [attribs, hierarchy delimiter, root name] 316 | hierarchy_delim = lst[1] 317 | # NIL if there is no hierarchy 318 | if 'NIL' == hierarchy_delim: 319 | hierarchy_delim = '.' 320 | return hierarchy_delim 321 | 322 | def get_names(server, compress): 323 | """Get list of folders, returns [(FolderName,FileName)]""" 324 | 325 | spinner = Spinner("Finding Folders") 326 | 327 | # Get hierarchy delimiter 328 | delim = get_hierarchy_delimiter(server) 329 | spinner.spin() 330 | 331 | # Get LIST of all folders 332 | typ, data = server.list() 333 | assert(typ == 'OK') 334 | spinner.spin() 335 | 336 | names = [] 337 | 338 | # parse each LIST, find folder name 339 | for row in data: 340 | lst = parse_list(row) 341 | foldername = lst[2] 342 | suffix = {'none':'', 'gzip':'.gz', 'bzip2':'.bz2'}[compress] 343 | filename = '.'.join(foldername.split(delim)) + '.mbox' + suffix 344 | names.append((foldername, filename)) 345 | 346 | # done 347 | spinner.stop() 348 | print ": %s folders" % (len(names)) 349 | return names 350 | 351 | def print_usage(): 352 | """Prints usage, exits""" 353 | # " " 354 | print "Usage: imapbackup [OPTIONS] -s HOST -u USERNAME [-p PASSWORD]" 355 | print " -a --append-to-mboxes Append new messages to mbox files. (default)" 356 | print " -y --yes-overwrite-mboxes Overwite existing mbox files instead of appending." 357 | print " -n --compress=none Use one plain mbox file for each folder. (default)" 358 | print " -z --compress=gzip Use mbox.gz files. Appending may be very slow." 359 | print " -b --compress=bzip2 Use mbox.bz2 files. Appending not supported: use -y." 360 | print " -f --=folder Specifify which folders use. Comma separated list." 361 | print " -e --ssl Use SSL. Port defaults to 993." 362 | print " -k KEY --key=KEY PEM private key file for SSL. Specify cert, too." 363 | print " -c CERT --cert=CERT PEM certificate chain for SSL. Specify key, too." 364 | print " Python's SSL module doesn't check the cert chain." 365 | print " -s HOST --server=HOST Address of server, port optional, eg. mail.com:143" 366 | print " -u USER --user=USER Username to log into server" 367 | print " -p PASS --pass=PASS Prompts for password if not specified." 368 | print "\nNOTE: mbox files are created in the current working directory." 369 | sys.exit(2) 370 | 371 | def process_cline(): 372 | """Uses getopt to process command line, returns (config, warnings, errors)""" 373 | # read command line 374 | try: 375 | short_args = "aynzbek:c:s:u:p:f:" 376 | long_args = ["append-to-mboxes", "yes-overwrite-mboxes", "compress=", 377 | "ssl", "keyfile=", "certfile=", "server=", "user=", "pass=", "folders="] 378 | opts, extraargs = getopt.getopt(sys.argv[1:], short_args, long_args) 379 | except getopt.GetoptError: 380 | print_usage() 381 | 382 | warnings = [] 383 | config = {'compress':'none', 'overwrite':False, 'usessl':False} 384 | errors = [] 385 | 386 | # empty command line 387 | if not len(opts) and not len(extraargs): 388 | print_usage() 389 | 390 | # process each command line option, save in config 391 | for option, value in opts: 392 | if option in ("-a", "--append-to-mboxes"): 393 | config['overwrite'] = False 394 | elif option in ("-y", "--yes-overwrite-mboxes"): 395 | warnings.append("Existing mbox files will be overwritten!") 396 | config["overwrite"] = True 397 | elif option == "-n": 398 | config['compress'] = 'none' 399 | elif option == "-z": 400 | config['compress'] = 'gzip' 401 | elif option == "-b": 402 | config['compress'] = 'bzip2' 403 | elif option == "--compress": 404 | if value in ('none', 'gzip', 'bzip2'): 405 | config['compress'] = value 406 | else: 407 | errors.append("Invalid compression type specified.") 408 | elif option in ("-e", "--ssl"): 409 | config['usessl'] = True 410 | elif option in ("-k", "--keyfile"): 411 | config['keyfilename'] = value 412 | elif option in ("-f", "--folders"): 413 | config['folders'] = value 414 | elif option in ("-c", "--certfile"): 415 | config['certfilename'] = value 416 | elif option in ("-s", "--server"): 417 | config['server'] = value 418 | elif option in ("-u", "--user"): 419 | config['user'] = value 420 | elif option in ("-p", "--pass"): 421 | config['pass'] = value 422 | else: 423 | errors.append("Unknown option: " + option) 424 | 425 | # don't ignore extra arguments 426 | for arg in extraargs: 427 | errors.append("Unknown argument: " + arg) 428 | 429 | # done processing command line 430 | return (config, warnings, errors) 431 | 432 | def check_config(config, warnings, errors): 433 | """Checks the config for consistency, returns (config, warnings, errors)""" 434 | 435 | if config['compress'] == 'bzip2' and config['overwrite'] == False: 436 | errors.append("Cannot append new messages to mbox.bz2 files. Please specify -y.") 437 | if config['compress'] == 'gzip' and config['overwrite'] == False: 438 | warnings.append( 439 | "Appending new messages to mbox.gz files is very slow. Please Consider\n" 440 | " using -y and compressing the files yourself with gzip -9 *.mbox") 441 | if 'server' not in config : 442 | errors.append("No server specified.") 443 | if 'user' not in config: 444 | errors.append("No username specified.") 445 | if ('keyfilename' in config) ^ ('certfilename' in config): 446 | errors.append("Please specify both key and cert or neither.") 447 | if 'keyfilename' in config and not config['usessl']: 448 | errors.append("Key specified without SSL. Please use -e or --ssl.") 449 | if 'certfilename' in config and not config['usessl']: 450 | errors.append("Certificate specified without SSL. Please use -e or --ssl.") 451 | if 'server' in config and ':' in config['server']: 452 | # get host and port strings 453 | bits = config['server'].split(':', 1) 454 | config['server'] = bits[0] 455 | # port specified, convert it to int 456 | if len(bits) > 1 and len(bits[1]) > 0: 457 | try: 458 | port = int(bits[1]) 459 | if port > 65535 or port < 0: 460 | raise ValueError 461 | config['port'] = port 462 | except ValueError: 463 | errors.append("Invalid port. Port must be an integer between 0 and 65535.") 464 | return (config, warnings, errors) 465 | 466 | def get_config(): 467 | """Gets config from command line and console, returns config""" 468 | # config = { 469 | # 'compress': 'none' or 'gzip' or 'bzip2' 470 | # 'overwrite': True or False 471 | # 'server': String 472 | # 'port': Integer 473 | # 'user': String 474 | # 'pass': String 475 | # 'usessl': True or False 476 | # 'keyfilename': String or None 477 | # 'certfilename': String or None 478 | # } 479 | 480 | config, warnings, errors = process_cline() 481 | config, warnings, errors = check_config(config, warnings, errors) 482 | 483 | # show warnings 484 | for warning in warnings: 485 | print "WARNING:", warning 486 | 487 | # show errors, exit 488 | for error in errors: 489 | print "ERROR", error 490 | if len(errors): 491 | sys.exit(2) 492 | 493 | # prompt for password, if necessary 494 | if 'pass' not in config: 495 | config['pass'] = getpass.getpass() 496 | 497 | # defaults 498 | if not 'port' in config: 499 | if config['usessl']: 500 | config['port'] = 993 501 | else: 502 | config['port'] = 143 503 | 504 | # done! 505 | return config 506 | 507 | def connect_and_login(config): 508 | """Connects to the server and logs in. Returns IMAP4 object.""" 509 | try: 510 | assert(not (('keyfilename' in config) ^ ('certfilename' in config))) 511 | 512 | if config['usessl'] and 'keyfilename' in config: 513 | print "Connecting to '%s' TCP port %d," % (config['server'], config['port']), 514 | print "SSL, key from %s," % (config['keyfilename']), 515 | print "cert from %s " % (config['certfilename']) 516 | server = imaplib.IMAP4_SSL(config['server'], config['port'], 517 | config['keyfilename'], config['certfilename']) 518 | elif config['usessl']: 519 | print "Connecting to '%s' TCP port %d, SSL" % (config['server'], config['port']) 520 | server = imaplib.IMAP4_SSL(config['server'], config['port']) 521 | else: 522 | print "Connecting to '%s' TCP port %d" % (config['server'], config['port']) 523 | server = imaplib.IMAP4(config['server'], config['port']) 524 | 525 | print "Logging in as '%s'" % (config['user']) 526 | server.login(config['user'], config['pass']) 527 | except socket.gaierror, e: 528 | (err, desc) = e 529 | print "ERROR: problem looking up server '%s' (%s %s)" % (config['server'], err, desc) 530 | sys.exit(3) 531 | except socket.error, e: 532 | if str(e) == "SSL_CTX_use_PrivateKey_file error": 533 | print "ERROR: error reading private key file '%s'" % (config['keyfilename']) 534 | elif str(e) == "SSL_CTX_use_certificate_chain_file error": 535 | print "ERROR: error reading certificate chain file '%s'" % (config['keyfilename']) 536 | else: 537 | print "ERROR: could not connect to '%s' (%s)" % (config['server'], e) 538 | 539 | sys.exit(4) 540 | 541 | return server 542 | 543 | def main(): 544 | """Main entry point""" 545 | try: 546 | config = get_config() 547 | server = connect_and_login(config) 548 | names = get_names(server, config['compress']) 549 | 550 | if config.get('folders'): 551 | dirs = map (lambda x: x.strip(), config.get('folders').split(',')) 552 | names = filter (lambda x: x[0] in dirs, names) 553 | 554 | #for n in range(len(names)): 555 | # print n, names[n] 556 | 557 | for name_pair in names: 558 | try: 559 | foldername, filename = name_pair 560 | fol_messages = scan_folder(server, foldername) 561 | fil_messages = scan_file(filename, config['compress'], config['overwrite']) 562 | 563 | new_messages = {} 564 | for msg_id in fol_messages: 565 | if msg_id not in fil_messages: 566 | new_messages[msg_id] = fol_messages[msg_id] 567 | 568 | #for f in new_messages: 569 | # print "%s : %s" % (f, new_messages[f]) 570 | 571 | download_messages(server, filename, new_messages, config) 572 | 573 | except SkipFolderException, e: 574 | print e 575 | 576 | print "Disconnecting" 577 | server.logout() 578 | except socket.error, e: 579 | (err, desc) = e 580 | print "ERROR: %s %s" % (err, desc) 581 | sys.exit(4) 582 | except imaplib.IMAP4.error, e: 583 | print "ERROR:", e 584 | sys.exit(5) 585 | 586 | 587 | # From http://www.pixelbeat.org/talks/python/spinner.py 588 | def cli_exception(typ, value, traceback): 589 | """Handle CTRL-C by printing newline instead of ugly stack trace""" 590 | if not issubclass(typ, KeyboardInterrupt): 591 | sys.__excepthook__(typ, value, traceback) 592 | else: 593 | sys.stdout.write("\n") 594 | sys.stdout.flush() 595 | 596 | if sys.stdin.isatty(): 597 | sys.excepthook = cli_exception 598 | 599 | 600 | 601 | # Hideous fix to counteract http://python.org/sf/1092502 602 | # (which should have been fixed ages ago.) 603 | # Also see http://python.org/sf/1441530 604 | def _fixed_socket_read(self, size=-1): 605 | data = self._rbuf 606 | if size < 0: 607 | # Read until EOF 608 | buffers = [] 609 | if data: 610 | buffers.append(data) 611 | self._rbuf = "" 612 | if self._rbufsize <= 1: 613 | recv_size = self.default_bufsize 614 | else: 615 | recv_size = self._rbufsize 616 | while True: 617 | data = self._sock.recv(recv_size) 618 | if not data: 619 | break 620 | buffers.append(data) 621 | return "".join(buffers) 622 | else: 623 | # Read until size bytes or EOF seen, whichever comes first 624 | buf_len = len(data) 625 | if buf_len >= size: 626 | self._rbuf = data[size:] 627 | return data[:size] 628 | buffers = [] 629 | if data: 630 | buffers.append(data) 631 | self._rbuf = "" 632 | while True: 633 | left = size - buf_len 634 | recv_size = min(self._rbufsize, left) # the actual fix 635 | data = self._sock.recv(recv_size) 636 | if not data: 637 | break 638 | buffers.append(data) 639 | n = len(data) 640 | if n >= left: 641 | self._rbuf = data[left:] 642 | buffers[-1] = data[:left] 643 | break 644 | buf_len += n 645 | return "".join(buffers) 646 | 647 | # Platform detection to enable socket patch 648 | if 'Darwin' in platform.platform() and '2.3.5' == platform.python_version(): 649 | socket._fileobject.read = _fixed_socket_read 650 | if 'Windows' in platform.platform(): 651 | socket._fileobject.read = _fixed_socket_read 652 | 653 | if __name__ == '__main__': 654 | gc.enable() 655 | main() 656 | --------------------------------------------------------------------------------