├── favorites.txt ├── mentions.txt ├── tweets.txt ├── dmessages-sent.txt ├── dmessages-received.txt ├── lastID.txt ├── twitter-credentials ├── archive-tweets.py ├── archive-dm-received.py ├── archive-faves.py ├── archive-mentions.py ├── archive-dm-sent.py └── README.md /favorites.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /mentions.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /tweets.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /dmessages-sent.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /dmessages-received.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /lastID.txt: -------------------------------------------------------------------------------- 1 | twitter: 100 2 | mention: 100 3 | favorit: 100 4 | dm-send: 100 5 | dm-rece: 100 -------------------------------------------------------------------------------- /twitter-credentials: -------------------------------------------------------------------------------- 1 | consumerKey: aaa 2 | consumerSecret: bbb 3 | token: ccc-ddd 4 | tokenSecret: eee 5 | -------------------------------------------------------------------------------- /archive-tweets.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import tweepy 4 | import pytz 5 | import os 6 | 7 | # Parameters. 8 | me = 'username' 9 | urlprefix = 'http://twitter.com/%s/status/' % me 10 | tweetdir = os.environ['HOME'] + '/Dropbox/twitter/' 11 | tweetfile = tweetdir + 'tweets.txt' 12 | idfile = tweetdir + 'lastID.txt' 13 | datefmt = '%B %-d, %Y at %-I:%M %p' 14 | homeTZ = pytz.timezone('US/Central') 15 | utc = pytz.utc 16 | 17 | def setup_api(): 18 | """Authorize the use of the Twitter API.""" 19 | a = {} 20 | with open(os.environ['HOME'] + '/.twitter-credentials') as credentials: 21 | for line in credentials: 22 | k, v = line.split(': ') 23 | a[k] = v.strip() 24 | auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret']) 25 | auth.set_access_token(a['token'], a['tokenSecret']) 26 | return tweepy.API(auth) 27 | 28 | # Authorize. 29 | api = setup_api() 30 | 31 | # Get the ID of the last downloaded tweet. 32 | a = {} 33 | with open(idfile, 'r') as f: 34 | for line in f: 35 | k, v = line.split(': ') 36 | a[k] = v.strip() 37 | lastID = a['twitter'] 38 | 39 | # Collect all the tweets since the last one. 40 | tweets = api.user_timeline(me, since_id=lastID, count=200, include_rts=True) 41 | 42 | # Write them out to the twitter.txt file. 43 | with open(tweetfile, 'a') as f: 44 | for t in reversed(tweets): 45 | ts = utc.localize(t.created_at).astimezone(homeTZ) 46 | lines = ['', 47 | t.text, 48 | ts.strftime(datefmt), 49 | urlprefix + t.id_str, 50 | '- - - - -', 51 | ''] 52 | f.write('\n'.join(lines).encode('utf8')) 53 | lastID = t.id_str 54 | 55 | # Update the ID of the last downloaded tweet. 56 | with open(idfile, 'r') as f: 57 | data = f.readlines() 58 | 59 | data[0] = 'twitter: ' + lastID + '\n' 60 | 61 | with open(idfile, 'w') as f: 62 | f.writelines ( data ) 63 | -------------------------------------------------------------------------------- /archive-dm-received.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import tweepy 4 | import pytz 5 | import os 6 | 7 | # Parameters. 8 | me = 'username' 9 | urlprefix = 'http://twitter.com/' 10 | tweetdir = os.environ['HOME'] + '/Dropbox/twitter/' 11 | tweetfile = tweetdir + 'dmessages-received.txt' 12 | idfile = tweetdir + 'lastID.txt' 13 | datefmt = '%B %-d, %Y at %-I:%M %p' 14 | homeTZ = pytz.timezone('US/Central') 15 | utc = pytz.utc 16 | 17 | def setup_api(): 18 | """Authorize the use of the Twitter API.""" 19 | a = {} 20 | with open(os.environ['HOME'] + '/.twitter-credentials') as credentials: 21 | for line in credentials: 22 | k, v = line.split(': ') 23 | a[k] = v.strip() 24 | auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret']) 25 | auth.set_access_token(a['token'], a['tokenSecret']) 26 | return tweepy.API(auth) 27 | 28 | # Authorize. 29 | api = setup_api() 30 | 31 | # Get the ID of the last downloaded message. 32 | a = {} 33 | with open(idfile, 'r') as f: 34 | for line in f: 35 | k, v = line.split(': ') 36 | a[k] = v.strip() 37 | lastID = a['dm-rec'] 38 | 39 | # Collect all the direct messages since the last one. 40 | tweets = api.direct_messages(since_id=lastID) 41 | 42 | # Write them out to the dmessages-received.txt file. 43 | with open(tweetfile, 'a') as f: 44 | for t in reversed(tweets): 45 | ts = utc.localize(t.created_at).astimezone(homeTZ) 46 | lines = ['', 47 | t.text, 48 | '', 49 | 'at ' + ts.strftime(datefmt) + ' from [@' + t.sender.screen_name + '](' + urlprefix + t.sender.screen_name + ')', 50 | '- - - - -', 51 | ''] 52 | f.write('\n'.join(lines).encode('utf8')) 53 | lastID = t.id_str 54 | 55 | # Update the ID of the last downloaded message. 56 | with open(idfile, 'r') as f: 57 | data = f.readlines() 58 | 59 | data[3] = 'dm-rec: ' + lastID + '\n' 60 | 61 | with open(idfile, 'w') as f: 62 | f.writelines ( data ) 63 | -------------------------------------------------------------------------------- /archive-faves.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import tweepy 4 | import pytz 5 | import os 6 | 7 | # Parameters. 8 | me = 'username' 9 | urlprefix = 'http://twitter.com/' 10 | tweetdir = os.environ['HOME'] + '/Dropbox/twitter/' 11 | tweetfile = tweetdir + 'favorites.txt' 12 | idfile = tweetdir + 'lastID.txt' 13 | datefmt = '%B %-d, %Y at %-I:%M %p' 14 | homeTZ = pytz.timezone('US/Central') 15 | utc = pytz.utc 16 | 17 | def setup_api(): 18 | """Authorize the use of the Twitter API.""" 19 | a = {} 20 | with open(os.environ['HOME'] + '/.twitter-credentials') as credentials: 21 | for line in credentials: 22 | k, v = line.split(': ') 23 | a[k] = v.strip() 24 | auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret']) 25 | auth.set_access_token(a['token'], a['tokenSecret']) 26 | return tweepy.API(auth) 27 | 28 | # Authorize. 29 | api = setup_api() 30 | 31 | # Get the ID of the last downloaded favorite. 32 | a = {} 33 | with open(idfile, 'r') as f: 34 | for line in f: 35 | k, v = line.split(': ') 36 | a[k] = v.strip() 37 | lastID = a['favorit'] 38 | 39 | # Collect all the favorites since the last one. 40 | tweets = api.favorites(since_id=lastID) 41 | 42 | # Write them out to the mentions.txt file. 43 | with open(tweetfile, 'a') as f: 44 | for t in reversed(tweets): 45 | ts = utc.localize(t.created_at).astimezone(homeTZ) 46 | lines = ['', 47 | t.text, 48 | '', 49 | 'at [' + ts.strftime(datefmt) + '](' + urlprefix + t.user.screen_name + '/status/' + t.id_str + ') by [@' + t.user.screen_name + '](' + urlprefix + t.user.screen_name + ')', 50 | '- - - - -', 51 | ''] 52 | f.write('\n'.join(lines).encode('utf8')) 53 | lastID = t.id_str 54 | 55 | # Update the ID of the last downloaded mention. 56 | with open(idfile, 'r') as f: 57 | data = f.readlines() 58 | 59 | data[2] = 'favorit: ' + lastID + '\n' 60 | 61 | with open(idfile, 'w') as f: 62 | f.writelines ( data ) 63 | -------------------------------------------------------------------------------- /archive-mentions.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import tweepy 4 | import pytz 5 | import os 6 | 7 | # Parameters. 8 | me = 'username' 9 | urlprefix = 'http://twitter.com/' 10 | tweetdir = os.environ['HOME'] + '/Dropbox/twitter/' 11 | tweetfile = tweetdir + 'mentions.txt' 12 | idfile = tweetdir + 'lastID.txt' 13 | datefmt = '%B %-d, %Y at %-I:%M %p' 14 | homeTZ = pytz.timezone('US/Central') 15 | utc = pytz.utc 16 | 17 | def setup_api(): 18 | """Authorize the use of the Twitter API.""" 19 | a = {} 20 | with open(os.environ['HOME'] + '/.twitter-credentials') as credentials: 21 | for line in credentials: 22 | k, v = line.split(': ') 23 | a[k] = v.strip() 24 | auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret']) 25 | auth.set_access_token(a['token'], a['tokenSecret']) 26 | return tweepy.API(auth) 27 | 28 | # Authorize. 29 | api = setup_api() 30 | 31 | # Get the ID of the last downloaded mention. 32 | a = {} 33 | with open(idfile, 'r') as f: 34 | for line in f: 35 | k, v = line.split(': ') 36 | a[k] = v.strip() 37 | lastID = a['mention'] 38 | 39 | # Collect all the mentions since the last one. 40 | tweets = api.mentions(since_id=lastID) 41 | 42 | # Write them out to the mentions.txt file. 43 | with open(tweetfile, 'a') as f: 44 | for t in reversed(tweets): 45 | ts = utc.localize(t.created_at).astimezone(homeTZ) 46 | lines = ['', 47 | t.text, 48 | '', 49 | 'at [' + ts.strftime(datefmt) + '](' + urlprefix + t.user.screen_name + '/status/' + t.id_str + ') by [@' + t.user.screen_name + '](' + urlprefix + t.user.screen_name + ')', 50 | '- - - - -', 51 | ''] 52 | f.write('\n'.join(lines).encode('utf8')) 53 | lastID = t.id_str 54 | 55 | # Update the ID of the last downloaded mention. 56 | with open(idfile, 'r') as f: 57 | data = f.readlines() 58 | 59 | data[1] = 'mention: ' + lastID + '\n' 60 | 61 | with open(idfile, 'w') as f: 62 | f.writelines ( data ) 63 | -------------------------------------------------------------------------------- /archive-dm-sent.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import tweepy 4 | import pytz 5 | import os 6 | 7 | # Parameters. 8 | me = 'username' 9 | urlprefix = 'http://twitter.com/' 10 | tweetdir = os.environ['HOME'] + '/Dropbox/twitter/' 11 | tweetfile = tweetdir + 'dmessages-sent.txt' 12 | idfile = tweetdir + 'lastID.txt' 13 | datefmt = '%B %-d, %Y at %-I:%M %p' 14 | homeTZ = pytz.timezone('US/Central') 15 | utc = pytz.utc 16 | 17 | def setup_api(): 18 | """Authorize the use of the Twitter API.""" 19 | a = {} 20 | with open(os.environ['HOME'] + '/.twitter-credentials') as credentials: 21 | for line in credentials: 22 | k, v = line.split(': ') 23 | a[k] = v.strip() 24 | auth = tweepy.OAuthHandler(a['consumerKey'], a['consumerSecret']) 25 | auth.set_access_token(a['token'], a['tokenSecret']) 26 | return tweepy.API(auth) 27 | 28 | # Authorize. 29 | api = setup_api() 30 | 31 | # Get the ID of the last downloaded direct message. 32 | a = {} 33 | with open(idfile, 'r') as f: 34 | for line in f: 35 | k, v = line.split(': ') 36 | a[k] = v.strip() 37 | lastID = a['dm-sent'] 38 | 39 | # Collect all the direct messages since the last one. 40 | tweets = api.sent_direct_messages(since_id=lastID) 41 | 42 | # Write them out to the dmessages-sent.txt file. 43 | with open(tweetfile, 'a') as f: 44 | for t in reversed(tweets): 45 | ts = utc.localize(t.created_at).astimezone(homeTZ) 46 | lines = ['', 47 | t.text, 48 | '', 49 | 'at ' + ts.strftime(datefmt) + ' to [@' + t.recipient.screen_name + '](' + urlprefix + t.recipient.screen_name + ') from [@' + me + '](http://www.twitter.com/' + me + ')', 50 | '- - - - -', 51 | ''] 52 | f.write('\n'.join(lines).encode('utf8')) 53 | lastID = t.id_str 54 | 55 | # Update the ID of the last downloaded message. 56 | with open(idfile, 'r') as f: 57 | data = f.readlines() 58 | 59 | data[4] = 'dm-sent: ' + lastID + '\n' 60 | 61 | with open(idfile, 'w') as f: 62 | f.writelines ( data ) 63 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Tweet archiver # 2 | 3 | These Python scripts and the related files will download your tweets, mentions, favorites and direct messages from Twitter and save them in a plaintext archive file on your computer. It is intended to be run periodically to add recent tweets to an existing archive file. The original script (in a slightly earlier version, only for archiving tweets) is described in [this blog post][1]. If you already have an archive of tweets in ThinkUp, you might find [this post][2] useful in turning it into a plaintext archive. If you are starting an archive from scratch, [this post and script][3] by Tim Bueno will be helpful. 4 | 5 | The files in the repository are: 6 | 7 | * A collection of scripts that perform the archiving, which can be be stored anywhere and should be run periodically via a system like `cron` or `launchd`. 8 | - `archive-tweets.py`: archives tweets to `tweets.txt`. 9 | - `archive-mentions.py`: archives mentions to `mentions.txt`. 10 | - `archive-faves`: archives favorites to `favorites.txt`. 11 | - `archive-dm-received`: archives direct messages received from other users to `dmessages-received.txt`. 12 | - `archive-dm-sent`: archives direct messages sent to other users to `dmessages-sent.txt`. 13 | * `tweets.txt`, `mentions.txt`, `favorites.txt`, `dmessages-received`, `dmessages-sent`. These are the archive files themselves, currently empty. They should be kept in a folder named `twitter` inside your Dropbox folder. 14 | * `twitter-credentials`. This file should be renamed `.twitter-credentials` and saved in your home directory. The values for `consumerKey`, `consumerSecret`, `token`, and `tokenSecret` must be provided by Twitter. Go to the [Twitter developer site][4], click the "Create an app" link, and follow the instructions given there for creating an app. When you're done, you'll be given the four credentials—long strings of letters and numbers—you'll need. If you want to archive your direct messages, you'll also need to allow your app to "read, write and access direct messages" (configured in the "Settings" tab). 15 | * `lastID.txt`. This file holds the ID numbers of the most recently archived tweet, mention, favorite, etc.; it currently holds dummy values you'll need to change. It should be kept in the same folder as the archive files. 16 | 17 | You don't *have* to keep your archive in Dropbox, but that's a convenient place to be able to access your tweets from any of your computers. The directory for the archive can be changed by editing Line 10 of `archive-$entity.py`. 18 | 19 | You also don't *have* to archive every type of message, and you can delete the associated script and empty archive file if you want. However, keep the dummy values in `lastID.txt` unless you edit Line 62 of the remaining scripts, which updates the value in `lastID.txt`. 20 | 21 | [1]: http://www.leancrew.com/all-this/2012/07/archiving-tweets-without-ifttt/ 22 | [2]: http://www.leancrew.com/all-this/2012/07/archiving-tweets/ 23 | [3]: http://www.timbueno.com/2012/07/07/rolling-my-own-automatic-tweet-archiver 24 | [4]: https://dev.twitter.com/ --------------------------------------------------------------------------------