├── sample_bot.py ├── LICENSE ├── initializeDatabase.py ├── README.md └── redditbot.py /sample_bot.py: -------------------------------------------------------------------------------- 1 | imgurID = "" 2 | imgurSecret = "" 3 | rengeName = "" 4 | rengePass = "" 5 | 6 | # User Agent Format - "platform:identifier:version - purpose (by /u/username)" 7 | rengeVersion = "v0.1" 8 | rengeUserAgent = "OS X:com.tsunderedev.renge_chon:" + rengeVersion + " - saves pictures from various subreddits (by /u/hizinfiz)" 9 | 10 | sublist = ["twodeeart", "animewallpaper", "moescape", "patchuu", "animevectorwallpapers", 11 | "animephonewallpapers", "iwallpaper", "wallpaper", "wallpaperdump", "wallpapers", 12 | "WQHD_Wallpaper", "multiwall", "widescreenwallpaper", "ultrahdwallpapers", "bigwallpapers", 13 | "gamewalls", "minimalwallpaper", "wallpaperpacks", "earthporn", "spaceporn", 14 | "imaginarybattlefields", "imaginaryfuturewar", "skyrimporn"] -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 hizinfiz 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /initializeDatabase.py: -------------------------------------------------------------------------------- 1 | # Run this code the first time you use the bot. It will create a 2 | # database called images.db for you, and initialize a table for storing 3 | # Reddit submissions and Imgur text posts. 4 | 5 | import sqlite3 6 | import sys 7 | 8 | def makeDB(): 9 | ''' 10 | Creates the database 11 | ''' 12 | db = sqlite3.connect("images.db") 13 | db.execute("CREATE TABLE images (reddit text, imgur text)") 14 | db.close() 15 | 16 | def test(): 17 | ''' 18 | Tests to see if the database works 19 | ''' 20 | db = sqlite3.connect("images.db") 21 | # I'll add code later 22 | db.close() 23 | 24 | def reset(): 25 | ''' 26 | Resets the database 27 | ''' 28 | db = sqlite3.connect("images.db") 29 | db.execute("DROP TABLE IF EXISTS images") 30 | db.execute("CREATE TABLE images (reddit text, imgur text)") 31 | db.close() 32 | 33 | def main(): 34 | if len(sys.argv) > 1: 35 | if sys.argv[1] == "make": 36 | print("Making database") 37 | makeDB() 38 | elif sys.argv[1] == "test": 39 | test() 40 | elif sys.argv[1] == "reset": 41 | reset() 42 | else: 43 | print("Invalid argument.") 44 | else: 45 | print("Argument required.") 46 | print("Use 'make' if running for first time") 47 | print("Use 'test' to test the database") 48 | print("Use 'reset' to reset the database") 49 | 50 | if __name__ == "__main__": main() -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Image Downloader Bot 2 | 3 | This is a Reddit bot that I'm working on that downloads images from various subreddits. I'm currently learning Python, and thought this would be a fun project to work on, especially because I have a minor (not really) wallpaper addiction. 4 | 5 | This bot was written in Python 3.4. 6 | 7 | ## Installation 8 | 9 | 1. `git clone http://github.com/hizinfiz/Reddit-Image-Downloader-Bot.git` 10 | 1. Install `sqlite3`, `PRAW`, and `ImgurClient` using pip. 11 | 1. Register an [Imgur Client](https://api.imgur.com/oauth2/addclient). 12 | Under "Authorization Type", you can select "OAuth 2 authorization without a callback URL". I'm including this because when I first did it, I had to Google to find out which one I should choose. 13 | 1. Create `bot.py` in your Python directory. This allows you to privately store your Imgur client ID, secret, and bot login credentials without worrying about it getting out online. I also use `bot.py` (for now) to store the list of subreddits I'd like my bot to download wallpapers from. 14 | The directory on OS X is: 15 | `/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4` 16 | I've included `sample_bot.py` as an example of what to do. 17 | 1. Come up with a list of subreddits you'd like to grab images from. I included the subreddits I'm checking in `sample_bot.py`. 18 | 19 | ## Running the bot 20 | 21 | For now, the bot doesn't currently log in to Reddit, it just runs via the command line. I'm planning to add logging in in the future, that way I can communicate with the bot without having to access it via the command line. 22 | 23 | Before starting the bot, you should run `initializeDatabase.py make`. This will initialize the database that keeps track of what images you've saved already. 24 | 25 | If at any time you want to reset the database, `initializeDatabase.py reset` will do that for you. 26 | 27 | Then, you can run `RedditBot.py` and it'll get on its way! 28 | 29 | ## Current Functionality 30 | 31 | 1. Save the top 25 posts from each subreddit you've specified. It will only look at Imgur posts, everything else will be ignored. If a user decides to link to their Imgur profile page, it'll ignore that as well. 32 | 2. Keep a log of each image saved so that you don't end up with duplicates. 33 | This works across subreddits as well because I find that people like to submit their wallpapers to multiple subreddits at the same time. 34 | However, it won't detect reuploads of the same image, so unfortunately you're out of luck there. I believe there's software available that can do that for you so why bother making my own. 35 | 3. On the first run, it'll make a directory named `Images` for you. The first time you save images from a subreddit, it'll make a directory for that subreddit in `Images`. 36 | 4. It also likes to crap out if you save images too quickly because of the Imgur API rate limit, that's something I'll fix in the next few days when I have time. For now, I just made it take short breaks after each save and long breaks after each album. I would recommend skipping /r/wallpaperdump because it doesn't like that subreddit very much. 37 | 38 | ## Planned Functionality 39 | 40 | 1. Actually log into Reddit and automatically save from /hot once per day. 41 | 2. Fix the Imgur API rate limit issue. 42 | 3. Save from other image sources. Planned includes iminus, awwnime, and deviantart. This is mostly based on whatever I'm most interested in saving from, I'm not planning on hitting every single major image hosting service. 43 | 4. Accept requests via Reddit PM 44 | - Add/Remove subreddit (probably going to have to move the subreddit list over to the database rather than keep it in bot.py) 45 | - Save Top - All Time, Past Year, Past Month, Past Week & specify how many to save (default = 100) 46 | 5. If I give it an image ID, tell me the Reddit post 47 | 6. Tell me if I incorrectly added a subreddit. 48 | 7. Ignore images below a certain resolution because nobody wants low res wallpapers. 49 | 8. Sort based on resolution into desktop and mobile wallpapers 50 | 51 | ## License 52 | 53 | See [LICENSE](https://github.com/hizinfiz/Reddit-Image-Downloader-bot/blob/master/LICENSE) -------------------------------------------------------------------------------- /redditbot.py: -------------------------------------------------------------------------------- 1 | from imgurpython.helpers.error import ImgurClientError 2 | from imgurpython import ImgurClient 3 | import bot 4 | import praw 5 | import os 6 | import re 7 | import sqlite3 8 | import time 9 | import urllib 10 | 11 | # Imgur API client ID 12 | aID = bot.imgurID 13 | # Imgur API client secret 14 | aSecret = bot.imgurSecret 15 | # Reddit bot username 16 | aName = bot.rengeName 17 | # Reddit bot password 18 | aPass = bot.rengePass 19 | 20 | # Reddit useragent 21 | aUserAgent = bot.rengeUserAgent 22 | 23 | # List of subreddits to pull from 24 | aSublist = bot.sublist 25 | # These are for whenever I'd like to test just one subreddit or a subset of the subreddits 26 | aTestSublist = aSublist[0:23] 27 | aTestSub = aSublist[0] 28 | 29 | def createDir(subreddit): 30 | ''' 31 | Checks to see if a directory for the subreddit exists. If 32 | it doesn't exist, it creates that directory. 33 | ''' 34 | if not os.path.exists(os.path.abspath("images/{}/".format(subreddit))): 35 | os.makedirs(os.path.abspath("images/{}/".format(subreddit))) 36 | 37 | def getImgurID(url): 38 | ''' 39 | Takes a url and strips the domain and file type. Returns a 40 | valid Imgur image ID 41 | ''' 42 | imgid = re.sub("(?i)http(s)*://(\w.)*imgur.com/(r/\w*/)*", "", url) 43 | while "." in imgid: 44 | imgid = imgid[:-1] 45 | return imgid 46 | 47 | def getImgurAlbumID(url): 48 | ''' 49 | Takes a url and strips the domain. Returns a valid Imgur 50 | album ID. 51 | ''' 52 | albid = re.sub("(?i)http(s)*://(\w.)*imgur.com/(a|gallery)/(r/\w*/)*", "", url) 53 | while ("#" in albid) | ("?" in albid): 54 | albid = albid[:-1] 55 | return albid 56 | 57 | def saveImage(client, subreddit, url, reddID): 58 | ''' 59 | Takes an image id, checks for the file type and creates an 60 | appropriate savepath. It then checks to see if the image has 61 | already been saved. If the image has been saved, it skips that 62 | image. 63 | 64 | In the future, the check will be performed after getImgurID() 65 | is called in checkIfSaved() in order to conserve API calls and 66 | prevent rate limiting. It will also be compared against a database 67 | that contains Reddit submission IDs and Imgur Image/Album IDs in 68 | order to allow for checking against other subreddit posts since 69 | usually users will submit an image to more than one subreddit. 70 | ''' 71 | print(" Image: {}".format(url)) 72 | imgid = getImgurID(url) 73 | 74 | if checkIfSaved(reddID, imgid): 75 | print(" Saving") 76 | # Check to see if saving the image returns an error 77 | try: 78 | # Determine the save path by image type 79 | if "jpeg" in client.get_image(imgid).type: 80 | savepath = os.path.abspath("images/{}/{}.jpg".format(subreddit, imgid)) 81 | elif "png" in client.get_image(imgid).type: 82 | savepath = os.path.abspath("images/{}/{}.png".format(subreddit, imgid)) 83 | 84 | # Check to see if the image has already been saved. 85 | # If it has, skip that image. 86 | # I'm going to replace this in the future with a database check 87 | # that occurs earlier in the parrent if statement 88 | if not os.path.exists(os.path.abspath(savepath)): 89 | urllib.request.urlretrieve(client.get_image(imgid).link, savepath) 90 | time.sleep(2) 91 | else: 92 | print(" just kidding :P") 93 | except ImgurClientError as e: 94 | print(e.error_message) 95 | print(e.status_code) 96 | # I want to have it send a PM to me here if the error code matches certain things 97 | else: 98 | pass 99 | finally: 100 | pass 101 | else: 102 | print(" Skipping") 103 | 104 | def saveAlbum(client, subreddit, url, reddID): 105 | ''' 106 | Takes post url and gets the album ID. It then iterates 107 | through the album and calls saveImage(). 108 | ''' 109 | print(" Album: {}".format(url)) 110 | albid = getImgurAlbumID(url) 111 | 112 | print(" Saving") 113 | for image in client.get_album_images(albid): 114 | saveImage(client, subreddit, image.link, reddID) 115 | 116 | # this is to prevent rate limits after saving a very large album 117 | print(" Album done. Sleeping.") 118 | time.sleep(30) 119 | 120 | 121 | def checkIfSaved(reddID, imgid): 122 | ''' 123 | Takes a Reddit post ID and Imgur image ID and checks the 124 | database to see if it's been entered. 125 | ''' 126 | db = sqlite3.connect("images.db") 127 | 128 | c = db.execute("SELECT * FROM images WHERE imgur = ?", (imgid,)) 129 | 130 | if c.fetchone() == None: # If it's not found in imgur, save 131 | c = db.execute("SELECT * FROM images WHERE reddit = ?", (reddID,)) 132 | db.execute("insert into images (reddit, imgur) values (?, ?)", (reddID, imgid)) 133 | db.commit() 134 | db.close() 135 | return True 136 | else: # If it's found in imgur, don't save 137 | db.close() 138 | return False 139 | 140 | def main(): 141 | # Authentification with Reddit and Imgur 142 | r = praw.Reddit(aUserAgent) 143 | client = ImgurClient(aID, aSecret) 144 | 145 | 146 | # for sub in aTestSublist: 147 | for sub in aSublist: 148 | subreddit = r.get_subreddit(sub) 149 | 150 | createDir(subreddit) 151 | 152 | print("SAVING PICTURES FROM /r/{}".format(sub)) 153 | 154 | # Grab the current top 25 posts in subreddit 155 | for post in subreddit.get_hot(): 156 | if (("/a/" in post.url) | ("/gallery" in post.url)) & ("imgur.com" in post.url): 157 | saveAlbum(client, subreddit, post.url, post.id) 158 | elif "imgur.com" in post.url: 159 | saveImage(client, subreddit, post.url, post.id) 160 | else: # This is for non-imgur links 161 | print("Found non-Imgur link") 162 | time.sleep(1) 163 | 164 | # For now, just wait 3 seconds bewteen images. 165 | # Don't want to break the PRAW API rate limit 166 | time.sleep(1) 167 | 168 | # Wait 2 minute between subreddits 169 | # time.sleep(60) 170 | 171 | 172 | 173 | if __name__ == "__main__": 174 | main() --------------------------------------------------------------------------------