├── requirements.txt ├── LICENSE ├── README.md └── leccap_dl.py /requirements.txt: -------------------------------------------------------------------------------- 1 | appdirs==1.4.3 2 | args==0.1.0 3 | clint==0.5.1 4 | packaging==16.8 5 | pyparsing==2.2.0 6 | requests==2.13.0 7 | selenium==3.3.1 8 | six==1.10.0 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Original Work Copyright (c) 2016 Maximilian Najork 4 | Modified Work Copyright (c) 2017 Aditya Bhatt 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LeccapDownloader 2 | 3 | An automated lecture recording downloader designed to work with https://leccap.engin.umich.edu. 4 | 5 | This implementation downloads selected lectures for a given course into a given directory. 6 | 7 | Requires Google Chrome/Firefox/Edge/Safari, python3, and Selenium (`pip install -r requirements.txt`). 8 | 9 | ## Setup 10 | 11 | 1. Download the most recent Web Driver for your browser and operating system 12 | * ChromeDriver: https://sites.google.com/a/chromium.org/chromedriver/downloads 13 | * FirefoxDriver (Marionette): https://github.com/mozilla/geckodriver/releases [**NOT WORKING WITH QUANTUM**] 14 | * Edge: https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/ [Will **NOT** work unless source code is modified] 15 | * Safari: https://webkit.org/blog/6900/webdriver-support-in-safari-10/ [Will **NOT** work unless source code is modified] 16 | 2. Extract the WebDriver zip and add the binary into your path 17 | * Alternatively, just put the binary (chromedriver or geckodriver) in the same folder as leccap_dl.py 18 | * Or, specify location to chromedriver or geckodriver using `-wdc` or `-wdf` respectively 19 | 3. `pip install -r requirements.txt` 20 | * This installs: 21 | * selenium: needed to navigate browser without user input 22 | * requests: needed to download video files 23 | * clint: needed for progress bar on downloads 24 | 25 | ## Usage 26 | `python leccap_dl.py` 27 | 28 | `python leccap_dl.py [-h] [-t] [-i COURSE_UID] [-o OUTPUT_DIRECTORY] [-wdf WEB_DRIVER_FIREFOX] [-wdc WEB_DRIVER_CHROME]` 29 | 30 | **Name** | **Type** | **Description** 31 | --- | --- | --- 32 | `-h [--help]` | flag | Show help message 33 | `-t [--threaded]`| flag | **Optional.** Runs each download in a separate thread. Minimal performance increase, no progress bar. 34 | `-i [--course-uid] COURSE_UID` | string | **Optional.** The unique course identifier, which can be found at the end of the leccap URL. Note that this is not the same as the unique identifier for an individual lecture recording. This allows for quick downloads if the course uid is known. If not, a menu of classes will appear. 35 | `-o [--output] OUTPUT_DIRECTORY` | string | **Optional.** The directory to output downloaded files to. Defaults to current directory. 36 | `-wdf [--web-driver-firefox] WEB_DRIVER_FIREFOX` | string | **Optional.** The location of the geckodriver. Defaults to PATH then current directory if not provided. 37 | `-wdc [--web-driver-chrome] WEB_DRIVER_CHROME` | string | **Optional.** The location of the chromedriver. Defaults to PATH then current directory if not provided. 38 | 39 | 40 | 41 | #### Example 42 | `python leccap_dl.py` 43 | 44 | `python leccap_dl.py -t --course-uid n3yotibeo2l5zofckkx -o /home/user/videos -wdf /usr/local/bin/geckodriver` 45 | 46 | `python leccap_dl.py -o /home/user/videos -wdc /usr/local/bin/chromedriver` 47 | -------------------------------------------------------------------------------- /leccap_dl.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import argparse 4 | import requests 5 | from clint.textui import progress 6 | import getpass 7 | from selenium import webdriver 8 | import threading 9 | import time 10 | import re 11 | import sys 12 | import datetime 13 | 14 | FILE_EXT = ".mp4" 15 | 16 | LOGIN_URL = "https://weblogin.umich.edu/" 17 | LECCAP = "https://leccap.engin.umich.edu/leccap" 18 | LECCAP_BASE_URL = LECCAP + "/viewer/s/" 19 | YEAR = datetime.datetime.now().year 20 | 21 | def parse_args(): 22 | parser = argparse.ArgumentParser(\ 23 | description="An automated leccap recording downloader",\ 24 | epilog="example: python leccap_dl.py hsfrlzcioe7xc71tu1w [-o /home/user/videos] [-t]") 25 | parser.add_argument("-t", "--threaded",\ 26 | help="if used, each download will be put in a new thread",\ 27 | action="store_true") 28 | parser.add_argument("-i","--course-uid",\ 29 | help="the unique leccap course identifier") 30 | parser.add_argument("-o", "--output-directory",\ 31 | default='.',\ 32 | help="directory to output files (default: current directory [.])") 33 | parser.add_argument("-wdf", "--web-driver-firefox",\ 34 | help="specify location of firefox WebDriver if not in current directory or PATH") 35 | parser.add_argument("-wdc", "--web-driver-chrome",\ 36 | help="specify location of chrome WebDriver if not in current directory or PATH") 37 | 38 | return parser.parse_args() 39 | 40 | def main(): 41 | args = parse_args() 42 | 43 | uniqname = input("Uniqname: ") 44 | password = getpass.getpass("Password: ") 45 | 46 | # initialize browser 47 | browser = init_browser(args.web_driver_chrome, args.web_driver_firefox) 48 | 49 | browser.implicitly_wait(60) # seconds 50 | 51 | # attempt login 52 | browser.get(LOGIN_URL) 53 | browser.find_element_by_id("login").send_keys(uniqname) 54 | browser.find_element_by_id("password").send_keys(password) 55 | browser.find_element_by_id("loginSubmit").click() 56 | 57 | class_select = re.compile('^[0-9]+$|^n$|^p$') 58 | if args.course_uid: 59 | # go to course leccap page 60 | leccap_course_url = LECCAP_BASE_URL + args.course_uid 61 | browser.get(leccap_course_url) 62 | else: 63 | # find available courses 64 | year = YEAR 65 | browser.get(LECCAP+"/"+str(year)) 66 | i = 0 67 | class_uid = [] 68 | for classes in browser.find_elements_by_class_name("list-group-item"): 69 | class_uid.append(classes.get_attribute("href").split("/")[-1]) 70 | print("[%d] Class: %s" % (i, classes.text)) 71 | i += 1 72 | class_index = input("Select class or p/n to change year [%i]: "%year) 73 | while not class_select.search(class_index): 74 | class_index = input("Select class or p/n to change year [%i]: "%year) 75 | while class_index == 'p' or class_index == 'n': 76 | # find available courses 77 | if class_index == 'p': 78 | year -= 1 79 | elif class_index == 'n': 80 | year += 1 81 | if year > YEAR: 82 | year = YEAR 83 | browser.get(LECCAP+"/"+str(year)) 84 | i = 0 85 | class_uid = [] 86 | c = browser.find_elements_by_class_name("list-group-item") 87 | if not c: 88 | print("No classes found") 89 | class_index = input("p/n to change year [%i]: "%year) 90 | while class_index != 'p' and class_index != 'n': 91 | class_index = input("p/n to change year [%i]: "%year) 92 | else: 93 | for classes in c: 94 | class_uid.append(classes.get_attribute("href").split("/")[-1]) 95 | print("[%d] Class: %s" % (i, classes.text)) 96 | i += 1 97 | class_index = input("Select class or p/n to change year [%i]: "%year) 98 | while not class_select.search(class_index): 99 | class_index = input("Select class or p/n to change year [%i]: "%year) 100 | 101 | leccap_course_url = LECCAP_BASE_URL + class_uid[int(class_index)] 102 | browser.get(leccap_course_url) 103 | 104 | # scrape lecture urls 105 | lecture_urls = [] 106 | lecture_names = [] 107 | i = 0 108 | for rec_btn in browser.find_elements_by_class_name("recording-button"): 109 | lec_url = rec_btn.get_attribute("href") 110 | lecture_urls.append(lec_url) 111 | # get lecture names and dates 112 | for rec_info in rec_btn.find_elements_by_class_name("recording-info"): 113 | date = rec_info.find_element_by_class_name("recording-date").text 114 | name = rec_info.find_element_by_class_name("recording-title").text 115 | lecture_names.append(name) 116 | print("[%d] Name: %s \t Date: %s" % (i, name, date)) 117 | i += 1 118 | 119 | # get src urls for selected lectures 120 | true_urls = [] 121 | video_re = re.compile('[0-9]+|\*') 122 | total_vid = i - 1 123 | print("Select video(s) to download (space delimited). * to download all") 124 | vids_list = input().split() 125 | def video_select(item): 126 | if not video_re.search(item): 127 | return False 128 | if str(item).isnumeric(): 129 | if int(item) < 0 or int(item) > total_vid: 130 | return False 131 | return True 132 | vids_list = list(filter(video_select, vids_list)) 133 | while not vids_list: 134 | print("Select video(s) to download (space delimited). * to download all") 135 | vids_list = input().split() 136 | vids_list = list(filter(video_select, vids_list)) 137 | select_indices = [x if x == '*' else int(x) for x in vids_list] 138 | if '*' in select_indices: 139 | true_urls = [(lecture_urls[i], lecture_names[i]) for i in range(len(lecture_urls))] 140 | else: 141 | true_urls = [(lecture_urls[i], lecture_names[i]) for i in select_indices] 142 | video_urls = [] 143 | for (lec_url, useless) in true_urls: 144 | browser.get(lec_url) 145 | vid_url = browser.find_element_by_tag_name("video").get_attribute("src") 146 | video_urls.append(vid_url) 147 | 148 | # exit browser 149 | browser.quit() 150 | 151 | # remove ending slash of output directory 152 | output_directory = args.output_directory 153 | if output_directory[-1] == '/': 154 | output_directory = output_directory[:-1] 155 | 156 | # download videos to output directory 157 | threads = [] 158 | for i in range(len(video_urls)): 159 | lec_name = true_urls[i][1] 160 | # remove bad characters from lecture filename 161 | lec_name = re.sub('[;/?:"=|*]','-',lec_name) 162 | filename = output_directory + '/' + lec_name + FILE_EXT 163 | if args.threaded: 164 | threads.append(threading.Thread(target=download_file, args=(filename, video_urls[i]))) 165 | else: 166 | print("downloading " + filename + " from " + video_urls[i]) 167 | r = requests.get(video_urls[i], stream=True) 168 | with open(filename, 'wb') as f: 169 | total_length = int(r.headers.get('content-length')) 170 | for chunk in progress.bar(r.iter_content(chunk_size=1024), expected_size=(total_length/1024) + 1): 171 | if chunk: 172 | f.write(chunk) 173 | if args.threaded: 174 | for i in threads: 175 | i.start() 176 | 177 | def download_file(filename, url): 178 | print("downloading " + filename + " from " + url) 179 | r = requests.get(url, stream=True) 180 | f = open(filename, 'wb') 181 | for chunk in r.iter_content(chunk_size=1024): 182 | if chunk: 183 | f.write(chunk) 184 | f.close() 185 | 186 | def init_browser(chrome, firefox): 187 | if chrome: 188 | try: 189 | browser = webdriver.Chrome(chrome) 190 | except Exception: 191 | try: 192 | browser = webdriver.Chrome() 193 | except Exception: 194 | try: 195 | browser = webdriver.Firefox() 196 | except Exception: 197 | if sys.platform == 'win32': 198 | try: 199 | browser = webdriver.Chrome('./chromedriver.exe') 200 | except Exception: 201 | try: 202 | browser = webdriver.Firefox(executable_path='./geckodriver.exe') 203 | except: 204 | print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr) 205 | exit(1) 206 | else: 207 | try: 208 | browser = webdriver.Chrome('./chromedriver') 209 | except Exception: 210 | try: 211 | browser = webdriver.Firefox(executable_path='./geckodriver') 212 | except: 213 | print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr) 214 | exit(1) 215 | elif firefox: 216 | try: 217 | browser = webdriver.Firefox(executable_path=firefox) 218 | except Exception: 219 | try: 220 | browser = webdriver.Chrome() 221 | except Exception: 222 | try: 223 | browser = webdriver.Firefox() 224 | except Exception: 225 | if sys.platform == 'win32': 226 | try: 227 | browser = webdriver.Chrome('./chromedriver.exe') 228 | except Exception: 229 | try: 230 | browser = webdriver.Firefox(executable_path='./geckodriver.exe') 231 | except: 232 | print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr) 233 | exit(1) 234 | else: 235 | try: 236 | browser = webdriver.Chrome('./chromedriver') 237 | except Exception: 238 | try: 239 | browser = webdriver.Firefox(executable_path='./geckodriver') 240 | except: 241 | print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr) 242 | exit(1) 243 | else: 244 | try: 245 | browser = webdriver.Chrome() 246 | except Exception: 247 | try: 248 | browser = webdriver.Firefox() 249 | except Exception: 250 | if sys.platform == 'win32': 251 | try: 252 | browser = webdriver.Chrome('./chromedriver.exe') 253 | except Exception: 254 | try: 255 | browser = webdriver.Firefox(executable_path='./geckodriver.exe') 256 | except: 257 | print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr) 258 | exit(1) 259 | else: 260 | try: 261 | browser = webdriver.Chrome('./chromedriver') 262 | except Exception: 263 | try: 264 | browser = webdriver.Firefox(executable_path='./geckodriver') 265 | except: 266 | print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr) 267 | exit(1) 268 | return browser 269 | 270 | if __name__ == '__main__': 271 | main() 272 | --------------------------------------------------------------------------------