├── requirements.txt
├── LICENSE
├── README.md
└── leccap_dl.py


/requirements.txt:
--------------------------------------------------------------------------------
1 | appdirs==1.4.3
2 | args==0.1.0
3 | clint==0.5.1
4 | packaging==16.8
5 | pyparsing==2.2.0
6 | requests==2.13.0
7 | selenium==3.3.1
8 | six==1.10.0
9 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Original Work Copyright (c) 2016 Maximilian Najork
 4 | Modified Work Copyright (c) 2017 Aditya Bhatt
 5 | 
 6 | Permission is hereby granted, free of charge, to any person obtaining a copy
 7 | of this software and associated documentation files (the "Software"), to deal
 8 | in the Software without restriction, including without limitation the rights
 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | copies of the Software, and to permit persons to whom the Software is
11 | furnished to do so, subject to the following conditions:
12 | 
13 | The above copyright notice and this permission notice shall be included in all
14 | copies or substantial portions of the Software.
15 | 
16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22 | SOFTWARE.
23 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # LeccapDownloader
 2 | 
 3 | An automated lecture recording downloader designed to work with https://leccap.engin.umich.edu.
 4 | 
 5 | This implementation downloads selected lectures for a given course into a given directory.
 6 | 
 7 | Requires Google Chrome/Firefox/Edge/Safari, python3, and Selenium (`pip install -r requirements.txt`).
 8 | 
 9 | ## Setup
10 | 
11 | 1. Download the most recent Web Driver for your browser and operating system
12 |     * ChromeDriver: https://sites.google.com/a/chromium.org/chromedriver/downloads
13 |     * FirefoxDriver (Marionette): https://github.com/mozilla/geckodriver/releases [**NOT WORKING WITH QUANTUM**]
14 |     * Edge: https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/ [Will **NOT** work unless source code is modified]
15 |     * Safari: https://webkit.org/blog/6900/webdriver-support-in-safari-10/ [Will **NOT** work unless source code is modified]
16 | 2. Extract the WebDriver zip and add the binary into your path
17 |     * Alternatively, just put the binary (chromedriver or geckodriver) in the same folder as leccap_dl.py
18 |     * Or, specify location to chromedriver or geckodriver using `-wdc` or `-wdf` respectively
19 | 3. `pip install -r requirements.txt`
20 |     * This installs:
21 |          * selenium: needed to navigate browser without user input
22 |          * requests: needed to download video files
23 |          * clint: needed for progress bar on downloads
24 | 
25 | ## Usage
26 | `python leccap_dl.py`
27 | 
28 | `python leccap_dl.py [-h] [-t] [-i COURSE_UID] [-o OUTPUT_DIRECTORY] [-wdf WEB_DRIVER_FIREFOX] [-wdc WEB_DRIVER_CHROME]`
29 | 
30 | **Name** | **Type** | **Description**
31 | --- | --- | ---
32 | `-h [--help]` | flag | Show help message
33 | `-t [--threaded]`| flag | **Optional.** Runs each download in a separate thread. Minimal performance increase, no progress bar.
34 | `-i [--course-uid] COURSE_UID` | string | **Optional.** The unique course identifier, which can be found at the end of the leccap URL. Note that this is not the same as the unique identifier for an individual lecture recording. This allows for quick downloads if the course uid is known. If not, a menu of classes will appear.
35 | `-o [--output] OUTPUT_DIRECTORY` | string | **Optional.** The directory to output downloaded files to. Defaults to current directory.
36 | `-wdf [--web-driver-firefox] WEB_DRIVER_FIREFOX` | string | **Optional.** The location of the geckodriver. Defaults to PATH then current directory if not provided.
37 | `-wdc [--web-driver-chrome] WEB_DRIVER_CHROME` | string | **Optional.** The location of the chromedriver. Defaults to PATH then current directory if not provided.
38 | 
39 | 
40 | 
41 | #### Example
42 | `python leccap_dl.py`
43 | 
44 | `python leccap_dl.py -t --course-uid n3yotibeo2l5zofckkx -o /home/user/videos -wdf /usr/local/bin/geckodriver`
45 | 
46 | `python leccap_dl.py -o /home/user/videos -wdc /usr/local/bin/chromedriver`
47 | 


--------------------------------------------------------------------------------
/leccap_dl.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python3
  2 | 
  3 | import argparse
  4 | import requests
  5 | from clint.textui import progress
  6 | import getpass
  7 | from selenium import webdriver
  8 | import threading
  9 | import time
 10 | import re
 11 | import sys
 12 | import datetime
 13 | 
 14 | FILE_EXT = ".mp4"
 15 | 
 16 | LOGIN_URL = "https://weblogin.umich.edu/"
 17 | LECCAP = "https://leccap.engin.umich.edu/leccap"
 18 | LECCAP_BASE_URL = LECCAP + "/viewer/s/"
 19 | YEAR = datetime.datetime.now().year
 20 | 
 21 | def parse_args():
 22 | 	parser = argparse.ArgumentParser(\
 23 | 		description="An automated leccap recording downloader",\
 24 | 		epilog="example: python leccap_dl.py hsfrlzcioe7xc71tu1w [-o /home/user/videos] [-t]")
 25 | 	parser.add_argument("-t", "--threaded",\
 26 | 		help="if used, each download will be put in a new thread",\
 27 | 		action="store_true")
 28 | 	parser.add_argument("-i","--course-uid",\
 29 | 		help="the unique leccap course identifier")
 30 | 	parser.add_argument("-o", "--output-directory",\
 31 | 		default='.',\
 32 | 		help="directory to output files (default: current directory [.])")
 33 | 	parser.add_argument("-wdf", "--web-driver-firefox",\
 34 | 		help="specify location of firefox WebDriver if not in current directory or PATH")
 35 | 	parser.add_argument("-wdc", "--web-driver-chrome",\
 36 | 		help="specify location of chrome WebDriver if not in current directory or PATH")
 37 | 
 38 | 	return parser.parse_args()
 39 | 
 40 | def main():
 41 | 	args = parse_args()
 42 | 
 43 | 	uniqname = input("Uniqname: ")
 44 | 	password = getpass.getpass("Password: ")
 45 | 
 46 | 	# initialize browser
 47 | 	browser = init_browser(args.web_driver_chrome, args.web_driver_firefox)
 48 | 
 49 | 	browser.implicitly_wait(60) # seconds
 50 | 
 51 | 	# attempt login
 52 | 	browser.get(LOGIN_URL)
 53 | 	browser.find_element_by_id("login").send_keys(uniqname)
 54 | 	browser.find_element_by_id("password").send_keys(password)
 55 | 	browser.find_element_by_id("loginSubmit").click()
 56 | 
 57 | 	class_select = re.compile('^[0-9]+$|^n$|^p$')
 58 | 	if args.course_uid:
 59 | 		# go to course leccap page
 60 | 		leccap_course_url = LECCAP_BASE_URL + args.course_uid
 61 | 		browser.get(leccap_course_url)
 62 | 	else:
 63 | 		# find available courses
 64 | 		year = YEAR
 65 | 		browser.get(LECCAP+"/"+str(year))
 66 | 		i = 0
 67 | 		class_uid = []
 68 | 		for classes in browser.find_elements_by_class_name("list-group-item"):
 69 | 			class_uid.append(classes.get_attribute("href").split("/")[-1])
 70 | 			print("[%d] Class: %s" % (i, classes.text))
 71 | 			i += 1
 72 | 		class_index = input("Select class or p/n to change year [%i]: "%year)
 73 | 		while not class_select.search(class_index):
 74 | 			class_index = input("Select class or p/n to change year [%i]: "%year)
 75 | 		while class_index == 'p' or class_index == 'n':
 76 | 			# find available courses
 77 | 			if class_index == 'p':
 78 | 				year -= 1
 79 | 			elif class_index == 'n':
 80 | 				year += 1
 81 | 			if year > YEAR:
 82 | 				year = YEAR
 83 | 			browser.get(LECCAP+"/"+str(year))
 84 | 			i = 0
 85 | 			class_uid = []
 86 | 			c = browser.find_elements_by_class_name("list-group-item")
 87 | 			if not c:
 88 | 				print("No classes found")
 89 | 				class_index = input("p/n to change year [%i]: "%year)
 90 | 				while class_index != 'p' and class_index != 'n':
 91 | 					class_index = input("p/n to change year [%i]: "%year)
 92 | 			else:
 93 | 				for classes in c:
 94 | 					class_uid.append(classes.get_attribute("href").split("/")[-1])
 95 | 					print("[%d] Class: %s" % (i, classes.text))
 96 | 					i += 1
 97 | 				class_index = input("Select class or p/n to change year [%i]: "%year)
 98 | 				while not class_select.search(class_index):
 99 | 					class_index = input("Select class or p/n to change year [%i]: "%year)
100 | 
101 | 		leccap_course_url = LECCAP_BASE_URL + class_uid[int(class_index)]
102 | 		browser.get(leccap_course_url)
103 | 	
104 | 	# scrape lecture urls
105 | 	lecture_urls = []
106 | 	lecture_names = []
107 | 	i = 0
108 | 	for rec_btn in browser.find_elements_by_class_name("recording-button"):
109 | 		lec_url = rec_btn.get_attribute("href")
110 | 		lecture_urls.append(lec_url)
111 | 		# get lecture names and dates
112 | 		for rec_info in rec_btn.find_elements_by_class_name("recording-info"):
113 | 			date = rec_info.find_element_by_class_name("recording-date").text
114 | 			name = rec_info.find_element_by_class_name("recording-title").text
115 | 			lecture_names.append(name)
116 | 			print("[%d] Name: %s \t Date: %s" % (i, name, date))
117 | 		i += 1
118 | 
119 | 	# get src urls for selected lectures
120 | 	true_urls = []
121 | 	video_re = re.compile('[0-9]+|\*')
122 | 	total_vid = i - 1
123 | 	print("Select video(s) to download (space delimited). * to download all")
124 | 	vids_list = input().split()
125 | 	def video_select(item):
126 | 		if not video_re.search(item):
127 | 			return False
128 | 		if str(item).isnumeric():
129 | 			if int(item) < 0 or int(item) > total_vid:
130 | 				return False
131 | 		return True
132 | 	vids_list = list(filter(video_select, vids_list))
133 | 	while not vids_list:
134 | 		print("Select video(s) to download (space delimited). * to download all")
135 | 		vids_list = input().split()
136 | 		vids_list = list(filter(video_select, vids_list))
137 | 	select_indices = [x if x == '*' else int(x) for x in vids_list]
138 | 	if '*' in select_indices:
139 | 		true_urls = [(lecture_urls[i], lecture_names[i]) for i in range(len(lecture_urls))]
140 | 	else:
141 | 		true_urls = [(lecture_urls[i], lecture_names[i]) for i in select_indices]
142 | 	video_urls = []
143 | 	for (lec_url, useless) in true_urls:
144 | 		browser.get(lec_url)
145 | 		vid_url = browser.find_element_by_tag_name("video").get_attribute("src")
146 | 		video_urls.append(vid_url)
147 | 
148 | 	# exit browser
149 | 	browser.quit()
150 | 
151 | 	# remove ending slash of output directory
152 | 	output_directory = args.output_directory
153 | 	if output_directory[-1] == '/':
154 | 		output_directory = output_directory[:-1]
155 | 
156 | 	# download videos to output directory
157 | 	threads = []
158 | 	for i in range(len(video_urls)):
159 | 		lec_name = true_urls[i][1]
160 | 		# remove bad characters from lecture filename
161 | 		lec_name = re.sub('[;/?:"=|*]','-',lec_name)
162 | 		filename =  output_directory + '/' + lec_name + FILE_EXT
163 | 		if args.threaded:
164 | 			threads.append(threading.Thread(target=download_file, args=(filename, video_urls[i])))
165 | 		else:
166 | 			print("downloading " + filename + " from " + video_urls[i])
167 | 			r = requests.get(video_urls[i], stream=True)
168 | 			with open(filename, 'wb') as f:
169 | 				total_length = int(r.headers.get('content-length'))
170 | 				for chunk in progress.bar(r.iter_content(chunk_size=1024), expected_size=(total_length/1024) + 1):
171 | 					if chunk:
172 | 						f.write(chunk)
173 | 	if args.threaded:
174 | 		for i in threads:
175 | 			i.start()
176 | 
177 | def download_file(filename, url):
178 | 	print("downloading " + filename + " from " + url)
179 | 	r = requests.get(url, stream=True)
180 | 	f = open(filename, 'wb')
181 | 	for chunk in r.iter_content(chunk_size=1024):
182 | 		if chunk:
183 | 			f.write(chunk)
184 | 	f.close()
185 | 
186 | def init_browser(chrome, firefox):
187 | 	if chrome:
188 | 		try:
189 | 			browser = webdriver.Chrome(chrome)
190 | 		except Exception:
191 | 			try:
192 | 				browser = webdriver.Chrome()
193 | 			except Exception:
194 | 				try:
195 | 					browser = webdriver.Firefox()
196 | 				except Exception:
197 | 					if sys.platform == 'win32':
198 | 						try:
199 | 							browser = webdriver.Chrome('./chromedriver.exe')
200 | 						except Exception:
201 | 							try:
202 | 								browser = webdriver.Firefox(executable_path='./geckodriver.exe')
203 | 							except:
204 | 								print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr)
205 | 								exit(1)
206 | 					else:
207 | 						try:
208 | 							browser = webdriver.Chrome('./chromedriver')
209 | 						except Exception:
210 | 							try:
211 | 								browser = webdriver.Firefox(executable_path='./geckodriver')
212 | 							except:
213 | 								print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr)
214 | 								exit(1)
215 | 	elif firefox:
216 | 		try:
217 | 			browser = webdriver.Firefox(executable_path=firefox)
218 | 		except Exception:
219 | 			try:
220 | 				browser = webdriver.Chrome()
221 | 			except Exception:
222 | 				try:
223 | 					browser = webdriver.Firefox()
224 | 				except Exception:
225 | 					if sys.platform == 'win32':
226 | 						try:
227 | 							browser = webdriver.Chrome('./chromedriver.exe')
228 | 						except Exception:
229 | 							try:
230 | 								browser = webdriver.Firefox(executable_path='./geckodriver.exe')
231 | 							except:
232 | 								print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr)
233 | 								exit(1)
234 | 					else:
235 | 						try:
236 | 							browser = webdriver.Chrome('./chromedriver')
237 | 						except Exception:
238 | 							try:
239 | 								browser = webdriver.Firefox(executable_path='./geckodriver')
240 | 							except:
241 | 								print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr)
242 | 								exit(1)
243 | 	else:	
244 | 		try:
245 | 			browser = webdriver.Chrome()
246 | 		except Exception:
247 | 			try:
248 | 				browser = webdriver.Firefox()
249 | 			except Exception:
250 | 				if sys.platform == 'win32':
251 | 					try:
252 | 						browser = webdriver.Chrome('./chromedriver.exe')
253 | 					except Exception:
254 | 						try:
255 | 							browser = webdriver.Firefox(executable_path='./geckodriver.exe')
256 | 						except:
257 | 							print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr)
258 | 							exit(1)
259 | 				else:
260 | 					try:
261 | 						browser = webdriver.Chrome('./chromedriver')
262 | 					except Exception:
263 | 						try:
264 | 							browser = webdriver.Firefox(executable_path='./geckodriver')
265 | 						except:
266 | 							print("Please add Chrome/Firefox WebDriver to path or current directory", file=sys.stderr)
267 | 							exit(1)
268 | 	return browser
269 | 
270 | if __name__ == '__main__':
271 | 	main()
272 | 


--------------------------------------------------------------------------------