├── GO.sh
├── README.md
├── create-slide-urls.py
├── create-slide-video.py
└── download-slides.sh


/GO.sh:
--------------------------------------------------------------------------------
1 | python3 create-slide-urls.py && mkdir -p slides && ./download-slides.sh


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # slideslive-downloader
 2 | download slideslive presentations (video + slides) + 🎉👍🙌👌✨👏**SYNCED SLIDES VIDEO**👏✨👌🙌👍🎉
 3 | 
 4 | ## what is it
 5 | I wanted to download ICLR 2020 videos, so I hacked this together. It should work for any presentation hosted on slideslive.com
 6 | 
 7 | ## how to use
 8 | - I have some hardcoded values, change them as appropriate
 9 | - I also download the main video directly from the webpage in browser by just finding the mp4 file in source.
10 | - and I use packages like progressbar2, opencv, and idk maybe some other stuff. Look at the source and install those packages.
11 | 
12 | e.g. Beyond “tabula rasa” in reinforcement learning: agents that remember, adapt, and generalize 
13 | https://slideslive.com/38926829/beyond-tabula-rasa-in-reinforcement-learning-agents-that-remember-adapt-and-generalize
14 | 
15 | slide URL: https://d2ygwrecguqg66.cloudfront.net/data/presentations/38926829/slides/big/0-0001.jpg
16 | 
17 | slide XML: https://d2ygwrecguqg66.cloudfront.net/data/presentations/38926829/38926829.xml
18 | (contains sync times for slides)
19 | 
20 | 
21 | 1. download mp4 from page
22 | 2. find the slide URL from the page source
23 | 3. find the slide XML from the page source and download it. it contains sync times for slides
24 | 4. modify `create-slide-urls.py` with correct `slide_count` and `url`
25 | 5. run `GO.sh`
26 | 6. modify `create-slide-video.py` with correct XML file 
27 | 7. run `python3 create-slide-video.py` to turn the slide images into a video that is synced with the main mp4
28 | 8. now you can use something like DaVinci Resolve video editor to place the videos side-by-side and render it out
29 | 
30 | done!
31 | 
32 | ### notes
33 | http://www.betr-rl.ml/2020/program/
34 | 


--------------------------------------------------------------------------------
/create-slide-urls.py:
--------------------------------------------------------------------------------
 1 | 
 2 | slide_count = 1102
 3 | 
 4 | with open('slide-urls.txt', 'w+') as f:
 5 | 	for i in range(1, slide_count+1):
 6 | 		filename = f"0-{str(i).zfill(4)}.jpg"
 7 | 		url = f"https://d2ygwrecguqg66.cloudfront.net/data/presentations/38926829/slides/big/{filename}"
 8 | 		f.write(url)
 9 | 		f.write('\n')
10 | 


--------------------------------------------------------------------------------
/create-slide-video.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | import operator
 4 | import progressbar
 5 | import xml.etree.ElementTree as ET
 6 | 
 7 | class Slide:
 8 | 	def __init__(self, order_id, time_sec, time, slide_name):
 9 | 		self.order_id = order_id
10 | 		self.time_sec = time_sec
11 | 		self.time = time
12 | 		self.slide_name = slide_name
13 | 		
14 | 	def __str__(self):
15 | 		return f"orderId: {self.order_id} timeSec: {self.time_sec} time: {self.time} slideName: {self.slide_name}"
16 | 		
17 | 	def __repr__(self):
18 | 		return self.__str__()
19 | 	
20 | 	
21 | slides = []
22 | 
23 | for child in ET.parse('38926829.xml').getroot().iter('slide'):
24 | 	order_id = int(child.find('orderId').text)
25 | 	time_sec = int(child.find('timeSec').text)
26 | 	time = int(child.find('time').text)
27 | 	slide_name = child.find('slideName').text
28 | 	
29 | 	slide = Slide(order_id, time_sec, time, slide_name)
30 | 	
31 | 	slides.append(slide)
32 | 	
33 | slides = sorted(slides, key=operator.attrgetter('order_id'))
34 | 
35 | fourcc = cv2.VideoWriter_fourcc(*'mp4v')
36 | out = cv2.VideoWriter('slides.mp4', fourcc, 1.0, (1024, 576))
37 | 
38 | for i in progressbar.progressbar(range(0, len(slides)-1)):
39 | 	frame_count = slides[i + 1].time_sec - slides[i].time_sec
40 | 	frame = cv2.imread(f"slides/{slides[i].slide_name}.jpg")
41 | 	for j in range(frame_count):
42 | 		out.write(frame)
43 | 
44 | out.release()
45 | 	


--------------------------------------------------------------------------------
/download-slides.sh:
--------------------------------------------------------------------------------
1 | parallel -j 16 wget -P slides/ < slide-urls.txt


--------------------------------------------------------------------------------