├── .gitignore ├── .travis.yml ├── LICENSE ├── README.md ├── download_links.py ├── main.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - "3.8" 4 | - "3.7" 5 | - "3.6" 6 | - "3.5" 7 | cache: pip 8 | install: 9 | - pip install -r requirements.txt 10 | script: 11 | - python main.py -h 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Gunjan Nandy 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # tutorial-pdf-downloader 2 | 3 | [](https://travis-ci.com/Gunjan933/tutorial-pdf-downloader) [](https://snyk.io//test/github/Gunjan933/tutorial-pdf-downloader?targetFile=requirements.txt) [](https://github.com/Gunjan933/tutorial-pdf-downloader/graphs/contributors) [](#python-support) 4 | 5 | Downloads full tutorial PDFs from **[Javatpoint](https://www.javatpoint.com/)**, **[Tutorialspoint](https://www.tutorialspoint.com/)** and other websites. 6 | 7 | ## Disclaimer / Please note: 8 | 9 | These websites can provide free and quality education by showing advertisements, that's their only source of income. Don't overuse this script, as it gives their server huge pressure. This script is for those, who are poor, who don't have the luxury of stable and sustained internet connection. As I think education should be free for all, doesn't mean I support piracy. This is for educational purposes only. Always support their work, either by paying or visiting their websites. 10 | 11 | ## Usage 12 | 13 | ### First install depedencies: 14 | Make sure you have pip installed, then run 15 | 16 | ```console 17 | pip install --user -r requirements.txt 18 | ``` 19 | ### Set up download links: 20 | * Copy any links from any tutorial from **[Javatpoint](https://www.javatpoint.com/)** or **[Tutorialspoint](https://www.tutorialspoint.com/)** or **both** and paste it in `download_links.py`. 21 | - To see examples, open **[download_links.py](./download_links.py)** 22 | * If you want to download all listed tutorial in both sites, then jump to next step. 23 | 24 | ### Run the Downloader: 25 | * To download each links from `download_links.py` that you set earlier: 26 | 27 | ```console 28 | python main.py -u 29 | ``` 30 | * To download all tutorials from Javatpoint: 31 | 32 | ```console 33 | python main.py -j 34 | ``` 35 | * To download all tutorials from Tutorialspoint: 36 | 37 | ```console 38 | python main.py -t 39 | ``` 40 | * To download all tutorials from Javatpoint and Tutorialspoint: 41 | 42 | ```console 43 | python main.py -a 44 | ``` 45 | * To check usages or help: 46 | 47 | ```console 48 | python main.py -h 49 | ``` 50 | 51 | ## Save location 52 | 53 | PDFs are saved in the same directory, in which you cloned the repository. 54 | ``` 55 | -- some_parent_directory 56 | | 57 | |-- tutorial-pdf-downloader 58 | | | 59 | | |--main.py 60 | | |--download_links.py 61 | | |--requirements.txt 62 | | |--README.md 63 | | 64 | | 65 | |-- downloads 66 | | 67 | |--artificial-intelligence 68 | |--mobile-computing 69 | ``` 70 | 71 | ## Changelog 72 | * Added **[Tutorialspoint](https://www.tutorialspoint.com/)** support. 73 | 74 | ## Future works 75 | * ~~Add **[Tutorialspoint](https://www.tutorialspoint.com/)** support.~~ 76 | * Add GUI. 77 | 78 | ## Bugs / Issues 79 | * If found, please report it in issues section. 80 | 81 | ## Contribute 82 | * Any contributions or suggestions are welcome. -------------------------------------------------------------------------------- /download_links.py: -------------------------------------------------------------------------------- 1 | download_list = { 2 | # "https://www.javatpoint.com/aptitude/quantitative" , 3 | # "https://www.tutorialspoint.com/react_native/react_native_alert.htm" , 4 | "https://www.javatpoint.com/apache-ant-tutorial" , 5 | # "https://www.tutorialspoint.com/java/index.htm" , 6 | } 7 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | from download_links import* 2 | import multiprocessing as mp 3 | from bs4 import BeautifulSoup 4 | from multiprocessing import Process 5 | from multiprocessing import Pool as process_pool 6 | from multiprocessing.dummy import Pool as thread_pool 7 | import requests, time, os, sys, weasyprint, argparse, fnmatch, shutil 8 | 9 | def tutorialspoint_get_page(url): 10 | domain_name = "https://www.tutorialspoint.com" 11 | while True: 12 | try: 13 | page_response = requests.get(url, headers={'User-Agent': 'Chrome'}, timeout=5) 14 | soup = BeautifulSoup(page_response.content, "html.parser") 15 | str_soup = str(soup) 16 | except: 17 | print(" >> Could not connect to "+url.split("/")[-1]+", retrying") 18 | time.sleep(1) 19 | continue 20 | else: 21 | print(" >> Downloaded " + url.split("/")[-1]) 22 | break 23 | page = str_soup[str_soup.find('
\n' 62 | head = head.replace("