├── .gitignore ├── .pre-commit-config.yaml ├── CHANGELOG.md ├── LICENSE.txt ├── MANIFEST.in ├── README.md ├── bookcut ├── Settings.ini ├── __init__.py ├── article.py ├── bibliography.py ├── book.py ├── book_details.py ├── bookcut.py ├── booklist.py ├── downloader.py ├── libgen.py ├── mirror_checker.py ├── organise.py ├── repositories.py ├── search.py └── settings.py ├── conftest.py ├── pytest.ini ├── setup.py └── tests ├── __init__.py ├── test_book.py ├── test_bookcut.py ├── test_main.py └── test_mirror_checker.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | bookdb.db 3 | BookCat.egg-info/ 4 | build/ 5 | dist/ 6 | BookCut.egg-info/ 7 | resources.json 8 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/psf/black 3 | rev: 21.6b0 4 | hooks: 5 | - id: black 6 | language_version: python3 7 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | All notable changes to this project will be documented in this file. 3 | 4 | 5 | ##[1.3.7] 6 | 7 | ### Add 8 | - Black autoformatting 9 | 10 | ### Fix 11 | -Issue#7 12 | 13 | 14 | ## [1.3.6] 15 | - Repos option added. ArXiv is now included. 16 | - Fixed known bugs 17 | 18 | ## [1.3.5] - 06_Jun_2021 19 | - Fixed Issue#9 20 | 21 | ### Fixed 22 | - Organise options known bug that could not move some files fixed. 23 | 24 | ## [1.3.1] - 11_Aug_2020 25 | 26 | ### Fixed 27 | - Organise options known bug that could not move some files fixed. 28 | 29 | ## [1.3.0] - 10_Aug_2020 30 | 31 | ### Added 32 | - Configuration mode, user now can change some basic settings of BookCut like 33 | destination folder and clear screen option. Also can add more Libgen Mirrors. 34 | 35 | ### Fixed 36 | - Some raise errors and some known bugs. 37 | 38 | ## [1.2.3] - 07_Aug_2020 39 | 40 | ### Fixed 41 | - [Bug] Issue#6 Forced list error. 42 | 43 | 44 | ## [1.2.2] - 03_Aug_2020 45 | 46 | ### Fixed 47 | - [Bug] Issue with FileNotFoundError, when no home/Documents directory existed. 48 | 49 | 50 | ## [1.2.1] - 03_Aug_2020 51 | 52 | ### Added 53 | - Version option added ('bookcut --version') 54 | - List forced flag, for downloading the founded books automatically. 55 | - Added this file (CHANGELOG.md). 56 | 57 | ### Fixed 58 | - Search option bug, when no results found. 59 | - PEP8 fixes 60 | 61 | ## [1.2.0] - 03_Aug_2020 62 | ### Added 63 | - Details option 64 | - Bibliography option which returns a list with all the books from an author, with the option to save to .txt file 65 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) [2020] [Costis94] 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include bookcut/Settings.ini 2 | include README.md 3 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | [![Downloads](https://pepy.tech/badge/bookcut)](https://pepy.tech/project/bookcut) ![pypi](https://img.shields.io/pypi/v/pip.svg) 5 | [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) 6 | 7 | 8 | BookCut is a Python Command Line Interface tool, that help the user to download **free e-books**, 9 | **organise** them in folders by genre, **retrieve** book details by *ISBN* or *title*, 10 | get a list with **all the books from a writer** and save them to .txt file. 11 | 12 | *With the help of LibGen, ArXiv and OpenLibrary.* 13 | 14 | 15 | ## REQUIREMENTS 16 | 17 | * Python 3 18 | * python3-pip 19 | 20 | 21 | ## Installation 22 | 23 | * **Install with pip:** 24 | 25 | ```bash 26 | pip install bookcut 27 | #or if you have also Python 2 28 | pip3 install bookcut 29 | ``` 30 | 31 | 32 | ## Usage 33 | 34 | ### Searching and downloading books: 35 | 36 | * Download a **single** book: 37 | 38 | ```bash 39 | bookcut book -b "White Fang" -a "Jack London" 40 | ``` 41 | 42 | * Download a **list** of books: 43 | 44 | ```bash 45 | bookcut list "FreeEbooksToDownload.txt" 46 | ``` 47 | 48 | * Organise a **folder** full of e-books to folders according to genre: 49 | 50 | ```bash 51 | bookcut organise "full/path/to/folder" 52 | ``` 53 | *** 54 | * Search **LibGen**, output the results and download e-book: 55 | 56 | ```bash 57 | bookcut search -t 'Homer Odyssey' 58 | ``` 59 | 60 | * Search more book repositories with the **--repos** option: 61 | ``` bash 62 | bookcut search -t 'Homer Odyssey' --repos 'libgen,arxiv' 63 | ``` 64 | **Available book repositories: Libgen, ArXiv** 65 | *** 66 | 67 | * Get the **details** of a book by **title and author**, or simply **ISBN**. 68 | 69 | ```bash 70 | bookcut details -b 'Homer Iliad' 71 | ``` 72 | 73 | * Get a list with *all the books* from an **author**,with an option to save to .txt: 74 | 75 | ```bash 76 | bookcut all-books -author 'Stephen King' 77 | ``` 78 | *** 79 | ### Searching and downloading articles: 80 | Now you can use bookcut to search and download **scientific articles**. 81 | 82 | - Search with the Digital Object Identifier: 83 | ``` 84 | bookcut article --doi "10.1126/science.196.4287.293" 85 | ``` 86 | - Search with the exact title: 87 | ``` 88 | bookcut article --title "Ribulose Bisphosphate Carboxylase A Two-Layered, Square-Shaped Molecule of Symmetry" 89 | ``` 90 | **** 91 | ### Configuration 92 | * Also you can change some basic settings of BookCut. For more check: 93 | 94 | ```bash 95 | bookcut config --help 96 | ``` 97 | 98 | ## TO-DO 99 | * Add Tests 100 | * Add documentation 101 | * Add more sources with free e-books 102 | * Fix organiser so it can use all types of files 103 | * Add a logger. 104 | 105 | ## Copyrights 106 | Please use the bookcut app to download **only free e-books** that are legally distributing through *BookCut repositories.* 107 | Bookcut contributors do not have any responsibility for the use of the tool. 108 | ## Contributing 109 | Pull requests are welcome, this is my first project so be kind. 110 | For major changes, please open an issue first to discuss what you would like to change. 111 | 112 | Please make sure to update tests as appropriate. 113 | 114 | ## License 115 | [MIT](https://choosealicense.com/licenses/mit/) 116 | -------------------------------------------------------------------------------- /bookcut/Settings.ini: -------------------------------------------------------------------------------- 1 | [LibGen] 2 | mirrors = https://libgen.lc/,http://libgen.li/,http://185.39.10.101/,http://genesis.lib/ 3 | 4 | [Repositories] 5 | available_repos = arxiv,libgen 6 | 7 | [ArticleRepositories] 8 | article_repos = openaccessbutton 9 | 10 | [Settings] 11 | clean_screen = True 12 | destination = None 13 | -------------------------------------------------------------------------------- /bookcut/__init__.py: -------------------------------------------------------------------------------- 1 | from importlib import metadata 2 | 3 | __version__ = metadata.version("bookcut") 4 | -------------------------------------------------------------------------------- /bookcut/article.py: -------------------------------------------------------------------------------- 1 | from bookcut.repositories import open_access_button 2 | from bookcut.downloader import filename_refubrished 3 | from bookcut.search import search_downloader 4 | from click import confirm 5 | 6 | """ 7 | Article.py is using from article command and searches repositories for 8 | published articles. 9 | """ 10 | 11 | 12 | def article_search(doi, title): 13 | try: 14 | article_json_data = open_access_button(doi, title) 15 | url = article_json_data["url"] 16 | metadata = article_json_data["metadata"] 17 | title = metadata["title"] 18 | filename = filename_refubrished(title) 19 | filename = filename + ".pdf" 20 | ask_for_downloading(filename, url) 21 | except KeyError: 22 | print("\nCan not find the given article.\nPlease try another search!") 23 | 24 | 25 | def ask_for_downloading(articlefilename, url): 26 | ask = confirm(f"Do you want to download:\n {articlefilename}") 27 | if ask is True: 28 | search_downloader(articlefilename, url) 29 | else: 30 | print("Aborted!") 31 | -------------------------------------------------------------------------------- /bookcut/bibliography.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import json 3 | import re 4 | from difflib import SequenceMatcher 5 | import os 6 | from bookcut.mirror_checker import pageStatus 7 | 8 | """This file is used by ---allbooks command 9 | It is searching OpenLibrary for all books written from an 10 | author, and gives the choice to user to save it to a .txt file""" 11 | 12 | OPEN_LIBRARY_URL = "http://www.openlibrary.org" 13 | 14 | 15 | def main(author, similarity): 16 | # returns all the books writen by an author from openlibrary 17 | # using similarity for filtering the results 18 | status = pageStatus(OPEN_LIBRARY_URL) 19 | if status is not False: 20 | search_url = "http://openlibrary.org/search.json?author=" + author 21 | jason = requests.get(search_url) 22 | jason = jason.text 23 | data = json.loads(jason) 24 | data = data["docs"] 25 | if data != []: 26 | metr = 0 27 | books = [] 28 | for i in range(0, len(data) - 1): 29 | title = data[metr]["title"] 30 | metr = metr + 1 31 | books.append(title) 32 | mylist = list(dict.fromkeys(books)) 33 | 34 | # Filtrering results: trying to erase similar titles 35 | words = [ 36 | " the ", 37 | "The ", 38 | " THE ", 39 | " The" " a ", 40 | " A ", 41 | " and ", 42 | " of ", 43 | " from ", 44 | "on", 45 | "The", 46 | "in", 47 | ] 48 | 49 | noise_re = re.compile( 50 | "\\b(%s)\\W" % ("|".join(map(re.escape, words))), re.I 51 | ) 52 | clean_mylist = [noise_re.sub("", p) for p in mylist] 53 | 54 | for i in clean_mylist: 55 | for j in clean_mylist: 56 | a = similar(i, j, similarity) 57 | if a is True: 58 | clean_mylist.pop(a) 59 | 60 | clean_mylist.sort() 61 | print(" ~Books found to OpenLibrary Database:\n") 62 | for i in clean_mylist: 63 | print(i) 64 | return clean_mylist 65 | else: 66 | print("(!) No valid author name, or bad internet connection.") 67 | print("Please try again!") 68 | return None 69 | 70 | 71 | def similar(a, b, similarity): 72 | """function which check similarity between two strings""" 73 | ratio = SequenceMatcher(None, a, b).ratio() 74 | if ratio > similarity and ratio < 1: 75 | return True 76 | else: 77 | return False 78 | 79 | 80 | def save_to_txt(lista, path, author): 81 | # save the books list to txt file. 82 | for content in lista: 83 | name = f"{author}_bibliography.txt" 84 | full_path = os.path.join(path, name) 85 | with open(full_path, "a", encoding="utf-8") as f1: 86 | f1.write(content + " " + author + os.linesep) 87 | print("\nList saved at: ", full_path, "\n") 88 | -------------------------------------------------------------------------------- /bookcut/book.py: -------------------------------------------------------------------------------- 1 | from bookcut.mirror_checker import settingParser 2 | import mechanize 3 | from bs4 import BeautifulSoup as Soup 4 | from bookcut.libgen import file_name 5 | from click import confirm 6 | from bookcut.downloader import downloading 7 | from bookcut.repositories import arxiv, libgen_repo 8 | import pandas as pd 9 | from bookcut.search import choose_a_book 10 | 11 | 12 | def libgen_book_find( 13 | title, author, publisher, destination, extension, force, libgenurl 14 | ): 15 | """searching @ LibGen for a single book""" 16 | try: 17 | book = Booksearch(title, author, publisher, type, libgenurl) 18 | result = book.search() 19 | extensions = result["extensions"] 20 | tb = result["table_data"] 21 | mirrors = result["mirrors"] 22 | file_details = book.give_result(extensions, tb, mirrors, extension) 23 | if file_details is not None: 24 | book.cursor(file_details["url"], destination, file_details["file"], force) 25 | except TypeError: 26 | # TODO add logger error 27 | pass 28 | 29 | 30 | def book_searching_in_repos(term, repos): 31 | # search a book in various Repositories 32 | if repos is None: 33 | libgen_data = libgen_repo(term) 34 | return libgen_data 35 | repos = repos.split(",") 36 | repos = [i.strip(" ") for i in repos] 37 | available_repos = settingParser("Repositories", "available_repos") 38 | df = pd.DataFrame({"Author(s)": [], "Title": [], "Size": [], "Extension": []}) 39 | for i in repos: 40 | if i in available_repos: 41 | if i == "arxiv": 42 | arxiv_data = arxiv(term) 43 | df = pd.concat([df, arxiv_data], ignore_index=True) 44 | if i == "libgen": 45 | libgen_data = libgen_repo(term) 46 | df = pd.concat([df, libgen_data], ignore_index=True) 47 | choose_a_book(df) 48 | 49 | 50 | class Booksearch: 51 | """searching libgen original page and returns book details and mirror link""" 52 | 53 | def __init__(self, title, author, publisher, filetype, libgenurl): 54 | self.title = title 55 | self.author = author 56 | self.publisher = publisher 57 | self.filetype = filetype 58 | self.mirror = None 59 | self.libgenurl = libgenurl 60 | 61 | def search(self): 62 | """searching libgen and returns table data, extensions and links""" 63 | br = mechanize.Browser() 64 | br.set_handle_robots(False) # ignore robots 65 | br.set_handle_refresh(False) # 66 | br.addheaders = [("User-agent", "Firefox")] 67 | 68 | br.open(self.libgenurl) 69 | br.select_form("libgen") 70 | input_form = self.title + " " + self.author + " " + self.publisher 71 | br.form["req"] = input_form 72 | ac = br.submit() 73 | html_from_page = ac 74 | soup = Soup(html_from_page, "html.parser") 75 | links_table = soup.find_all("table")[3] 76 | table_data = [] 77 | mirrors = [] 78 | extensions = [] 79 | for i in links_table: 80 | try: 81 | td = i.find_all("td") 82 | for tr in td: 83 | # scrape mirror links 84 | temp = tr.find("a", href=True) 85 | mirror_page = temp["href"] 86 | # add also mirror link 87 | if mirror_page.startswith("http") is False: 88 | mirror_page = self.libgenurl + temp["href"] 89 | else: 90 | mirror_page = temp["href"] 91 | mirrors.append(mirror_page) 92 | except Exception as e: 93 | print(e) 94 | 95 | # Parse Details from table_data 96 | table = soup.find_all("table")[2] 97 | for i in table: 98 | try: 99 | td = i.find_all("td") 100 | row = tr.find_all("tr") 101 | row = [tr.text for tr in td] 102 | table_data.append(row) 103 | extensions.append(row[8]) 104 | table_details = dict() 105 | table_details["extensions"] = extensions 106 | table_details["table_data"] = table_data 107 | table_details["mirrors"] = mirrors 108 | return table_details 109 | except Exception as e: 110 | pass 111 | 112 | def give_result(self, extensions, table_data, mirrors, filetype): 113 | try: 114 | if filetype is not None: 115 | temp = 0 116 | for i in extensions: 117 | if filetype == i: 118 | result = dict() 119 | result["url"] = mirrors[temp] 120 | result["file"] = extensions[temp] 121 | print("\nDownloading Link: FOUND") 122 | return result 123 | temp = temp + 1 124 | else: 125 | # return the first result 126 | result = dict() 127 | result["url"] = mirrors[0] 128 | result["file"] = extensions[0] 129 | print("\nDownloading Link: FOUND") 130 | return result 131 | except IndexError: 132 | print("Downloading Link:NOT FOUND\n") 133 | print("================================") 134 | 135 | def cursor(self, url, destination_folder, extension, forced): 136 | """asking the user to download a chosen book or to abort""" 137 | title = str(self.title) 138 | author = str(self.author) 139 | nameofbook = file_name(url) 140 | if nameofbook is None: 141 | nameofbook = ( 142 | title.replace("\n", "") + author.replace("\n", "") + "." + extension 143 | ) 144 | if forced is not True: 145 | ask = confirm(f"Do you want to download:\n {nameofbook}") 146 | if ask is True: 147 | downloading( 148 | url, title, author, nameofbook, destination_folder, extension 149 | ) 150 | else: 151 | downloading(url, title, author, nameofbook, destination_folder, extension) 152 | -------------------------------------------------------------------------------- /bookcut/book_details.py: -------------------------------------------------------------------------------- 1 | import requests 2 | import json 3 | from requests import ConnectionError 4 | from bookcut.mirror_checker import pageStatus 5 | 6 | """This file is using by ---details command. 7 | It's main use is to search OpenLibrary for a books' details. 8 | It's input can be the name of the book or the ISBN. 9 | """ 10 | 11 | OPEN_LIBRARY_URL = "http://www.openlibrary.org" 12 | 13 | 14 | def main(term): 15 | # searching OpenLibrary and prints the details of a book 16 | try: 17 | if term is None: 18 | term = input( 19 | "Please enter the book and the author, or the ISBN of the book." 20 | ) 21 | term = term.replace(" ", "+") 22 | pageStatus(OPEN_LIBRARY_URL) 23 | search_url = "http://openlibrary.org/search.json?q=" + term 24 | jason = requests.get(search_url) 25 | jason = jason.text 26 | data = json.loads(jason) 27 | try: 28 | data = data["docs"][0] 29 | except IndexError: 30 | data = None 31 | print("Invalid search, please try again.") 32 | 33 | if data is not None: 34 | author = data["author_name"][0] 35 | title = data["title_suggest"] 36 | isbn = data["isbn"] 37 | first_publish_year = data["first_publish_year"] 38 | try: 39 | lang = data["language"] 40 | except KeyError: 41 | lang = None 42 | 43 | print("Results for search: ", term, "\n") 44 | print("Title:", title) 45 | print("Author(s):", author, "\n") 46 | print("ISBN(s):", isbn, "\n") 47 | if lang is not None: 48 | print( 49 | "Language(s): ", 50 | ) 51 | print("\nFirst published: ", first_publish_year) 52 | except ConnectionError: 53 | url = "http://www.openlibrary.com" 54 | print( 55 | "Unable to connect to:", 56 | url, 57 | "\nPlease check your internet connection and try again later.", 58 | ) 59 | except json.decoder.JSONDecodeError: 60 | print("An error occured during the retrieving of data.") 61 | print("Please try again later.") 62 | -------------------------------------------------------------------------------- /bookcut/bookcut.py: -------------------------------------------------------------------------------- 1 | import click 2 | import pyfiglet 3 | from os import name, system 4 | from bookcut import __version__ 5 | from bookcut.mirror_checker import main as mirror_checker, settingParser 6 | from bookcut.book import libgen_book_find, book_searching_in_repos 7 | from bookcut.organise import main_organiser 8 | from bookcut.search import choose_a_book 9 | from bookcut.book_details import main as detailing 10 | from bookcut.bibliography import main as allbooks 11 | from bookcut.bibliography import save_to_txt 12 | from bookcut.article import article_search 13 | from bookcut.settings import initial_config, mirrors_append, read_settings 14 | from bookcut.settings import ( 15 | screen_setting, 16 | print_settings, 17 | set_destination, 18 | path_checker, 19 | ) 20 | from bookcut.booklist import booklist_main 21 | from bookcut.repositories import libgen_repo 22 | from bookcut.libgen import md5_search 23 | 24 | 25 | @click.group(name="commands") 26 | @click.version_option(version=__version__) 27 | def entry(): 28 | """ 29 | for a single book download you can \n 30 | bookcut.py book --bookname "White Fang" -- author "Jack London" 31 | \nor bookcut.py book -b "White Fang" -a "Jack London" \n 32 | *For a more complete help: bookcut.py [COMMAND] --help\n 33 | *For example: bookcut.py list --help 34 | """ 35 | # read the settings ini file and check what value for clean screen 36 | settings = read_settings() 37 | clean_screen(settings[0]) 38 | title = pyfiglet.figlet_format("BookCut") 39 | click.echo(title) 40 | click.echo("**********************************") 41 | print("Welcome to BookCut! I'm here to" "\nhelp you to read your favourite books!") 42 | print("**********************************") 43 | 44 | 45 | @entry.command(name="list", help="Download a list of ebook from a .txt file") 46 | @click.option( 47 | "--file", 48 | "-f", 49 | help="A .txt file in which books are written in a separate line", 50 | required=True, 51 | ) 52 | @click.option( 53 | "--destination", 54 | "-d", 55 | help="The destinations folder of the downloaded books", 56 | default=path_checker(), 57 | ) 58 | @click.option( 59 | "--forced", help="Forced option, accepts all books for downloading", is_flag=True 60 | ) 61 | @click.option("--extension", "-ext", help="File type of e-book.") 62 | def download_from_txt(file, destination, forced, extension): 63 | click.echo("Importing of book list:Started.") 64 | if forced: 65 | click.echo(click.style("(!) Forced list downloading:Enabled", fg="green")) 66 | booklist_main(file, destination, forced, extension) 67 | 68 | 69 | @entry.command( 70 | name="book", 71 | help="Download a book in epub format, by inserting" "\n the title and the author", 72 | ) 73 | @click.option("--book", "-b", help="Title of Book", default=" ") 74 | @click.option("--author", "-a", help="The author of the Book", default=" ") 75 | @click.option("--publisher", "-p", default="") 76 | @click.option( 77 | "--destination", 78 | "-d", 79 | help="The destinations folder of the downloaded books", 80 | default=path_checker(), 81 | ) 82 | @click.option("--extension", "-ext", help="Filetype of e-book for example:pdf") 83 | @click.option("--forced", is_flag=True) 84 | @click.option("--md5", help="Md5 search for a specific book version.", default=None) 85 | def book(book, author, publisher, destination, extension, forced, md5): 86 | if book == " " and md5 is None: 87 | print("Invalid Input! Check for more.") 88 | elif author != " " and book != " ": 89 | click.echo(f"\nSearching for {book.capitalize()} by {author.capitalize()}") 90 | elif book != " ": 91 | click.echo(f"\nSearching for {book.capitalize()}") 92 | url = mirror_checker() 93 | if url is not None: 94 | if md5 is not None: 95 | print("\nSearching for book with md5: ", md5) 96 | md5_search(md5, url, destination) 97 | else: 98 | libgen_book_find( 99 | book, author, publisher, destination, extension, forced, url 100 | ) 101 | 102 | 103 | def clean_screen(setting): 104 | """Cleans the terminal screen""" 105 | if setting == "True": 106 | if name == "nt": 107 | _ = system("cls") 108 | else: 109 | _ = system("clear") 110 | 111 | 112 | @entry.command( 113 | name="organise", help="Organise the ebooks in folders according\n to genre" 114 | ) 115 | @click.option( 116 | "--directory", 117 | "-d", 118 | help="Directory of source ", 119 | required=True, 120 | default=path_checker(), 121 | ) 122 | @click.option( 123 | "--output", 124 | "-o", 125 | help="The destination folder of organised books", 126 | default=path_checker(), 127 | ) 128 | def organiser(directory, output): 129 | print("\nBookCut is starting to \norganise your books!") 130 | main_organiser(directory) 131 | 132 | 133 | @entry.command(name="all-books", help="Search and return all the books from an author") 134 | @click.option("--author", "-a", required=True, help="Author name") 135 | @click.option( 136 | "--ratio", "-r", help="Ratio for filtering book results", default="0.7", type=float 137 | ) 138 | def bibliography(author, ratio): 139 | print(f"\nStart searching for all books by {author.capitalize()}:") 140 | lista = allbooks(author, ratio) 141 | if lista is not None: 142 | print("**********************************") 143 | choice = "y or n" 144 | while choice != "Y" or choice != "N": 145 | choice = input("\nDo you wish to save the list? [Y/n]: ") 146 | choice = choice.capitalize() 147 | if choice == "Y": 148 | save_to_txt(lista, path_checker(), author) 149 | break 150 | elif choice == "N": 151 | print("Aborted.") 152 | break 153 | 154 | 155 | @entry.command( 156 | name="search", 157 | help="Search LibGen or other repositories and choose a book to download", 158 | ) 159 | @click.option("--term", "-t", help="Term for searching") 160 | @click.option("--repos", default=None) 161 | def searching(term, repos): 162 | print("Searching for:", term.capitalize()) 163 | # set default libgen search 164 | if repos is None: 165 | libgen_data = libgen_repo(term) 166 | choose_a_book(libgen_data) 167 | else: 168 | book_searching_in_repos(term, repos) 169 | 170 | 171 | @entry.command(name="details", help="Search the details of a book") 172 | @click.option( 173 | "--book", 174 | "-b", 175 | help="Enter book & author or the ISBN number.", 176 | required=True, 177 | default=None, 178 | ) 179 | def details(book): 180 | detailing(book) 181 | 182 | 183 | @entry.command(name="article", help="Search for an article") 184 | @click.option("--doi", "-d", help="Enter D.O.I. of the article", default=None) 185 | @click.option("--title", "-t", help="Enter title of article", default=None) 186 | def article(doi, title): 187 | if doi or title is not None: 188 | article_search(doi, title) 189 | else: 190 | print("Not correct input. \nPlease use: bookcut article --help") 191 | 192 | 193 | @entry.command(name="config", help="BookCut configuration settings") 194 | @click.option("--libgen_add", help="Add a Libgen mirror to mirrors list", default=None) 195 | @click.option( 196 | "--restore", help="Restores the settings file to initial state", is_flag=True 197 | ) 198 | @click.option("--settings", help="Prints the current BookCut settings", is_flag=True) 199 | @click.option( 200 | "--clean_screen", 201 | help="You can choose if BookCut will" " clean terminal screen", 202 | is_flag=True, 203 | ) 204 | @click.option("--download_folder", help="Set BookCut's download folder", default=None) 205 | def configure_mode(restore, libgen_add, settings, clean_screen, download_folder): 206 | if restore: 207 | prompt = click.confirm("\n Are you sure do you want to restore Settings?") 208 | if prompt is True: 209 | initial_config() 210 | else: 211 | click.echo("Aborted!") 212 | elif libgen_add is not None: 213 | click.echo(f"Adding {libgen_add} to mirrors list") 214 | mirrors_append(libgen_add) 215 | elif settings: 216 | print_settings() 217 | elif clean_screen: 218 | prompt = click.confirm("\nDo you want Bookcut to clean command line?") 219 | if prompt is True: 220 | screen_setting("True") 221 | else: 222 | screen_setting("False") 223 | elif download_folder is not None: 224 | set_destination(download_folder) 225 | else: 226 | print( 227 | "Usage: bookcut config [OPTIONS]", 228 | "\nTry 'bookcut config --help' for help.\n", 229 | "\nError: Missing option or flag.", 230 | ) 231 | 232 | 233 | if __name__ == "__main__": 234 | entry() 235 | -------------------------------------------------------------------------------- /bookcut/booklist.py: -------------------------------------------------------------------------------- 1 | from bookcut.book import libgen_book_find 2 | from bookcut.mirror_checker import main as mirror_checker 3 | 4 | 5 | def file_list(filename): 6 | """checks if the input file is a .txt file and adds each separate line 7 | as a book to the list 'Lines'. 8 | After return this list to download_from_txt 9 | """ 10 | 11 | if filename.endswith(".txt"): 12 | try: 13 | file1 = open(filename, "r", encoding="utf-8") 14 | Lines = file1.readlines() 15 | for i in Lines: 16 | if i == "\n": 17 | Lines.remove(i) 18 | return Lines 19 | except FileNotFoundError: 20 | print("Error:No such file or directory:", filename) 21 | else: 22 | print("\nError:Not correct file type. Please insert a '.txt' file") 23 | 24 | 25 | def booklist_main(file, destination, forced, extension): 26 | """executes with the command --list""" 27 | Lines = file_list(file) 28 | if Lines is not None: 29 | print("List imported succesfully!") 30 | url = mirror_checker() 31 | if url is not None: 32 | temp = 1 33 | many = len(Lines) 34 | for a in Lines: 35 | if a != "": 36 | print( 37 | f"~[{temp}/{many}] Searching for:", 38 | a, 39 | ) 40 | temp = temp + 1 41 | libgen_book_find(a, "", "", destination, extension, forced, url) 42 | -------------------------------------------------------------------------------- /bookcut/downloader.py: -------------------------------------------------------------------------------- 1 | import requests 2 | from tqdm import tqdm 3 | import os 4 | from bs4 import BeautifulSoup as Soup 5 | 6 | 7 | def downloading(link, name, author, file, destination_folder, type): 8 | """finds the first available book and sends the link to file_downloader""" 9 | page = requests.get(link) 10 | soup = Soup(page.content, "html.parser") 11 | 12 | searcher = [a["href"] for a in soup.find_all(href=True) if a.text] 13 | searcher_link = searcher[0] 14 | if searcher_link.startswith("http") is False: 15 | until_dot = link.split("//") 16 | searcher_link = until_dot[0] + "//" + until_dot[1] + searcher_link 17 | file_downloader(searcher_link, name, author, file, destination_folder, type) 18 | 19 | 20 | def file_downloader(href, name, author, file, destination_folder, type): 21 | """Downloads the book file to users folder""" 22 | response = requests.get(href, stream=True) 23 | total_size = int(response.headers.get("content-length")) 24 | inMb = total_size / 1000000 25 | inMb = round(inMb, 2) 26 | print("\nDownloading...", "\nTotal file size:", inMb, "MB") 27 | 28 | # Folder to download books 29 | filename = file 30 | if filename != "": 31 | pass 32 | else: 33 | filename = name + " - " + author + type 34 | path = destination_folder 35 | 36 | filename = os.path.join(path, filename) 37 | 38 | try: 39 | with open(filename, "wb") as f: 40 | """For progress bar""" 41 | with tqdm(total=total_size, unit="iB", unit_scale=True) as pbar: 42 | for ch in response.iter_content(chunk_size=1024): 43 | if ch: 44 | f.write(ch) 45 | pbar.update(len(ch)) 46 | 47 | print("================================\nFile saved as:", filename) 48 | except FileNotFoundError: 49 | print("ERROR! Is the destination folder exists? ") 50 | 51 | 52 | def pathfinder(): 53 | path = os.path.expanduser("~/Documents/BookCut") 54 | if os.path.isdir(path): 55 | pass 56 | else: 57 | os.makedirs(path) 58 | return path 59 | 60 | 61 | def filename_refubrished(filename): 62 | # for valid filenames without special characters 63 | special_char = [":", "/", '""', "?", "*", "<", ">", "|"] 64 | for i in special_char: 65 | filename = filename.replace(i, " ") 66 | return filename 67 | -------------------------------------------------------------------------------- /bookcut/libgen.py: -------------------------------------------------------------------------------- 1 | from bs4 import BeautifulSoup as soupa 2 | import requests 3 | from bookcut.downloader import file_downloader 4 | from click import confirm 5 | from bookcut.search import RESULT_ERROR 6 | 7 | 8 | def epub_finder(soup): 9 | table = soup.find("table", attrs={"class": "c"}) 10 | tb = table.find_all("tr") 11 | data = [] 12 | epub = "epub" 13 | for row in tb: 14 | col = row.find_all("td") 15 | col = [ele.text.strip() for ele in col] 16 | xxx = [ele for ele in col if ele] 17 | 18 | false_results = ["[1]", "[2]", "[3]", "[4]", "[5]"] 19 | if false_results == xxx: 20 | pass 21 | else: 22 | data.append(xxx) 23 | del data[0] 24 | count = 0 25 | for a in data: 26 | if epub in a: 27 | break 28 | else: 29 | count = count + 1 30 | return count 31 | 32 | 33 | def file_name(url): 34 | print("URL: ", url) 35 | page = requests.get(url) 36 | try: 37 | soup = soupa(page.content, "html.parser") 38 | r = soup.find("input")["value"] 39 | r.replace("\n", "") 40 | return r 41 | except TypeError: 42 | return None 43 | 44 | 45 | def md5_search(md5, url, destination): 46 | try: 47 | # function that using by book command and searching for a specific book in LibGen with a given md5 value 48 | mirror_url = url + "/ads.php?md5=" + md5 49 | req = requests.get(mirror_url) 50 | soup = soupa(req.content, "html.parser") 51 | html = soup.find("input", attrs={"id": "textarea-example"}) 52 | filename = html["value"] 53 | url_soup = soup.findAll("table", attrs={"id": "main"}) 54 | 55 | urls = [] 56 | for j in url_soup: 57 | a = j.findAll("a", href=True) 58 | for i in a: 59 | urls.append(i["href"]) 60 | download_url = url + urls[0] 61 | question = confirm(f"Do you want to download:\n{filename}") 62 | if question is True: 63 | file_downloader(download_url, "", "", filename, destination, "") 64 | else: 65 | print("Aborted!") 66 | except TypeError: 67 | print(RESULT_ERROR) 68 | -------------------------------------------------------------------------------- /bookcut/mirror_checker.py: -------------------------------------------------------------------------------- 1 | import requests 2 | from requests import ConnectionError 3 | import configparser 4 | import os 5 | 6 | 7 | CONNECTION_ERROR_MESSAGE = ( 8 | "\nUnable to connect to: {} " 9 | "\nPlease check your internet connection and try again later." 10 | ) 11 | 12 | 13 | def settingParser(section, value): 14 | "Parsing data from Settings.ini" 15 | config = configparser.ConfigParser() 16 | module_path = os.path.dirname(os.path.realpath(__file__)) 17 | settings_ini = os.path.join(module_path, "Settings.ini") 18 | config.read(settings_ini) 19 | mirrors = config.get(section, value) 20 | mirrors = mirrors.split(",") 21 | return mirrors 22 | 23 | 24 | def main(verbose=True): 25 | """Check which LibGen mirror is available""" 26 | 27 | mirrors = settingParser("LibGen", "mirrors") 28 | for url in mirrors: 29 | try: 30 | r = requests.head(url) 31 | if r.status_code == 200 or r.status_code == 301: 32 | status = True 33 | if status is True: 34 | if verbose is True: 35 | print("Connected to:", url) 36 | return url 37 | break 38 | else: 39 | print("No mirrors available or no Internet Connection!") 40 | except: 41 | pass 42 | 43 | 44 | def pageStatus(url, verbose=True): 45 | try: 46 | request = requests.head(url) 47 | if request.status_code == 200 or request.status_code == 301: 48 | if verbose is True: 49 | print("Connected to:", url) 50 | return True 51 | except ConnectionError: 52 | pass 53 | print(CONNECTION_ERROR_MESSAGE.format(url)) 54 | return False 55 | 56 | 57 | if __name__ == "__main__": 58 | main() 59 | -------------------------------------------------------------------------------- /bookcut/organise.py: -------------------------------------------------------------------------------- 1 | from bookcut.mirror_checker import pageStatus 2 | import os 3 | import shutil 4 | import requests 5 | import json 6 | 7 | OPEN_LIBRARY_URL = "http://www.openlibrary.org" 8 | 9 | 10 | def main_organiser(directory): 11 | status = pageStatus(OPEN_LIBRARY_URL) 12 | if status is not False: 13 | book_list = get_books(directory) 14 | # lists only the files in the given directory 15 | namepath = [] 16 | with os.scandir(directory) as entries: 17 | for entry in entries: 18 | if entry.is_file(): 19 | namepath.append(entry.name) 20 | for i in range(0, len(book_list)): 21 | print("File:", namepath[i]) 22 | try: 23 | """splitting file name to author and book title for using as 24 | searching terms to OpenLibrary""" 25 | a = book_list[i].split("by") 26 | book = a[1] 27 | author = a[0] 28 | a = scraper(book, author) 29 | print("\n***", book, " ", author) 30 | a = a["genre"] 31 | filename = namepath[i] 32 | cutpaste(directory, a, filename) 33 | except IndexError: 34 | try: 35 | a = book_list[i].split("-") 36 | book = a[1] 37 | author = a[0] 38 | a = scraper(book, author) 39 | print("\n***", book, " ", author) 40 | a = a["genre"] 41 | filename = namepath[i] 42 | cutpaste(directory, a, filename) 43 | except IndexError: 44 | print("Unable to organise this file.\n") 45 | pass 46 | 47 | 48 | def get_books(dir): 49 | """filtering epub, pdf, txt, mobi, djvu files in the given directory 50 | and return a list with all filenames""" 51 | epub_list = [] 52 | for file in os.listdir(dir): 53 | if file.endswith(".epub"): 54 | renamed = file.replace(".epub", "") 55 | renamed = renamed.replace("_", " ") 56 | epub_list.append(renamed) 57 | elif file.endswith(".pdf"): 58 | renamed = file.replace(".pdf", "") 59 | renamed = renamed.replace("_", " ") 60 | epub_list.append(renamed) 61 | elif file.endswith(".txt"): 62 | renamed = file.replace(".txt", "") 63 | renamed = renamed.replace("_", " ") 64 | epub_list.append(renamed) 65 | elif file.endswith(".mobi"): 66 | renamed = file.replace(".mobi", "") 67 | renamed = renamed.replace("_", " ") 68 | epub_list.append(renamed) 69 | elif file.endswith(".djvu"): 70 | renamed = file.replace(".djvu", "") 71 | renamed = renamed.replace("_", " ") 72 | epub_list.append(renamed) 73 | return epub_list 74 | 75 | 76 | def scraper(book, author): 77 | """parsing the book category from OpenLibrary""" 78 | try: 79 | book = book.replace(" ", "+") 80 | author = author.replace(" ", "+") 81 | 82 | search_url = "http://openlibrary.org/search.json?q=" + book + "+" + author 83 | jason = requests.get(search_url) 84 | jason = jason.text 85 | data = json.loads(jason) 86 | json_formatted_str = json.dumps(data, indent=2) 87 | 88 | book_values = {} 89 | isbn = None 90 | author_name = None 91 | title = None 92 | subject = None 93 | try: 94 | # TODO: to add feature to check all docs 95 | 96 | data = data["docs"][0] 97 | except IndexError: 98 | data = None 99 | if data is not None: 100 | try: 101 | isbn = data["isbn"][0] 102 | except KeyError: 103 | pass 104 | try: 105 | author_name = data["author_name"][0] 106 | except KeyError: 107 | pass 108 | try: 109 | title = data["title_suggest"] 110 | except KeyError: 111 | pass 112 | try: 113 | subject = data["subject"] 114 | except KeyError: 115 | pass 116 | 117 | book_values.update([("isbn", isbn), ("author", author_name), ("title", title)]) 118 | if subject is not None: 119 | for a in subject: 120 | x = genre_finder(a) 121 | if x is not None: 122 | subject = x 123 | break 124 | else: 125 | subject = "Uncategorized" 126 | else: 127 | subject = "Uncategorized" 128 | book_values.update({"genre": subject}) 129 | return book_values 130 | except requests.ConnectionError: 131 | url = "http://www.openlibrary.com" 132 | print( 133 | "Unable to connect to:", 134 | url, 135 | "\nPlease check your internet connection and try again later.", 136 | ) 137 | return None 138 | 139 | 140 | def genre_finder(sub): 141 | genres = [ 142 | "Classics", 143 | "Literary", 144 | "Fiction", 145 | "Historical Fiction", 146 | "Romance", 147 | "Horror", 148 | "Mystery", 149 | "Suspence", 150 | "Fantasy", 151 | "Action", 152 | "Adventure", 153 | "Science Fiction", 154 | "History", 155 | "Biography", 156 | "Autobiography", 157 | "Poetry", 158 | "Art", 159 | "Music", 160 | "Humor", 161 | "Religion", 162 | "Mythology", 163 | "Philosophy", 164 | "Health", 165 | "Science", 166 | "Social Science", 167 | "Psychology", 168 | "Self-helf", 169 | "Nonfiction", 170 | ] 171 | 172 | if sub in genres: 173 | return sub 174 | else: 175 | return None 176 | 177 | 178 | def cutpaste(dir, genre, file): 179 | """Check if genre folder exists if not it creates one""" 180 | path = os.path.join(dir, genre) 181 | if os.path.isdir(path): 182 | pass 183 | else: 184 | os.mkdir(path) 185 | print("Created folder:", genre) 186 | filepath = os.path.join(path, file) 187 | 188 | from_path = os.path.join(dir, file) 189 | dest_path = os.path.join(dir, genre, file) 190 | shutil.move(from_path, dest_path) 191 | print("File moved to: ", genre, "\n", "\n", "********************") 192 | -------------------------------------------------------------------------------- /bookcut/repositories.py: -------------------------------------------------------------------------------- 1 | from bs4 import BeautifulSoup as soup 2 | from bookcut.mirror_checker import ( 3 | pageStatus, 4 | main as mirror_checker, 5 | CONNECTION_ERROR_MESSAGE, 6 | ) 7 | import mechanize 8 | import pandas as pd 9 | import requests 10 | 11 | ARCHIV_URL = "https://export.arxiv.org/find/grp_cs,grp_econ,grp_eess,grp_math,grp_physics,grp_q-bio,grp_q-fin,grp_stat" 12 | ARCHIV_BASE = "https://export.arxiv.org" 13 | OPEN_ACCESS_BUTTON = "https://api.openaccessbutton.org/find" 14 | 15 | 16 | def arxiv(term): 17 | # Searching Arxiv.org and returns a DataFrame with the founded results. 18 | status = pageStatus(ARCHIV_URL) 19 | if status: 20 | br = mechanize.Browser() 21 | br.set_handle_robots(False) # ignore robots 22 | br.set_handle_refresh(False) # 23 | br.addheaders = [("User-agent", "Firefox")] 24 | 25 | br.open(ARCHIV_URL) 26 | br.select_form(nr=0) 27 | input_form = term 28 | br.form["query"] = input_form 29 | ac = br.submit() 30 | html_from_page = ac 31 | html_soup = soup(html_from_page, "html.parser") 32 | 33 | t = html_soup.findAll("div", {"class": "list-title mathjax"}) 34 | titles = [] 35 | for i in t: 36 | raw = i.text 37 | raw = raw.replace("Title: ", "") 38 | raw = raw.replace("\n", "") 39 | titles.append(raw) 40 | authors = [] 41 | auth_soup = html_soup.findAll("div", {"class": "list-authors"}) 42 | for i in auth_soup: 43 | raw = i.text 44 | raw = raw.replace("Authors:", "") 45 | raw = raw.replace("\n", "") 46 | authors.append(raw) 47 | extensions = [] 48 | urls = [] 49 | ext = html_soup.findAll("span", {"class": "list-identifier"}) 50 | for i in ext: 51 | a = i.findAll("a") 52 | link = a[1]["href"] 53 | extensions.append(str(a[1].text)) 54 | urls.append(ARCHIV_BASE + link) 55 | 56 | arxiv_df = pd.DataFrame( 57 | { 58 | "Title": titles, 59 | "Author(s)": authors, 60 | "Url": urls, 61 | "Extension": extensions, 62 | } 63 | ) 64 | 65 | return arxiv_df 66 | else: 67 | print(CONNECTION_ERROR_MESSAGE.format("ArXiv")) 68 | return None 69 | 70 | 71 | def libgen_repo(term): 72 | # Searching LibGen and returns results DataFrame 73 | try: 74 | url = mirror_checker() 75 | if url is not None: 76 | br = mechanize.Browser() 77 | br.set_handle_robots(False) # ignore robots 78 | br.set_handle_refresh(False) # 79 | br.addheaders = [("User-agent", "Firefox")] 80 | 81 | br.open(url) 82 | br.select_form("libgen") 83 | input_form = term 84 | br.form["req"] = input_form 85 | ac = br.submit() 86 | html_from_page = ac 87 | html_soup = soup(html_from_page, "html.parser") 88 | table = html_soup.find_all("table")[2] 89 | 90 | table_data = [] 91 | mirrors = [] 92 | extensions = [] 93 | 94 | for i in table: 95 | j = 0 96 | try: 97 | td = i.find_all("td") 98 | for tr in td: 99 | # scrape mirror links 100 | if j == 9: 101 | temp = tr.find("a", href=True) 102 | mirrors.append(temp["href"]) 103 | j = j + 1 104 | row = [tr.text for tr in td] 105 | table_data.append(row) 106 | extensions.append(row[8]) 107 | except: 108 | pass 109 | 110 | # Clean result page 111 | for j in table_data: 112 | j.pop(0) 113 | del j[8:15] 114 | headers = [ 115 | "Author(s)", 116 | "Title", 117 | "Publisher", 118 | "Year", 119 | "Pages", 120 | "Language", 121 | "Size", 122 | "Extension", 123 | ] 124 | 125 | tabular = pd.DataFrame(table_data) 126 | tabular.columns = headers 127 | tabular["Url"] = mirrors 128 | return tabular 129 | except ValueError: 130 | # create emptyDataframe 131 | df = pd.DataFrame() 132 | return df 133 | 134 | 135 | def open_access_button(doi, title): 136 | status = pageStatus(OPEN_ACCESS_BUTTON) 137 | if status: 138 | if doi is not None: 139 | query = {"doi": doi} 140 | else: 141 | query = {"title": title} 142 | req = requests.get(OPEN_ACCESS_BUTTON, params=query) 143 | response = req.json() 144 | return response 145 | else: 146 | print(CONNECTION_ERROR_MESSAGE.format("Open Access Button")) 147 | -------------------------------------------------------------------------------- /bookcut/search.py: -------------------------------------------------------------------------------- 1 | from bookcut.mirror_checker import main as mirror_checker 2 | from bookcut.downloader import filename_refubrished 3 | from bookcut.settings import path_checker 4 | from bs4 import BeautifulSoup as Soup 5 | import mechanize 6 | import pandas as pd 7 | import os 8 | import requests 9 | from tqdm import tqdm 10 | 11 | RESULT_ERROR = "\nNo results found or bad Internet connection.\nPlease try again!" 12 | 13 | 14 | def search_downloader(file, href): 15 | # search_downloader downloads the book 16 | response = requests.get(href, stream=True) 17 | total_size = int(response.headers.get("content-length")) 18 | inMb = total_size / 1000000 19 | inMb = round(inMb, 2) 20 | filename = file 21 | print("\nDownloading...\n", "Total file size:", inMb, "MB") 22 | 23 | path = path_checker() 24 | 25 | filename = os.path.join(path, filename) 26 | # progress bar 27 | buffer_size = 1024 28 | progress = tqdm( 29 | response.iter_content(buffer_size), 30 | f"{file}", 31 | total=total_size, 32 | unit="B", 33 | unit_scale=True, 34 | unit_divisor=1024, 35 | ) 36 | with open(filename, "wb") as f: 37 | for data in progress: 38 | # write data read to the file 39 | f.write(data) 40 | # update the progress bar manually 41 | progress.update(len(data)) 42 | print("================================\nFile saved as:", filename) 43 | 44 | 45 | def link_finder(link, mirror_used): 46 | # link_ finder is searching Libgen for download link and filename 47 | page = requests.get(link) 48 | soup = Soup(page.content, "html.parser") 49 | searcher = [a["href"] for a in soup.find_all(href=True) if a.text] 50 | try: 51 | filename = soup.find("input")["value"] 52 | except TypeError: 53 | filename = None 54 | if searcher[0].startswith("http") is False: 55 | searcher[0] = mirror_used + searcher[0] 56 | results = [filename, searcher[0]] 57 | return results 58 | 59 | 60 | def search(term): 61 | # This function is used when searching to LibGen with the command 62 | # bookcut search -t "keyword" 63 | 64 | url = mirror_checker() 65 | if url is not None: 66 | br = mechanize.Browser() 67 | br.set_handle_robots(False) # ignore robots 68 | br.set_handle_refresh(False) # 69 | br.addheaders = [("User-agent", "Firefox")] 70 | 71 | br.open(url) 72 | br.select_form("libgen") 73 | input_form = term 74 | br.form["req"] = input_form 75 | ac = br.submit() 76 | html_from_page = ac 77 | soup = Soup(html_from_page, "html.parser") 78 | table = soup.find_all("table")[2] 79 | 80 | table_data = [] 81 | mirrors = [] 82 | extensions = [] 83 | 84 | for i in table: 85 | j = 0 86 | try: 87 | td = i.find_all("td") 88 | for tr in td: 89 | # scrape mirror links 90 | if j == 9: 91 | temp = tr.find("a", href=True) 92 | mirrors.append(temp["href"]) 93 | j = j + 1 94 | row = [tr.text for tr in td] 95 | table_data.append(row) 96 | extensions.append(row[8]) 97 | 98 | except: 99 | pass 100 | 101 | # Clean result page 102 | for j in table_data: 103 | j.pop(0) 104 | del j[8:15] 105 | headers = [ 106 | "Author(s)", 107 | "Title", 108 | "Publisher", 109 | "Year", 110 | "Pages", 111 | "Language", 112 | "Size", 113 | "Extension", 114 | ] 115 | 116 | try: 117 | tabular = pd.DataFrame(table_data) 118 | tabular.index += 1 119 | tabular.columns = headers 120 | print(tabular) 121 | choices = [] 122 | temp = len(mirrors) + 1 123 | for i in range(1, temp): 124 | choices.append(str(i)) 125 | choices.append("C") 126 | choices.append("c") 127 | while True: 128 | tell_me = str( 129 | input( 130 | "\n\nPlease enter a number from 1 to {number}" 131 | ' to download a book or press "C" to abort' 132 | " search: ".format(number=len(extensions)) 133 | ) 134 | ) 135 | if tell_me in choices: 136 | if tell_me == "C" or tell_me == "c": 137 | print("Aborted!") 138 | return None 139 | else: 140 | c = int(tell_me) - 1 141 | results = [mirrors[c], extensions[c]] 142 | return results 143 | except ValueError: 144 | print("\nNo results found or bad Internet connection.") 145 | print("Please,try again.") 146 | return None 147 | else: 148 | print("\nNo results found or bad Internet connection.") 149 | print("Please,try again.") 150 | 151 | 152 | def single_search(): 153 | def search(term): 154 | # This function is used when searching to LibGen with the command 155 | # bookcut search -t "keyword" 156 | 157 | url = mirror_checker() 158 | if url is not None: 159 | br = mechanize.Browser() 160 | br.set_handle_robots(False) # ignore robots 161 | br.set_handle_refresh(False) # 162 | br.addheaders = [("User-agent", "Firefox")] 163 | 164 | br.open(url) 165 | br.select_form("libgen") 166 | input_form = term 167 | br.form["req"] = input_form 168 | ac = br.submit() 169 | html_from_page = ac 170 | soup = Soup(html_from_page, "html.parser") 171 | table = soup.find_all("table")[2] 172 | 173 | table_data = [] 174 | mirrors = [] 175 | extensions = [] 176 | 177 | for i in table: 178 | j = 0 179 | try: 180 | td = i.find_all("td") 181 | for tr in td: 182 | # scrape mirror links 183 | if j == 9: 184 | temp = tr.find("a", href=True) 185 | mirrors.append(temp["href"]) 186 | j = j + 1 187 | row = [tr.text for tr in td] 188 | table_data.append(row) 189 | extensions.append(row[8]) 190 | 191 | except: 192 | pass 193 | 194 | # Clean result page 195 | for j in table_data: 196 | j.pop(0) 197 | del j[8:15] 198 | headers = [ 199 | "Author(s)", 200 | "Title", 201 | "Publisher", 202 | "Year", 203 | "Pages", 204 | "Language", 205 | "Size", 206 | "Extension", 207 | ] 208 | 209 | try: 210 | tabular = pd.DataFrame(table_data) 211 | tabular.index += 1 212 | tabular.columns = headers 213 | print(tabular) 214 | choices = [] 215 | temp = len(mirrors) + 1 216 | for i in range(1, temp): 217 | choices.append(str(i)) 218 | choices.append("C") 219 | choices.append("c") 220 | while True: 221 | tell_me = str( 222 | input( 223 | "\n\nPlease enter a number from 1 to {number}" 224 | ' to download a book or press "C" to abort' 225 | " search: ".format(number=len(extensions)) 226 | ) 227 | ) 228 | if tell_me in choices: 229 | if tell_me == "C" or tell_me == "c": 230 | print("Aborted!") 231 | return None 232 | else: 233 | c = int(tell_me) - 1 234 | print(mirrors[c], " ", extensions[c]) 235 | results = [mirrors[c], extensions[c]] 236 | return results 237 | except ValueError: 238 | print("\nNo results found or bad Internet connection.") 239 | print("Please,try again.") 240 | return None 241 | else: 242 | print("\nNo results found or bad Internet connection.") 243 | print("Please,try again.") 244 | 245 | 246 | def choose_a_book(dataframe): 247 | # asks the user which book to download from the printed DataFrame 248 | if dataframe.empty is False: 249 | dataframe.index += 1 250 | print(dataframe[["Author(s)", "Title", "Size", "Extension"]]) 251 | 252 | urls = dataframe["Url"].to_list() 253 | titles = dataframe["Title"].to_list() 254 | extensions = dataframe["Extension"].to_list() 255 | choices = [] 256 | temp = len(urls) + 1 257 | for i in range(1, temp): 258 | choices.append(str(i)) 259 | choices.append("C") 260 | choices.append("c") 261 | try: 262 | while True: 263 | tell_me = str( 264 | input( 265 | "\n\nPlease enter a number from 1 to {number}" 266 | ' to download a book or press "C" to abort' 267 | " search: ".format(number=len(urls)) 268 | ) 269 | ) 270 | if tell_me in choices: 271 | if tell_me == "C" or tell_me == "c": 272 | print("Aborted!") 273 | return None 274 | else: 275 | c = int(tell_me) - 1 276 | filename = titles[c] + "." + extensions[c] 277 | filename = filename_refubrished(filename) 278 | if urls[c].startswith("https://export.arxiv.org/"): 279 | search_downloader(filename, urls[c]) 280 | return False 281 | else: 282 | mirror_used = mirror_checker(False) 283 | link = mirror_used + urls[c] 284 | details = link_finder(link, mirror_used) 285 | file_link = details[1] 286 | search_downloader(filename, file_link) 287 | return False 288 | except ValueError: 289 | print(RESULT_ERROR) 290 | print("Please,try again.") 291 | return None 292 | else: 293 | print(RESULT_ERROR) 294 | -------------------------------------------------------------------------------- /bookcut/settings.py: -------------------------------------------------------------------------------- 1 | import configparser 2 | import os 3 | from bookcut.downloader import pathfinder 4 | 5 | 6 | def initial_config(): 7 | """function to create settings .ini file, used also for restore settings""" 8 | try: 9 | write_config = configparser.ConfigParser() 10 | module_path = os.path.dirname(os.path.realpath(__file__)) 11 | settings_ini = os.path.join(module_path, "Settings.ini") 12 | 13 | write_config.add_section("LibGen") 14 | write_config.add_section("Settings") 15 | mirrors = "https://libgen.lc/,http://libgen.li/,http://185.39.10.101/,http://genesis.lib/" 16 | write_config.set("LibGen", "mirrors", mirrors) 17 | write_config.set("Settings", "clean_screen", "True") 18 | write_config.set("Settings", "destination", "None") 19 | 20 | cfgfile = open(settings_ini, "w") 21 | write_config.write(cfgfile) 22 | cfgfile.close() 23 | except PermissionError as error: 24 | print("\n", error) 25 | print("You have to be administrator to change BookCut settings. ") 26 | 27 | 28 | def mirrors_append(url): 29 | """function to append the LibGen mirrors list""" 30 | 31 | try: 32 | 33 | # READ EXISTING LIST 34 | config = configparser.ConfigParser() 35 | module_path = os.path.dirname(os.path.realpath(__file__)) 36 | settings_ini = os.path.join(module_path, "Settings.ini") 37 | 38 | config.read(settings_ini) 39 | mirrors = config.get("LibGen", "mirrors") 40 | mirrors = mirrors + "," + url 41 | 42 | # APPEND LIST 43 | mirrors = str(mirrors) 44 | config.set("LibGen", "mirrors", mirrors) 45 | 46 | # WRITE TO INI FILE 47 | cfgfile = open(settings_ini, "w") 48 | config.write(cfgfile) 49 | cfgfile.close() 50 | 51 | # Succefully message 52 | print("\nSuccesfully added to list!:") 53 | mirrors = mirrors.split(",") 54 | for i in mirrors: 55 | print(i) 56 | except PermissionError as error: 57 | print("\n", error) 58 | print("You have to be administrator to change BookCut settings. ") 59 | 60 | 61 | def read_settings(): 62 | # read the config file settings and printing them 63 | 64 | # get ini file path 65 | config = configparser.ConfigParser() 66 | module_path = os.path.dirname(os.path.realpath(__file__)) 67 | settings_ini = os.path.join(module_path, "Settings.ini") 68 | 69 | # get values 70 | config.read(settings_ini) 71 | clean_screen = config.get("Settings", "clean_screen") 72 | destination = config.get("Settings", "destination") 73 | settings = [clean_screen, destination] 74 | return settings 75 | 76 | 77 | def print_settings(): 78 | """Prints settings""" 79 | settings = read_settings() 80 | 81 | print("\nBookCut Settings:\n") 82 | print("1.Clean Screen Option Enabled: ", settings[0]) 83 | print("2.Destination Folder Path: ", settings[1]) 84 | 85 | 86 | def screen_setting(input): 87 | """clean screen settings adjust""" 88 | try: 89 | config = configparser.ConfigParser() 90 | module_path = os.path.dirname(os.path.realpath(__file__)) 91 | settings_ini = os.path.join(module_path, "Settings.ini") 92 | 93 | config.read(settings_ini) 94 | config.set("Settings", "clean_screen", input) 95 | cfgfile = open(settings_ini, "w") 96 | config.write(cfgfile) 97 | cfgfile.close() 98 | except PermissionError as error: 99 | print("\n", error) 100 | print("You have to be administrator to change BookCut settings. ") 101 | 102 | 103 | def set_destination(path): 104 | try: 105 | if os.path.isdir(path): 106 | module_path = os.path.dirname(os.path.realpath(__file__)) 107 | settings_ini = os.path.join(module_path, "Settings.ini") 108 | 109 | config = configparser.ConfigParser() 110 | config.read(settings_ini) 111 | config.set("Settings", "destination", path) 112 | cfgfile = open(settings_ini, "w") 113 | config.write(cfgfile) 114 | cfgfile.close() 115 | print("Destination path changed!\n", path) 116 | else: 117 | try: 118 | os.makedirs(path) 119 | print("Created folder: ", path) 120 | except FileNotFoundError as error: 121 | print("\n", error) 122 | print("(!) Not a valid path please try again!") 123 | 124 | except PermissionError as error: 125 | print("\n", error) 126 | print("(!) You have to be administrator to change BookCut settings!") 127 | 128 | 129 | def path_checker(): 130 | settings = read_settings() 131 | if settings[1] != "None": 132 | return settings[1] 133 | else: 134 | path = pathfinder() 135 | return path 136 | 137 | 138 | if __name__ == "__main__": 139 | initial_config() 140 | -------------------------------------------------------------------------------- /conftest.py: -------------------------------------------------------------------------------- 1 | def pytest_addoption(parser): 2 | parser.addoption( 3 | "--web", 4 | action="store_true", 5 | dest="web", 6 | default=False, 7 | help="enable tests requiring an internet connection", 8 | ) 9 | 10 | 11 | def pytest_configure(config): 12 | if not config.option.web: 13 | setattr(config.option, "markexpr", "not web") 14 | -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- 1 | [pytest] 2 | minversion = 6.0 3 | testpaths = 4 | tests 5 | markers = 6 | web: mark tests which require an internet connection 7 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | import sys 3 | import pathlib 4 | 5 | if sys.version_info.major < 3: 6 | print("\nPython 2 is not supported! \nPlease upgrade to Python 3.\n") 7 | print( 8 | "Installation of BookCut stopped, please try again with\n" 9 | "a newer version of Python!" 10 | ) 11 | sys.exit(1) 12 | 13 | # The directory containing this file 14 | HERE = pathlib.Path(__file__).parent 15 | 16 | # The text of the README file 17 | README = (HERE / "README.md").read_text() 18 | 19 | setuptools.setup( 20 | name="BookCut", 21 | python_requires=">3.5.2", 22 | version="1.3.7", 23 | author="Costis94", 24 | author_email="gravitymusician@gmail.com", 25 | description="Command Line Interface app to download ebooks", 26 | long_description_content_type="text/markdown", 27 | long_description=README, 28 | url="https://github.com/costis94/bookcut", 29 | packages=setuptools.find_packages(), 30 | classifiers=[ 31 | "Programming Language :: Python :: 3", 32 | "License :: OSI Approved :: MIT License", 33 | "Operating System :: OS Independent", 34 | ], 35 | install_requires=[ 36 | "pandas", 37 | "click>=7.1.2", 38 | "requests", 39 | "beautifulsoup4", 40 | "pyfiglet", 41 | "tqdm", 42 | "mechanize", 43 | ], 44 | extras_require={ 45 | "dev": [ 46 | "pytest", 47 | "pytest-cov", 48 | "pre-commit", 49 | "black", 50 | ] 51 | }, 52 | include_package_data=True, 53 | entry_points=""" 54 | [console_scripts] 55 | bookcut=bookcut.bookcut:entry 56 | """, 57 | ) 58 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/costis94/bookcut/88a06bf6e7962f6b013b9f45d23886e255d7a9f2/tests/__init__.py -------------------------------------------------------------------------------- /tests/test_book.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | from bookcut.mirror_checker import main as mirror_checker 3 | from bookcut.book import Booksearch 4 | 5 | 6 | @pytest.mark.web 7 | def test_single_book_download(): 8 | title = "Iliad" 9 | author = "Homer" 10 | publisher = " " 11 | type_format = " " 12 | book = Booksearch(title, author, publisher, type_format, mirror_checker()) 13 | result = book.search() 14 | extensions = result["extensions"] 15 | print("extensions: ", extensions) 16 | tb = result["table_data"] 17 | mirrors = result["mirrors"] 18 | assert mirrors[0].startswith("http"), "Not correct format of Mirror URL." 19 | assert type(extensions) is list, "Wrong format of extension details." 20 | file_details = book.give_result(extensions, tb, mirrors, extensions[0]) 21 | -------------------------------------------------------------------------------- /tests/test_bookcut.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | from click.testing import CliRunner 4 | from bookcut import __version__ 5 | from bookcut.bookcut import entry 6 | 7 | 8 | def test_entry_with_version_option(): 9 | cli_output = CliRunner().invoke(entry, ["--version"]) 10 | assert cli_output.exit_code == 0 11 | assert cli_output.output == f"commands, version {__version__}\n" 12 | -------------------------------------------------------------------------------- /tests/test_main.py: -------------------------------------------------------------------------------- 1 | import unittest 2 | -------------------------------------------------------------------------------- /tests/test_mirror_checker.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | from bookcut.mirror_checker import pageStatus, main as mirror_checker 3 | from bookcut.mirror_checker import requests, CONNECTION_ERROR_MESSAGE 4 | from requests import ConnectionError 5 | 6 | TEST_URL = "http://www.sometesturl.com" 7 | 8 | 9 | @pytest.mark.web 10 | def test_mirror_availability(): 11 | available_mirror = mirror_checker() 12 | assert type(available_mirror) is str, "Not correct type of LibGen Url" 13 | assert available_mirror.startswith("http"), "Not correct LibGen Url." 14 | 15 | 16 | @pytest.mark.parametrize("status_code", [200, 301]) 17 | def test_openLibraryStatus_output_if_it_can_connect(monkeypatch, capsys, status_code): 18 | def mock_requests_head(_): 19 | return type("_", (), {"status_code": status_code}) 20 | 21 | monkeypatch.setattr(requests, "head", mock_requests_head) 22 | assert pageStatus(TEST_URL) 23 | captured = capsys.readouterr() 24 | assert captured.out == f"Connected to: {TEST_URL}\n" 25 | 26 | 27 | def test_openLibraryStatus_output_for_wrong_status_code(monkeypatch, capsys): 28 | def mock_requests_head(_): 29 | return type("_", (), {"status_code": 42}) 30 | 31 | monkeypatch.setattr(requests, "head", mock_requests_head) 32 | assert not pageStatus(TEST_URL) 33 | captured = capsys.readouterr() 34 | assert captured.out == CONNECTION_ERROR_MESSAGE.format(TEST_URL) + "\n" 35 | 36 | 37 | def test_openLibraryStatus_output_on_connection_error(monkeypatch, capsys): 38 | def mock_requests_head(_): 39 | raise ConnectionError 40 | 41 | monkeypatch.setattr(requests, "head", mock_requests_head) 42 | assert not pageStatus(TEST_URL) 43 | captured = capsys.readouterr() 44 | assert captured.out == CONNECTION_ERROR_MESSAGE.format(TEST_URL) + "\n" 45 | 46 | 47 | @pytest.mark.web 48 | def test_open_libraryStatus(): 49 | status = pageStatus(url="http://www.openlibrary.org") 50 | assert status is not False, "OpenLibrary Status =! 200" 51 | 52 | 53 | @pytest.mark.web 54 | def test_archiv_Status(): 55 | status = pageStatus(url="http://export.arxiv.org/") 56 | assert status is not False, "Archiv Status =! 200" 57 | --------------------------------------------------------------------------------