├── .gitignore
├── .pre-commit-config.yaml
├── CHANGELOG.md
├── LICENSE.txt
├── MANIFEST.in
├── README.md
├── bookcut
    ├── Settings.ini
    ├── __init__.py
    ├── article.py
    ├── bibliography.py
    ├── book.py
    ├── book_details.py
    ├── bookcut.py
    ├── booklist.py
    ├── downloader.py
    ├── libgen.py
    ├── mirror_checker.py
    ├── organise.py
    ├── repositories.py
    ├── search.py
    └── settings.py
├── conftest.py
├── pytest.ini
├── setup.py
└── tests
    ├── __init__.py
    ├── test_book.py
    ├── test_bookcut.py
    ├── test_main.py
    └── test_mirror_checker.py


/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | bookdb.db
3 | BookCat.egg-info/
4 | build/
5 | dist/
6 | BookCut.egg-info/
7 | resources.json
8 | 


--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
1 | repos:
2 |   - repo: https://github.com/psf/black
3 |     rev: 21.6b0
4 |     hooks:
5 |       - id: black
6 |         language_version: python3
7 | 


--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
 1 | ﻿# Changelog
 2 | All notable changes to this project will be documented in this file.
 3 | 
 4 | 
 5 | ##[1.3.7]
 6 | 
 7 | ### Add
 8 | - Black autoformatting
 9 | 
10 | ### Fix
11 | -Issue#7
12 | 
13 | 
14 | ## [1.3.6]
15 | - Repos option added. ArXiv is now included.
16 | - Fixed known bugs
17 | 
18 | ## [1.3.5] - 06_Jun_2021
19 | - Fixed Issue#9
20 | 
21 | ### Fixed
22 | - Organise options known bug that could not move some files fixed.
23 | 
24 | ## [1.3.1] - 11_Aug_2020
25 | 
26 | ### Fixed
27 | - Organise options known bug that could not move some files fixed.
28 | 
29 | ## [1.3.0] - 10_Aug_2020
30 | 
31 | ### Added
32 | - Configuration mode, user now can change some basic settings of BookCut like
33 | destination folder and clear screen option. Also can add more Libgen Mirrors.
34 | 
35 | ### Fixed
36 | - Some raise errors and some known bugs.
37 | 
38 | ## [1.2.3] - 07_Aug_2020
39 | 
40 | ### Fixed
41 | - [Bug] Issue#6 Forced list error.
42 | 
43 | 
44 | ## [1.2.2] - 03_Aug_2020
45 | 
46 | ### Fixed
47 | - [Bug] Issue with FileNotFoundError, when no home/Documents directory existed.
48 | 
49 | 
50 | ## [1.2.1] - 03_Aug_2020
51 | 
52 | ### Added
53 | - Version option added ('bookcut --version')
54 | - List forced flag, for downloading the founded books automatically.
55 | - Added this file (CHANGELOG.md).
56 | 
57 | ### Fixed
58 | - Search option bug, when no results found.
59 | - PEP8 fixes
60 | 
61 | ## [1.2.0] - 03_Aug_2020
62 | ### Added
63 | - Details option
64 | - Bibliography option which returns a list with all the books from an author, with the option to save to .txt file
65 | 


--------------------------------------------------------------------------------
/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) [2020] [Costis94]
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include bookcut/Settings.ini
2 | include README.md
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | 
  2 | <img src="https://i.imgur.com/ZUX2ehE.png" width="256" height="69">
  3 | 
  4 | [![Downloads](https://pepy.tech/badge/bookcut)](https://pepy.tech/project/bookcut) ![pypi](https://img.shields.io/pypi/v/pip.svg)
  5 | [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
  6 | 
  7 | 
  8 | BookCut is a Python Command Line Interface tool, that help the user to download **free e-books**,
  9 | **organise** them in folders by genre, **retrieve** book details by *ISBN* or *title*,
 10 | get a list with **all the books from a writer** and save them to .txt file.
 11 | 
 12 | *With the help of LibGen, ArXiv and OpenLibrary.*
 13 | 
 14 | 
 15 | ## REQUIREMENTS
 16 | 
 17 | * Python 3
 18 | * python3-pip
 19 | 
 20 | 
 21 | ## Installation
 22 | 
 23 | * **Install with pip:**
 24 | 
 25 | ```bash
 26 | pip install bookcut
 27 | #or if you have also Python 2
 28 | pip3 install bookcut
 29 | ```
 30 | 
 31 | 
 32 | ## Usage
 33 | 
 34 | ### Searching and downloading books:
 35 | 
 36 | * Download a **single** book:
 37 | 
 38 | ```bash
 39 | bookcut book -b "White Fang" -a "Jack London"
 40 | ```
 41 | 
 42 | * Download a **list** of books:
 43 | 
 44 | ```bash
 45 | bookcut list "FreeEbooksToDownload.txt"
 46 | ```
 47 | 
 48 | * Organise a **folder** full of e-books to folders according to genre:
 49 | 
 50 | ```bash
 51 | bookcut organise "full/path/to/folder"
 52 | ```
 53 | ***
 54 | * Search **LibGen**, output the results and download e-book:
 55 | 
 56 | ```bash
 57 | bookcut search -t 'Homer Odyssey'
 58 | ```
 59 | 
 60 | * Search more book repositories with the **--repos** option:
 61 | ``` bash
 62 | bookcut search -t 'Homer Odyssey' --repos 'libgen,arxiv'
 63 | ```
 64 |   **Available book repositories: Libgen, ArXiv**
 65 | ***
 66 | 
 67 | * Get the **details** of a book by **title and author**, or simply **ISBN**.
 68 | 
 69 | ```bash
 70 | bookcut details -b 'Homer Iliad'
 71 | ```
 72 | 
 73 | * Get a list with *all the books* from an **author**,with an option to save to .txt:
 74 | 
 75 | ```bash
 76 | bookcut all-books -author 'Stephen King'
 77 | ```
 78 | ***
 79 | ### Searching and downloading articles:
 80 | Now you can use bookcut to search and download **scientific articles**.
 81 | 
 82 |  - Search with the Digital Object Identifier:
 83 | ```
 84 |  bookcut article --doi "10.1126/science.196.4287.293"
 85 | ```
 86 | - Search with the exact title:
 87 | ```  
 88 |  bookcut article --title "Ribulose Bisphosphate Carboxylase  A Two-Layered, Square-Shaped Molecule of Symmetry"
 89 | ```
 90 | ****
 91 | ### Configuration
 92 | * Also you can change some basic settings of BookCut. For more check:
 93 | 
 94 | ```bash
 95 | bookcut config --help
 96 | ```
 97 | 
 98 | ## TO-DO
 99 | * Add Tests
100 | * Add documentation
101 | * Add more sources with free e-books
102 | * Fix organiser so it can use all types of files
103 | * Add a logger.
104 | 
105 | ## Copyrights
106 | Please use the bookcut app to download **only free e-books** that are legally distributing through *BookCut repositories.*
107 | Bookcut contributors do not have any responsibility for the use of the tool.
108 | ## Contributing
109 | Pull requests are welcome, this is my first project so be kind.
110 | For major changes, please open an issue first to discuss what you would like to change.
111 | 
112 | Please make sure to update tests as appropriate.
113 | 
114 | ## License
115 | [MIT](https://choosealicense.com/licenses/mit/)
116 | 


--------------------------------------------------------------------------------
/bookcut/Settings.ini:
--------------------------------------------------------------------------------
 1 | [LibGen]
 2 | mirrors = https://libgen.lc/,http://libgen.li/,http://185.39.10.101/,http://genesis.lib/
 3 | 
 4 | [Repositories]
 5 | available_repos = arxiv,libgen
 6 | 
 7 | [ArticleRepositories]
 8 | article_repos = openaccessbutton
 9 | 
10 | [Settings]
11 | clean_screen = True
12 | destination = None
13 | 


--------------------------------------------------------------------------------
/bookcut/__init__.py:
--------------------------------------------------------------------------------
1 | from importlib import metadata
2 | 
3 | __version__ = metadata.version("bookcut")
4 | 


--------------------------------------------------------------------------------
/bookcut/article.py:
--------------------------------------------------------------------------------
 1 | from bookcut.repositories import open_access_button
 2 | from bookcut.downloader import filename_refubrished
 3 | from bookcut.search import search_downloader
 4 | from click import confirm
 5 | 
 6 | """
 7 | Article.py is using from article command and searches repositories for
 8 | published articles.
 9 | """
10 | 
11 | 
12 | def article_search(doi, title):
13 |     try:
14 |         article_json_data = open_access_button(doi, title)
15 |         url = article_json_data["url"]
16 |         metadata = article_json_data["metadata"]
17 |         title = metadata["title"]
18 |         filename = filename_refubrished(title)
19 |         filename = filename + ".pdf"
20 |         ask_for_downloading(filename, url)
21 |     except KeyError:
22 |         print("\nCan not find the given article.\nPlease try another search!")
23 | 
24 | 
25 | def ask_for_downloading(articlefilename, url):
26 |     ask = confirm(f"Do you want to download:\n {articlefilename}")
27 |     if ask is True:
28 |         search_downloader(articlefilename, url)
29 |     else:
30 |         print("Aborted!")
31 | 


--------------------------------------------------------------------------------
/bookcut/bibliography.py:
--------------------------------------------------------------------------------
 1 | import requests
 2 | import json
 3 | import re
 4 | from difflib import SequenceMatcher
 5 | import os
 6 | from bookcut.mirror_checker import pageStatus
 7 | 
 8 | """This file is used by ---allbooks command
 9 |    It is searching OpenLibrary for all books written from an
10 |    author, and gives the choice to user to save it to a .txt file"""
11 | 
12 | OPEN_LIBRARY_URL = "http://www.openlibrary.org"
13 | 
14 | 
15 | def main(author, similarity):
16 |     # returns all the books writen by an author from openlibrary
17 |     # using similarity for filtering the results
18 |     status = pageStatus(OPEN_LIBRARY_URL)
19 |     if status is not False:
20 |         search_url = "http://openlibrary.org/search.json?author=" + author
21 |         jason = requests.get(search_url)
22 |         jason = jason.text
23 |         data = json.loads(jason)
24 |         data = data["docs"]
25 |         if data != []:
26 |             metr = 0
27 |             books = []
28 |             for i in range(0, len(data) - 1):
29 |                 title = data[metr]["title"]
30 |                 metr = metr + 1
31 |                 books.append(title)
32 |                 mylist = list(dict.fromkeys(books))
33 | 
34 |             #       Filtrering results: trying to erase similar titles
35 |             words = [
36 |                 " the ",
37 |                 "The ",
38 |                 " THE ",
39 |                 " The" " a ",
40 |                 " A ",
41 |                 " and ",
42 |                 " of ",
43 |                 " from ",
44 |                 "on",
45 |                 "The",
46 |                 "in",
47 |             ]
48 | 
49 |             noise_re = re.compile(
50 |                 "\\b(%s)\\W" % ("|".join(map(re.escape, words))), re.I
51 |             )
52 |             clean_mylist = [noise_re.sub("", p) for p in mylist]
53 | 
54 |             for i in clean_mylist:
55 |                 for j in clean_mylist:
56 |                     a = similar(i, j, similarity)
57 |                     if a is True:
58 |                         clean_mylist.pop(a)
59 | 
60 |             clean_mylist.sort()
61 |             print(" ~Books found to OpenLibrary Database:\n")
62 |             for i in clean_mylist:
63 |                 print(i)
64 |             return clean_mylist
65 |         else:
66 |             print("(!) No valid author name, or bad internet connection.")
67 |             print("Please try again!")
68 |             return None
69 | 
70 | 
71 | def similar(a, b, similarity):
72 |     """function which check similarity between two strings"""
73 |     ratio = SequenceMatcher(None, a, b).ratio()
74 |     if ratio > similarity and ratio < 1:
75 |         return True
76 |     else:
77 |         return False
78 | 
79 | 
80 | def save_to_txt(lista, path, author):
81 |     # save the books list to txt file.
82 |     for content in lista:
83 |         name = f"{author}_bibliography.txt"
84 |         full_path = os.path.join(path, name)
85 |         with open(full_path, "a", encoding="utf-8") as f1:
86 |             f1.write(content + " " + author + os.linesep)
87 |     print("\nList saved at: ", full_path, "\n")
88 | 


--------------------------------------------------------------------------------
/bookcut/book.py:
--------------------------------------------------------------------------------
  1 | from bookcut.mirror_checker import settingParser
  2 | import mechanize
  3 | from bs4 import BeautifulSoup as Soup
  4 | from bookcut.libgen import file_name
  5 | from click import confirm
  6 | from bookcut.downloader import downloading
  7 | from bookcut.repositories import arxiv, libgen_repo
  8 | import pandas as pd
  9 | from bookcut.search import choose_a_book
 10 | 
 11 | 
 12 | def libgen_book_find(
 13 |     title, author, publisher, destination, extension, force, libgenurl
 14 | ):
 15 |     """searching @ LibGen for a single book"""
 16 |     try:
 17 |         book = Booksearch(title, author, publisher, type, libgenurl)
 18 |         result = book.search()
 19 |         extensions = result["extensions"]
 20 |         tb = result["table_data"]
 21 |         mirrors = result["mirrors"]
 22 |         file_details = book.give_result(extensions, tb, mirrors, extension)
 23 |         if file_details is not None:
 24 |             book.cursor(file_details["url"], destination, file_details["file"], force)
 25 |     except TypeError:
 26 |         # TODO add logger error
 27 |         pass
 28 | 
 29 | 
 30 | def book_searching_in_repos(term, repos):
 31 |     # search a book in various Repositories
 32 |     if repos is None:
 33 |         libgen_data = libgen_repo(term)
 34 |         return libgen_data
 35 |     repos = repos.split(",")
 36 |     repos = [i.strip(" ") for i in repos]
 37 |     available_repos = settingParser("Repositories", "available_repos")
 38 |     df = pd.DataFrame({"Author(s)": [], "Title": [], "Size": [], "Extension": []})
 39 |     for i in repos:
 40 |         if i in available_repos:
 41 |             if i == "arxiv":
 42 |                 arxiv_data = arxiv(term)
 43 |                 df = pd.concat([df, arxiv_data], ignore_index=True)
 44 |             if i == "libgen":
 45 |                 libgen_data = libgen_repo(term)
 46 |                 df = pd.concat([df, libgen_data], ignore_index=True)
 47 |     choose_a_book(df)
 48 | 
 49 | 
 50 | class Booksearch:
 51 |     """searching libgen original page and returns book details and mirror link"""
 52 | 
 53 |     def __init__(self, title, author, publisher, filetype, libgenurl):
 54 |         self.title = title
 55 |         self.author = author
 56 |         self.publisher = publisher
 57 |         self.filetype = filetype
 58 |         self.mirror = None
 59 |         self.libgenurl = libgenurl
 60 | 
 61 |     def search(self):
 62 |         """searching libgen and returns table data, extensions and links"""
 63 |         br = mechanize.Browser()
 64 |         br.set_handle_robots(False)  # ignore robots
 65 |         br.set_handle_refresh(False)  #
 66 |         br.addheaders = [("User-agent", "Firefox")]
 67 | 
 68 |         br.open(self.libgenurl)
 69 |         br.select_form("libgen")
 70 |         input_form = self.title + " " + self.author + " " + self.publisher
 71 |         br.form["req"] = input_form
 72 |         ac = br.submit()
 73 |         html_from_page = ac
 74 |         soup = Soup(html_from_page, "html.parser")
 75 |         links_table = soup.find_all("table")[3]
 76 |         table_data = []
 77 |         mirrors = []
 78 |         extensions = []
 79 |         for i in links_table:
 80 |             try:
 81 |                 td = i.find_all("td")
 82 |                 for tr in td:
 83 |                     # scrape mirror links
 84 |                     temp = tr.find("a", href=True)
 85 |                     mirror_page = temp["href"]
 86 |                     # add also mirror link
 87 |                     if mirror_page.startswith("http") is False:
 88 |                         mirror_page = self.libgenurl + temp["href"]
 89 |                     else:
 90 |                         mirror_page = temp["href"]
 91 |                     mirrors.append(mirror_page)
 92 |             except Exception as e:
 93 |                 print(e)
 94 | 
 95 |             # Parse Details from table_data
 96 |             table = soup.find_all("table")[2]
 97 |             for i in table:
 98 |                 try:
 99 |                     td = i.find_all("td")
100 |                     row = tr.find_all("tr")
101 |                     row = [tr.text for tr in td]
102 |                     table_data.append(row)
103 |                     extensions.append(row[8])
104 |                     table_details = dict()
105 |                     table_details["extensions"] = extensions
106 |                     table_details["table_data"] = table_data
107 |                     table_details["mirrors"] = mirrors
108 |                     return table_details
109 |                 except Exception as e:
110 |                     pass
111 | 
112 |     def give_result(self, extensions, table_data, mirrors, filetype):
113 |         try:
114 |             if filetype is not None:
115 |                 temp = 0
116 |                 for i in extensions:
117 |                     if filetype == i:
118 |                         result = dict()
119 |                         result["url"] = mirrors[temp]
120 |                         result["file"] = extensions[temp]
121 |                         print("\nDownloading Link: FOUND")
122 |                         return result
123 |                     temp = temp + 1
124 |             else:
125 |                 # return the first result
126 |                 result = dict()
127 |                 result["url"] = mirrors[0]
128 |                 result["file"] = extensions[0]
129 |                 print("\nDownloading Link: FOUND")
130 |                 return result
131 |         except IndexError:
132 |             print("Downloading Link:NOT FOUND\n")
133 |             print("================================")
134 | 
135 |     def cursor(self, url, destination_folder, extension, forced):
136 |         """asking the user to download a chosen book or to abort"""
137 |         title = str(self.title)
138 |         author = str(self.author)
139 |         nameofbook = file_name(url)
140 |         if nameofbook is None:
141 |             nameofbook = (
142 |                 title.replace("\n", "") + author.replace("\n", "") + "." + extension
143 |             )
144 |         if forced is not True:
145 |             ask = confirm(f"Do you want to download:\n {nameofbook}")
146 |             if ask is True:
147 |                 downloading(
148 |                     url, title, author, nameofbook, destination_folder, extension
149 |                 )
150 |         else:
151 |             downloading(url, title, author, nameofbook, destination_folder, extension)
152 | 


--------------------------------------------------------------------------------
/bookcut/book_details.py:
--------------------------------------------------------------------------------
 1 | import requests
 2 | import json
 3 | from requests import ConnectionError
 4 | from bookcut.mirror_checker import pageStatus
 5 | 
 6 | """This file is using by ---details command.
 7 |    It's main use is to search OpenLibrary for a books' details.
 8 |    It's input can be the name of the book or the ISBN.
 9 | """
10 | 
11 | OPEN_LIBRARY_URL = "http://www.openlibrary.org"
12 | 
13 | 
14 | def main(term):
15 |     # searching OpenLibrary and prints the details of a book
16 |     try:
17 |         if term is None:
18 |             term = input(
19 |                 "Please enter the book and the author, or the ISBN of the book."
20 |             )
21 |         term = term.replace(" ", "+")
22 |         pageStatus(OPEN_LIBRARY_URL)
23 |         search_url = "http://openlibrary.org/search.json?q=" + term
24 |         jason = requests.get(search_url)
25 |         jason = jason.text
26 |         data = json.loads(jason)
27 |         try:
28 |             data = data["docs"][0]
29 |         except IndexError:
30 |             data = None
31 |             print("Invalid search, please try again.")
32 | 
33 |         if data is not None:
34 |             author = data["author_name"][0]
35 |             title = data["title_suggest"]
36 |             isbn = data["isbn"]
37 |             first_publish_year = data["first_publish_year"]
38 |             try:
39 |                 lang = data["language"]
40 |             except KeyError:
41 |                 lang = None
42 | 
43 |             print("Results for search: ", term, "\n")
44 |             print("Title:", title)
45 |             print("Author(s):", author, "\n")
46 |             print("ISBN(s):", isbn, "\n")
47 |             if lang is not None:
48 |                 print(
49 |                     "Language(s): ",
50 |                 )
51 |             print("\nFirst published: ", first_publish_year)
52 |     except ConnectionError:
53 |         url = "http://www.openlibrary.com"
54 |         print(
55 |             "Unable to connect to:",
56 |             url,
57 |             "\nPlease check your internet connection and try again later.",
58 |         )
59 |     except json.decoder.JSONDecodeError:
60 |         print("An error occured during the retrieving of data.")
61 |         print("Please try again later.")
62 | 


--------------------------------------------------------------------------------
/bookcut/bookcut.py:
--------------------------------------------------------------------------------
  1 | import click
  2 | import pyfiglet
  3 | from os import name, system
  4 | from bookcut import __version__
  5 | from bookcut.mirror_checker import main as mirror_checker, settingParser
  6 | from bookcut.book import libgen_book_find, book_searching_in_repos
  7 | from bookcut.organise import main_organiser
  8 | from bookcut.search import choose_a_book
  9 | from bookcut.book_details import main as detailing
 10 | from bookcut.bibliography import main as allbooks
 11 | from bookcut.bibliography import save_to_txt
 12 | from bookcut.article import article_search
 13 | from bookcut.settings import initial_config, mirrors_append, read_settings
 14 | from bookcut.settings import (
 15 |     screen_setting,
 16 |     print_settings,
 17 |     set_destination,
 18 |     path_checker,
 19 | )
 20 | from bookcut.booklist import booklist_main
 21 | from bookcut.repositories import libgen_repo
 22 | from bookcut.libgen import md5_search
 23 | 
 24 | 
 25 | @click.group(name="commands")
 26 | @click.version_option(version=__version__)
 27 | def entry():
 28 |     """
 29 |         for a single book download you can \n
 30 |         bookcut.py book --bookname "White Fang" -- author "Jack London"
 31 |         \nor  bookcut.py book -b "White Fang" -a "Jack London" \n
 32 |     *For a more complete help:  bookcut.py [COMMAND] --help\n
 33 |     *For example: bookcut.py list --help
 34 |     """
 35 |     # read the settings ini file and check what value for clean screen
 36 |     settings = read_settings()
 37 |     clean_screen(settings[0])
 38 |     title = pyfiglet.figlet_format("BookCut")
 39 |     click.echo(title)
 40 |     click.echo("**********************************")
 41 |     print("Welcome to BookCut! I'm here to" "\nhelp you to read your favourite books!")
 42 |     print("**********************************")
 43 | 
 44 | 
 45 | @entry.command(name="list", help="Download a list of ebook from a .txt file")
 46 | @click.option(
 47 |     "--file",
 48 |     "-f",
 49 |     help="A .txt file in which books are written in a separate line",
 50 |     required=True,
 51 | )
 52 | @click.option(
 53 |     "--destination",
 54 |     "-d",
 55 |     help="The destinations folder of the downloaded books",
 56 |     default=path_checker(),
 57 | )
 58 | @click.option(
 59 |     "--forced", help="Forced option, accepts all books for downloading", is_flag=True
 60 | )
 61 | @click.option("--extension", "-ext", help="File type of e-book.")
 62 | def download_from_txt(file, destination, forced, extension):
 63 |     click.echo("Importing of book list:Started.")
 64 |     if forced:
 65 |         click.echo(click.style("(!) Forced list downloading:Enabled", fg="green"))
 66 |     booklist_main(file, destination, forced, extension)
 67 | 
 68 | 
 69 | @entry.command(
 70 |     name="book",
 71 |     help="Download a book in epub format, by inserting" "\n the title and the author",
 72 | )
 73 | @click.option("--book", "-b", help="Title of Book", default=" ")
 74 | @click.option("--author", "-a", help="The author of the Book", default=" ")
 75 | @click.option("--publisher", "-p", default="")
 76 | @click.option(
 77 |     "--destination",
 78 |     "-d",
 79 |     help="The destinations folder of the downloaded books",
 80 |     default=path_checker(),
 81 | )
 82 | @click.option("--extension", "-ext", help="Filetype of e-book for example:pdf")
 83 | @click.option("--forced", is_flag=True)
 84 | @click.option("--md5", help="Md5 search for a specific book version.", default=None)
 85 | def book(book, author, publisher, destination, extension, forced, md5):
 86 |     if book == " " and md5 is None:
 87 |         print("Invalid Input! Check <bookcut book --help> for more.")
 88 |     elif author != " " and book != " ":
 89 |         click.echo(f"\nSearching for {book.capitalize()} by {author.capitalize()}")
 90 |     elif book != " ":
 91 |         click.echo(f"\nSearching for {book.capitalize()}")
 92 |     url = mirror_checker()
 93 |     if url is not None:
 94 |         if md5 is not None:
 95 |             print("\nSearching for book with md5: ", md5)
 96 |             md5_search(md5, url, destination)
 97 |         else:
 98 |             libgen_book_find(
 99 |                 book, author, publisher, destination, extension, forced, url
100 |             )
101 | 
102 | 
103 | def clean_screen(setting):
104 |     """Cleans the terminal screen"""
105 |     if setting == "True":
106 |         if name == "nt":
107 |             _ = system("cls")
108 |         else:
109 |             _ = system("clear")
110 | 
111 | 
112 | @entry.command(
113 |     name="organise", help="Organise the ebooks in folders according\n to genre"
114 | )
115 | @click.option(
116 |     "--directory",
117 |     "-d",
118 |     help="Directory of source ",
119 |     required=True,
120 |     default=path_checker(),
121 | )
122 | @click.option(
123 |     "--output",
124 |     "-o",
125 |     help="The destination folder of organised books",
126 |     default=path_checker(),
127 | )
128 | def organiser(directory, output):
129 |     print("\nBookCut is starting to \norganise your books!")
130 |     main_organiser(directory)
131 | 
132 | 
133 | @entry.command(name="all-books", help="Search and return all the books from an author")
134 | @click.option("--author", "-a", required=True, help="Author name")
135 | @click.option(
136 |     "--ratio", "-r", help="Ratio for filtering  book results", default="0.7", type=float
137 | )
138 | def bibliography(author, ratio):
139 |     print(f"\nStart searching for all books by {author.capitalize()}:")
140 |     lista = allbooks(author, ratio)
141 |     if lista is not None:
142 |         print("**********************************")
143 |         choice = "y or n"
144 |         while choice != "Y" or choice != "N":
145 |             choice = input("\nDo you wish to save the list? [Y/n]: ")
146 |             choice = choice.capitalize()
147 |             if choice == "Y":
148 |                 save_to_txt(lista, path_checker(), author)
149 |                 break
150 |             elif choice == "N":
151 |                 print("Aborted.")
152 |                 break
153 | 
154 | 
155 | @entry.command(
156 |     name="search",
157 |     help="Search LibGen or other repositories and choose a book to download",
158 | )
159 | @click.option("--term", "-t", help="Term for searching")
160 | @click.option("--repos", default=None)
161 | def searching(term, repos):
162 |     print("Searching for:", term.capitalize())
163 |     # set default libgen search
164 |     if repos is None:
165 |         libgen_data = libgen_repo(term)
166 |         choose_a_book(libgen_data)
167 |     else:
168 |         book_searching_in_repos(term, repos)
169 | 
170 | 
171 | @entry.command(name="details", help="Search the details of a book")
172 | @click.option(
173 |     "--book",
174 |     "-b",
175 |     help="Enter book & author or the ISBN number.",
176 |     required=True,
177 |     default=None,
178 | )
179 | def details(book):
180 |     detailing(book)
181 | 
182 | 
183 | @entry.command(name="article", help="Search for an article")
184 | @click.option("--doi", "-d", help="Enter D.O.I. of the article", default=None)
185 | @click.option("--title", "-t", help="Enter title of article", default=None)
186 | def article(doi, title):
187 |     if doi or title is not None:
188 |         article_search(doi, title)
189 |     else:
190 |         print("Not correct input. \nPlease use: bookcut article --help")
191 | 
192 | 
193 | @entry.command(name="config", help="BookCut configuration settings")
194 | @click.option("--libgen_add", help="Add a Libgen mirror to mirrors list", default=None)
195 | @click.option(
196 |     "--restore", help="Restores the settings file to initial state", is_flag=True
197 | )
198 | @click.option("--settings", help="Prints the current BookCut settings", is_flag=True)
199 | @click.option(
200 |     "--clean_screen",
201 |     help="You can choose if BookCut will" " clean terminal screen",
202 |     is_flag=True,
203 | )
204 | @click.option("--download_folder", help="Set BookCut's download folder", default=None)
205 | def configure_mode(restore, libgen_add, settings, clean_screen, download_folder):
206 |     if restore:
207 |         prompt = click.confirm("\n Are you sure do you want to restore Settings?")
208 |         if prompt is True:
209 |             initial_config()
210 |         else:
211 |             click.echo("Aborted!")
212 |     elif libgen_add is not None:
213 |         click.echo(f"Adding {libgen_add} to mirrors list")
214 |         mirrors_append(libgen_add)
215 |     elif settings:
216 |         print_settings()
217 |     elif clean_screen:
218 |         prompt = click.confirm("\nDo you want Bookcut to clean command line?")
219 |         if prompt is True:
220 |             screen_setting("True")
221 |         else:
222 |             screen_setting("False")
223 |     elif download_folder is not None:
224 |         set_destination(download_folder)
225 |     else:
226 |         print(
227 |             "Usage: bookcut config [OPTIONS]",
228 |             "\nTry 'bookcut config --help' for help.\n",
229 |             "\nError: Missing option or flag.",
230 |         )
231 | 
232 | 
233 | if __name__ == "__main__":
234 |     entry()
235 | 


--------------------------------------------------------------------------------
/bookcut/booklist.py:
--------------------------------------------------------------------------------
 1 | from bookcut.book import libgen_book_find
 2 | from bookcut.mirror_checker import main as mirror_checker
 3 | 
 4 | 
 5 | def file_list(filename):
 6 |     """checks if the input file is a .txt file and adds each separate line
 7 |     as a book to the list 'Lines'.
 8 |     After return this list to download_from_txt
 9 |     """
10 | 
11 |     if filename.endswith(".txt"):
12 |         try:
13 |             file1 = open(filename, "r", encoding="utf-8")
14 |             Lines = file1.readlines()
15 |             for i in Lines:
16 |                 if i == "\n":
17 |                     Lines.remove(i)
18 |             return Lines
19 |         except FileNotFoundError:
20 |             print("Error:No such file or directory:", filename)
21 |     else:
22 |         print("\nError:Not correct file type. Please insert a '.txt' file")
23 | 
24 | 
25 | def booklist_main(file, destination, forced, extension):
26 |     """executes with the command --list"""
27 |     Lines = file_list(file)
28 |     if Lines is not None:
29 |         print("List imported succesfully!")
30 |         url = mirror_checker()
31 |         if url is not None:
32 |             temp = 1
33 |             many = len(Lines)
34 |             for a in Lines:
35 |                 if a != "":
36 |                     print(
37 |                         f"~[{temp}/{many}] Searching for:",
38 |                         a,
39 |                     )
40 |                     temp = temp + 1
41 |                     libgen_book_find(a, "", "", destination, extension, forced, url)
42 | 


--------------------------------------------------------------------------------
/bookcut/downloader.py:
--------------------------------------------------------------------------------
 1 | import requests
 2 | from tqdm import tqdm
 3 | import os
 4 | from bs4 import BeautifulSoup as Soup
 5 | 
 6 | 
 7 | def downloading(link, name, author, file, destination_folder, type):
 8 |     """finds the first available book and sends the link to file_downloader"""
 9 |     page = requests.get(link)
10 |     soup = Soup(page.content, "html.parser")
11 | 
12 |     searcher = [a["href"] for a in soup.find_all(href=True) if a.text]
13 |     searcher_link = searcher[0]
14 |     if searcher_link.startswith("http") is False:
15 |         until_dot = link.split("//")
16 |         searcher_link = until_dot[0] + "//" + until_dot[1] + searcher_link
17 |     file_downloader(searcher_link, name, author, file, destination_folder, type)
18 | 
19 | 
20 | def file_downloader(href, name, author, file, destination_folder, type):
21 |     """Downloads the book file to users folder"""
22 |     response = requests.get(href, stream=True)
23 |     total_size = int(response.headers.get("content-length"))
24 |     inMb = total_size / 1000000
25 |     inMb = round(inMb, 2)
26 |     print("\nDownloading...", "\nTotal file size:", inMb, "MB")
27 | 
28 |     # Folder to download books
29 |     filename = file
30 |     if filename != "":
31 |         pass
32 |     else:
33 |         filename = name + " - " + author + type
34 |     path = destination_folder
35 | 
36 |     filename = os.path.join(path, filename)
37 | 
38 |     try:
39 |         with open(filename, "wb") as f:
40 |             """For progress bar"""
41 |             with tqdm(total=total_size, unit="iB", unit_scale=True) as pbar:
42 |                 for ch in response.iter_content(chunk_size=1024):
43 |                     if ch:
44 |                         f.write(ch)
45 |                         pbar.update(len(ch))
46 | 
47 |         print("================================\nFile saved as:", filename)
48 |     except FileNotFoundError:
49 |         print("ERROR! Is the destination folder exists? ")
50 | 
51 | 
52 | def pathfinder():
53 |     path = os.path.expanduser("~/Documents/BookCut")
54 |     if os.path.isdir(path):
55 |         pass
56 |     else:
57 |         os.makedirs(path)
58 |     return path
59 | 
60 | 
61 | def filename_refubrished(filename):
62 |     # for valid filenames without special characters
63 |     special_char = [":", "/", '""', "?", "*", "<", ">", "|"]
64 |     for i in special_char:
65 |         filename = filename.replace(i, " ")
66 |     return filename
67 | 


--------------------------------------------------------------------------------
/bookcut/libgen.py:
--------------------------------------------------------------------------------
 1 | from bs4 import BeautifulSoup as soupa
 2 | import requests
 3 | from bookcut.downloader import file_downloader
 4 | from click import confirm
 5 | from bookcut.search import RESULT_ERROR
 6 | 
 7 | 
 8 | def epub_finder(soup):
 9 |     table = soup.find("table", attrs={"class": "c"})
10 |     tb = table.find_all("tr")
11 |     data = []
12 |     epub = "epub"
13 |     for row in tb:
14 |         col = row.find_all("td")
15 |         col = [ele.text.strip() for ele in col]
16 |         xxx = [ele for ele in col if ele]
17 | 
18 |         false_results = ["[1]", "[2]", "[3]", "[4]", "[5]"]
19 |         if false_results == xxx:
20 |             pass
21 |         else:
22 |             data.append(xxx)
23 |     del data[0]
24 |     count = 0
25 |     for a in data:
26 |         if epub in a:
27 |             break
28 |         else:
29 |             count = count + 1
30 |     return count
31 | 
32 | 
33 | def file_name(url):
34 |     print("URL: ", url)
35 |     page = requests.get(url)
36 |     try:
37 |         soup = soupa(page.content, "html.parser")
38 |         r = soup.find("input")["value"]
39 |         r.replace("\n", "")
40 |         return r
41 |     except TypeError:
42 |         return None
43 | 
44 | 
45 | def md5_search(md5, url, destination):
46 |     try:
47 |         # function that using by book command and searching for a specific book in LibGen with a given md5 value
48 |         mirror_url = url + "/ads.php?md5=" + md5
49 |         req = requests.get(mirror_url)
50 |         soup = soupa(req.content, "html.parser")
51 |         html = soup.find("input", attrs={"id": "textarea-example"})
52 |         filename = html["value"]
53 |         url_soup = soup.findAll("table", attrs={"id": "main"})
54 | 
55 |         urls = []
56 |         for j in url_soup:
57 |             a = j.findAll("a", href=True)
58 |             for i in a:
59 |                 urls.append(i["href"])
60 |         download_url = url + urls[0]
61 |         question = confirm(f"Do you want to download:\n{filename}")
62 |         if question is True:
63 |             file_downloader(download_url, "", "", filename, destination, "")
64 |         else:
65 |             print("Aborted!")
66 |     except TypeError:
67 |         print(RESULT_ERROR)
68 | 


--------------------------------------------------------------------------------
/bookcut/mirror_checker.py:
--------------------------------------------------------------------------------
 1 | import requests
 2 | from requests import ConnectionError
 3 | import configparser
 4 | import os
 5 | 
 6 | 
 7 | CONNECTION_ERROR_MESSAGE = (
 8 |     "\nUnable to connect to: {} "
 9 |     "\nPlease check your internet connection and try again later."
10 | )
11 | 
12 | 
13 | def settingParser(section, value):
14 |     "Parsing data from Settings.ini"
15 |     config = configparser.ConfigParser()
16 |     module_path = os.path.dirname(os.path.realpath(__file__))
17 |     settings_ini = os.path.join(module_path, "Settings.ini")
18 |     config.read(settings_ini)
19 |     mirrors = config.get(section, value)
20 |     mirrors = mirrors.split(",")
21 |     return mirrors
22 | 
23 | 
24 | def main(verbose=True):
25 |     """Check which LibGen mirror is available"""
26 | 
27 |     mirrors = settingParser("LibGen", "mirrors")
28 |     for url in mirrors:
29 |         try:
30 |             r = requests.head(url)
31 |             if r.status_code == 200 or r.status_code == 301:
32 |                 status = True
33 |             if status is True:
34 |                 if verbose is True:
35 |                     print("Connected to:", url)
36 |                 return url
37 |                 break
38 |             else:
39 |                 print("No mirrors available or no Internet Connection!")
40 |         except:
41 |             pass
42 | 
43 | 
44 | def pageStatus(url, verbose=True):
45 |     try:
46 |         request = requests.head(url)
47 |         if request.status_code == 200 or request.status_code == 301:
48 |             if verbose is True:
49 |                 print("Connected to:", url)
50 |             return True
51 |     except ConnectionError:
52 |         pass
53 |     print(CONNECTION_ERROR_MESSAGE.format(url))
54 |     return False
55 | 
56 | 
57 | if __name__ == "__main__":
58 |     main()
59 | 


--------------------------------------------------------------------------------
/bookcut/organise.py:
--------------------------------------------------------------------------------
  1 | from bookcut.mirror_checker import pageStatus
  2 | import os
  3 | import shutil
  4 | import requests
  5 | import json
  6 | 
  7 | OPEN_LIBRARY_URL = "http://www.openlibrary.org"
  8 | 
  9 | 
 10 | def main_organiser(directory):
 11 |     status = pageStatus(OPEN_LIBRARY_URL)
 12 |     if status is not False:
 13 |         book_list = get_books(directory)
 14 |         # lists only the files in the given directory
 15 |         namepath = []
 16 |         with os.scandir(directory) as entries:
 17 |             for entry in entries:
 18 |                 if entry.is_file():
 19 |                     namepath.append(entry.name)
 20 |         for i in range(0, len(book_list)):
 21 |             print("File:", namepath[i])
 22 |             try:
 23 |                 """splitting file name to author and book title for using as
 24 |                 searching terms to OpenLibrary"""
 25 |                 a = book_list[i].split("by")
 26 |                 book = a[1]
 27 |                 author = a[0]
 28 |                 a = scraper(book, author)
 29 |                 print("\n***", book, "  ", author)
 30 |                 a = a["genre"]
 31 |                 filename = namepath[i]
 32 |                 cutpaste(directory, a, filename)
 33 |             except IndexError:
 34 |                 try:
 35 |                     a = book_list[i].split("-")
 36 |                     book = a[1]
 37 |                     author = a[0]
 38 |                     a = scraper(book, author)
 39 |                     print("\n***", book, "  ", author)
 40 |                     a = a["genre"]
 41 |                     filename = namepath[i]
 42 |                     cutpaste(directory, a, filename)
 43 |                 except IndexError:
 44 |                     print("Unable to organise this file.\n")
 45 |                     pass
 46 | 
 47 | 
 48 | def get_books(dir):
 49 |     """filtering epub, pdf, txt, mobi, djvu files in the given directory
 50 |     and return a list with all filenames"""
 51 |     epub_list = []
 52 |     for file in os.listdir(dir):
 53 |         if file.endswith(".epub"):
 54 |             renamed = file.replace(".epub", "")
 55 |             renamed = renamed.replace("_", " ")
 56 |             epub_list.append(renamed)
 57 |         elif file.endswith(".pdf"):
 58 |             renamed = file.replace(".pdf", "")
 59 |             renamed = renamed.replace("_", " ")
 60 |             epub_list.append(renamed)
 61 |         elif file.endswith(".txt"):
 62 |             renamed = file.replace(".txt", "")
 63 |             renamed = renamed.replace("_", " ")
 64 |             epub_list.append(renamed)
 65 |         elif file.endswith(".mobi"):
 66 |             renamed = file.replace(".mobi", "")
 67 |             renamed = renamed.replace("_", " ")
 68 |             epub_list.append(renamed)
 69 |         elif file.endswith(".djvu"):
 70 |             renamed = file.replace(".djvu", "")
 71 |             renamed = renamed.replace("_", " ")
 72 |             epub_list.append(renamed)
 73 |     return epub_list
 74 | 
 75 | 
 76 | def scraper(book, author):
 77 |     """parsing the book category from OpenLibrary"""
 78 |     try:
 79 |         book = book.replace(" ", "+")
 80 |         author = author.replace(" ", "+")
 81 | 
 82 |         search_url = "http://openlibrary.org/search.json?q=" + book + "+" + author
 83 |         jason = requests.get(search_url)
 84 |         jason = jason.text
 85 |         data = json.loads(jason)
 86 |         json_formatted_str = json.dumps(data, indent=2)
 87 | 
 88 |         book_values = {}
 89 |         isbn = None
 90 |         author_name = None
 91 |         title = None
 92 |         subject = None
 93 |         try:
 94 |             # TODO: to add feature to check all docs
 95 | 
 96 |             data = data["docs"][0]
 97 |         except IndexError:
 98 |             data = None
 99 |         if data is not None:
100 |             try:
101 |                 isbn = data["isbn"][0]
102 |             except KeyError:
103 |                 pass
104 |             try:
105 |                 author_name = data["author_name"][0]
106 |             except KeyError:
107 |                 pass
108 |             try:
109 |                 title = data["title_suggest"]
110 |             except KeyError:
111 |                 pass
112 |             try:
113 |                 subject = data["subject"]
114 |             except KeyError:
115 |                 pass
116 | 
117 |         book_values.update([("isbn", isbn), ("author", author_name), ("title", title)])
118 |         if subject is not None:
119 |             for a in subject:
120 |                 x = genre_finder(a)
121 |                 if x is not None:
122 |                     subject = x
123 |                     break
124 |                 else:
125 |                     subject = "Uncategorized"
126 |         else:
127 |             subject = "Uncategorized"
128 |         book_values.update({"genre": subject})
129 |         return book_values
130 |     except requests.ConnectionError:
131 |         url = "http://www.openlibrary.com"
132 |         print(
133 |             "Unable to connect to:",
134 |             url,
135 |             "\nPlease check your internet connection and try again later.",
136 |         )
137 |         return None
138 | 
139 | 
140 | def genre_finder(sub):
141 |     genres = [
142 |         "Classics",
143 |         "Literary",
144 |         "Fiction",
145 |         "Historical Fiction",
146 |         "Romance",
147 |         "Horror",
148 |         "Mystery",
149 |         "Suspence",
150 |         "Fantasy",
151 |         "Action",
152 |         "Adventure",
153 |         "Science Fiction",
154 |         "History",
155 |         "Biography",
156 |         "Autobiography",
157 |         "Poetry",
158 |         "Art",
159 |         "Music",
160 |         "Humor",
161 |         "Religion",
162 |         "Mythology",
163 |         "Philosophy",
164 |         "Health",
165 |         "Science",
166 |         "Social Science",
167 |         "Psychology",
168 |         "Self-helf",
169 |         "Nonfiction",
170 |     ]
171 | 
172 |     if sub in genres:
173 |         return sub
174 |     else:
175 |         return None
176 | 
177 | 
178 | def cutpaste(dir, genre, file):
179 |     """Check if genre folder exists if not it creates one"""
180 |     path = os.path.join(dir, genre)
181 |     if os.path.isdir(path):
182 |         pass
183 |     else:
184 |         os.mkdir(path)
185 |         print("Created folder:", genre)
186 |         filepath = os.path.join(path, file)
187 | 
188 |     from_path = os.path.join(dir, file)
189 |     dest_path = os.path.join(dir, genre, file)
190 |     shutil.move(from_path, dest_path)
191 |     print("File moved to: ", genre, "\n", "\n", "********************")
192 | 


--------------------------------------------------------------------------------
/bookcut/repositories.py:
--------------------------------------------------------------------------------
  1 | from bs4 import BeautifulSoup as soup
  2 | from bookcut.mirror_checker import (
  3 |     pageStatus,
  4 |     main as mirror_checker,
  5 |     CONNECTION_ERROR_MESSAGE,
  6 | )
  7 | import mechanize
  8 | import pandas as pd
  9 | import requests
 10 | 
 11 | ARCHIV_URL = "https://export.arxiv.org/find/grp_cs,grp_econ,grp_eess,grp_math,grp_physics,grp_q-bio,grp_q-fin,grp_stat"
 12 | ARCHIV_BASE = "https://export.arxiv.org"
 13 | OPEN_ACCESS_BUTTON = "https://api.openaccessbutton.org/find"
 14 | 
 15 | 
 16 | def arxiv(term):
 17 |     # Searching Arxiv.org and returns a DataFrame with the founded results.
 18 |     status = pageStatus(ARCHIV_URL)
 19 |     if status:
 20 |         br = mechanize.Browser()
 21 |         br.set_handle_robots(False)  # ignore robots
 22 |         br.set_handle_refresh(False)  #
 23 |         br.addheaders = [("User-agent", "Firefox")]
 24 | 
 25 |         br.open(ARCHIV_URL)
 26 |         br.select_form(nr=0)
 27 |         input_form = term
 28 |         br.form["query"] = input_form
 29 |         ac = br.submit()
 30 |         html_from_page = ac
 31 |         html_soup = soup(html_from_page, "html.parser")
 32 | 
 33 |         t = html_soup.findAll("div", {"class": "list-title mathjax"})
 34 |         titles = []
 35 |         for i in t:
 36 |             raw = i.text
 37 |             raw = raw.replace("Title: ", "")
 38 |             raw = raw.replace("\n", "")
 39 |             titles.append(raw)
 40 |         authors = []
 41 |         auth_soup = html_soup.findAll("div", {"class": "list-authors"})
 42 |         for i in auth_soup:
 43 |             raw = i.text
 44 |             raw = raw.replace("Authors:", "")
 45 |             raw = raw.replace("\n", "")
 46 |             authors.append(raw)
 47 |         extensions = []
 48 |         urls = []
 49 |         ext = html_soup.findAll("span", {"class": "list-identifier"})
 50 |         for i in ext:
 51 |             a = i.findAll("a")
 52 |             link = a[1]["href"]
 53 |             extensions.append(str(a[1].text))
 54 |             urls.append(ARCHIV_BASE + link)
 55 | 
 56 |         arxiv_df = pd.DataFrame(
 57 |             {
 58 |                 "Title": titles,
 59 |                 "Author(s)": authors,
 60 |                 "Url": urls,
 61 |                 "Extension": extensions,
 62 |             }
 63 |         )
 64 | 
 65 |         return arxiv_df
 66 |     else:
 67 |         print(CONNECTION_ERROR_MESSAGE.format("ArXiv"))
 68 |         return None
 69 | 
 70 | 
 71 | def libgen_repo(term):
 72 |     # Searching LibGen and returns results DataFrame
 73 |     try:
 74 |         url = mirror_checker()
 75 |         if url is not None:
 76 |             br = mechanize.Browser()
 77 |             br.set_handle_robots(False)  # ignore robots
 78 |             br.set_handle_refresh(False)  #
 79 |             br.addheaders = [("User-agent", "Firefox")]
 80 | 
 81 |             br.open(url)
 82 |             br.select_form("libgen")
 83 |             input_form = term
 84 |             br.form["req"] = input_form
 85 |             ac = br.submit()
 86 |             html_from_page = ac
 87 |             html_soup = soup(html_from_page, "html.parser")
 88 |             table = html_soup.find_all("table")[2]
 89 | 
 90 |             table_data = []
 91 |             mirrors = []
 92 |             extensions = []
 93 | 
 94 |             for i in table:
 95 |                 j = 0
 96 |                 try:
 97 |                     td = i.find_all("td")
 98 |                     for tr in td:
 99 |                         # scrape mirror links
100 |                         if j == 9:
101 |                             temp = tr.find("a", href=True)
102 |                             mirrors.append(temp["href"])
103 |                         j = j + 1
104 |                     row = [tr.text for tr in td]
105 |                     table_data.append(row)
106 |                     extensions.append(row[8])
107 |                 except:
108 |                     pass
109 | 
110 |             # Clean result page
111 |             for j in table_data:
112 |                 j.pop(0)
113 |                 del j[8:15]
114 |             headers = [
115 |                 "Author(s)",
116 |                 "Title",
117 |                 "Publisher",
118 |                 "Year",
119 |                 "Pages",
120 |                 "Language",
121 |                 "Size",
122 |                 "Extension",
123 |             ]
124 | 
125 |             tabular = pd.DataFrame(table_data)
126 |             tabular.columns = headers
127 |             tabular["Url"] = mirrors
128 |             return tabular
129 |     except ValueError:
130 |         # create emptyDataframe
131 |         df = pd.DataFrame()
132 |         return df
133 | 
134 | 
135 | def open_access_button(doi, title):
136 |     status = pageStatus(OPEN_ACCESS_BUTTON)
137 |     if status:
138 |         if doi is not None:
139 |             query = {"doi": doi}
140 |         else:
141 |             query = {"title": title}
142 |         req = requests.get(OPEN_ACCESS_BUTTON, params=query)
143 |         response = req.json()
144 |         return response
145 |     else:
146 |         print(CONNECTION_ERROR_MESSAGE.format("Open Access Button"))
147 | 


--------------------------------------------------------------------------------
/bookcut/search.py:
--------------------------------------------------------------------------------
  1 | from bookcut.mirror_checker import main as mirror_checker
  2 | from bookcut.downloader import filename_refubrished
  3 | from bookcut.settings import path_checker
  4 | from bs4 import BeautifulSoup as Soup
  5 | import mechanize
  6 | import pandas as pd
  7 | import os
  8 | import requests
  9 | from tqdm import tqdm
 10 | 
 11 | RESULT_ERROR = "\nNo results found or bad Internet connection.\nPlease try again!"
 12 | 
 13 | 
 14 | def search_downloader(file, href):
 15 |     # search_downloader downloads the book
 16 |     response = requests.get(href, stream=True)
 17 |     total_size = int(response.headers.get("content-length"))
 18 |     inMb = total_size / 1000000
 19 |     inMb = round(inMb, 2)
 20 |     filename = file
 21 |     print("\nDownloading...\n", "Total file size:", inMb, "MB")
 22 | 
 23 |     path = path_checker()
 24 | 
 25 |     filename = os.path.join(path, filename)
 26 |     # progress bar
 27 |     buffer_size = 1024
 28 |     progress = tqdm(
 29 |         response.iter_content(buffer_size),
 30 |         f"{file}",
 31 |         total=total_size,
 32 |         unit="B",
 33 |         unit_scale=True,
 34 |         unit_divisor=1024,
 35 |     )
 36 |     with open(filename, "wb") as f:
 37 |         for data in progress:
 38 |             # write data read to the file
 39 |             f.write(data)
 40 |             # update the progress bar manually
 41 |             progress.update(len(data))
 42 |     print("================================\nFile saved as:", filename)
 43 | 
 44 | 
 45 | def link_finder(link, mirror_used):
 46 |     # link_ finder is searching Libgen for download link and filename
 47 |     page = requests.get(link)
 48 |     soup = Soup(page.content, "html.parser")
 49 |     searcher = [a["href"] for a in soup.find_all(href=True) if a.text]
 50 |     try:
 51 |         filename = soup.find("input")["value"]
 52 |     except TypeError:
 53 |         filename = None
 54 |     if searcher[0].startswith("http") is False:
 55 |         searcher[0] = mirror_used + searcher[0]
 56 |     results = [filename, searcher[0]]
 57 |     return results
 58 | 
 59 | 
 60 | def search(term):
 61 |     # This function is used when searching to LibGen with the command
 62 |     # bookcut search -t "keyword"
 63 | 
 64 |     url = mirror_checker()
 65 |     if url is not None:
 66 |         br = mechanize.Browser()
 67 |         br.set_handle_robots(False)  # ignore robots
 68 |         br.set_handle_refresh(False)  #
 69 |         br.addheaders = [("User-agent", "Firefox")]
 70 | 
 71 |         br.open(url)
 72 |         br.select_form("libgen")
 73 |         input_form = term
 74 |         br.form["req"] = input_form
 75 |         ac = br.submit()
 76 |         html_from_page = ac
 77 |         soup = Soup(html_from_page, "html.parser")
 78 |         table = soup.find_all("table")[2]
 79 | 
 80 |         table_data = []
 81 |         mirrors = []
 82 |         extensions = []
 83 | 
 84 |         for i in table:
 85 |             j = 0
 86 |             try:
 87 |                 td = i.find_all("td")
 88 |                 for tr in td:
 89 |                     # scrape mirror links
 90 |                     if j == 9:
 91 |                         temp = tr.find("a", href=True)
 92 |                         mirrors.append(temp["href"])
 93 |                     j = j + 1
 94 |                 row = [tr.text for tr in td]
 95 |                 table_data.append(row)
 96 |                 extensions.append(row[8])
 97 | 
 98 |             except:
 99 |                 pass
100 | 
101 |         # Clean result page
102 |         for j in table_data:
103 |             j.pop(0)
104 |             del j[8:15]
105 |         headers = [
106 |             "Author(s)",
107 |             "Title",
108 |             "Publisher",
109 |             "Year",
110 |             "Pages",
111 |             "Language",
112 |             "Size",
113 |             "Extension",
114 |         ]
115 | 
116 |         try:
117 |             tabular = pd.DataFrame(table_data)
118 |             tabular.index += 1
119 |             tabular.columns = headers
120 |             print(tabular)
121 |             choices = []
122 |             temp = len(mirrors) + 1
123 |             for i in range(1, temp):
124 |                 choices.append(str(i))
125 |             choices.append("C")
126 |             choices.append("c")
127 |             while True:
128 |                 tell_me = str(
129 |                     input(
130 |                         "\n\nPlease enter a number from 1 to {number}"
131 |                         ' to download a book or press "C" to abort'
132 |                         " search: ".format(number=len(extensions))
133 |                     )
134 |                 )
135 |                 if tell_me in choices:
136 |                     if tell_me == "C" or tell_me == "c":
137 |                         print("Aborted!")
138 |                         return None
139 |                     else:
140 |                         c = int(tell_me) - 1
141 |                         results = [mirrors[c], extensions[c]]
142 |                         return results
143 |         except ValueError:
144 |             print("\nNo results found or bad Internet connection.")
145 |             print("Please,try again.")
146 |             return None
147 |     else:
148 |         print("\nNo results found or bad Internet connection.")
149 |         print("Please,try again.")
150 | 
151 | 
152 | def single_search():
153 |     def search(term):
154 |         # This function is used when searching to LibGen with the command
155 |         # bookcut search -t "keyword"
156 | 
157 |         url = mirror_checker()
158 |         if url is not None:
159 |             br = mechanize.Browser()
160 |             br.set_handle_robots(False)  # ignore robots
161 |             br.set_handle_refresh(False)  #
162 |             br.addheaders = [("User-agent", "Firefox")]
163 | 
164 |             br.open(url)
165 |             br.select_form("libgen")
166 |             input_form = term
167 |             br.form["req"] = input_form
168 |             ac = br.submit()
169 |             html_from_page = ac
170 |             soup = Soup(html_from_page, "html.parser")
171 |             table = soup.find_all("table")[2]
172 | 
173 |             table_data = []
174 |             mirrors = []
175 |             extensions = []
176 | 
177 |             for i in table:
178 |                 j = 0
179 |                 try:
180 |                     td = i.find_all("td")
181 |                     for tr in td:
182 |                         # scrape mirror links
183 |                         if j == 9:
184 |                             temp = tr.find("a", href=True)
185 |                             mirrors.append(temp["href"])
186 |                         j = j + 1
187 |                     row = [tr.text for tr in td]
188 |                     table_data.append(row)
189 |                     extensions.append(row[8])
190 | 
191 |                 except:
192 |                     pass
193 | 
194 |             # Clean result page
195 |             for j in table_data:
196 |                 j.pop(0)
197 |                 del j[8:15]
198 |             headers = [
199 |                 "Author(s)",
200 |                 "Title",
201 |                 "Publisher",
202 |                 "Year",
203 |                 "Pages",
204 |                 "Language",
205 |                 "Size",
206 |                 "Extension",
207 |             ]
208 | 
209 |             try:
210 |                 tabular = pd.DataFrame(table_data)
211 |                 tabular.index += 1
212 |                 tabular.columns = headers
213 |                 print(tabular)
214 |                 choices = []
215 |                 temp = len(mirrors) + 1
216 |                 for i in range(1, temp):
217 |                     choices.append(str(i))
218 |                 choices.append("C")
219 |                 choices.append("c")
220 |                 while True:
221 |                     tell_me = str(
222 |                         input(
223 |                             "\n\nPlease enter a number from 1 to {number}"
224 |                             ' to download a book or press "C" to abort'
225 |                             " search: ".format(number=len(extensions))
226 |                         )
227 |                     )
228 |                     if tell_me in choices:
229 |                         if tell_me == "C" or tell_me == "c":
230 |                             print("Aborted!")
231 |                             return None
232 |                         else:
233 |                             c = int(tell_me) - 1
234 |                             print(mirrors[c], "   ", extensions[c])
235 |                             results = [mirrors[c], extensions[c]]
236 |                             return results
237 |             except ValueError:
238 |                 print("\nNo results found or bad Internet connection.")
239 |                 print("Please,try again.")
240 |                 return None
241 |         else:
242 |             print("\nNo results found or bad Internet connection.")
243 |             print("Please,try again.")
244 | 
245 | 
246 | def choose_a_book(dataframe):
247 |     # asks the user which book to download from the printed DataFrame
248 |     if dataframe.empty is False:
249 |         dataframe.index += 1
250 |         print(dataframe[["Author(s)", "Title", "Size", "Extension"]])
251 | 
252 |         urls = dataframe["Url"].to_list()
253 |         titles = dataframe["Title"].to_list()
254 |         extensions = dataframe["Extension"].to_list()
255 |         choices = []
256 |         temp = len(urls) + 1
257 |         for i in range(1, temp):
258 |             choices.append(str(i))
259 |         choices.append("C")
260 |         choices.append("c")
261 |         try:
262 |             while True:
263 |                 tell_me = str(
264 |                     input(
265 |                         "\n\nPlease enter a number from 1 to {number}"
266 |                         ' to download a book or press "C" to abort'
267 |                         " search: ".format(number=len(urls))
268 |                     )
269 |                 )
270 |                 if tell_me in choices:
271 |                     if tell_me == "C" or tell_me == "c":
272 |                         print("Aborted!")
273 |                         return None
274 |                     else:
275 |                         c = int(tell_me) - 1
276 |                         filename = titles[c] + "." + extensions[c]
277 |                         filename = filename_refubrished(filename)
278 |                         if urls[c].startswith("https://export.arxiv.org/"):
279 |                             search_downloader(filename, urls[c])
280 |                             return False
281 |                         else:
282 |                             mirror_used = mirror_checker(False)
283 |                             link = mirror_used + urls[c]
284 |                             details = link_finder(link, mirror_used)
285 |                             file_link = details[1]
286 |                             search_downloader(filename, file_link)
287 |                             return False
288 |         except ValueError:
289 |             print(RESULT_ERROR)
290 |             print("Please,try again.")
291 |             return None
292 |     else:
293 |         print(RESULT_ERROR)
294 | 


--------------------------------------------------------------------------------
/bookcut/settings.py:
--------------------------------------------------------------------------------
  1 | import configparser
  2 | import os
  3 | from bookcut.downloader import pathfinder
  4 | 
  5 | 
  6 | def initial_config():
  7 |     """function to create settings .ini file, used also for restore settings"""
  8 |     try:
  9 |         write_config = configparser.ConfigParser()
 10 |         module_path = os.path.dirname(os.path.realpath(__file__))
 11 |         settings_ini = os.path.join(module_path, "Settings.ini")
 12 | 
 13 |         write_config.add_section("LibGen")
 14 |         write_config.add_section("Settings")
 15 |         mirrors = "https://libgen.lc/,http://libgen.li/,http://185.39.10.101/,http://genesis.lib/"
 16 |         write_config.set("LibGen", "mirrors", mirrors)
 17 |         write_config.set("Settings", "clean_screen", "True")
 18 |         write_config.set("Settings", "destination", "None")
 19 | 
 20 |         cfgfile = open(settings_ini, "w")
 21 |         write_config.write(cfgfile)
 22 |         cfgfile.close()
 23 |     except PermissionError as error:
 24 |         print("\n", error)
 25 |         print("You have to be administrator to change BookCut settings. ")
 26 | 
 27 | 
 28 | def mirrors_append(url):
 29 |     """function to append the LibGen mirrors list"""
 30 | 
 31 |     try:
 32 | 
 33 |         # READ EXISTING LIST
 34 |         config = configparser.ConfigParser()
 35 |         module_path = os.path.dirname(os.path.realpath(__file__))
 36 |         settings_ini = os.path.join(module_path, "Settings.ini")
 37 | 
 38 |         config.read(settings_ini)
 39 |         mirrors = config.get("LibGen", "mirrors")
 40 |         mirrors = mirrors + "," + url
 41 | 
 42 |         # APPEND LIST
 43 |         mirrors = str(mirrors)
 44 |         config.set("LibGen", "mirrors", mirrors)
 45 | 
 46 |         # WRITE TO INI FILE
 47 |         cfgfile = open(settings_ini, "w")
 48 |         config.write(cfgfile)
 49 |         cfgfile.close()
 50 | 
 51 |         # Succefully message
 52 |         print("\nSuccesfully added to list!:")
 53 |         mirrors = mirrors.split(",")
 54 |         for i in mirrors:
 55 |             print(i)
 56 |     except PermissionError as error:
 57 |         print("\n", error)
 58 |         print("You have to be administrator to change BookCut settings. ")
 59 | 
 60 | 
 61 | def read_settings():
 62 |     # read the config file settings and printing them
 63 | 
 64 |     # get ini file path
 65 |     config = configparser.ConfigParser()
 66 |     module_path = os.path.dirname(os.path.realpath(__file__))
 67 |     settings_ini = os.path.join(module_path, "Settings.ini")
 68 | 
 69 |     # get values
 70 |     config.read(settings_ini)
 71 |     clean_screen = config.get("Settings", "clean_screen")
 72 |     destination = config.get("Settings", "destination")
 73 |     settings = [clean_screen, destination]
 74 |     return settings
 75 | 
 76 | 
 77 | def print_settings():
 78 |     """Prints settings"""
 79 |     settings = read_settings()
 80 | 
 81 |     print("\nBookCut Settings:\n")
 82 |     print("1.Clean Screen Option Enabled: ", settings[0])
 83 |     print("2.Destination Folder Path: ", settings[1])
 84 | 
 85 | 
 86 | def screen_setting(input):
 87 |     """clean screen settings adjust"""
 88 |     try:
 89 |         config = configparser.ConfigParser()
 90 |         module_path = os.path.dirname(os.path.realpath(__file__))
 91 |         settings_ini = os.path.join(module_path, "Settings.ini")
 92 | 
 93 |         config.read(settings_ini)
 94 |         config.set("Settings", "clean_screen", input)
 95 |         cfgfile = open(settings_ini, "w")
 96 |         config.write(cfgfile)
 97 |         cfgfile.close()
 98 |     except PermissionError as error:
 99 |         print("\n", error)
100 |         print("You have to be administrator to change BookCut settings. ")
101 | 
102 | 
103 | def set_destination(path):
104 |     try:
105 |         if os.path.isdir(path):
106 |             module_path = os.path.dirname(os.path.realpath(__file__))
107 |             settings_ini = os.path.join(module_path, "Settings.ini")
108 | 
109 |             config = configparser.ConfigParser()
110 |             config.read(settings_ini)
111 |             config.set("Settings", "destination", path)
112 |             cfgfile = open(settings_ini, "w")
113 |             config.write(cfgfile)
114 |             cfgfile.close()
115 |             print("Destination path changed!\n", path)
116 |         else:
117 |             try:
118 |                 os.makedirs(path)
119 |                 print("Created folder: ", path)
120 |             except FileNotFoundError as error:
121 |                 print("\n", error)
122 |                 print("(!) Not a valid path please try again!")
123 | 
124 |     except PermissionError as error:
125 |         print("\n", error)
126 |         print("(!) You have to be administrator to change BookCut settings!")
127 | 
128 | 
129 | def path_checker():
130 |     settings = read_settings()
131 |     if settings[1] != "None":
132 |         return settings[1]
133 |     else:
134 |         path = pathfinder()
135 |         return path
136 | 
137 | 
138 | if __name__ == "__main__":
139 |     initial_config()
140 | 


--------------------------------------------------------------------------------
/conftest.py:
--------------------------------------------------------------------------------
 1 | def pytest_addoption(parser):
 2 |     parser.addoption(
 3 |         "--web",
 4 |         action="store_true",
 5 |         dest="web",
 6 |         default=False,
 7 |         help="enable tests requiring an internet connection",
 8 |     )
 9 | 
10 | 
11 | def pytest_configure(config):
12 |     if not config.option.web:
13 |         setattr(config.option, "markexpr", "not web")
14 | 


--------------------------------------------------------------------------------
/pytest.ini:
--------------------------------------------------------------------------------
1 | [pytest]
2 | minversion = 6.0
3 | testpaths =
4 |     tests
5 | markers =
6 |     web: mark tests which require an internet connection
7 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | import setuptools
 2 | import sys
 3 | import pathlib
 4 | 
 5 | if sys.version_info.major < 3:
 6 |     print("\nPython 2 is not supported! \nPlease upgrade to Python 3.\n")
 7 |     print(
 8 |         "Installation of BookCut stopped, please try again with\n"
 9 |         "a newer version of Python!"
10 |     )
11 |     sys.exit(1)
12 | 
13 | # The directory containing this file
14 | HERE = pathlib.Path(__file__).parent
15 | 
16 | # The text of the README file
17 | README = (HERE / "README.md").read_text()
18 | 
19 | setuptools.setup(
20 |     name="BookCut",
21 |     python_requires=">3.5.2",
22 |     version="1.3.7",
23 |     author="Costis94",
24 |     author_email="gravitymusician@gmail.com",
25 |     description="Command Line Interface app to download ebooks",
26 |     long_description_content_type="text/markdown",
27 |     long_description=README,
28 |     url="https://github.com/costis94/bookcut",
29 |     packages=setuptools.find_packages(),
30 |     classifiers=[
31 |         "Programming Language :: Python :: 3",
32 |         "License :: OSI Approved :: MIT License",
33 |         "Operating System :: OS Independent",
34 |     ],
35 |     install_requires=[
36 |         "pandas",
37 |         "click>=7.1.2",
38 |         "requests",
39 |         "beautifulsoup4",
40 |         "pyfiglet",
41 |         "tqdm",
42 |         "mechanize",
43 |     ],
44 |     extras_require={
45 |         "dev": [
46 |             "pytest",
47 |             "pytest-cov",
48 |             "pre-commit",
49 |             "black",
50 |         ]
51 |     },
52 |     include_package_data=True,
53 |     entry_points="""
54 |             [console_scripts]
55 |             bookcut=bookcut.bookcut:entry
56 |             """,
57 | )
58 | 


--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/costis94/bookcut/88a06bf6e7962f6b013b9f45d23886e255d7a9f2/tests/__init__.py


--------------------------------------------------------------------------------
/tests/test_book.py:
--------------------------------------------------------------------------------
 1 | import pytest
 2 | from bookcut.mirror_checker import main as mirror_checker
 3 | from bookcut.book import Booksearch
 4 | 
 5 | 
 6 | @pytest.mark.web
 7 | def test_single_book_download():
 8 |     title = "Iliad"
 9 |     author = "Homer"
10 |     publisher = " "
11 |     type_format = " "
12 |     book = Booksearch(title, author, publisher, type_format, mirror_checker())
13 |     result = book.search()
14 |     extensions = result["extensions"]
15 |     print("extensions: ", extensions)
16 |     tb = result["table_data"]
17 |     mirrors = result["mirrors"]
18 |     assert mirrors[0].startswith("http"), "Not correct format of Mirror URL."
19 |     assert type(extensions) is list, "Wrong format of extension details."
20 |     file_details = book.give_result(extensions, tb, mirrors, extensions[0])
21 | 


--------------------------------------------------------------------------------
/tests/test_bookcut.py:
--------------------------------------------------------------------------------
 1 | import pytest
 2 | 
 3 | from click.testing import CliRunner
 4 | from bookcut import __version__
 5 | from bookcut.bookcut import entry
 6 | 
 7 | 
 8 | def test_entry_with_version_option():
 9 |     cli_output = CliRunner().invoke(entry, ["--version"])
10 |     assert cli_output.exit_code == 0
11 |     assert cli_output.output == f"commands, version {__version__}\n"
12 | 


--------------------------------------------------------------------------------
/tests/test_main.py:
--------------------------------------------------------------------------------
1 | import unittest
2 | 


--------------------------------------------------------------------------------
/tests/test_mirror_checker.py:
--------------------------------------------------------------------------------
 1 | import pytest
 2 | from bookcut.mirror_checker import pageStatus, main as mirror_checker
 3 | from bookcut.mirror_checker import requests, CONNECTION_ERROR_MESSAGE
 4 | from requests import ConnectionError
 5 | 
 6 | TEST_URL = "http://www.sometesturl.com"
 7 | 
 8 | 
 9 | @pytest.mark.web
10 | def test_mirror_availability():
11 |     available_mirror = mirror_checker()
12 |     assert type(available_mirror) is str, "Not correct type of LibGen Url"
13 |     assert available_mirror.startswith("http"), "Not correct LibGen Url."
14 | 
15 | 
16 | @pytest.mark.parametrize("status_code", [200, 301])
17 | def test_openLibraryStatus_output_if_it_can_connect(monkeypatch, capsys, status_code):
18 |     def mock_requests_head(_):
19 |         return type("_", (), {"status_code": status_code})
20 | 
21 |     monkeypatch.setattr(requests, "head", mock_requests_head)
22 |     assert pageStatus(TEST_URL)
23 |     captured = capsys.readouterr()
24 |     assert captured.out == f"Connected to: {TEST_URL}\n"
25 | 
26 | 
27 | def test_openLibraryStatus_output_for_wrong_status_code(monkeypatch, capsys):
28 |     def mock_requests_head(_):
29 |         return type("_", (), {"status_code": 42})
30 | 
31 |     monkeypatch.setattr(requests, "head", mock_requests_head)
32 |     assert not pageStatus(TEST_URL)
33 |     captured = capsys.readouterr()
34 |     assert captured.out == CONNECTION_ERROR_MESSAGE.format(TEST_URL) + "\n"
35 | 
36 | 
37 | def test_openLibraryStatus_output_on_connection_error(monkeypatch, capsys):
38 |     def mock_requests_head(_):
39 |         raise ConnectionError
40 | 
41 |     monkeypatch.setattr(requests, "head", mock_requests_head)
42 |     assert not pageStatus(TEST_URL)
43 |     captured = capsys.readouterr()
44 |     assert captured.out == CONNECTION_ERROR_MESSAGE.format(TEST_URL) + "\n"
45 | 
46 | 
47 | @pytest.mark.web
48 | def test_open_libraryStatus():
49 |     status = pageStatus(url="http://www.openlibrary.org")
50 |     assert status is not False, "OpenLibrary Status =! 200"
51 | 
52 | 
53 | @pytest.mark.web
54 | def test_archiv_Status():
55 |     status = pageStatus(url="http://export.arxiv.org/")
56 |     assert status is not False, "Archiv Status =! 200"
57 | 


--------------------------------------------------------------------------------