├── .gitignore ├── CONTRIBUTING.md ├── README.md └── calibre-recipes ├── AOSABook.recipe ├── A_Mathematical_Theory_of_Communication.recipe ├── Android_Studio_Development_Essentials.recipe ├── Android_Training_Course_In_Chinese.recipe ├── AngularJS_Tutorial_Cn.recipe ├── CS183_Peter_Thiel.recipe ├── Computer_Science_from_the_Bottom_Up.recipe ├── Designing_Evolvable_Web_APIs_with_ASP_NET.recipe ├── Dive_Into_Python_3.recipe ├── Explore_Flask.recipe ├── Extending_and_Embedding_PHP_zh_CN.recipe ├── Forecasting_Principles_and_Practice.recipe ├── Free_as_in_Freedom.recipe ├── Game_Programming_Patterns.recipe ├── Git_Community_Book.recipe ├── Git_Pocket_Guide.recipe ├── High_Performance_Browser_Networking.recipe ├── House_Transcripts.recipe ├── Interactive_Data_Visualization_for_the_Web.recipe ├── Introduction_to_Linux.recipe ├── Learn_Python_the_Hard_Way.recipe ├── Learn_Vimscript_the_Hard_Way.recipe ├── Learn_Vimscript_the_Hard_Way_Zhcn.recipe ├── Makefile ├── Mastering_Perl.recipe ├── Nature_of_Code_the.recipe ├── Pro_Git_ZH.recipe ├── Programming_JavaScript_Applications.recipe ├── Python_Cookbook.recipe ├── SICP.recipe ├── Test_Driven_Web_Development_with_Python.recipe ├── The_Definitive_Guide_to_Yii_2.0.recipe └── Tutorials_about_Development_for_Android.recipe /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | *.mobi 3 | debug/ 4 | 5 | ### JetBrains template 6 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion 7 | 8 | *.iml 9 | 10 | ## Directory-based project format: 11 | .idea/ 12 | # if you remove the above rule, at least ignore the following: 13 | 14 | # User-specific stuff: 15 | # .idea/workspace.xml 16 | # .idea/tasks.xml 17 | # .idea/dictionaries 18 | 19 | # Sensitive or high-churn files: 20 | # .idea/dataSources.ids 21 | # .idea/dataSources.xml 22 | # .idea/sqlDataSources.xml 23 | # .idea/dynamic.xml 24 | # .idea/uiDesigner.xml 25 | 26 | # Gradle: 27 | # .idea/gradle.xml 28 | # .idea/libraries 29 | 30 | # Mongo Explorer plugin: 31 | # .idea/mongoSettings.xml 32 | 33 | ## File-based project format: 34 | *.ipr 35 | *.iws 36 | 37 | ## Plugin-specific files: 38 | 39 | # IntelliJ 40 | /out/ 41 | 42 | # mpeltonen/sbt-idea plugin 43 | .idea_modules/ 44 | 45 | # JIRA plugin 46 | atlassian-ide-plugin.xml 47 | 48 | # Crashlytics plugin (for Android Studio and IntelliJ) 49 | com_crashlytics_export_strings.xml 50 | crashlytics.properties 51 | crashlytics-build.properties 52 | 53 | 54 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Guidelines for Contributing to Kindle-Open-Books 2 | 3 | First, I'd like to thank all of those who helped me make this project become 4 | better and better. At any time, contributions are welcome :) 5 | 6 | As an open source project, Kindle-Open-Books welcomes contributions of many forms. 7 | 8 | Examples of contributions include: 9 | 10 | * Submit a new recipe. 11 | * Improve the existed recipe. 12 | * Bug reports. 13 | * Patch reviews. 14 | * Or any suggestions for this project. 15 | 16 | But, to make this project easy to maintain and stay healthy, here are some 17 | guidelines. 18 | 19 | ## Submit a new recipe 20 | 21 | I'll take this into detail and explain to how do it. 22 | 23 | 1. Fork this repo. 24 | 1. Clone your repo. 25 | 1. Checkout a new branch, e.g. "Add_Dive_into_Python_3" 26 | 1. Develop the new recipe. 27 | 1. Put your recipe into the calibre-recipes folder, the recipe's name should be 28 | Uppercase join with '_'. You can have a look at the recipes in the folder. 29 | E.g. "Dive_Into_Python_3.recipe". 30 | 1. Update readme. Add your recipe in the right category(English/Chinese) and 31 | sort by alphabet. 32 | 1. It's time to make a pull request. Thanks again! 33 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | This project is created to convert open source materials to kindle supported format (`.mobi`) 2 | 3 | The conversion is limited to open source licensed books. This project does not include any generated `.mobi` files and only includes the `.recipe` file for Calibre. 4 | 5 | # About calibre recipes 6 | 7 | [Calibre](http://calibre-ebook.com/) is a free electronic book management tool. It allows the generation of e-book through scraping RSS or HTML contents. It can be done through a Calibre recipe (in Python). For more details of Calibre recipe kindly refer to [Calibre Manual](http://manual.calibre-ebook.com/news.html) 8 | 9 | List of Recipes in `calibre-recipes` Folder 10 | 11 | ## English 12 | 13 | + AOSABook.recipe - [The Architecture of Open Source Applications](http://www.aosabook.org/en/index.html) 14 | + Android_Studio_Development_Essentials.recipe - [Android Studio Development Essentials](http://www.techotopia.com/index.php/Android_Studio_Development_Essentials) 15 | + Computer_Science_from_the_Bottom_Up.recipe - [Computer Science from the Bottom Up](http://www.bottomupcs.com/index.html) 16 | + CS183_Peter_Thiel.recipe - [Notes Essays—Peter Thiel’s CS183: Startup](http://blakemasters.com/peter-thiels-cs183-startup) 17 | + Designing_Evolvable_Web_APIs_with_ASP_NET.recipe - [Designing Evolvable Web APIs with ASP.NET](http://chimera.labs.oreilly.com/books/1234000001708) 18 | + Dive_Into_Python_3.recipe - [Dive Into Python 3](http://www.diveintopython3.net/) 19 | + Explore_Flask.recipe - [Explore Flask](http://exploreflask.com/) 20 | + Forecasting_Principles_and_Practice.recipe - [Forecasting Principles and Practice](http://otexts.com/fpp/) 21 | + Game_Programming_Patterns.recipe - [Game Programming Patterns] (http://gameprogrammingpatterns.com) 22 | + Git_Pocket_Guide.recipe - [Git Pocket Guide](http://chimera.labs.oreilly.com/books/1230000000561) 23 | + High_Performance_Browser_Networking.recipe - [High Performance Browser Networking](http://chimera.labs.oreilly.com/books/1230000000545/index.html) 24 | + House_Transcripts.recipe - [House Transcripts](http://clinic-duty.livejournal.com/12225.html) 25 | + Interactive_Data_Visualization_for_the_Web.recipe - [Interactive Data Visualization for the Web](http://chimera.labs.oreilly.com/books/1230000000345) 26 | + Introduction_to_Linux.recipe - [Introduction to Linux](http://tldp.org/LDP/intro-linux/html/) 27 | + Learn_Python_the_Hard_Way.recipe - [Learn Python The Hard Way, 3rd Edition](http://learnpythonthehardway.org/book/) 28 | + Learn_Vimscript_the_Hard_Way.recipe - [Learn Vimscript the Hard Way](http://learnvimscriptthehardway.stevelosh.com/) 29 | + Mastering_Perl.recipe - [Mastering Perl](http://chimera.labs.oreilly.com/books/1234000001527) 30 | + Nature_of_Code_the.recipe - [The Nature of Code](http://natureofcode.com/book/) 31 | + Programming_JavaScript_Applications.recipe - [Programming JavaScript Applications](http://chimera.labs.oreilly.com/books/1234000000262) 32 | + Python_Cookbook.recipe - [Python Cookbook](http://chimera.labs.oreilly.com/books/1230000000393) 33 | + SICP.recipe - [Structure and Interpretation of Computer Programs](http://mitpress.mit.edu/sicp/full-text/book/book.html) 34 | + Test_Driven_Web_Development_with_Python - [Test-Driven Web Development with Python](http://chimera.labs.oreilly.com/books/1234000000754) 35 | + Interactive_Data_Visualization_for_the_Web - [Interactive Data Visualization for the Web](http://chimera.labs.oreilly.com/books/1230000000345) 36 | + Free_as_in_Freedom - [Free as in Freedom](http://www.oreilly.com/openbook/freedom) 37 | + The_Definitive_Guide_to_Yii_2.0.recipe - [The Definitive Guide to Yii 2.0](http://www.yiiframework.com/doc-2.0/guide-index.html) 38 | + Tutorials_about_Development_for_Android.recipe - [Tutorials about Development for Android](http://www.vogella.com/tutorials/android.html) 39 | 40 | ## Simplified Chinese 41 | 42 | + A_Mathematical_Theory_of_Communication.recipe - [通信的数学理论](http://www.ituring.com.cn/minibook/611) 43 | + Android_Training_Course_In_Chinese.recipe - [Android官方培训课程中文版](http://hukai.me/android-training-course-in-chinese/) 44 | + AngularJS_Tutorial_Cn.recipe - [AngularJS入门教程](http://www.ituring.com.cn/minibook/303) 45 | + Extending_and_Embedding_PHP_zh_CN.recipe - [PHP扩展开发及内核应用](https://github.com/walu/phpbook/blob/master/index.md) 46 | + Git_Community_Book.recipe - [Git Community Book 中文版](http://gitbook.liuhui998.com/) 47 | + Learn_Vimscript_the_Hard_Way_Zhcn.recipe - [笨方法学Vimscrpt 简体中文版](http://learnvimscriptthehardway.onefloweroneworld.com/) 48 | + Pro_Git_ZH.recipe - [Pro Git 简体中文版](http://iissnan.com/progit/) 49 | 50 | # Usage 51 | 52 | ## GUI 53 | 54 | 1. Install Calibre [Download](http://calibre-ebook.com/download) 55 | 2. Go to `Fetch news`, `Load recipe from file` to add your recipe. 56 | 3. Go to `Fetch news`, `Schedule news download`, `Custom`, select the recipe added in step 1 and click `Download Now` 57 | 4. For more details, kindly refer to [Calibre Manual](http://manual.calibre-ebook.com/news.html) 58 | 59 | ## Terminal 60 | 61 | 1. Install Calibre 62 | 63 | * Archlinux 64 | 65 | ```bash 66 | pacman -S calibre 67 | ``` 68 | 69 | * Debian/Ubuntu 70 | 71 | ```bash 72 | apt-get install calibre 73 | ``` 74 | 75 | * RedHat/Fedora/CentOS 76 | 77 | ```bash 78 | yum -y install calibre 79 | ``` 80 | 81 | * Mac OSX - Requires [Command Line Tool] (http://manual.calibre-ebook.com/cli/cli-index.html)。 82 | 83 | 2. Execute the following command in `calibre-recipes` folder 84 | 85 | If you want to generate all books, just run `make`, e.g. 86 | 87 | ```bash 88 | make 89 | ``` 90 | 91 | Otherwise to generate mobi for a specific book, run 92 | 93 | ```bash 94 | make xxx.mobi 95 | ``` 96 | 97 | For example 98 | 99 | ```bash 100 | make AOSABook.mobi 101 | ``` 102 | 103 | Internally, this will run 104 | 105 | ```bash 106 | ebook-convert AOSABook.recipe AOSABook.mobi 107 | ``` 108 | 109 | It will generate `AOSABook.mobi` in the same folder. 110 | 111 | # Contributing 112 | 113 | Please read the `CONTRIBUTING` document. 114 | -------------------------------------------------------------------------------- /calibre-recipes/AOSABook.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class AOSABook(BasicNewsRecipe): 4 | 5 | title = 'The Architecture of Open Source Applications' 6 | description = 'In these two books, the authors of four dozen open source applications explain how their software is structured, and why. What are each program\'s major components? How do they interact? And what did their builders learn during their development? In answering these questions, the contributors to these books provide unique insights into how they think.' 7 | cover_url = 'http://www.aosabook.org/images/cover1.jpg' 8 | 9 | url_prefix = 'http://www.aosabook.org/en/' 10 | no_stylesheets = True 11 | remove_tags = [{ 'class': 'footer' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def generate_vol(self, table): 17 | articles = [] 18 | 19 | for link in table.findAll('a'): 20 | if '#' in link['href']: 21 | continue 22 | 23 | title = self.get_title(link) 24 | url = self.url_prefix + link['href'] 25 | a = { 'title': title, 'url': url } 26 | 27 | articles.append(a) 28 | 29 | return articles 30 | 31 | def parse_index(self): 32 | soup = self.index_to_soup(self.url_prefix + 'index.html') 33 | 34 | tables = soup.findAll('table') 35 | articles_vol1 = self.generate_vol(tables[1]) 36 | articles_vol2 = self.generate_vol(tables[0]) 37 | 38 | volumes = [('Volume1', articles_vol1), ('Volume2', articles_vol2)] 39 | 40 | return volumes 41 | -------------------------------------------------------------------------------- /calibre-recipes/A_Mathematical_Theory_of_Communication.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class A_Mathematical_Theory_of_Communication(BasicNewsRecipe): 4 | 5 | title = '通信的数学理论' 6 | description = '' 7 | cover_url = 'http://www.ituring.com.cn/bookcover/1185.935.jpg' 8 | 9 | url_prefix = 'http://www.ituring.com.cn/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'id': 'question-header' }, { 'class': 'post-text' }]; 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup('http://www.ituring.com.cn/minibook/611') 18 | 19 | div = soup.find('div', { 'class': 'minibook-list' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a', { 'class': 'question-hyperlink' }): 23 | title = self.get_title(link) 24 | url = self.url_prefix + link['href'] 25 | a = { 'title': title, 'url': url } 26 | 27 | articles.append(a) 28 | 29 | ans = [('A_Mathematical_Theory_of_Communication', articles)] 30 | 31 | return ans 32 | 33 | def postprocess_html(self, soup, first_fetch): 34 | first = True 35 | 36 | for text in soup.findAll('div', { 'class': 'post-text' }): 37 | if first: 38 | first = False 39 | else: 40 | text.extract() 41 | 42 | return soup 43 | -------------------------------------------------------------------------------- /calibre-recipes/Android_Studio_Development_Essentials.recipe: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from calibre.web.feeds.recipes import BasicNewsRecipe 4 | from calibre.ebooks.BeautifulSoup import Tag, NavigableString 5 | from collections import OrderedDict 6 | 7 | 8 | class Android_Studio_Development_Essentials(BasicNewsRecipe): 9 | 10 | title = 'Android Studio Development Essentials' 11 | description = '' 12 | cover_url = 'http://www.techotopia.com/cover_images/android_studio_front_cover_150x120.png' 13 | extra_css = ''' 14 | pre { 15 | border-color: #e0e5ea; 16 | border-style: none none none solid; 17 | border-width: medium medium medium 5px; 18 | } 19 | 20 | blackquote { 21 | border-left: 5px solid #eee; 22 | } 23 | 24 | ''' 25 | url_pre = 'http://www.techotopia.com' 26 | no_stylesheets = True 27 | keep_only_tags = [{ 'id': 'bodyContent' }] 28 | remove_tags = [dict(name='table', attrs={'align':'center'})] 29 | simultaneous_downloads = 5 30 | 31 | def parse_index(self): 32 | return self.guide_parse_index() 33 | 34 | def guide_parse_index(self): 35 | 36 | # Generally, by changing this line of URL, you can download any brooks from techotopia.com 37 | soup = self.index_to_soup(self.url_pre + '/index.php/Android_Studio_Development_Essentials') 38 | 39 | div = soup.find('div', { 'id': 'bodyContent' }) 40 | 41 | articles = [] 42 | 43 | ol = div.find('ol') 44 | myli = ol.li 45 | while (myli): 46 | 47 | link = myli.a 48 | 49 | title = link['title'] 50 | url = self.url_pre + link['href'] 51 | 52 | a = { 'title' : title, 'url' : url } 53 | 54 | articles.append(a) 55 | 56 | myli = myli.nextSibling 57 | 58 | ans = [(self.title, articles)] 59 | return ans 60 | 61 | -------------------------------------------------------------------------------- /calibre-recipes/Android_Training_Course_In_Chinese.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Android_Training_Course_In_Chinese(BasicNewsRecipe): 4 | max_articles_per_feed = 1000 5 | title = 'Android_Training_Course_In_Chinese' 6 | description = u'来自网友翻译的官方android教程' 7 | cover_url = 'http://hukai.me/android-training-course-in-chinese/android_training.jpg' 8 | 9 | url_prefix = 'http://hukai.me/android-training-course-in-chinese/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'normal' }] 12 | 13 | def get_title(self, link): 14 | return link.get('title') 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup( self.url_prefix + 'index.html' ) 18 | div = soup.find('div', { 'class':'chapters' }) 19 | 20 | articles = [] 21 | for link in div.findAll('a'): 22 | if '../' in link['href']: 23 | continue 24 | 25 | til = self.get_title(link) 26 | url = self.url_prefix + link['href'] 27 | a = { 'title': til, 'url': url } 28 | 29 | articles.append(a) 30 | 31 | ans = [('Android_Training_Course_In_Chinese', articles)] 32 | 33 | return ans 34 | -------------------------------------------------------------------------------- /calibre-recipes/AngularJS_Tutorial_Cn.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class AngularJS_Tutorial_Cn(BasicNewsRecipe): 4 | 5 | title = 'AngularJS入门教程' 6 | description = '' 7 | cover_url = 'http://www.ituring.com.cn/download/01YQ9gUjqjyW' 8 | 9 | url_prefix = 'http://www.ituring.com.cn/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'id': 'question-header' }, { 'class': 'post-text' }]; 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup('http://www.ituring.com.cn/minibook/303') 18 | 19 | div = soup.find('div', { 'class': 'minibook-list' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a', { 'class': 'question-hyperlink' }): 23 | title = self.get_title(link) 24 | url = self.url_prefix + link['href'] 25 | a = { 'title': title, 'url': url } 26 | 27 | articles.append(a) 28 | 29 | ans = [('AngularJS_Tutorial_Cn', articles)] 30 | 31 | return ans 32 | 33 | def postprocess_html(self, soup, first_fetch): 34 | first = True 35 | 36 | for text in soup.findAll('div', { 'class': 'post-text' }): 37 | if first: 38 | first = False 39 | else: 40 | text.extract() 41 | 42 | return soup 43 | -------------------------------------------------------------------------------- /calibre-recipes/CS183_Peter_Thiel.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Zero_To_One(BasicNewsRecipe): 4 | 5 | title = 'CS183 Peter Thiel' 6 | description = '' 7 | cover_url = 'http://zerotoonebook.com/images/bookcover.png' 8 | 9 | url_prefix = 'http://blakemasters.com/peter-thiels-cs183-startup' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'post-content' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + '') 18 | 19 | div = soup.find('div', { 'class': 'post-content' }) 20 | articles = [] 21 | for link in div.findAll('a'): 22 | if '#' in link['href']: 23 | continue 24 | 25 | if not 'blakemasters.tumblr.com' in link['href']: 26 | continue 27 | 28 | title = self.get_title(link) 29 | url = link['href'] 30 | a = { 'title': title, 'url': url } 31 | 32 | articles.append(a) 33 | 34 | ans = [(title, articles)] 35 | 36 | return ans 37 | -------------------------------------------------------------------------------- /calibre-recipes/Computer_Science_from_the_Bottom_Up.recipe: -------------------------------------------------------------------------------- 1 | import re 2 | from calibre.web.feeds.recipes import BasicNewsRecipe 3 | 4 | class CSBU(BasicNewsRecipe): 5 | 6 | title = 'Computer Science from the Bottom Up' 7 | description = 'Computer Science from the Bottom Up — A free, online book designed to teach computer science from the bottom end up. Topics covered include binary and binary logic, operating systems internals, toolchain fundamentals and system library fundamentals' 8 | 9 | url_prefix = 'http://www.bottomupcs.com/' 10 | no_stylesheets = True 11 | remove_tags = [{'name':'header'}, {'name':'footer'}, {'name':'div', 'class':'toc'}] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup('http://www.bottomupcs.com/index.html') 18 | 19 | toc = soup.find('div', {'class':'toc'}) 20 | 21 | articles = [] 22 | for link in toc.findAll('a'): 23 | pattern = re.compile(r'^.*\.html$') 24 | if not pattern.match(link['href']): 25 | continue 26 | 27 | title = self.get_title(link) 28 | url = self.url_prefix + link['href'] 29 | a = { 'title': title, 'url': url } 30 | 31 | articles.append(a) 32 | 33 | ans = [('Computer Science from the Bottom Up', articles)] 34 | 35 | return ans 36 | -------------------------------------------------------------------------------- /calibre-recipes/Designing_Evolvable_Web_APIs_with_ASP_NET.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Designing_Evolving_Web_APIs_with_ASP_NET(BasicNewsRecipe): 4 | 5 | title = 'Designing Evolving Web APIs with ASP.NET' 6 | description = '' 7 | cover_url = 'http://akamaicovers.oreilly.com/images/0636920026617/rc_lrg.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1234000001708/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [(title, articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Dive_Into_Python_3.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Dive_Into_Python_3(BasicNewsRecipe): 4 | title = 'Dive_Into_Python_3' 5 | __author__ = 'lord63' 6 | description = '''Dive_Into_Python_3, this book is freely licensed under the 7 | Creative Commons Attribution Share-Alike license''' 8 | timefmt = '[%Y-%m-%d]' 9 | no_stylesheets = True 10 | INDEX = 'http://www.diveintopython3.net/' 11 | remove_tags = [ {'class': 'v'}, # remove the navigation at the bottom of the page 12 | {'class': 'c'}, 13 | # the next three things is for the Introduction chapter. 14 | dict(attrs={'start': '-1'}), 15 | dict(name='h2'), 16 | dict(name='p', attrs={'style': 'float:right;width:245px;text-align:center;margin:0 0 0 1.75em'})] 17 | remove_tags_before = dict(name='h1') # remove the unwanted things at the top of the page 18 | 19 | 20 | def parse_index(self): 21 | soup = self.index_to_soup(self.INDEX) 22 | TOC = soup.find('ol', attrs={'start': '-1'}) 23 | # To keep the license, so I add this chapter. 24 | articles = [{'title': 'Introduction', 'url': self.INDEX}] 25 | for tag in TOC.findAll('li'): 26 | title = self.tag_to_string(tag) 27 | url = self.INDEX + tag.a['href'] 28 | article = {'title': title, 'url': url} 29 | articles.append(article) 30 | return [('Dive_Into_Python_3', articles)] 31 | 32 | -------------------------------------------------------------------------------- /calibre-recipes/Explore_Flask.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Explore_Flask(BasicNewsRecipe): 4 | 5 | title = 'Explore Flask' 6 | description = '' 7 | 8 | url_prefix = 'http://exploreflask.com/' 9 | no_stylesheets = True 10 | 11 | def get_title(self, link): 12 | return link.contents[0].strip() 13 | 14 | def parse_index(self): 15 | soup = self.index_to_soup(self.url_prefix + 'index.html') 16 | 17 | div = soup.find('div') 18 | articles = [] 19 | for link in div.findAll('a'): 20 | if '#' in link['href']: 21 | continue 22 | 23 | title = self.get_title(link) 24 | url = self.url_prefix + link['href'] 25 | a = { 'title': title, 'url': url } 26 | 27 | articles.append(a) 28 | 29 | ans = [(title, articles)] 30 | 31 | return ans 32 | -------------------------------------------------------------------------------- /calibre-recipes/Extending_and_Embedding_PHP_zh_CN.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Designing_Evolving_Web_APIs_with_ASP_NET(BasicNewsRecipe): 4 | 5 | title = 'Extending and Embedding PHP' 6 | description = 'Extending_and_Embedding_PHP_Chinese' 7 | cover_url = 'http://img3.douban.com/lpic/s1918674.jpg' 8 | 9 | url_prefix = 'https://github.com/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'blob instapaper_body' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'walu/phpbook/blob/master/preface.md') 18 | 19 | div = soup.find('div', { 'class': 'blob instapaper_body' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | title = self.get_title(link) 27 | url = self.url_prefix + link['href'] 28 | a = { 'title': title, 'url': url } 29 | 30 | articles.append(a) 31 | 32 | ans = [(title, articles)] 33 | 34 | return ans -------------------------------------------------------------------------------- /calibre-recipes/Forecasting_Principles_and_Practice.recipe: -------------------------------------------------------------------------------- 1 | import re 2 | from calibre.web.feeds.recipes import BasicNewsRecipe 3 | from calibre.ebooks.BeautifulSoup import NavigableString 4 | 5 | class Forecasting_Principles_and_Practice(BasicNewsRecipe): 6 | 7 | title = 'Forecasting: principles and practice' 8 | description = '' 9 | 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'id': re.compile('^post-') }] 12 | 13 | def get_title(self, link): 14 | title = [] 15 | for ele in link.contents: 16 | if isinstance(ele, NavigableString): 17 | title.append(ele) 18 | else: 19 | title.append(ele.contents[0]) 20 | 21 | return ' '.join(title) 22 | 23 | def parse_index(self): 24 | soup = self.index_to_soup('http://otexts.com/fpp/') 25 | 26 | articles = [] 27 | for li in soup.findAll('li', { 'class': re.compile('^page_item') }): 28 | link = li.find('a') 29 | title = self.get_title(link) 30 | url = link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [('Forecasting: principles and practice', articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Free_as_in_Freedom.recipe: -------------------------------------------------------------------------------- 1 | import re 2 | from calibre.web.feeds.recipes import BasicNewsRecipe 3 | 4 | clean_chapter_title_re = re.compile(r'^.*:\s*') 5 | 6 | 7 | class Free_as_in_Freedom(BasicNewsRecipe): 8 | 9 | title = 'Free as in Freedom' 10 | description = 'Richard Stallman\'s Crusade for Free Software' 11 | cover_url = 'http://akamaicovers.oreilly.com/images/9780596002879/lrg.jpg' 12 | 13 | url_prefix = 'http://www.oreilly.com/openbook/freedom/' 14 | no_stylesheets = True 15 | remove_javascript = True 16 | remove_empty_feeds = True 17 | keep_only_tags = [dict(name = 'blockquote')] 18 | 19 | def get_title(self, link): 20 | title = link.contents[0].strip() 21 | return clean_chapter_title_re.sub('', title) 22 | 23 | def parse_index(self): 24 | soup = self.index_to_soup(self.url_prefix + 'index.html') 25 | chapter_list = soup.find('blockquote') 26 | 27 | chapters = [] 28 | for link in chapter_list.findAll('a'): 29 | if not 'ch' in link['href']: 30 | continue 31 | 32 | title = self.get_title(link) 33 | url = self.url_prefix + link['href'] 34 | a = { 'title': title, 'url': url } 35 | 36 | chapters.append(a) 37 | 38 | return [(self.title, chapters)] 39 | -------------------------------------------------------------------------------- /calibre-recipes/Game_Programming_Patterns.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Game_Programming_Patterns(BasicNewsRecipe): 4 | 5 | title = 'Game Programming Patterns' 6 | description = '' 7 | cover_url = 'http://1.bp.blogspot.com/-AjFt5OHTuwo/VIFUbzbz0HI/AAAAAAAAOGc/mBKFW9_lFBE/s1600/gpp-splash.jpg' 8 | url_prefix = 'http://gameprogrammingpatterns.com/' 9 | keep_only_tags = [{ 'class': 'content' }] 10 | 11 | 12 | def parse_index(self): 13 | roman_numbers = ['', 'I', 'II', 'III', 'IV', 'V', 'VI', 'VII'] 14 | 15 | soup = self.index_to_soup(self.url_prefix + 'contents.html') 16 | toc_root = soup.find('ol', { 'type': 'I' }) 17 | 18 | articles = [] 19 | section_index = 1 20 | artical_index = 0 21 | for link_wrapper in toc_root.findAll('li'): 22 | link = link_wrapper.find('a') 23 | raw_title = link.contents[0].strip() 24 | 25 | if (link_wrapper.find('strong') and link_wrapper.find('ol')): 26 | title = roman_numbers[section_index] + '. ' + raw_title 27 | section_index = section_index + 1 28 | else: 29 | if (artical_index == 0): 30 | title = raw_title 31 | else: 32 | title = str(artical_index) + '. ' + raw_title 33 | artical_index= artical_index + 1 34 | 35 | url = self.url_prefix + link['href'] 36 | a = { 'title': title, 'url': url } 37 | articles.append(a) 38 | 39 | ans = [(title, articles)] 40 | 41 | return ans 42 | -------------------------------------------------------------------------------- /calibre-recipes/Git_Community_Book.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Git_Pocket_Guide(BasicNewsRecipe): 4 | 5 | title = 'Git Pocket Guide' 6 | description = '' 7 | cover_url = 'http://gitbook.liuhui998.com/assets/images/header-book.gif' 8 | 9 | url_prefix = 'http://gitbook.liuhui998.com/' 10 | no_stylesheets = True 11 | # keep_only_tags = [{ 'class': 'span-21' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | articles = [] 18 | 19 | soup = self.index_to_soup(self.url_prefix + 'index.html') 20 | tag_td = soup.findAll('td') 21 | for volume in tag_td: 22 | if len(volume.contents) < 3: 23 | continue 24 | 25 | for link in volume.findAll('a'): 26 | if not '.html' in link['href']: 27 | continue 28 | til = self.get_title(link) 29 | url = self.url_prefix + link['href'] 30 | a = { 'title': til, 'url': url } 31 | 32 | articles.append(a) 33 | 34 | ans = [('Git Community Book 中文版', articles)] 35 | return ans 36 | -------------------------------------------------------------------------------- /calibre-recipes/Git_Pocket_Guide.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Git_Pocket_Guide(BasicNewsRecipe): 4 | 5 | title = 'Git Pocket Guide' 6 | description = '' 7 | cover_url = 'http://akamaicovers.oreilly.com/images/0636920024972/lrg.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1230000000561/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [(title, articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/High_Performance_Browser_Networking.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | import re 3 | 4 | class High_Performance_Browser_Networking(BasicNewsRecipe): 5 | 6 | title = 'High Performance Browser Networking' 7 | description = '' 8 | cover_url = 'http://orm-other.s3.amazonaws.com/hpbnsplash/hpbncover.jpg' 9 | 10 | url_prefix = 'http://chimera.labs.oreilly.com/books/1230000000545/' 11 | no_stylesheets = True 12 | keep_only_tags = [{ 'class': ['preface', 'chapter', 'index', 'colophon'] }] 13 | 14 | def get_title(self, link): 15 | return link.contents[0].strip() 16 | 17 | def append_colophon(self, articles): 18 | colophon = {'title': 'Colophon', 'url': self.url_prefix + 'co01.html'} 19 | articles.append(colophon) 20 | 21 | def parse_index(self): 22 | soup = self.index_to_soup(self.url_prefix + 'index.html') 23 | 24 | div = soup.find('div', { 'class': 'toc' }) 25 | 26 | p = re.compile('(pr|ch|ix)\d+\.html$') 27 | articles = [] 28 | for link in div.findAll('a'): 29 | href = link['href'] 30 | if p.match(href): 31 | title = self.get_title(link) 32 | url = self.url_prefix + href 33 | a = { 'title': title, 'url': url } 34 | 35 | articles.append(a) 36 | 37 | self.append_colophon(articles) 38 | 39 | ans = [('High Performance Browser Networking', articles)] 40 | return ans 41 | -------------------------------------------------------------------------------- /calibre-recipes/House_Transcripts.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class House_Transcripts(BasicNewsRecipe): 4 | 5 | title = 'House Transcripts' 6 | description = '' 7 | cover_url = 'http://lolsnaps.com/upload_pic/WhatILearnedFromHouseMD-33384.jpeg' 8 | 9 | url_prefix = 'http://clinic-duty.livejournal.com/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'entryText' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup('http://clinic-duty.livejournal.com/12225.html') 18 | 19 | div = soup.find('div', { 'class': 'entryText' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if 'season' in link['href']: 24 | continue 25 | 26 | # if not 'ch' in link['href']: 27 | # continue 28 | 29 | til = self.get_title(link) 30 | url = link['href'] 31 | a = { 'title': til, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [('House_Transcripts', articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Interactive_Data_Visualization_for_the_Web.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Interactive_Data_Visualization_for_the_Web(BasicNewsRecipe): 4 | 5 | title = 'Interactive_Data_Visualization_for_the_Web' 6 | description = '' 7 | cover_url = 'http://akamaicovers.oreilly.com/images/0636920026938/lrg.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1230000000345/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [(title, articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Introduction_to_Linux.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.news import BasicNewsRecipe 2 | 3 | class IntroductionToLinux(BasicNewsRecipe): 4 | title = 'Introduction_to_Linux' 5 | __author__ = 'soooldier & lord63' 6 | description = 'Introduction_to_Linux' 7 | timefmt = '[%Y-%m-%d]' 8 | no_stylesheets = True 9 | url_prefix = 'http://tldp.org/LDP/intro-linux/html/' 10 | keep_only_tags =[dict(name='div', attrs={'class': 'GLOSSARY'}), # the glossary 11 | dict(name='div', attrs={'class': 'chapter'}), # normal chapter 12 | dict(name='div', attrs={'class': 'preface'}), # the introduction 13 | dict(name='div', attrs={'class': 'sect1'}), # other contents 14 | dict(name='div', attrs={'class': 'section'}), # contents in introduction 15 | dict(name='div', attrs={'class': 'appendix'})] # appendix B 16 | 17 | 18 | def get_title(self, tag): 19 | prefix = '' 20 | # I think a subtitle should have indent. 21 | if tag.parent.parent.name == 'dd': 22 | prefix = '====' 23 | # Some title don't have a num, for example, the 'Introduction'. 24 | if not isinstance(tag.contents[0], basestring): 25 | return prefix + tag.a.contents[0].strip() 26 | else: 27 | return prefix + tag.contents[0] + tag.a.contents[0].strip() 28 | 29 | 30 | def parse_index(self): 31 | soup = self.index_to_soup(self.url_prefix) 32 | toc = soup.find('div', {'class': 'TOC'}) 33 | articles = [] 34 | 35 | for tag in toc.findAll('dt'): 36 | # Drop the 'List of Tables' and 'List of Figures' section. 37 | if tag.a == None: 38 | continue 39 | # Drop the index section. 40 | if 'i14033.html' in tag.a['href']: 41 | continue 42 | title = self.get_title(tag) 43 | url = self.url_prefix + tag.a['href'] 44 | article = {'title': title, 'url': url} 45 | articles.append(article) 46 | ans = [('Introduction_to_Linux', articles)] 47 | 48 | return ans 49 | -------------------------------------------------------------------------------- /calibre-recipes/Learn_Python_the_Hard_Way.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.news import BasicNewsRecipe 2 | 3 | class LearnPythonTheHardWay(BasicNewsRecipe): 4 | title = 'learn_python_the_hard_way' 5 | __author__ = 'lord63' 6 | description = 'learn_python_the_hard_way' 7 | timefmt = '[%Y-%m-%d]' 8 | no_stylesheets = True 9 | url_prefix = 'http://learnpythonthehardway.org/book/' 10 | keep_only_tags = [{'class': 'large-12 columns'}] 11 | 12 | 13 | def parse_index(self): 14 | soup = self.index_to_soup(self.url_prefix + 'index.html') 15 | toc = soup.find('ul', 'simple') 16 | articles = [] 17 | for tag in toc.findAll('li'): 18 | title = self.tag_to_string(tag) 19 | url = self.url_prefix + tag.a['href'] 20 | article = {'title': title, 'url': url} 21 | articles.append(article) 22 | ans = [('learn_python_the_hard_way', articles)] 23 | return ans 24 | 25 | 26 | def postprocess_html(self, soup, first_fetch): 27 | first = True 28 | for text in soup.findAll('div', {'class': 'large-12 columns'}): 29 | if first: 30 | first = False 31 | else: 32 | text.extract() 33 | return soup 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /calibre-recipes/Learn_Vimscript_the_Hard_Way.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.news import BasicNewsRecipe 2 | 3 | class LearnVimscriptTheHardWay(BasicNewsRecipe): 4 | title = 'learn_vimscript_the_hard_way' 5 | __author__ = 'lord63' 6 | description = ('Learn Vimscript the Hard Way is a book for users of the' 7 | 'Vim editor who want to learn how to customize Vim.') 8 | timefmt = '[%Y-%m-%d]' 9 | no_stylesheets = True 10 | url_prefix = 'http://learnvimscriptthehardway.stevelosh.com' 11 | keep_only_tags = [{'class': 'content fourteen columns offset-by-one'}, 12 | {'class': 'fourteen columns offset-by-one'}, 13 | {'class': 'content twelve columns offset-by-one'}] 14 | remove_tags = [dict(name='section', attrs={'class': 'toc'}), 15 | dict(name='div', attrs={'class': 'prevnext'})] 16 | 17 | def parse_index(self): 18 | soup = self.index_to_soup(self.url_prefix) 19 | toc = soup.find('section', 'toc') 20 | articles = [{'title': 'Introduction', 'url': self.url_prefix}] 21 | for tag in toc.findAll('li'): 22 | title = self.tag_to_string(tag) 23 | url = self.url_prefix + tag.a['href'] 24 | article = {'title': title, 'url': url} 25 | articles.append(article) 26 | ans = [('learn_vimscript_the_hard_way', articles)] 27 | return ans 28 | -------------------------------------------------------------------------------- /calibre-recipes/Learn_Vimscript_the_Hard_Way_Zhcn.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.news import BasicNewsRecipe 2 | 3 | class LearnVimscriptTheHardWay(BasicNewsRecipe): 4 | title = '笨方法学Vimscript' 5 | __author__ = 'Steve Losh' 6 | description = ('笨方法学Vimscript面向那些想学会如何自定义Vim编辑器的用户') 7 | timefmt = '[%Y-%m-%d]' 8 | no_stylesheets = True 9 | url_prefix = 'http://learnvimscriptthehardway.onefloweroneworld.com' 10 | keep_only_tags = [{'class': 'content fourteen columns offset-by-one'}, 11 | {'class': 'fourteen columns offset-by-one'}, 12 | {'class': 'content twelve columns offset-by-one'}] 13 | remove_tags = [dict(name='section', attrs={'class': 'toc'}), 14 | dict(name='div', attrs={'class': 'prevnext'})] 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix) 18 | toc = soup.find('section', 'toc') 19 | articles = [{'title': 'Introduction', 'url': self.url_prefix}] 20 | for tag in toc.findAll('li'): 21 | title = self.tag_to_string(tag) 22 | url = self.url_prefix + tag.a['href'] 23 | article = {'title': title, 'url': url} 24 | articles.append(article) 25 | ans = [('learn_vimscript_the_hard_way_zhcn', articles)] 26 | return ans 27 | -------------------------------------------------------------------------------- /calibre-recipes/Makefile: -------------------------------------------------------------------------------- 1 | RECIPES := $(shell /bin/ls -A *.recipe) 2 | 3 | all: $(RECIPES:.recipe=.mobi) 4 | 5 | %.mobi: %.recipe 6 | ebook-convert $< $@ 7 | 8 | .PHONY: clean 9 | 10 | clean: 11 | $(RM) *.mobi 12 | -------------------------------------------------------------------------------- /calibre-recipes/Mastering_Perl.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Mastering_Perl(BasicNewsRecipe): 4 | 5 | title = 'Mastering Perl' 6 | description = '' 7 | cover_url = 'http://akamaicovers.oreilly.com/images/9780596527242/lrg.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1234000001527/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [(title, articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Nature_of_Code_the.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Nature_of_Code_the(BasicNewsRecipe): 4 | 5 | title = 'The Nature of Code' 6 | description = '' 7 | cover_url = 'http://www.vjsmag.com/wp-content/uploads/2014/06/vjs-magazine-The-Nature-Code-Simulating-Processing.jpg' 8 | url_prefix = 'http://natureofcode.com' 9 | no_stylesheets = True 10 | keep_only_tags = [{ 'id': 'container' }] 11 | 12 | def parse_index(self): 13 | soup = self.index_to_soup(self.url_prefix + '/book/') 14 | 15 | div = soup.find('div', { 'id': 'toc-list' }) 16 | 17 | articles = [] 18 | for link in div.findAll('a'): 19 | title = link.contents[0].strip() 20 | url = self.url_prefix + link['href'] 21 | a = { 'title': title, 'url': url } 22 | articles.append(a) 23 | 24 | ans = [(title, articles)] 25 | 26 | return ans 27 | -------------------------------------------------------------------------------- /calibre-recipes/Pro_Git_ZH.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Pro_Git_ZH(BasicNewsRecipe): 4 | 5 | title = 'Pro_Git_ZH' 6 | description = 'Pro_Git_Chinese' 7 | cover_url = 'http://iissnan.com/progit/assets/img/pro-git-cover.jpeg' 8 | 9 | url_prefix = 'http://iissnan.com/progit/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'col-md-8' }] 12 | 13 | def parse_index(self): 14 | soup = self.index_to_soup(self.url_prefix) 15 | 16 | div = soup.find('ul', { 'class': 'toc' }) 17 | 18 | articles = [] 19 | for link in div.findAll('a'): 20 | 21 | til = link.contents[0].strip() 22 | url = self.url_prefix +'/'+ link['href'] 23 | a = { 'title': til, 'url': url } 24 | 25 | articles.append(a) 26 | 27 | ans = [('Pro_Git_ZH', articles)] 28 | 29 | return ans 30 | -------------------------------------------------------------------------------- /calibre-recipes/Programming_JavaScript_Applications.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Programming_JavaScript_Applications(BasicNewsRecipe): 4 | 5 | title = 'Programming_JavaScript_Applications' 6 | description = '' 7 | cover_url = 'http://akamaicovers.oreilly.com/images/0636920024231/lrg.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1234000000262/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [(title, articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Python_Cookbook.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Git_Pocket_Guide(BasicNewsRecipe): 4 | 5 | title = 'Python Cookbook' 6 | description = '' 7 | cover_url = 'http://orm-other.s3.amazonaws.com/pythonckbksplash/pythonckbk_cover.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1230000000393/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | til = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': til, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [('Python Cookbook', articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/SICP.recipe: -------------------------------------------------------------------------------- 1 | import re 2 | from calibre.web.feeds.recipes import BasicNewsRecipe 3 | 4 | class SICP(BasicNewsRecipe): 5 | 6 | title = 'Structure and Interpretation of Computer Programs' 7 | description = '' 8 | cover_url = 'http://mitpress.mit.edu/sicp/full-text/book/cover.jpg' 9 | 10 | url_prefix = 'http://mitpress.mit.edu/sicp/full-text/book/' 11 | no_stylesheets = True 12 | remove_tags = [{ 'class': 'navigation' }] 13 | 14 | def get_title(self, link): 15 | return link.contents[0].strip() 16 | 17 | def parse_index(self): 18 | soup = self.index_to_soup('http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-4.html') 19 | 20 | articles = [] 21 | for link in soup.findAll('a'): 22 | if not (link.has_key('name') and link.has_key('href')): 23 | continue 24 | 25 | pattern = re.compile(r'^.*\d\.\d\.\d$') 26 | if pattern.match(link['href']): 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [('SICP', articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/Test_Driven_Web_Development_with_Python.recipe: -------------------------------------------------------------------------------- 1 | from calibre.web.feeds.recipes import BasicNewsRecipe 2 | 3 | class Test_Driven_Web_Development_with_Python(BasicNewsRecipe): 4 | 5 | title = 'Test-Driven Web Development with Python' 6 | description = '' 7 | cover_url = 'http://akamaicovers.oreilly.com/images/0636920029533/rc_lrg.jpg' 8 | 9 | url_prefix = 'http://chimera.labs.oreilly.com/books/1234000000754/' 10 | no_stylesheets = True 11 | keep_only_tags = [{ 'class': 'chapter' }] 12 | 13 | def get_title(self, link): 14 | return link.contents[0].strip() 15 | 16 | def parse_index(self): 17 | soup = self.index_to_soup(self.url_prefix + 'index.html') 18 | 19 | div = soup.find('div', { 'class': 'toc' }) 20 | 21 | articles = [] 22 | for link in div.findAll('a'): 23 | if '#' in link['href']: 24 | continue 25 | 26 | if not 'ch' in link['href']: 27 | continue 28 | 29 | title = self.get_title(link) 30 | url = self.url_prefix + link['href'] 31 | a = { 'title': title, 'url': url } 32 | 33 | articles.append(a) 34 | 35 | ans = [(title, articles)] 36 | 37 | return ans 38 | -------------------------------------------------------------------------------- /calibre-recipes/The_Definitive_Guide_to_Yii_2.0.recipe: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from calibre.web.feeds.recipes import BasicNewsRecipe 4 | from calibre.ebooks.BeautifulSoup import Tag, NavigableString 5 | from collections import OrderedDict 6 | 7 | 8 | class Yii_Framework_Guide(BasicNewsRecipe): 9 | 10 | title = 'The Definitive Guide to Yii 2.0' 11 | description = 'Yii 2.0 Documentation' 12 | cover_url = 'http://static.yiiframework.com/css/img/logo.png' 13 | extra_css = ''' 14 | pre { 15 | border-color: #e0e5ea; 16 | border-style: none none none solid; 17 | border-width: medium medium medium 5px; 18 | } 19 | 20 | blackquote { 21 | border-left: 5px solid #eee; 22 | } 23 | 24 | ''' 25 | url_pre = 'http://www.yiiframework.com/doc-2.0/' 26 | no_stylesheets = True 27 | keep_only_tags = [{ 'role': 'main' }] 28 | simultaneous_downloads = 5 29 | 30 | def parse_index(self): 31 | return self.guide_parse_index() 32 | 33 | def guide_parse_index(self): 34 | 35 | soup = self.index_to_soup(self.url_pre + 'guide-index.html') 36 | title = 'The Definitive Guide to Yii 2.0' 37 | 38 | div = soup.find('div', { 'role': 'main' }) 39 | 40 | section_title_pattern = '([A-Za-z]*\s*)+' 41 | feeds = OrderedDict() 42 | 43 | is_first_section = True 44 | section_title = '' 45 | 46 | for item in div.findAll(['h2','li']): 47 | 48 | 49 | html_tag = item.name 50 | 51 | if html_tag == 'h2': 52 | if not is_first_section: 53 | if articles: 54 | if section_title not in feeds: 55 | feeds[section_title] = [] 56 | feeds[section_title] += articles 57 | else: 58 | is_first_section = False 59 | section_title_raw = self.tag_to_string(item) 60 | section_title = re.search(section_title_pattern, section_title_raw).group(0).strip() 61 | articles = [] 62 | else: 63 | link = item.find('a') 64 | if link: 65 | 66 | page_url = self.url_pre + link['href'] 67 | page_title = self.tag_to_string(item) 68 | if 'TBD' in page_title: 69 | continue 70 | articles.append({'title': page_title, 'url': page_url, 71 | 'description':'', 'date':''}) 72 | 73 | if articles: 74 | if section_title not in feeds: 75 | feeds[section_title] = [] 76 | feeds[section_title] += articles 77 | 78 | ans = [(key, val) for key, val in feeds.iteritems()] 79 | return ans -------------------------------------------------------------------------------- /calibre-recipes/Tutorials_about_Development_for_Android.recipe: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from calibre.web.feeds.recipes import BasicNewsRecipe 4 | from calibre.ebooks.BeautifulSoup import Tag, NavigableString 5 | from collections import OrderedDict 6 | 7 | 8 | class Tutorials_about_Development_for_Android(BasicNewsRecipe): 9 | 10 | title = 'Tutorials about Development for Android' 11 | description = '' 12 | extra_css = ''' 13 | pre { 14 | border-color: #e0e5ea; 15 | border-style: none none none solid; 16 | border-width: medium medium medium 5px; 17 | } 18 | 19 | blackquote { 20 | border-left: 5px solid #eee; 21 | } 22 | 23 | ''' 24 | no_stylesheets = True 25 | keep_only_tags = [{ 'class': 'article' }] 26 | simultaneous_downloads = 5 27 | 28 | def parse_index(self): 29 | return self.guide_parse_index() 30 | 31 | def guide_parse_index(self): 32 | 33 | soup = self.index_to_soup('http://www.vogella.com/tutorials/android.html') 34 | 35 | div = soup.find('div', { 'class': 'tutorialdiv' }) 36 | 37 | feeds = OrderedDict() 38 | 39 | container = div.findAll('div', { 'class' : 'tutorialcontainer'}) 40 | 41 | for item in container: 42 | 43 | header = item.find('div', { 'class' : 'tutorialheader' }) 44 | 45 | section_title = self.tag_to_string(header.h2) 46 | feeds[section_title] = [] 47 | articles = [] 48 | 49 | pages = item.find('div', { 'class' : 'tutorialbody' }) 50 | links = pages.findAll('a') 51 | for link in links: 52 | page_url = link['href'] 53 | page_title = self.tag_to_string(link) 54 | articles.append({'title': page_title, 'url': page_url, 55 | 'description':'', 'date':''}) 56 | 57 | feeds[section_title] += articles 58 | 59 | ans = [(key, val) for key, val in feeds.iteritems()] 60 | return ans 61 | 62 | 63 | --------------------------------------------------------------------------------