└── README.md


/README.md:
--------------------------------------------------------------------------------
 1 | 收集写爬虫的相关技术资料以及有用的代码库
 2 | 
 3 | ####技术博客
 4 | * **Scrapy官方英文文档**: http://doc.scrapy.org/en/latest/index.html
 5 | * **Scrapy官方中文文档**: http://scrapy-chs.readthedocs.org/zh_CN/latest/intro/tutorial.html
 6 | * [使用Scrapy抓取数据](http://www.tuicool.com/articles/ZbEFnya)
 7 | * [Scrapy抓取豆瓣电影](http://www.ituring.com.cn/article/114408#)
 8 | * [知乎：如何入门爬虫](http://www.zhihu.com/question/20899988)
 9 | * [Scrapy爬虫笔记【8-Scrapy核心操作+爬豆瓣图片+数据库链接】](http://www.tuicool.com/articles/7JVvaa)
10 | * [Python爬虫学习系列教程](http://cuiqingcai.com/1052.html)
11 | * [PhantomJs](http://phantomjs.org/) : 模拟浏览器解析js，js引擎
12 | * [CasperJs](http://casperjs.org/): 以及`phantomjs`的js引擎相比`Phantomjs`更加简洁易用
13 | * [PhantomJs快速入门](http://www.tuicool.com/articles/beeMNj/)
14 | * [BeautifulSoup教程](http://cuiqingcai.com/1319.html)
15 | 
16 | #### 开源项目
17 | * [Cola](https://github.com/chineking/cola) 一个高水平的分布式爬虫框架
18 | * [Goose ](https://github.com/grangier/python-goose) Html Content / Article Extractor
19 | * [scrapy-redis](https://github.com/darkrho/scrapy-redis) 基于`Redis`和`Scrapy`的分布式爬虫框架
20 | * [nvie/rq](https://github.com/nvie/rq) `Python`实现的一个简单的任务队列
21 | * [Bloom Filter](http://billmill.org/bloomfilter-tutorial/): 一个高效Url过滤，去重库
22 | * [Scrapyd](https://github.com/scrapy/scrapyd): 部署监控`scrapy`的工具
23 | * [scrapy-client](https://github.com/scrapy/scrapyd-client): 与`Scrapyd`结合调用`addversion.json`发布`Spider`
24 | * [ScrapyJs](https://github.com/scrapinghub/scrapyjs): `scrapy`官方提供的`JS`解决方案
25 | * [RSpider](https://github.com/KDF5000/RSpider): 一个基于`Scrapy-redis`的分布式爬虫模板，实现了`user agent`的随机生成，多个爬虫同事运行，`Scrapy`状态通过`graphite`图形化监控
26 | 
27 | 


--------------------------------------------------------------------------------