├── .gitignore ├── LICENSE ├── NewsSpider ├── items.py ├── middlewares.py ├── pipelines.py ├── settings.py ├── spiders │ ├── __init__.py │ ├── netease.py │ ├── sina.py │ └── sohu.py └── utils │ └── bloom.py ├── README.md ├── main.py ├── requirements.txt └── scrapy.cfg /.gitignore: -------------------------------------------------------------------------------- 1 | log/ 2 | data/ 3 | .idea/ 4 | __pycache__/ 5 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/LICENSE -------------------------------------------------------------------------------- /NewsSpider/items.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/items.py -------------------------------------------------------------------------------- /NewsSpider/middlewares.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/middlewares.py -------------------------------------------------------------------------------- /NewsSpider/pipelines.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/pipelines.py -------------------------------------------------------------------------------- /NewsSpider/settings.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/settings.py -------------------------------------------------------------------------------- /NewsSpider/spiders/__init__.py: -------------------------------------------------------------------------------- 1 | #!usr/bin/env python 2 | # -*- coding:utf-8 -*- 3 | -------------------------------------------------------------------------------- /NewsSpider/spiders/netease.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/spiders/netease.py -------------------------------------------------------------------------------- /NewsSpider/spiders/sina.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/spiders/sina.py -------------------------------------------------------------------------------- /NewsSpider/spiders/sohu.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/spiders/sohu.py -------------------------------------------------------------------------------- /NewsSpider/utils/bloom.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/NewsSpider/utils/bloom.py -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/README.md -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/main.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /scrapy.cfg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hiyoung123/NewsSpider/HEAD/scrapy.cfg --------------------------------------------------------------------------------