├── .gitignore ├── .gitmodules ├── LICENSE ├── README.md ├── news_web ├── frontend │ ├── article.html │ └── subscription.html ├── init_db.py ├── lib │ ├── __init__.py │ ├── db_utils.py │ ├── error_code.py │ ├── response.py │ └── utils.py ├── manage.py ├── news_web │ ├── __init__.py │ ├── middlewares.py │ ├── settings.py │ ├── urls.py │ └── wsgi.py ├── run_server.sh └── web_server │ ├── __init__.py │ ├── admin.py │ ├── apps.py │ ├── migrations │ └── __init__.py │ ├── models.py │ ├── tests.py │ ├── urls.py │ └── views.py ├── newscrawler ├── newscrawler │ ├── __init__.py │ ├── items.py │ ├── middlewares.py │ ├── pipelines.py │ ├── settings.py │ ├── spiders │ │ ├── __init__.py │ │ ├── netease.py │ │ └── qq.py │ ├── utils.py │ ├── wechat_config.py │ └── wechat_push.py ├── scrapy.cfg ├── start_crawl.py └── worker.py ├── requirements.txt └── 论文相关文件 ├── MongoDB.png ├── WechatIMG37.png ├── WechatIMG38.png ├── WechatIMG39.png ├── WechatIMG40.png ├── WechatIMG41.jpeg ├── WechatIMG42.jpeg ├── WechatIMG43.jpeg ├── WechatIMG44.jpeg ├── nginx配置.png ├── scrapy架构.png ├── spider实现.png ├── useragent.png ├── 启动API服务器.png ├── 启动spider.png ├── 基于网络爬虫的新闻采集和订阅系统的设计与实现_黄雄镖_终稿.pdf ├── 新闻推送活动图.png ├── 新闻订阅活动图.png ├── 爬虫部分目录.png ├── 用例图.png ├── 系统总体框架.png └── 订阅与展示部分目录.png /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/.gitignore -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/.gitmodules -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # NewsCrawler 2 | 毕业设计 基于网络爬虫的新闻采集和订阅系统的设计与实现 3 | -------------------------------------------------------------------------------- /news_web/frontend/article.html: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/frontend/article.html -------------------------------------------------------------------------------- /news_web/frontend/subscription.html: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/frontend/subscription.html -------------------------------------------------------------------------------- /news_web/init_db.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/init_db.py -------------------------------------------------------------------------------- /news_web/lib/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /news_web/lib/db_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/lib/db_utils.py -------------------------------------------------------------------------------- /news_web/lib/error_code.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/lib/error_code.py -------------------------------------------------------------------------------- /news_web/lib/response.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/lib/response.py -------------------------------------------------------------------------------- /news_web/lib/utils.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /news_web/manage.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/manage.py -------------------------------------------------------------------------------- /news_web/news_web/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /news_web/news_web/middlewares.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/news_web/middlewares.py -------------------------------------------------------------------------------- /news_web/news_web/settings.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/news_web/settings.py -------------------------------------------------------------------------------- /news_web/news_web/urls.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/news_web/urls.py -------------------------------------------------------------------------------- /news_web/news_web/wsgi.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/news_web/wsgi.py -------------------------------------------------------------------------------- /news_web/run_server.sh: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/run_server.sh -------------------------------------------------------------------------------- /news_web/web_server/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /news_web/web_server/admin.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/web_server/admin.py -------------------------------------------------------------------------------- /news_web/web_server/apps.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/web_server/apps.py -------------------------------------------------------------------------------- /news_web/web_server/migrations/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /news_web/web_server/models.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/web_server/models.py -------------------------------------------------------------------------------- /news_web/web_server/tests.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/web_server/tests.py -------------------------------------------------------------------------------- /news_web/web_server/urls.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/web_server/urls.py -------------------------------------------------------------------------------- /news_web/web_server/views.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/news_web/web_server/views.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /newscrawler/newscrawler/items.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/items.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/middlewares.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/middlewares.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/pipelines.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/pipelines.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/settings.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/settings.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/spiders/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/spiders/__init__.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/spiders/netease.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/spiders/netease.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/spiders/qq.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/spiders/qq.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/utils.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/wechat_config.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/wechat_config.py -------------------------------------------------------------------------------- /newscrawler/newscrawler/wechat_push.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/newscrawler/wechat_push.py -------------------------------------------------------------------------------- /newscrawler/scrapy.cfg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/scrapy.cfg -------------------------------------------------------------------------------- /newscrawler/start_crawl.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/start_crawl.py -------------------------------------------------------------------------------- /newscrawler/worker.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/newscrawler/worker.py -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/requirements.txt -------------------------------------------------------------------------------- /论文相关文件/MongoDB.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/MongoDB.png -------------------------------------------------------------------------------- /论文相关文件/WechatIMG37.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG37.png -------------------------------------------------------------------------------- /论文相关文件/WechatIMG38.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG38.png -------------------------------------------------------------------------------- /论文相关文件/WechatIMG39.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG39.png -------------------------------------------------------------------------------- /论文相关文件/WechatIMG40.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG40.png -------------------------------------------------------------------------------- /论文相关文件/WechatIMG41.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG41.jpeg -------------------------------------------------------------------------------- /论文相关文件/WechatIMG42.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG42.jpeg -------------------------------------------------------------------------------- /论文相关文件/WechatIMG43.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG43.jpeg -------------------------------------------------------------------------------- /论文相关文件/WechatIMG44.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/WechatIMG44.jpeg -------------------------------------------------------------------------------- /论文相关文件/nginx配置.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/nginx配置.png -------------------------------------------------------------------------------- /论文相关文件/scrapy架构.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/scrapy架构.png -------------------------------------------------------------------------------- /论文相关文件/spider实现.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/spider实现.png -------------------------------------------------------------------------------- /论文相关文件/useragent.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/useragent.png -------------------------------------------------------------------------------- /论文相关文件/启动API服务器.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/启动API服务器.png -------------------------------------------------------------------------------- /论文相关文件/启动spider.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/启动spider.png -------------------------------------------------------------------------------- /论文相关文件/基于网络爬虫的新闻采集和订阅系统的设计与实现_黄雄镖_终稿.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/基于网络爬虫的新闻采集和订阅系统的设计与实现_黄雄镖_终稿.pdf -------------------------------------------------------------------------------- /论文相关文件/新闻推送活动图.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/新闻推送活动图.png -------------------------------------------------------------------------------- /论文相关文件/新闻订阅活动图.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/新闻订阅活动图.png -------------------------------------------------------------------------------- /论文相关文件/爬虫部分目录.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/爬虫部分目录.png -------------------------------------------------------------------------------- /论文相关文件/用例图.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/用例图.png -------------------------------------------------------------------------------- /论文相关文件/系统总体框架.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/系统总体框架.png -------------------------------------------------------------------------------- /论文相关文件/订阅与展示部分目录.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BillBillBillBill/NewsCrawler/HEAD/论文相关文件/订阅与展示部分目录.png --------------------------------------------------------------------------------