├── .gitignore ├── LICENSE ├── README.md ├── all_words ├── cnki.py ├── collection_utils ├── __init__.py └── collection_utils.py ├── crawler ├── __init__.py ├── cnki │ ├── __init__.py │ ├── cnki_class.py │ └── constants.py └── crawler.py ├── doc └── uml │ └── process.puml ├── feature_extractor.py.lprof ├── feature_extractor ├── __init__.py └── feature_extractor.py ├── mongo_utils ├── __init__.py └── mongo_utils.py └── resource ├── logging.conf └── stop_word /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/LICENSE -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/README.md -------------------------------------------------------------------------------- /all_words: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/all_words -------------------------------------------------------------------------------- /cnki.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/cnki.py -------------------------------------------------------------------------------- /collection_utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/collection_utils/__init__.py -------------------------------------------------------------------------------- /collection_utils/collection_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/collection_utils/collection_utils.py -------------------------------------------------------------------------------- /crawler/__init__.py: -------------------------------------------------------------------------------- 1 | __author__ = 'roliy' 2 | -------------------------------------------------------------------------------- /crawler/cnki/__init__.py: -------------------------------------------------------------------------------- 1 | #! usr/bin/python 2 | # coding=utf-8 3 | 4 | __author__ = 'roliy' 5 | -------------------------------------------------------------------------------- /crawler/cnki/cnki_class.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/crawler/cnki/cnki_class.py -------------------------------------------------------------------------------- /crawler/cnki/constants.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/crawler/cnki/constants.py -------------------------------------------------------------------------------- /crawler/crawler.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/crawler/crawler.py -------------------------------------------------------------------------------- /doc/uml/process.puml: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/doc/uml/process.puml -------------------------------------------------------------------------------- /feature_extractor.py.lprof: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/feature_extractor.py.lprof -------------------------------------------------------------------------------- /feature_extractor/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/feature_extractor/__init__.py -------------------------------------------------------------------------------- /feature_extractor/feature_extractor.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/feature_extractor/feature_extractor.py -------------------------------------------------------------------------------- /mongo_utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/mongo_utils/__init__.py -------------------------------------------------------------------------------- /mongo_utils/mongo_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/mongo_utils/mongo_utils.py -------------------------------------------------------------------------------- /resource/logging.conf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/resource/logging.conf -------------------------------------------------------------------------------- /resource/stop_word: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/roliygu/CNKICrawler/HEAD/resource/stop_word --------------------------------------------------------------------------------