├── .gitignore ├── LICENSE ├── PreProcessing ├── data_processor.py └── format_converter.py ├── README.md ├── Visualize └── visualizer.py ├── WebCrawler ├── LinkExtraction.py ├── async_crawler.py ├── crawler_config.py ├── main.py └── ua_info.py ├── data └── raw │ └── detail_urls.json └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/.gitignore -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/LICENSE -------------------------------------------------------------------------------- /PreProcessing/data_processor.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/PreProcessing/data_processor.py -------------------------------------------------------------------------------- /PreProcessing/format_converter.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/PreProcessing/format_converter.py -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/README.md -------------------------------------------------------------------------------- /Visualize/visualizer.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/Visualize/visualizer.py -------------------------------------------------------------------------------- /WebCrawler/LinkExtraction.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/WebCrawler/LinkExtraction.py -------------------------------------------------------------------------------- /WebCrawler/async_crawler.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/WebCrawler/async_crawler.py -------------------------------------------------------------------------------- /WebCrawler/crawler_config.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/WebCrawler/crawler_config.py -------------------------------------------------------------------------------- /WebCrawler/main.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/WebCrawler/main.py -------------------------------------------------------------------------------- /WebCrawler/ua_info.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/WebCrawler/ua_info.py -------------------------------------------------------------------------------- /data/raw/detail_urls.json: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/data/raw/detail_urls.json -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/zzz0627/DataScraping-LLMs-FineTuning/HEAD/requirements.txt --------------------------------------------------------------------------------