├── .gitignore ├── requirements.txt ├── .flake8 └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | .vscode/ 3 | *.iml 4 | .env 5 | .DS_Store 6 | venv/ 7 | .pytest_cache/ 8 | **__pycache__/ 9 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | flake8==5.0.4 2 | flake8-annotations==2.9.1 3 | flake8-quotes==3.3.1 4 | flake8-variables-names==0.0.5 5 | pep8-naming==0.13.2 6 | -------------------------------------------------------------------------------- /.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | inline-quotes = " 3 | ignore = E203, E266, W503, ANN002, ANN003, ANN101, ANN102, ANN401, N807, N818 4 | max-line-length = 79 5 | max-complexity = 18 6 | select = B,C,E,F,W,T4,B9,ANN,Q0,N8,VNE 7 | exclude = venv, tests 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Scrape books 2 | 3 | - Read [the guideline](https://github.com/mate-academy/py-task-guideline/blob/main/README.md) before start 4 | 5 | 6 | ## Task 7 | Here you will scrape https://books.toscrape.com/ website. 8 | For each of 1000 books you need to parse this information: 9 | - title 10 | - price 11 | - amount_in_stock 12 | - rating 13 | - category 14 | - description 15 | - upc 16 | 17 | In this task you should use `scrapy` framework for parsing. 18 | And implement only 1 spider to do such job. 19 | 20 | When completed it - save all books into `books.jl` file and commit it. 21 | This task doesn't have auto-tests, so test it manually. 22 | 23 | Hints: 24 | - use scrapy documentation for searching for all required information; 25 | - use scrapy best practices & learn how to learn new frameworks; 26 | - make your code as clean as possible; 27 | - separate scraping for different steps to make code cleaner. 28 | --------------------------------------------------------------------------------