└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Audible full scraper 2 | Scrape the metadata and audio samples of Audible's audiobooks using Python/Selenium 3 | 4 | Python code that iterates over Audible's search results and scrapes information on the audiobooks, including: 5 | * URL 6 | * Title 7 | * Author 8 | * Narrator 9 | * Price 10 | * Publisher 11 | * Category and sub-category 12 | * Release date 13 | * Length 14 | * Language 15 | * Ratings (separately for each of the 1- to 5-star ratings): 16 | * Overall 17 | * Performance 18 | * Story 19 | * Audio sample (mp3 URL), including the option to download the sample directly 20 | 21 | The scraping is performed in parallel, and includes a wait function to avoid overwhelming the server (or more likely, to avoid being banned) 22 | 23 | TODO: 24 | * Scrape the ratings and text of user comments (which can then be processed by NLP etc.) 25 | --------------------------------------------------------------------------------