├── LICENSE ├── README.md ├── lessreal-data.csv └── web.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 coderdudex 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Less-Real-Data 2 | Less-real.com data collection of quotes 3 | 4 | I found the website a long time ago, decided to data collect the whole quote collection before the website went down. Since it is slowly dying. 5 | 6 | ## Data 7 | There are around 8000 quotes. I collected it in the format of ***(id,anime,character,quote)*** 8 | 9 | ID | Anime | Character | Quote 10 | ------------ | ------------- | ------------- | ------------- 11 | 3639 | (Hyouka) | Oreki Houtarou | I'm not stupid. I'm just too lazy to show how smart I am. 12 | 13 | Some of the quotes are missing for some reason. I couldn't bother looking through around 8000 quotes to determine what quote was missing. 14 | 15 | ## What can this be used for? 16 | Well, this can be used for a API collection of anime quotes. 17 | -------------------------------------------------------------------------------- /web.py: -------------------------------------------------------------------------------- 1 | import requests 2 | from requests import get 3 | from bs4 import BeautifulSoup 4 | import pandas as pd 5 | import numpy as np 6 | 7 | from time import sleep 8 | # from random import randint 9 | from tqdm import tqdm 10 | 11 | headers = {"Accept-Language": "en-US,en;q=0.5"} 12 | 13 | animes = [] 14 | characters = [] 15 | quotes = [] 16 | 17 | pages = np.arange(1, 443, 1) 18 | 19 | for page in tqdm(range(443)): 20 | 21 | page = requests.get("https://www.less-real.com/?p=" + str(page), headers=headers) 22 | 23 | soup = BeautifulSoup(page.text, 'html.parser') 24 | quote_div = soup.find_all('div', class_='quote') 25 | 26 | for items in quote_div: 27 | 28 | # Anime 29 | anime = items.find(class_="quoteAnime").text 30 | animes.append(anime) 31 | 32 | # Character 33 | character = items.a.text 34 | characters.append(character) 35 | 36 | # Quote 37 | quote = items.find('span', class_="quoteText").text 38 | quotes.append(quote) 39 | 40 | data = { 41 | 'Anime': animes, 42 | 'Character': characters, 43 | 'Quote': quotes, 44 | } 45 | 46 | df = pd.DataFrame(data) 47 | 48 | df.to_csv('less-real_data.csv') 49 | --------------------------------------------------------------------------------