├── .gitignore ├── Localebnb.pptx ├── README.md ├── app.py ├── app_production.py ├── data ├── amenities.csv ├── city_list.csv ├── city_list_df.pkl └── neighborhood_list.csv ├── libs ├── __init__.py ├── __init__.pyc ├── airbnb │ ├── __init__.py │ ├── __init__.pyc │ ├── airbnblisting.py │ ├── airbnblisting.pyc │ ├── airbnblisting_2.py │ ├── airbnbneighborhood.py │ ├── airbnbneighborhood.pyc │ ├── airbnbsearchresult.py │ └── airbnbsearchresult.pyc ├── app_helper.py ├── app_helper.pyc ├── extract_features_from_listings.py ├── extract_features_from_neighborhoods.py ├── extract_listings_from_search_results.py ├── generate_mnb_models.py ├── generate_svm_models.py ├── notebook_0_random-updates.ipynb ├── notebook_1_data-eng-etl.ipynb ├── notebook_2_eda-analysis.ipynb ├── notebook_3_predictive-modeling.ipynb ├── notebook_4_generate-models.ipynb ├── scrape_listings.py ├── scrape_neighborhood_list.py ├── scrape_neighborhoods.py └── scrape_search_results.py ├── model_results ├── artsy_results.csv ├── artsy_svc_results.csv ├── dining_results.csv ├── dining_svc_results.csv ├── nightlife_results.csv ├── nightlife_svc_results.csv ├── shopping_results.csv └── shopping_svc_results.csv ├── models ├── mnb_artsy.pkl ├── mnb_artsy_final.pkl ├── mnb_dining.pkl ├── mnb_dining_final.pkl ├── mnb_nightlife.pkl ├── mnb_nightlife_final.pkl ├── mnb_shopping.pkl ├── mnb_shopping_final.pkl ├── svc_artsy_final.pkl ├── svc_dining_final.pkl ├── svc_nightlife_final.pkl ├── svc_shopping_final.pkl ├── tfidf.pkl ├── tfidf_artsy.pkl ├── tfidf_dining.pkl ├── tfidf_nightlife.pkl ├── tfidf_shopping.pkl ├── tfidf_svc.pkl ├── tfidf_svc_artsy.pkl ├── tfidf_svc_dining.pkl ├── tfidf_svc_nightlife.pkl ├── tfidf_svc_shopping.pkl ├── top_words_artsy.pkl ├── top_words_dining.pkl ├── top_words_nightlife.pkl └── top_words_shopping.pkl ├── other └── error_request.pkl ├── static ├── css │ ├── bootstrap-slider.css │ ├── bootstrap-theme.css │ ├── bootstrap-theme.css.map │ ├── bootstrap-theme.min.css │ ├── bootstrap.css │ ├── bootstrap.css.map │ ├── bootstrap.min.css │ ├── localebnb_main.css │ ├── navbar-fixed-top.css │ └── sortable-theme-bootstrap-for-localebnb.css ├── fonts │ ├── glyphicons-halflings-regular.eot │ ├── glyphicons-halflings-regular.svg │ ├── glyphicons-halflings-regular.ttf │ ├── glyphicons-halflings-regular.woff │ └── glyphicons-halflings-regular.woff2 ├── img │ ├── alamo_square.jpg │ ├── location_heart_ico.png │ ├── location_ico.png │ ├── presentation_insights.jpg │ ├── presentation_methodology.jpg │ ├── presentation_solution.jpg │ └── presentation_title.jpg └── js │ ├── bootstrap-slider.js │ ├── bootstrap.js │ ├── bootstrap.min.js │ ├── ie-emulation-modes-warning.js │ ├── ie10-viewport-bug-workaround.js │ ├── ie8-responsive-file-warning.js │ ├── jquery-1.11.3.min.js │ ├── npm.js │ ├── raw-files.min.js │ ├── sortable.js │ └── sortable.min.js └── templates ├── contact.html ├── index.html ├── listing.html └── search.html /.gitignore: -------------------------------------------------------------------------------- 1 | _internal/ 2 | libs/.ipynb_checkpoints* 3 | other/* 4 | *.pyc -------------------------------------------------------------------------------- /Localebnb.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/Localebnb.pptx -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # [Localebnb](http://www.localebnb.co): An Airbnb Contexual Recommendation App 2 | 3 | ### G Scott Stukey, Zipfian Academy, April 2015 - July 2015 4 | 5 | ![](/static/img/presentation_title.jpg) 6 | 7 | 8 | ## Overview 9 | 10 | The motivation for this project was: When booking a private residence, how do you find the perfect neighborhood? 11 | 12 | It stemed from my personal frustrations with Airbnb's search functionality while booking in Montreal. I knew that I wanted to stay in a "trendier" neighborhood, but away from tourists & nightlife. While I could search & filter Airbnb's search results by neighborhoods to stay, I had no idea what neighborhoods met my criteria! 13 | 14 | Localebnb aimed to be that contextual recommender for Airbnb. 15 | 16 | Using Airbnb listing descriptions (features) + Airbnb's neighborhood guides for traits (target ), I built an app that predicts whether a listing is in a neighborhood with a specified trait, and then I use that information to score & re-sort the default search results provided by Airbnb. 17 | 18 | ![](/static/img/presentation_solution.jpg) 19 | 20 | ##How to Use 21 | 22 | *Note: it is best to use this app on desktop with a large window* 23 | * Go to the [Localebnb app](http://localebnb.co) - note: this project is no longer live as of late 2015... 24 | * Enter in your search criteria (city, dates, guests), as well as neighborhood trait preferences ('is artsy, 'has shopping', etc) 25 | * Click "Search Airbnb" - this scrapes Airbnb's search results & listings, predicts the traits for each listing, then scores & re-sorts the search results 26 | * On the search result page, you can resort by the column header. You can also change your preferences and see how that changes the search results. 27 | * If you hover over a listing, a pop-up appears in the map. You can click to the Airbnb page of the listing, as well as additional information about the listing's description 28 | 29 | 30 | ## Dataset 31 | 32 | ![](/static/img/presentation_methodology.jpg) 33 | 34 | I scraped 4 types of pages across Airbnb for data: 35 | * Search Result Pages (e.g. https://www.airbnb.com/s/Portland--OR--United-States?checkin=09%2F18%2F2015&checkout=09%2F21%2F2015) 36 | * ~4000 Listing Pages (e.g. https://www.airbnb.com/rooms/14584) 37 | * [City Guides](https://www.airbnb.com/locations) for SF & NYC (e.g. https://www.airbnb.com/locations/san-francisco) 38 | * Neighborhood Guides for all neighborhoods(https://www.airbnb.com/locations/san-francisco/duboce-triangle) 39 | 40 | I mapped listings to neighborhoods & neighborhoods to traits to come up with my labeled dataset (listings -> traits). I then cleaned up the description using NLP techniques, vectorized the description using TF-IDF, and used a variety of models on this information. SVM's provided the highest accuracy (~78-82%, a 5 pt lift over a naive bayes model). Interestingly enough, when attempting to create a 'majority vote' ensemble (NB + SVM + Random Forest), the accuracty decreased slight against each of the individual models. This denotes that each of these 3 models are able to pick up certain features that neither of the other 2 are able to. 41 | 42 | I also ran a Doc2Vec (Word2Vec) model using the cleaned descriptions as sentences & the neighborhood traits & cities as label. However, due to the size of my corpus, this data proved insufficient for for use in Localebnb. With a much larger training set, I'd love to revisit this method. 43 | 44 | 45 | ## Additional Applications of this methodology. 46 | 47 | There are many applications for this data & methodology. 48 | 49 | Why Airbnb should implement this: 50 | * **User Value:** Increase user satisfaction by increasing relevance 51 | * **Business Value (revenue):** Increase booking rate by reducing bounces (& click fatigue) 52 | * **Business Value (content team):** Guide creation of neighborhood guides in new cities 53 | * ***Word2Vec Bonus 54 | note: The inclusion of a trait model for search results would need to be tested against existing systems. The potential negatives may include the increase of options (i.e. the paradox of choice) and/or the contextualized search lowers the costs of the listings that people book at. 55 | 56 | reference: 57 | 58 | * https://www.airbnb.com/support/article/39 59 | * http://nerds.airbnb.com/location-relevance/ 60 | * http://nerds.airbnb.com/host-preferences/ 61 | 62 | 63 | ## Extensions 64 | * Scrape more descriptions across more cities beyond SF & NYC (as neighborhood names & major street names were highly predictive in most models) 65 | * Include additional listing information in models 66 | * Make neighborhood traits more fluid by giving partial weight to nearby neighborhoods (utilizing graph analytics) 67 | * Revisit Doc2Vec model on a larger corpus & potential applications of Doc2Vec 68 | 69 | 70 | ## Toolkit + Credits 71 | 1. [iPython & iPython Notebook](http://ipython.org/notebook.html) - IDE for python; used to test code snippets & explore data 72 | 2. [MongoDB](http://www.mongodb.org/) - a NoSQL database; used for storing my scrapes 73 | * [pymongo](https://github.com/mongodb/mongo-python-driver) - A python wrapper for MongoDB 74 | 3. [Requests](http://docs.python-requests.org/en/latest/) - A python library used in scraping tasks for getting webpage html. 75 | 4. [BeautifulSoup](http://www.crummy.com/software/BeautifulSoup/) - A python html-parsing library. It makes it much easier to pull out particular elements from a complex webpage. 76 | 5. [pickle](https://docs.python.org/2/library/pickle.html) - A python library for serializing objects; used for saving requests objects for later parsing 77 | 6. [time](https://docs.python.org/2/library/time.html) & [datetime](https://docs.python.org/2/library/datetime.html) - Python libraries for time related functions; used for logging times of scrapes & pausing the scripts between scraping 78 | 7. - A python library for datetime related functions; used for parsing datetime objects in Pandas 79 | 8. [pandas](http://pandas.pydata.org/) - provides high-performance, easy-to-use data structures and data analysis tools for Python; used for basic data manipulation & some file reading 80 | 9. [NumPy](http://www.numpy.org/) - the fundamental package for scientific computing with Python.; used for math functionality 81 | 10. [scikit-learn](http://scikit-learn.org/stable/) - data modeling library 82 | 11. [nltk](http://www.nltk.org/) - library for NLP 83 | 10. [Flask](http://flask.pocoo.org/) - a python framework for creating web apps. 84 | 11. [gensim's Word2Vec & Doc2Vec](* Word2Vec - [https://radimrehurek.com/gensim/models/word2vec.html]) - a deep learning modeling library to help discern the definition of words. while not included in the final app, some EDA & testing was used with this model. With a larger corpus, it's likely that a Doc2Vec model would be used. 85 | 12. [moz](https://moz.com/blog/google-organic-click-through-rates-in-2014) - the Google SERP CTR by position served as an inspiration & starting point for my scoring system. 86 | 87 | Also, [Galvanize (a.k.a. Zipfian Academy)](http://www.zipfianacademy.com/) & its instructors for an amazing education. 88 | 89 | A special thank you (/slash/ apology) to Airbnb, whose amazing service was an inspiration for this project. I hope you are inspired by Localebnb to explore include neighborhood description search/filtering functionality in your search 90 | 91 | -G Scott Stukey 92 | 93 | 94 | ## Glossary of Terms 95 | * [TF-IDF aka Term Frequency - Inverse Document Frequency](http://en.wikipedia.org/wiki/Tf%E2%80%93idf) 96 | * [Naive-Bayes Classification](https://en.wikipedia.org/wiki/Naive_Bayes_classifier) 97 | * [Support Vector Machines](https://en.wikipedia.org/wiki/Support_vector_machine) 98 | * [Word2Vec](http://code.google.com/p/word2vec/) 99 | -------------------------------------------------------------------------------- /app.py: -------------------------------------------------------------------------------- 1 | from flask import Flask 2 | from flask import render_template 3 | from flask import request 4 | from pymongo import MongoClient 5 | import pickle 6 | app = Flask(__name__) 7 | from libs.airbnb.airbnbsearchresult import AirBnBSearchResult 8 | from libs.airbnb.airbnblisting import AirBnBListing 9 | from libs.app_helper import initialize_rank_scores 10 | import numpy as np 11 | import time 12 | 13 | DB_NAME = 'airbnb' 14 | SEARCH_COLL_NAME = 'search' 15 | LISTING_COLL_NAME = 'listings' 16 | DEFAULT_RANK_SCORES = initialize_rank_scores() 17 | 18 | AIR_S = AirBnBSearchResult(db_name=DB_NAME, coll_name=SEARCH_COLL_NAME) 19 | AIR_L = AirBnBListing(db_name=DB_NAME, coll_name=LISTING_COLL_NAME) 20 | PAUSE_BETWEEN_LISTING_SCRAPES = 0.5 21 | 22 | # Load my pickled models into the app 23 | # TFIDF = pickle.load(open('models/tfidf.pkl')) 24 | # MODEL_ARTSY = pickle.load(open('models/mnb_artsy.pkl')) 25 | # MODEL_DINING = pickle.load(open('models/mnb_dining.pkl')) 26 | # MODEL_NIGHTLIFE = pickle.load(open('models/mnb_nightlife.pkl')) 27 | # MODEL_SHOPPING = pickle.load(open('models/mnb_shopping.pkl')) 28 | 29 | TFIDF = pickle.load(open('models/tfidf_svc.pkl')) 30 | MODEL_ARTSY = pickle.load(open('models/svc_artsy_final.pkl')) 31 | MODEL_DINING = pickle.load(open('models/svc_dining_final.pkl')) 32 | MODEL_NIGHTLIFE = pickle.load(open('models/svc_nightlife_final.pkl')) 33 | MODEL_SHOPPING = pickle.load(open('models/svc_shopping_final.pkl')) 34 | WORDS_ARTSY = pickle.load(open('models/top_words_artsy.pkl')) 35 | WORDS_DINING = pickle.load(open('models/top_words_dining.pkl')) 36 | WORDS_NIGHTLIFE = pickle.load(open('models/top_words_nightlife.pkl')) 37 | WORDS_SHOPPING = pickle.load(open('models/top_words_shopping.pkl')) 38 | 39 | DEFAULT_VAL = 0.1 40 | 41 | 42 | # G SCOTT: Is this the most efficient way to do this? 43 | CITIES = [{'id':1, 'label':"San Francisco", 'city':'San-Francisco', 'state':'CA', 'country':'United-States'}, 44 | {'id':2, 'label':"New York", 'city':'New-York', 'state':'NY', 'country':'United-States'}] 45 | 46 | CITY_DICT = {x['city']:{'id':x['id'], 'label':x['label'], 'state':x['state'], 'country':x['country']} for x in CITIES} 47 | 48 | TRAITS = ['artsy', 'shopping', 'dining', 'nightlife'] 49 | # G SCOTT: Make this a function 50 | WEIGHTS = [{'id':'1', 'value':-1.0, 'label':"hate"}, 51 | {'id':'2', 'value':-0.5, 'label':"meh"}, 52 | {'id':'3', 'value':0.0, 'label':"average"}, 53 | {'id':'4', 'value':0.5, 'label':"like"}, 54 | {'id':'5', 'value':1.0, 'label':"love"}] 55 | 56 | def score_function(default_score, 57 | is_artsy, artsy_weight, 58 | is_shopping, shopping_weight, 59 | is_dining, dining_weight, 60 | is_nightlife, nightlife_weight): 61 | score_multiplier = is_artsy * artsy_weight + \ 62 | is_shopping * shopping_weight + \ 63 | is_dining * dining_weight + \ 64 | is_nightlife * nightlife_weight 65 | return default_score + score_multiplier 66 | 67 | @app.route('/') 68 | def index(): 69 | return render_template('index.html', cities=CITIES, traits=TRAITS, trait_weights=WEIGHTS) 70 | 71 | 72 | @app.route('/search', methods=['POST']) 73 | def search(): 74 | # liveflag = request.form['livecacheRadio'] == "live" 75 | 76 | if request.form['cityRadio'] != "live": 77 | city = request.form['cityRadio'] 78 | state = CITY_DICT[city]['state'] 79 | liveflag = False 80 | if city == "New-York": 81 | checkin = '06-23-2015' 82 | checkout = '06-25-2015' 83 | else: 84 | checkin = '06-30-2015' 85 | checkout = '07-02-2015' 86 | else: 87 | city_input = request.form['cityWriteIn'] 88 | city, state = [x.strip() for x in city_input.split(', ')] 89 | city = city.replace(' ','-') # Turn the city into AirBnB format 90 | liveflag = True 91 | checkin_input = request.form['checkinDate'].split('-') 92 | checkout_input = request.form['checkoutDate'].split('-') 93 | checkin = "-".join([checkin_input[1], checkin_input[2], checkin_input[0]]) 94 | checkout = "-".join([checkout_input[1], checkout_input[2], checkout_input[0]]) 95 | # checkin = '06-30-2015' # Update with less contrived example 96 | num_guests = request.form['numGuests'] 97 | 98 | artsy_weight_input = request.form['artsyRadio'] 99 | artsy_weight = float(artsy_weight_input) 100 | shopping_weight_input = request.form['shoppingRadio'] 101 | shopping_weight = float(shopping_weight_input) 102 | dining_weight_input = request.form['diningRadio'] 103 | dining_weight = float(dining_weight_input) 104 | nightlife_weight_input = request.form['nightlifeRadio'] 105 | nightlife_weight = float(nightlife_weight_input) 106 | 107 | 108 | # G SCOTT: Add checkout to params 109 | params = {'city':city, 110 | 'state':state, 111 | 'country':'United-States', 112 | 'checkin': checkin, 113 | 'checkout': checkout, 114 | 'guests':num_guests, 115 | 'price_max':400} 116 | 117 | 118 | AIR_S.set_params(params) 119 | 120 | if liveflag: 121 | AIR_S.scrape_from_web_for_app() # Need to confirm that the scrape went well 122 | listings = AIR_S.extract_listing_ids() 123 | if len(listings) > 10: 124 | listings = listings[:10] # limit the results to the top 10 125 | else: 126 | AIR_S.pull_one_from_db_cached(city=city) 127 | if city == "San-Francisco": 128 | listings = AIR_S.extract_listing_ids()[:11] 129 | else: 130 | listings = AIR_S.extract_listing_ids()[:9] 131 | 132 | 133 | 134 | listing_dict = {l:{'id':l} for l in listings} 135 | thumbnail_data = AIR_S.extract_thumbnail_data() 136 | 137 | #initialize values for finding the middle of the map 138 | max_lat = -360.0 139 | min_lat = 360.0 140 | max_lng = -360.0 141 | min_lng = 360.0 142 | 143 | for i, listing in enumerate(listings): 144 | if liveflag: 145 | AIR_L.scrape_from_web_for_app(listing_id=listing) 146 | listing_dict[listing]['url'] = AIR_L.url 147 | time.sleep(PAUSE_BETWEEN_LISTING_SCRAPES) 148 | else: 149 | AIR_L.pull_from_db_cached(listing_id=listing) 150 | # added this after the production DB errored out 151 | listing_dict[listing]['url'] = AIR_L.BASE_ROOM_URL + listing 152 | 153 | # listing_dict[listing]['url'] = AIR_L.url 154 | listing_dict[listing]['default_position'] = i+1 155 | listing_dict[listing]['default_score'] = DEFAULT_RANK_SCORES[i] 156 | 157 | listing_dict[listing]['thumbnail_img'] = thumbnail_data[listing]['thumbnail_img'] 158 | listing_dict[listing]['blurb'] = thumbnail_data[listing]['blurb'] 159 | listing_dict[listing]['thumbnail_price'] = thumbnail_data[listing]['thumbnail_price'] 160 | listing_dict[listing]['listing_type'] = thumbnail_data[listing]['listing_type'] 161 | listing_dict[listing]['lng'] = thumbnail_data[listing]['lng'] 162 | listing_dict[listing]['lat'] = thumbnail_data[listing]['lat'] 163 | 164 | max_lng = max(max_lng, float(listing_dict[listing]['lng'])) 165 | min_lng = min(min_lng, float(listing_dict[listing]['lng'])) 166 | max_lat = max(max_lat, float(listing_dict[listing]['lat'])) 167 | min_lat = min(min_lat, float(listing_dict[listing]['lat'])) 168 | 169 | 170 | if liveflag: 171 | description_clean = AIR_L.extract_clean_description() 172 | if description_clean != "": 173 | vectorized_desc = TFIDF.transform([description_clean]).toarray() 174 | listing_dict[listing]['is_artsy'] = int(MODEL_ARTSY.predict(vectorized_desc)[0]) 175 | listing_dict[listing]['is_shopping'] = int(MODEL_SHOPPING.predict(vectorized_desc)[0]) 176 | listing_dict[listing]['is_dining'] = int(MODEL_DINING.predict(vectorized_desc)[0]) 177 | listing_dict[listing]['is_nightlife'] = int(MODEL_NIGHTLIFE.predict(vectorized_desc)[0]) 178 | else: 179 | listing_dict[listing]['is_artsy'] = DEFAULT_VAL 180 | listing_dict[listing]['is_shopping'] = DEFAULT_VAL 181 | listing_dict[listing]['is_dining'] = DEFAULT_VAL 182 | listing_dict[listing]['is_nightlife'] = DEFAULT_VAL 183 | 184 | elif 'description_clean' in AIR_L.d: 185 | description_clean = AIR_L.d['description_clean'] 186 | vectorized_desc = TFIDF.transform([description_clean]).toarray() 187 | listing_dict[listing]['is_artsy'] = int(MODEL_ARTSY.predict(vectorized_desc)[0]) 188 | listing_dict[listing]['is_shopping'] = int(MODEL_SHOPPING.predict(vectorized_desc)[0]) 189 | listing_dict[listing]['is_dining'] = int(MODEL_DINING.predict(vectorized_desc)[0]) 190 | listing_dict[listing]['is_nightlife'] = int(MODEL_NIGHTLIFE.predict(vectorized_desc)[0]) 191 | else: 192 | # G SCOTT: consider checking to see if the neighborhood is artsy or not based on my data 193 | listing_dict[listing]['is_artsy'] = DEFAULT_VAL 194 | listing_dict[listing]['is_shopping'] = DEFAULT_VAL 195 | listing_dict[listing]['is_dining'] = DEFAULT_VAL 196 | listing_dict[listing]['is_nightlife'] = DEFAULT_VAL 197 | 198 | score_multiplier = listing_dict[listing]['is_artsy'] * artsy_weight + \ 199 | listing_dict[listing]['is_shopping'] * shopping_weight + \ 200 | listing_dict[listing]['is_dining'] * dining_weight + \ 201 | listing_dict[listing]['is_nightlife'] * nightlife_weight 202 | 203 | listing_dict[listing]['score'] = score_function(default_score = DEFAULT_RANK_SCORES[i], 204 | is_artsy = listing_dict[listing]['is_artsy'], 205 | artsy_weight = artsy_weight, 206 | is_shopping = listing_dict[listing]['is_shopping'], 207 | shopping_weight = shopping_weight, 208 | is_dining = listing_dict[listing]['is_dining'], 209 | dining_weight = dining_weight, 210 | is_nightlife = listing_dict[listing]['is_nightlife'], 211 | nightlife_weight = nightlife_weight) 212 | 213 | 214 | sorted_listings = sorted(listing_dict, key=lambda x:listing_dict[x]['score'], reverse=True) 215 | for i, listing in enumerate(sorted_listings): 216 | listing_dict[listing]['position'] = i+1 217 | 218 | map_center = ((max_lat+min_lat)/2, (max_lng+min_lng)/2) 219 | # return str(checkin_test) 220 | # return city_text 221 | # return str(city['value']) 222 | return render_template('search.html', sorted_listings = sorted_listings, listing_dict=listing_dict, map_center=map_center, city=city, state=state, traits=TRAITS, trait_weights=WEIGHTS, artsy_weight=artsy_weight, shopping_weight=shopping_weight, dining_weight=dining_weight, nightlife_weight=nightlife_weight) 223 | # return str(listing_dict) 224 | # return AIR_S.r.content # For debugging: used to show the cached AirBnB search from MongoDB 225 | 226 | 227 | @app.route('/listing/') 228 | def listing(listing_id): 229 | # cached scenario 230 | # AIR_L. 231 | # thumbnail_data = AIR_S.extract_thumbnail_data() 232 | # listing_index = listings.index(listing_id) 233 | # return str(listing_index) 234 | 235 | AIR_L.pull_from_db_cached(listing_id=listing_id) 236 | description_raw = AIR_L.d['description_raw'] 237 | description_raw_html = description_raw.replace('\n', "
") 238 | description_clean = AIR_L._clean_description(description_raw) 239 | artsy_words=[] 240 | shopping_words=[] 241 | dining_words=[] 242 | nightlife_words=[] 243 | for word in description_clean.split(): 244 | if word in WORDS_ARTSY: 245 | artsy_words.append(word) 246 | if word in WORDS_SHOPPING: 247 | shopping_words.append(word) 248 | if word in WORDS_DINING: 249 | dining_words.append(word) 250 | if word in WORDS_NIGHTLIFE: 251 | nightlife_words.append(word) 252 | 253 | listing_words_artsy = set(artsy_words) 254 | listing_words_shopping = set(shopping_words) 255 | listing_words_dining = set(dining_words) 256 | listing_words_nightlife = set(nightlife_words) 257 | 258 | # return description_raw_html 259 | return render_template('listing.html', description_raw_html=description_raw_html, listing_words_artsy=listing_words_artsy, listing_words_shopping=listing_words_shopping, listing_words_dining=listing_words_dining, listing_words_nightlife=listing_words_nightlife) 260 | 261 | 262 | @app.route('/about') 263 | def about(): 264 | return "https://www.linkedin.com/in/gscottstukey" 265 | 266 | @app.route('/contact') 267 | def contact(): 268 | return render_template('contact.html') 269 | 270 | if __name__ == '__main__': 271 | app.run(host='0.0.0.0', port=7777, debug=True) 272 | 273 | -------------------------------------------------------------------------------- /app_production.py: -------------------------------------------------------------------------------- 1 | from flask import Flask 2 | from flask import render_template 3 | from flask import request 4 | from pymongo import MongoClient 5 | import pickle 6 | app = Flask(__name__) 7 | from libs.airbnb.airbnbsearchresult import AirBnBSearchResult 8 | from libs.airbnb.airbnblisting import AirBnBListing 9 | from libs.app_helper import initialize_rank_scores 10 | import numpy as np 11 | import time 12 | 13 | DB_NAME = 'airbnb2' 14 | SEARCH_COLL_NAME = 'search' 15 | LISTING_COLL_NAME = 'listings' 16 | DEFAULT_RANK_SCORES = initialize_rank_scores() 17 | 18 | AIR_S = AirBnBSearchResult(db_name=DB_NAME, coll_name=SEARCH_COLL_NAME) 19 | AIR_L = AirBnBListing(db_name=DB_NAME, coll_name=LISTING_COLL_NAME) 20 | PAUSE_BETWEEN_LISTING_SCRAPES = 1 21 | 22 | # Load my pickled models into the app 23 | # TFIDF = pickle.load(open('models/tfidf.pkl')) 24 | # MODEL_ARTSY = pickle.load(open('models/mnb_artsy.pkl')) 25 | # MODEL_DINING = pickle.load(open('models/mnb_dining.pkl')) 26 | # MODEL_NIGHTLIFE = pickle.load(open('models/mnb_nightlife.pkl')) 27 | # MODEL_SHOPPING = pickle.load(open('models/mnb_shopping.pkl')) 28 | 29 | TFIDF = pickle.load(open('models/tfidf_svc.pkl')) 30 | MODEL_ARTSY = pickle.load(open('models/svc_artsy_final.pkl')) 31 | MODEL_DINING = pickle.load(open('models/svc_dining_final.pkl')) 32 | MODEL_NIGHTLIFE = pickle.load(open('models/svc_nightlife_final.pkl')) 33 | MODEL_SHOPPING = pickle.load(open('models/svc_shopping_final.pkl')) 34 | WORDS_ARTSY = pickle.load(open('models/top_words_artsy.pkl')) 35 | WORDS_DINING = pickle.load(open('models/top_words_dining.pkl')) 36 | WORDS_NIGHTLIFE = pickle.load(open('models/top_words_nightlife.pkl')) 37 | WORDS_SHOPPING = pickle.load(open('models/top_words_shopping.pkl')) 38 | 39 | DEFAULT_VAL = 0.1 40 | 41 | 42 | # G SCOTT: Is this the most efficient way to do this? 43 | CITIES = [{'id':1, 'label':"San Francisco", 'city':'San-Francisco', 'state':'CA', 'country':'United-States'}, 44 | {'id':2, 'label':"New York", 'city':'New-York', 'state':'NY', 'country':'United-States'}] 45 | 46 | CITY_DICT = {x['city']:{'id':x['id'], 'label':x['label'], 'state':x['state'], 'country':x['country']} for x in CITIES} 47 | 48 | TRAITS = ['artsy', 'shopping', 'dining', 'nightlife'] 49 | # G SCOTT: Make this a function 50 | WEIGHTS = [{'id':'1', 'value':-1.0, 'label':"hate"}, 51 | {'id':'2', 'value':-0.5, 'label':"meh"}, 52 | {'id':'3', 'value':0.0, 'label':"average"}, 53 | {'id':'4', 'value':0.5, 'label':"like"}, 54 | {'id':'5', 'value':1.0, 'label':"love"}] 55 | 56 | def score_function(default_score, 57 | is_artsy, artsy_weight, 58 | is_shopping, shopping_weight, 59 | is_dining, dining_weight, 60 | is_nightlife, nightlife_weight): 61 | score_multiplier = is_artsy * artsy_weight + \ 62 | is_shopping * shopping_weight + \ 63 | is_dining * dining_weight + \ 64 | is_nightlife * nightlife_weight 65 | return default_score + score_multiplier 66 | 67 | @app.route('/') 68 | def index(): 69 | return render_template('index.html', cities=CITIES, traits=TRAITS, trait_weights=WEIGHTS) 70 | 71 | 72 | @app.route('/search', methods=['POST']) 73 | def search(): 74 | # liveflag = request.form['livecacheRadio'] == "live" 75 | 76 | if request.form['cityRadio'] != "live": 77 | city = request.form['cityRadio'] 78 | state = CITY_DICT[city]['state'] 79 | liveflag = False 80 | if city == "New-York": 81 | checkin = '06-23-2015' 82 | checkout = '06-25-2015' 83 | else: 84 | checkin = '06-30-2015' 85 | checkout = '07-02-2015' 86 | else: 87 | city_input = request.form['cityWriteIn'] 88 | city, state = [x.strip() for x in city_input.split(', ')] 89 | city = city.replace(' ','-') # Turn the city into AirBnB format 90 | liveflag = True 91 | checkin_input = request.form['checkinDate'].split('-') 92 | checkout_input = request.form['checkoutDate'].split('-') 93 | checkin = "-".join([checkin_input[1], checkin_input[2], checkin_input[0]]) 94 | checkout = "-".join([checkout_input[1], checkout_input[2], checkout_input[0]]) 95 | # checkin = '06-30-2015' # Update with less contrived example 96 | num_guests = request.form['numGuests'] 97 | 98 | artsy_weight_input = request.form['artsyRadio'] 99 | artsy_weight = float(artsy_weight_input) 100 | shopping_weight_input = request.form['shoppingRadio'] 101 | shopping_weight = float(shopping_weight_input) 102 | dining_weight_input = request.form['diningRadio'] 103 | dining_weight = float(dining_weight_input) 104 | nightlife_weight_input = request.form['nightlifeRadio'] 105 | nightlife_weight = float(nightlife_weight_input) 106 | 107 | 108 | # G SCOTT: Add checkout to params 109 | params = {'city':city, 110 | 'state':state, 111 | 'country':'United-States', 112 | 'checkin': checkin, 113 | 'checkout': checkout, 114 | 'guests':num_guests, 115 | 'price_max':400} 116 | 117 | 118 | AIR_S.set_params(params) 119 | 120 | if liveflag: 121 | AIR_S.scrape_from_web_for_app() # Need to confirm that the scrape went well 122 | listings = AIR_S.extract_listing_ids() 123 | if len(listings) > 10: 124 | listings = listings[:10] 125 | else: 126 | AIR_S.pull_one_from_db_cached(city=city) 127 | if city == "San-Francisco": 128 | listings = AIR_S.extract_listing_ids()[:11] 129 | else: 130 | listings = AIR_S.extract_listing_ids()[:9] 131 | 132 | 133 | 134 | listing_dict = {l:{'id':l} for l in listings} 135 | thumbnail_data = AIR_S.extract_thumbnail_data() 136 | 137 | #initialize values for finding the middle of the map 138 | max_lat = -360.0 139 | min_lat = 360.0 140 | max_lng = -360.0 141 | min_lng = 360.0 142 | 143 | for i, listing in enumerate(listings): 144 | if liveflag: 145 | AIR_L.scrape_from_web_for_app(listing_id=listing) 146 | listing_dict[listing]['url'] = AIR_L.url 147 | time.sleep(PAUSE_BETWEEN_LISTING_SCRAPES) 148 | else: 149 | AIR_L.pull_from_db_cached(listing_id=listing) 150 | # added this after the production DB errored out 151 | listing_dict[listing]['url'] = AIR_L.BASE_ROOM_URL + listing 152 | 153 | # listing_dict[listing]['url'] = AIR_L.url 154 | listing_dict[listing]['default_position'] = i+1 155 | listing_dict[listing]['default_score'] = DEFAULT_RANK_SCORES[i] 156 | 157 | listing_dict[listing]['thumbnail_img'] = thumbnail_data[listing]['thumbnail_img'] 158 | listing_dict[listing]['blurb'] = thumbnail_data[listing]['blurb'] 159 | listing_dict[listing]['thumbnail_price'] = thumbnail_data[listing]['thumbnail_price'] 160 | listing_dict[listing]['listing_type'] = thumbnail_data[listing]['listing_type'] 161 | listing_dict[listing]['lng'] = thumbnail_data[listing]['lng'] 162 | listing_dict[listing]['lat'] = thumbnail_data[listing]['lat'] 163 | 164 | max_lng = max(max_lng, float(listing_dict[listing]['lng'])) 165 | min_lng = min(min_lng, float(listing_dict[listing]['lng'])) 166 | max_lat = max(max_lat, float(listing_dict[listing]['lat'])) 167 | min_lat = min(min_lat, float(listing_dict[listing]['lat'])) 168 | 169 | 170 | if liveflag: 171 | description_clean = AIR_L.extract_clean_description() 172 | if description_clean != "": 173 | vectorized_desc = TFIDF.transform([description_clean]).toarray() 174 | listing_dict[listing]['is_artsy'] = int(MODEL_ARTSY.predict(vectorized_desc)[0]) 175 | listing_dict[listing]['is_shopping'] = int(MODEL_SHOPPING.predict(vectorized_desc)[0]) 176 | listing_dict[listing]['is_dining'] = int(MODEL_DINING.predict(vectorized_desc)[0]) 177 | listing_dict[listing]['is_nightlife'] = int(MODEL_NIGHTLIFE.predict(vectorized_desc)[0]) 178 | else: 179 | listing_dict[listing]['is_artsy'] = DEFAULT_VAL 180 | listing_dict[listing]['is_shopping'] = DEFAULT_VAL 181 | listing_dict[listing]['is_dining'] = DEFAULT_VAL 182 | listing_dict[listing]['is_nightlife'] = DEFAULT_VAL 183 | 184 | elif 'description_clean' in AIR_L.d: 185 | description_clean = AIR_L.d['description_clean'] 186 | vectorized_desc = TFIDF.transform([description_clean]).toarray() 187 | listing_dict[listing]['is_artsy'] = int(MODEL_ARTSY.predict(vectorized_desc)[0]) 188 | listing_dict[listing]['is_shopping'] = int(MODEL_SHOPPING.predict(vectorized_desc)[0]) 189 | listing_dict[listing]['is_dining'] = int(MODEL_DINING.predict(vectorized_desc)[0]) 190 | listing_dict[listing]['is_nightlife'] = int(MODEL_NIGHTLIFE.predict(vectorized_desc)[0]) 191 | else: 192 | # G SCOTT: consider checking to see if the neighborhood is artsy or not based on my data 193 | listing_dict[listing]['is_artsy'] = DEFAULT_VAL 194 | listing_dict[listing]['is_shopping'] = DEFAULT_VAL 195 | listing_dict[listing]['is_dining'] = DEFAULT_VAL 196 | listing_dict[listing]['is_nightlife'] = DEFAULT_VAL 197 | 198 | score_multiplier = listing_dict[listing]['is_artsy'] * artsy_weight + \ 199 | listing_dict[listing]['is_shopping'] * shopping_weight + \ 200 | listing_dict[listing]['is_dining'] * dining_weight + \ 201 | listing_dict[listing]['is_nightlife'] * nightlife_weight 202 | 203 | listing_dict[listing]['score'] = score_function(default_score = DEFAULT_RANK_SCORES[i], 204 | is_artsy = listing_dict[listing]['is_artsy'], 205 | artsy_weight = artsy_weight, 206 | is_shopping = listing_dict[listing]['is_shopping'], 207 | shopping_weight = shopping_weight, 208 | is_dining = listing_dict[listing]['is_dining'], 209 | dining_weight = dining_weight, 210 | is_nightlife = listing_dict[listing]['is_nightlife'], 211 | nightlife_weight = nightlife_weight) 212 | 213 | 214 | sorted_listings = sorted(listing_dict, key=lambda x:listing_dict[x]['score'], reverse=True) 215 | for i, listing in enumerate(sorted_listings): 216 | listing_dict[listing]['position'] = i+1 217 | 218 | map_center = ((max_lat+min_lat)/2, (max_lng+min_lng)/2) 219 | # return str(checkin_test) 220 | # return city_text 221 | # return str(city['value']) 222 | return render_template('search.html', sorted_listings = sorted_listings, listing_dict=listing_dict, map_center=map_center, city=city, state=state, traits=TRAITS, trait_weights=WEIGHTS, artsy_weight=artsy_weight, shopping_weight=shopping_weight, dining_weight=dining_weight, nightlife_weight=nightlife_weight) 223 | # return str(listing_dict) 224 | # return AIR_S.r.content # For debugging: used to show the cached AirBnB search from MongoDB 225 | 226 | 227 | @app.route('/listing/') 228 | def listing(listing_id): 229 | # cached scenario 230 | # AIR_L. 231 | # thumbnail_data = AIR_S.extract_thumbnail_data() 232 | # listing_index = listings.index(listing_id) 233 | # return str(listing_index) 234 | 235 | AIR_L.pull_from_db_cached(listing_id=listing_id) 236 | description_raw = AIR_L.d['description_raw'] 237 | description_raw_html = description_raw.replace('\n', "
") 238 | description_clean = AIR_L._clean_description(description_raw) 239 | artsy_words=[] 240 | shopping_words=[] 241 | dining_words=[] 242 | nightlife_words=[] 243 | for word in description_clean.split(): 244 | if word in WORDS_ARTSY: 245 | artsy_words.append(word) 246 | if word in WORDS_SHOPPING: 247 | shopping_words.append(word) 248 | if word in WORDS_DINING: 249 | dining_words.append(word) 250 | if word in WORDS_NIGHTLIFE: 251 | nightlife_words.append(word) 252 | 253 | listing_words_artsy = set(artsy_words) 254 | listing_words_shopping = set(shopping_words) 255 | listing_words_dining = set(dining_words) 256 | listing_words_nightlife = set(nightlife_words) 257 | 258 | # return description_raw_html 259 | return render_template('listing.html', description_raw_html=description_raw_html, listing_words_artsy=listing_words_artsy, listing_words_shopping=listing_words_shopping, listing_words_dining=listing_words_dining, listing_words_nightlife=listing_words_nightlife) 260 | 261 | 262 | @app.route('/about') 263 | def about(): 264 | return "https://www.linkedin.com/in/gscottstukey" 265 | 266 | @app.route('/contact') 267 | def contact(): 268 | return render_template('contact.html') 269 | 270 | if __name__ == '__main__': 271 | app.run(host='0.0.0.0', port=80, debug=True) 272 | 273 | -------------------------------------------------------------------------------- /data/amenities.csv: -------------------------------------------------------------------------------- 1 | id,amenity 2 | 1,TV 3 | 2,Cable TV 4 | 3,Internet 5 | 4,Wireless Internet 6 | 5,Air Conditioning 7 | 6,Wheelchair Accessible 8 | 7,Pool 9 | 8,Kitchen 10 | 9,Free Parking on Premises 11 | 11,Smoking Allowed 12 | 12,Pets Allowed 13 | 14,Doorman 14 | 15,Gym 15 | 16,Breakfast 16 | 21,Elevator in Building 17 | 25,Hot Tub 18 | 27,Indoor Fireplace 19 | 28,Buzzer/Wireless Intercom 20 | 30,Heating 21 | 31,Family/Kid Friendly 22 | 32,Suitable for Events 23 | 33,Washer 24 | 34,Dryer 25 | 35,Smoke Detector 26 | 36,Carbon Monoxide Detector 27 | 37,First Aid Kit 28 | 38,Safety Card 29 | 39,Fire Extinguisher 30 | 40,Essentials 31 | 41,Shampoo 32 | -------------------------------------------------------------------------------- /data/city_list.csv: -------------------------------------------------------------------------------- 1 | city_id,city,state,country 2 | 1,San-Francisco,CA,United-States 3 | 2,New-York,NY,United-States -------------------------------------------------------------------------------- /data/city_list_df.pkl: -------------------------------------------------------------------------------- 1 | ccopy_reg 2 | _reconstructor 3 | p0 4 | (cpandas.core.frame 5 | DataFrame 6 | p1 7 | c__builtin__ 8 | object 9 | p2 10 | Ntp3 11 | Rp4 12 | g0 13 | (cpandas.core.internals 14 | BlockManager 15 | p5 16 | g2 17 | Ntp6 18 | Rp7 19 | ((lp8 20 | cpandas.core.index 21 | _new_Index 22 | p9 23 | (cpandas.core.index 24 | Index 25 | p10 26 | (dp11 27 | S'data' 28 | p12 29 | cnumpy.core.multiarray 30 | _reconstruct 31 | p13 32 | (cnumpy 33 | ndarray 34 | p14 35 | (I0 36 | tp15 37 | S'b' 38 | p16 39 | tp17 40 | Rp18 41 | (I1 42 | (I4 43 | tp19 44 | cnumpy 45 | dtype 46 | p20 47 | (S'O8' 48 | p21 49 | I0 50 | I1 51 | tp22 52 | Rp23 53 | (I3 54 | S'|' 55 | p24 56 | NNNI-1 57 | I-1 58 | I63 59 | tp25 60 | bI00 61 | (lp26 62 | S'city_id' 63 | p27 64 | aS'city' 65 | p28 66 | aS'state' 67 | p29 68 | aS'country' 69 | p30 70 | atp31 71 | bsS'name' 72 | p32 73 | Nstp33 74 | Rp34 75 | ag9 76 | (cpandas.core.index 77 | Int64Index 78 | p35 79 | (dp36 80 | g12 81 | g13 82 | (g14 83 | (I0 84 | tp37 85 | g16 86 | tp38 87 | Rp39 88 | (I1 89 | (I2 90 | tp40 91 | g20 92 | (S'i8' 93 | p41 94 | I0 95 | I1 96 | tp42 97 | Rp43 98 | (I3 99 | S'<' 100 | p44 101 | NNNI-1 102 | I-1 103 | I0 104 | tp45 105 | bI00 106 | S'\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00' 107 | p46 108 | tp47 109 | bsg32 110 | Nstp48 111 | Rp49 112 | a(lp50 113 | g13 114 | (g14 115 | (I0 116 | tp51 117 | g16 118 | tp52 119 | Rp53 120 | (I1 121 | (I1 122 | I2 123 | tp54 124 | g43 125 | I00 126 | S'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00' 127 | p55 128 | tp56 129 | bag13 130 | (g14 131 | (I0 132 | tp57 133 | g16 134 | tp58 135 | Rp59 136 | (I1 137 | (I3 138 | I2 139 | tp60 140 | g23 141 | I00 142 | (lp61 143 | S'San-Francisco' 144 | p62 145 | aS'New-York' 146 | p63 147 | aS'CA' 148 | p64 149 | aS'NY' 150 | p65 151 | aS'United-States' 152 | p66 153 | ag66 154 | atp67 155 | ba(lp68 156 | g9 157 | (g10 158 | (dp69 159 | g12 160 | g13 161 | (g14 162 | (I0 163 | tp70 164 | g16 165 | tp71 166 | Rp72 167 | (I1 168 | (I1 169 | tp73 170 | g23 171 | I00 172 | (lp74 173 | g27 174 | atp75 175 | bsg32 176 | Nstp76 177 | Rp77 178 | ag9 179 | (g10 180 | (dp78 181 | g12 182 | g13 183 | (g14 184 | (I0 185 | tp79 186 | g16 187 | tp80 188 | Rp81 189 | (I1 190 | (I3 191 | tp82 192 | g23 193 | I00 194 | (lp83 195 | g28 196 | ag29 197 | ag30 198 | atp84 199 | bsg32 200 | Nstp85 201 | Rp86 202 | a(dp87 203 | S'0.14.1' 204 | p88 205 | (dp89 206 | S'axes' 207 | p90 208 | g8 209 | sS'blocks' 210 | p91 211 | (lp92 212 | (dp93 213 | S'mgr_locs' 214 | p94 215 | c__builtin__ 216 | slice 217 | p95 218 | (I0 219 | I1 220 | I1 221 | tp96 222 | Rp97 223 | sS'values' 224 | p98 225 | g53 226 | sa(dp99 227 | g94 228 | g95 229 | (I1 230 | I4 231 | I1 232 | tp100 233 | Rp101 234 | sg98 235 | g59 236 | sasstp102 237 | bb. -------------------------------------------------------------------------------- /data/neighborhood_list.csv: -------------------------------------------------------------------------------- 1 | neighborhood_id,neighborhood,neighborhood_url,city_id,city 2 | 0,Alamo Square,/locations/san-francisco/alamo-square,1,san-francisco 3 | 1,Bayview,/locations/san-francisco/bayview,1,san-francisco 4 | 2,Bernal Heights,/locations/san-francisco/bernal-heights,1,san-francisco 5 | 3,Chinatown,/locations/san-francisco/chinatown,1,san-francisco 6 | 4,Civic Center,/locations/san-francisco/civic-center,1,san-francisco 7 | 5,Cole Valley,/locations/san-francisco/cole-valley,1,san-francisco 8 | 6,Cow Hollow,/locations/san-francisco/cow-hollow,1,san-francisco 9 | 7,Dogpatch,/locations/san-francisco/dogpatch,1,san-francisco 10 | 8,Downtown,/locations/san-francisco/downtown,1,san-francisco 11 | 9,Duboce Triangle,/locations/san-francisco/duboce-triangle,1,san-francisco 12 | 10,Excelsior,/locations/san-francisco/excelsior,1,san-francisco 13 | 11,Financial District,/locations/san-francisco/financial-district,1,san-francisco 14 | 12,Fisherman's Wharf,/locations/san-francisco/fisherman-s-wharf,1,san-francisco 15 | 13,Glen Park,/locations/san-francisco/glen-park,1,san-francisco 16 | 14,Haight-Ashbury,/locations/san-francisco/haight-ashbury,1,san-francisco 17 | 15,Hayes Valley,/locations/san-francisco/hayes-valley,1,san-francisco 18 | 16,Inner Sunset,/locations/san-francisco/inner-sunset,1,san-francisco 19 | 17,Japantown,/locations/san-francisco/japantown,1,san-francisco 20 | 18,Lower Haight,/locations/san-francisco/lower-haight,1,san-francisco 21 | 19,Marina,/locations/san-francisco/marina,1,san-francisco 22 | 20,Mission Bay,/locations/san-francisco/mission-bay,1,san-francisco 23 | 21,Mission District,/locations/san-francisco/mission-district,1,san-francisco 24 | 22,Mission Terrace,/locations/san-francisco/mission-terrace,1,san-francisco 25 | 23,Nob Hill,/locations/san-francisco/nob-hill,1,san-francisco 26 | 24,Noe Valley,/locations/san-francisco/noe-valley,1,san-francisco 27 | 25,North Beach,/locations/san-francisco/north-beach,1,san-francisco 28 | 26,Outer Sunset,/locations/san-francisco/sunset-district,1,san-francisco 29 | 27,Pacific Heights,/locations/san-francisco/pacific-heights,1,san-francisco 30 | 28,Parkside,/locations/san-francisco/parkside,1,san-francisco 31 | 29,Portola,/locations/san-francisco/portola,1,san-francisco 32 | 30,Potrero Hill,/locations/san-francisco/potrero-hill,1,san-francisco 33 | 31,Presidio Heights,/locations/san-francisco/presidio-heights,1,san-francisco 34 | 32,Richmond District,/locations/san-francisco/richmond-district,1,san-francisco 35 | 33,Russian Hill,/locations/san-francisco/russian-hill,1,san-francisco 36 | 34,SoMa,/locations/san-francisco/soma,1,san-francisco 37 | 35,South Beach,/locations/san-francisco/south-beach,1,san-francisco 38 | 36,Telegraph Hill,/locations/san-francisco/telegraph-hill,1,san-francisco 39 | 37,Tenderloin,/locations/san-francisco/tenderloin,1,san-francisco 40 | 38,The Castro,/locations/san-francisco/the-castro,1,san-francisco 41 | 39,Twin Peaks,/locations/san-francisco/twin-peaks,1,san-francisco 42 | 40,Visitacion Valley,/locations/san-francisco/visitacion-valley,1,san-francisco 43 | 41,Western Addition/NOPA,/locations/san-francisco/western-addition-nopa,1,san-francisco 44 | 42,Alphabet City,/locations/new-york/alphabet-city,2,new-york 45 | 43,Astoria,/locations/new-york/astoria,2,new-york 46 | 44,Battery Park City,/locations/new-york/battery-park-city,2,new-york 47 | 45,Bedford-Stuyvesant,/locations/new-york/bedford-stuyvesant,2,new-york 48 | 46,Boerum Hill,/locations/new-york/boerum-hill,2,new-york 49 | 47,Brooklyn Heights,/locations/new-york/brooklyn-heights,2,new-york 50 | 48,Bushwick,/locations/new-york/bushwick,2,new-york 51 | 49,Carroll Gardens,/locations/new-york/carroll-gardens,2,new-york 52 | 50,Chelsea,/locations/new-york/chelsea,2,new-york 53 | 51,Chinatown,/locations/new-york/chinatown,2,new-york 54 | 52,Civic Center,/locations/new-york/civic-center,2,new-york 55 | 53,Clinton Hill,/locations/new-york/clinton-hill,2,new-york 56 | 54,Cobble Hill,/locations/new-york/cobble-hill,2,new-york 57 | 55,Crown Heights,/locations/new-york/crown-heights,2,new-york 58 | 56,Downtown Brooklyn,/locations/new-york/downtown-brooklyn,2,new-york 59 | 57,DUMBO,/locations/new-york/dumbo,2,new-york 60 | 58,East Harlem,/locations/new-york/east-harlem,2,new-york 61 | 59,East Village,/locations/new-york/east-village,2,new-york 62 | 60,Financial District,/locations/new-york/financial-district,2,new-york 63 | 61,Flatbush,/locations/new-york/flatbush,2,new-york 64 | 62,Flatiron District,/locations/new-york/flatiron-district,2,new-york 65 | 63,Flushing,/locations/new-york/flushing,2,new-york 66 | 64,Fort Greene,/locations/new-york/fort-greene,2,new-york 67 | 65,Gowanus,/locations/new-york/gowanus,2,new-york 68 | 66,Gramercy Park,/locations/new-york/gramercy-park,2,new-york 69 | 67,Greenpoint,/locations/new-york/greenpoint,2,new-york 70 | 68,Greenwich Village,/locations/new-york/greenwich-village,2,new-york 71 | 69,Harlem,/locations/new-york/harlem,2,new-york 72 | 70,Hell's Kitchen,/locations/new-york/hell-s-kitchen,2,new-york 73 | 71,Hudson Square,/locations/new-york/hudson-square,2,new-york 74 | 72,Inwood,/locations/new-york/inwood,2,new-york 75 | 73,Jackson Heights,/locations/new-york/jackson-heights,2,new-york 76 | 74,Kensington,/locations/new-york/kensington,2,new-york 77 | 75,Kips Bay,/locations/new-york/kips-bay,2,new-york 78 | 76,Lefferts Garden,/locations/new-york/lefferts-garden,2,new-york 79 | 77,Little Italy,/locations/new-york/little-italy,2,new-york 80 | 78,Long Island City,/locations/new-york/long-island-city,2,new-york 81 | 79,Lower East Side,/locations/new-york/lower-east-side,2,new-york 82 | 80,Meatpacking District,/locations/new-york/meatpacking-district,2,new-york 83 | 81,Midtown,/locations/new-york/midtown,2,new-york 84 | 82,Midtown East,/locations/new-york/midtown-east,2,new-york 85 | 83,Morningside Heights,/locations/new-york/morningside-heights,2,new-york 86 | 84,Murray Hill,/locations/new-york/murray-hill,2,new-york 87 | 85,Noho,/locations/new-york/noho,2,new-york 88 | 86,Nolita,/locations/new-york/nolita,2,new-york 89 | 87,Park Slope,/locations/new-york/park-slope,2,new-york 90 | 88,Prospect Heights,/locations/new-york/prospect-heights,2,new-york 91 | 89,Red Hook,/locations/new-york/red-hook,2,new-york 92 | 90,Ridgewood,/locations/new-york/ridgewood,2,new-york 93 | 91,Soho,/locations/new-york/soho,2,new-york 94 | 92,South Street Seaport,/locations/new-york/south-street-seaport,2,new-york 95 | 93,Times Square/Theatre District,/locations/new-york/times-square-theatre-district,2,new-york 96 | 94,Tribeca,/locations/new-york/tribeca,2,new-york 97 | 95,Union Square,/locations/new-york/union-square,2,new-york 98 | 96,Upper East Side,/locations/new-york/upper-east-side,2,new-york 99 | 97,Upper West Side,/locations/new-york/upper-west-side,2,new-york 100 | 98,Washington Heights,/locations/new-york/washington-heights,2,new-york 101 | 99,West Village,/locations/new-york/west-village,2,new-york 102 | 100,Williamsburg,/locations/new-york/williamsburg,2,new-york 103 | 101,Windsor Terrace,/locations/new-york/windsor-terrace,2,new-york 104 | -------------------------------------------------------------------------------- /libs/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/__init__.py -------------------------------------------------------------------------------- /libs/__init__.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/__init__.pyc -------------------------------------------------------------------------------- /libs/airbnb/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/airbnb/__init__.py -------------------------------------------------------------------------------- /libs/airbnb/__init__.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/airbnb/__init__.pyc -------------------------------------------------------------------------------- /libs/airbnb/airbnblisting.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/airbnb/airbnblisting.pyc -------------------------------------------------------------------------------- /libs/airbnb/airbnbneighborhood.py: -------------------------------------------------------------------------------- 1 | """ 2 | NOTES: make sure mongod running. use `sudo mongod` in terminal 3 | """ 4 | 5 | import requests 6 | from pymongo import MongoClient 7 | import time 8 | import datetime 9 | from bs4 import BeautifulSoup 10 | from unidecode import unidecode 11 | import pickle 12 | 13 | 14 | class AirBnBNeighborhood(object): 15 | """ 16 | Initializes an AirBnBNeighborhood object 17 | This allows you to scrape neighborhood pages or retrieve them from MongoDB 18 | 19 | INPUT: 20 | - db_name (str): 'airbnb' or 'airbnb_test' 21 | - coll_name (str): 'neighborhoods' 22 | """ 23 | 24 | def __init__(self, db_name, coll_name): 25 | """ 26 | This is a class for searching AirBnBNeighborhood 27 | """ 28 | 29 | self.BASE_URL = "https://www.airbnb.com" 30 | 31 | client = MongoClient() 32 | self.db = client[db_name] 33 | self.coll = self.db[coll_name] 34 | 35 | self.neighborhood_id = None 36 | self.neighborhood = "" 37 | self.url = "" 38 | self.city_id = None 39 | self.city = "" 40 | 41 | self.r = None 42 | self.d = {} 43 | 44 | def scrape_and_insert(self, neighborhood_id, neighborhood, neighborhood_url, city_id, city): 45 | """ 46 | Scrapes a neighborhood & inserts the neighborhood into the collection 47 | 48 | INPUT: 49 | (per the 'neighborhood_list.csv' file) 50 | - neighborhood_id (int): 51 | - neighborhoold (str): 52 | - neighborhood_url (str): 53 | - city_id (int): 54 | - city (str): 55 | OUTPUT: 56 | - None 57 | """ 58 | self.neighborhood_id = neighborhood_id 59 | self.neighborhood = neighborhood 60 | self.url = self.BASE_URL + neighborhood_url 61 | self.city_id = city_id 62 | self.city = city 63 | 64 | self.r = requests.get(self.url) 65 | 66 | if self.r.status_code == 200: 67 | pkl = pickle.dumps(self.r) 68 | 69 | self.d = {'content': self.r.content, 70 | 'pickle': pkl, 71 | 'time': time.time(), 72 | 'dt': datetime.datetime.utcnow(), 73 | '_id': neighborhood_id, 74 | 'neighborhood': neighborhood, 75 | 'city_id': city_id, 76 | 'city': city, 77 | 'url': url, 78 | 'requests_meta': { 79 | 'status_code': self.r.status_code, 80 | 'is_redirect': self.r.is_redirect, 81 | 'is_ok': self.r.ok, 82 | 'raise_for_status': self.r.raise_for_status(), 83 | 'reason': self.r.reason 84 | } 85 | } 86 | 87 | else: 88 | self.d = {'time': time.time(), 89 | 'dt': datetime.datetime.utcnow(), 90 | '_id': neighborhood_id, 91 | 'neighborhood': neighborhood, 92 | 'city_id': city_id, 93 | 'city': city, 94 | 'url': url, 95 | 'error': True, 96 | 'requests_meta': { 97 | 'status_code': self.r.status_code, 98 | 'is_redirect': self.r.is_redirect, 99 | 'is_ok': self.r.ok, 100 | 'raise_for_status': self.r.raise_for_status(), 101 | 'reason': self.r.reason 102 | } 103 | } 104 | 105 | self.coll.insert(self.d) 106 | 107 | def pull_from_db(self, neighborhood_id): 108 | """ 109 | Pulls a previously scraped neighborhood's data from the MongoDB collection 110 | 111 | INPUT: 112 | - neighborhood_id (int or str): the id of the neighborhood you're trying to pull 113 | OUTPUT: None 114 | """ 115 | hood = self.coll.find_one({'_id': neighborhood_id}) 116 | 117 | self.neighborhood_id = hood['_id'] 118 | self.neighborhood = hood['neighborhood'] 119 | self.url = hood['url'] 120 | 121 | self.r = pickle.loads(hood['pickle']) 122 | self.d = hood 123 | 124 | def is_in_collection(self, neighborhood_id=None): 125 | """ 126 | Checks to see if the current neighborhood's data is in the MongoDB collection 127 | Note: This requires self.neighborhood_id to exist, 128 | i.e. a neighborhood to have been scraped or pulled 129 | 130 | INPUT: 131 | - neighborhood_id (None or int): 132 | * the id of the neighborhood you're trying to pull 133 | * if None (default), uses self.neighborhood_id 134 | OUTPUT: None 135 | """ 136 | if not neighborhood_id: 137 | hood_id = self.neighborhood_id 138 | else: 139 | hood_id = neighborhood_id 140 | return bool(self.coll.find_one({'_id': hood_id})) 141 | 142 | def is_other_in_collection(self, neighborhood_id): 143 | """ 144 | ********** DEPRECIATED *********** 145 | REASON: more efficient to combine this method wth is_in_collection() 146 | SOLUTION: use is_in_collection() with explicit neighborhood_id) 147 | ********************************** 148 | 149 | Checks to see if an explicit neighborhood's data is in the MongoDB collection 150 | 151 | INPUT: None 152 | OUTPUT: None 153 | """ 154 | if not self.coll.find_one({'_id': neighborhood_id}): 155 | return False 156 | else: 157 | return True 158 | 159 | def extract_features(self): 160 | """ 161 | Extracts all of the predefined features of the currently loaded neighborhood 162 | 163 | INPUT: None 164 | OUTPUT: 165 | - dict: the dictionary of the predefined features extracted 166 | """ 167 | features = {} 168 | 169 | soup = BeautifulSoup(self.r.content) 170 | 171 | headline = soup.find('div', {'class': 'center description'}).get_text().strip() 172 | features['headline'] = unidecode(headline) 173 | 174 | description = soup.find('meta', {'name': 'description'})['content'] 175 | features['description'] = unidecode(description) 176 | 177 | traits = [] 178 | traits_html = soup.find('ul', {'class': 'traits'}) 179 | if traits_html is not None: 180 | for trait in traits_html.find_all('span'): 181 | traits.append(trait.get_text()) 182 | features['traits'] = traits 183 | 184 | tags = [] 185 | for tag in soup.find_all('div', {'class': 'neighborhood-tag'}): 186 | tags.append(tag.get_text().strip()) 187 | features['tags'] = tags 188 | 189 | similar_hoods = [] 190 | similar_hood_html = soup.find('ul', {'class': 'trait-neighborhoods neighborhoods'}) 191 | if similar_hood_html is not None: 192 | for similar_hood in similar_hood_html.find_all('li'): 193 | similar_hoods.append(similar_hood['data-neighborhood-permalink']) 194 | features['similar_hoods'] = similar_hoods 195 | 196 | # This code doesn't parse out hoods "within" hoods 197 | neighboring_hoods = [] 198 | for neighboring_hood in soup.find('p', {'class': 'lede center'}).find_all('a'): 199 | neighboring_hoods.append(neighboring_hood.get_text()) 200 | features['neighboring_hoods'] = neighboring_hoods 201 | 202 | caption_bar = soup.find('div', {'class': 'caption bar'}).find_all('li') 203 | if caption_bar != []: 204 | public_trans = soup.find('div', {'class': 'caption bar'}).find_all('li')[0].strong.get_text() 205 | features['public_trans'] = public_trans 206 | having_a_car = soup.find('div', {'class': 'caption bar'}).find_all('li')[1].strong.get_text() 207 | features['having_a_car'] = having_a_car 208 | 209 | data_bbox = soup.find('div', {'id': 'inner-map'})['data-bbox'] 210 | features['data_bbox'] = data_bbox 211 | data_x = float(soup.find('div', {'id': 'inner-map'})['data-x']) 212 | features['data_x'] = data_x 213 | data_y = float(soup.find('div', {'id': 'inner-map'})['data-y']) 214 | features['data_y'] = data_y 215 | 216 | return features 217 | 218 | def add_features(self, new_features): 219 | """ 220 | Add features to the currently loaded neighborhood 221 | Note: The neighborhood must already exist in the MongoDB collection 222 | 223 | INPUT: new_features (dict) - a dict of features to add 224 | OUTPUT: None 225 | """ 226 | 227 | self.coll.update({'_id': self.neighborhood_id}, {'$set': new_features}) 228 | 229 | def extract_and_add_features(self): 230 | """ 231 | Runs extract_features() on the currently loaded neighborhood's data, 232 | and then runs add_features() to add them 233 | Note: The neighborhood must already exist in the MongoDB collection 234 | 235 | INPUT: None 236 | OUTPUT: None 237 | """ 238 | new_features = self.extract_features() 239 | self.add_features(new_features=new_features) 240 | -------------------------------------------------------------------------------- /libs/airbnb/airbnbneighborhood.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/airbnb/airbnbneighborhood.pyc -------------------------------------------------------------------------------- /libs/airbnb/airbnbsearchresult.py: -------------------------------------------------------------------------------- 1 | """ 2 | NOTES: make sure mongod running. use `sudo mongod` in terminal 3 | """ 4 | 5 | import datetime 6 | import requests 7 | from bs4 import BeautifulSoup 8 | from pymongo import MongoClient 9 | import time 10 | import pickle 11 | from bson.objectid import ObjectId 12 | 13 | 14 | class AirBnBSearchResult(object): 15 | """ 16 | Initializes an AirBnBSearchResult object 17 | This allows you to scrape search result pages or retrieve them from MongoDB 18 | 19 | INPUT: 20 | - db_name (str): 'airbnb' or 'airbnb_test' 21 | - coll_name (str): 'search' 22 | """ 23 | 24 | def __init__(self, db_name, coll_name): 25 | """ 26 | This is a class for searching AirBnBSearchResults 27 | 28 | required params: 29 | strings: city, state, country 30 | dates (as strings in mm-dd-yyyy format): checkin, checkout 31 | ints: guests, start_page, end_page, price_max 32 | """ 33 | 34 | self.SEARCH_RESULT_URL = "https://www.airbnb.com/s/" 35 | 36 | client = MongoClient() 37 | self.db = client[db_name] 38 | self.coll = self.db[coll_name] 39 | 40 | 41 | def set_params(self, params): 42 | """ 43 | Sets the initial parameters of the search result 44 | 45 | INPUT: 46 | - params (dict): this should include: 47 | * city, state, country (strings) 48 | * checkin, checkout (strings, in mm-dd-yyyy format) 49 | * guests (int) 50 | * price_max (int or float) 51 | 52 | """ 53 | self.params = params 54 | 55 | self.city = self.params['city'] 56 | self.state = self.params['state'] 57 | self.country = self.params['country'] 58 | self.checkin = self.params['checkin'] 59 | self.checkout = self.params['checkout'] 60 | self.guests = self.params['guests'] 61 | # self.price_min = params['price_min'] 62 | if 'price_max' in params: 63 | self.price_max = params['price_max'] 64 | 65 | 66 | def scrape_all_results(self, start_page=1, end_page=1, insert_into_db=True, pause_between_pages=1.0): 67 | """ 68 | Scrapes multiple pages of a search result info from AirBnB 69 | Runs scrape_from_web() on the currently loaded params, 70 | for each page from start_page to end_page 71 | note: requires set_params() to have been run 72 | 73 | INPUT: 74 | - start_pagem, end_page (ints): the page numbers to add to the url parameters 75 | - insert_into_db (bool): whether or not to insert the data into the db 76 | - pause_between_pages (float): How long to wait between pages 77 | OUTPUT: None 78 | """ 79 | 80 | self.start_page = start_page 81 | self.end_page = end_page 82 | 83 | for page in xrange(start_page, end_page+1): 84 | scrape = self.scrape_from_web(page=page, insert_into_db=insert_into_db) 85 | if insert_into_db: self.coll.insert(d) 86 | 87 | time.sleep(pause_between_pages) 88 | 89 | def scrape_from_web(self, page=1, insert_into_db = True): 90 | """ 91 | Scrapes a search result's info from AirBnB 92 | note: requires set_params() to have been run 93 | 94 | INPUT: 95 | - page (int): the page number to add to the url parameters 96 | - insert_into_db (bool): whether or not to insert the data into the db 97 | OUTPUT: None 98 | """ 99 | 100 | city_url = '%s--%s--%s' % (self.city, self.state, self.country) 101 | 102 | url_params = {'checkin': self.checkin, 103 | 'checkout': self.checkout, 104 | 'guests': self.guests, 105 | 'price_max': self.price_max} 106 | 107 | if page != 1: 108 | url_params['page'] = page 109 | 110 | self.r = requests.get(self.SEARCH_RESULT_URL + city_url, params=url_params) 111 | 112 | if self.r.status_code == 200: 113 | pkl = pickle.dumps(self.r) # pickling the requests object to allow parsing via Beautiful Soup later 114 | 115 | self.d = {'content':self.r.content, 116 | 'pickle': pkl, 117 | 'time': time.time(), 118 | 'dt': datetime.datetime.utcnow(), 119 | 'city': self.city, 120 | 'state': self.state, 121 | 'country': self.country, 122 | 'params': url_params, 123 | 'requests_meta':{ 124 | 'status_code': self.r.status_code, 125 | 'is_redirect': self.r.is_redirect, 126 | 'is_ok': self.r.ok, 127 | 'raise_for_status': self.r.raise_for_status(), 128 | 'reason': self.r.reason 129 | }, 130 | 'url': self.r.url, 131 | 'page': page} 132 | 133 | if insert_into_db: self.insert_into_coll() 134 | 135 | def scrape_from_web_for_app(self): 136 | """ 137 | Scrapes a search result's info from AirBnB 138 | note: requires set_params() to have been run 139 | note: specific for the production instance of the app 140 | 141 | INPUT: None 142 | OUTPUT: None 143 | """ 144 | 145 | city_url = '%s--%s--%s' % (self.city, self.state, self.country) 146 | 147 | url_params = {'checkin': self.checkin, 148 | 'checkout': self.checkout, 149 | 'guests': self.guests, 150 | 'price_max': self.price_max} 151 | 152 | self.r = requests.get(self.SEARCH_RESULT_URL + city_url, params=url_params) 153 | 154 | if self.r.status_code == 200: 155 | self.d = {'content':self.r.content, 156 | 'time': time.time(), 157 | 'dt': datetime.datetime.utcnow(), 158 | 'city': self.city, 159 | 'state': self.state, 160 | 'country': self.country, 161 | 'params': url_params, 162 | 'url': self.r.url} 163 | 164 | def pull_one_from_db(self): 165 | """ 166 | Pulls a previously scraped search result data from the MongoDB collection 167 | The query is based off of the city, state, country & checkin information 168 | Note: requires the user to have run set_params() 169 | 170 | INPUT: None 171 | OUTPUT: None 172 | """ 173 | self.d = self.coll.find_one({'city': self.city, 174 | 'state': self.state, 175 | 'country': self.country, 176 | 'params.checkin': self.checkin}) 177 | 178 | self.r = pickle.loads(self.d['pickle']) 179 | 180 | def pull_one_from_db_cached(self, city): 181 | """ 182 | Pulls a previously scraped search result data from the MongoDB collection 183 | The query is based off of the city, state, country & checkin information 184 | Note: requires the user to have run set_params() 185 | 186 | INPUT: None 187 | OUTPUT: None 188 | """ 189 | if city == "New-York": 190 | self.d = self.coll.find_one({'_id': ObjectId("557f158e0540ac02f7fc9b63")}) 191 | else: 192 | self.d = self.coll.find_one({'_id': ObjectId("557f07ed0540ac02f7fc916d")}) 193 | 194 | self.r = pickle.loads(self.d['pickle']) 195 | 196 | def extract_listing_ids(self): 197 | """ 198 | Creates a list of all the listing_ids found on the Search Result page 199 | 200 | INPUT: None 201 | OUTPUT: list (of strs) 202 | """ 203 | listing_ids = [] 204 | soup = BeautifulSoup(self.r.content) 205 | 206 | for listing in soup.find_all("div", {"class": "listing"}): 207 | listing_ids.append(listing['data-id']) 208 | 209 | return listing_ids 210 | 211 | def extract_thumbnail_data(self): 212 | """ 213 | Creates a dictionary for each listing filled with all its critical data 214 | Note: This requires self.r.content to exist, 215 | i.e. a search result to have been scraped or pulled 216 | 217 | INPUT: None 218 | OUTPUT: dict (of dicts) 219 | """ 220 | thumbnail_data = {} 221 | soup = BeautifulSoup(self.r.content) 222 | 223 | for listing in soup.find_all("div", {"class": "listing"}): 224 | cur_data = {} 225 | listing_id = listing['data-id'] 226 | cur_data['listing_id'] = listing_id 227 | 228 | cur_data['lat'] = listing['data-lat'] 229 | cur_data['lng'] = listing['data-lng'] 230 | 231 | if listing.find('img') != None: 232 | tmp_img = listing.find('img')['data-urls'][2:] 233 | tmp_img = tmp_img[:tmp_img.find("\"")] 234 | cur_data['thumbnail_img'] = tmp_img 235 | else: 236 | cur_data['thumbnail_img'] = "/no_thumbnail.jpg" 237 | 238 | cur_data['blurb'] = listing['data-name'] 239 | 240 | if listing.find("span"): 241 | cur_data['thumbnail_price'] = listing.find("span").get_text() 242 | else: 243 | cur_data['thumbnail_price'] = "n/a" 244 | 245 | if listing.find("div", {'itemprop': "description"}): 246 | if listing.find("div", {'itemprop': "description"}).find('a'): 247 | tmp = listing.find("div", {'itemprop': "description"}).find('a').get_text() 248 | if tmp.find(u'\xb7') != -1: 249 | tmp = tmp[:tmp.find(u'\xb7')] 250 | cur_data['listing_type'] = tmp 251 | else: 252 | cur_data['listing_type'] = "not available" 253 | else: 254 | cur_data['listing_type'] = "not available" 255 | 256 | thumbnail_data[listing_id] = cur_data 257 | 258 | return thumbnail_data 259 | 260 | 261 | def insert_into_coll(self, overwrite=False): 262 | """ 263 | Inserts the current search result's data into the MongoDB collection 264 | 265 | INPUT: 266 | - overwrite (bool): whether to overwrite if the listing already exists 267 | OUTPUT: None 268 | """ 269 | 270 | self.coll.insert(self.d) 271 | 272 | -------------------------------------------------------------------------------- /libs/airbnb/airbnbsearchresult.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/airbnb/airbnbsearchresult.pyc -------------------------------------------------------------------------------- /libs/app_helper.py: -------------------------------------------------------------------------------- 1 | """ 2 | This file contains helper functions for the localeBnB web app 3 | """ 4 | 5 | import numpy as np 6 | 7 | 8 | def initialize_rank_scores(head=None, 9 | tail_high=5, 10 | tail_low=2, 11 | tail_num=16, 12 | log_score=True): 13 | ''' 14 | Initializes scores for the listing ranks; 15 | Combines a custom head and combines it with a linear tail function 16 | Loosely inspired by the Long Tail & Google SERP CTRs: 17 | (https://moz.com/blog/google-organic-click-through-rates-in-2014) 18 | 19 | INPUT: 20 | - head (None or 1d numpy array): 21 | * score represents the "head" of the scores 22 | * if None (default), function uses data from moz.com to score the head 23 | * to avoid using the head, set head=np.array([]) 24 | - tail_high, tail_low, tail_num (ints): 25 | * inputs to an np.linspace() function 26 | * to avoid using the tail, set tail_num=0 27 | - log_score (bool): if True, takes the natural log of the score 28 | 29 | OUTPUT: 30 | - 1d numpy array: an array of length len(head) + tail_num. 31 | ''' 32 | 33 | if head is not None: 34 | head_scores = head 35 | else: 36 | MOZ_DATA = np.array([25, 15, 11, 8, 6.5]) 37 | head_scores = MOZ_DATA 38 | 39 | tail_scores = np.linspace(tail_high, tail_low, tail_num) 40 | 41 | scores = np.append(head_scores, tail_scores) 42 | if log_score: 43 | scores = np.log(scores) # Take the ln of the score 44 | 45 | return scores 46 | -------------------------------------------------------------------------------- /libs/app_helper.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/libs/app_helper.pyc -------------------------------------------------------------------------------- /libs/extract_features_from_listings.py: -------------------------------------------------------------------------------- 1 | """ 2 | Script to extract features from the Neighborhoods 3 | 4 | DEPENDENCIES: 5 | 1) scrape_search_listings.py > MongoDB 'listings' collection 6 | 7 | NOTES: make sure mongod running. use `sudo mongod` in terminal 8 | """ 9 | 10 | from airbnb.airbnblisting import AirBnBListing 11 | 12 | DB_NAME = 'airbnb' 13 | COLL_NAME = 'listings' 14 | 15 | 16 | def main(): 17 | air_listing = AirBnBListing(db_name=DB_NAME, coll_name=COLL_NAME) 18 | listing_dict = list(air_listing.coll.find({}, {'_id': 1})) 19 | 20 | for listing in listing_dict: 21 | listing_id = listing['_id'] 22 | air_listing.pull_from_db(listing_id=listing_id) 23 | air_listing.extract_and_add_features() 24 | print "Extracting Features for %s" % listing_id 25 | 26 | if __name__ == '__main__': 27 | main() 28 | -------------------------------------------------------------------------------- /libs/extract_features_from_neighborhoods.py: -------------------------------------------------------------------------------- 1 | """ 2 | Script to extract features from the Neighborhoods 3 | 4 | DEPENDENCIES: 5 | 1) scrape_neighborhoods.py > MongoDB 'neighborhoods' collection 6 | 7 | NOTES: make sure mongod running. use `sudo mongod` in terminal 8 | """ 9 | 10 | from airbnb.airbnbneighborhood import AirBnBNeighborhood 11 | 12 | DB_NAME = 'airbnb' 13 | COLL_NAME = 'neighborhoods' 14 | 15 | 16 | def main(): 17 | air_hood = AirBnBNeighborhood(db_name=DB_NAME, coll_name=COLL_NAME) 18 | hoods_dict = list(air_hood.coll.find({}, {'_id': 1})) 19 | 20 | for hood in hoods_dict: 21 | hood_id = hood['_id'] 22 | air_hood.pull_from_db(neighborhood_id=hood_id) 23 | air_hood.extract_and_add_features() 24 | print "Extracting Features for %s" % hood_id 25 | 26 | 27 | if __name__ == '__main__': 28 | main() 29 | -------------------------------------------------------------------------------- /libs/extract_listings_from_search_results.py: -------------------------------------------------------------------------------- 1 | """ 2 | Inserts all of the listings found in search results into a listings collection. 3 | 4 | DEPENDENCIES: 5 | 1) scrape_search_results.py > MongoDB 'search' collection 6 | 7 | NOTES: make sure mongod running. use `sudo mongod` in terminal 8 | """ 9 | from bs4 import BeautifulSoup 10 | import pickle 11 | from pymongo import MongoClient 12 | 13 | DB_NAME = 'airbnb' 14 | SEARCH_COLL_NAME = 'search' 15 | LISTING_COLL_NAME = 'listings' 16 | 17 | 18 | def main(): 19 | client = MongoClient() 20 | db = client[DB_NAME] 21 | search_coll = db[SEARCH_COLL_NAME] 22 | listing_coll = db[LISTING_COLL_NAME] 23 | 24 | # loop through all of the search results 25 | for x in search_coll.find({}, {'pickle': 1}): 26 | r = pickle.loads(x['pickle']) 27 | soup = BeautifulSoup(r.content) 28 | 29 | # loop through all of the listings in each search result 30 | for listing in soup.find_all("div", {"class": "listing"}): 31 | try: 32 | listing_id = listing['data-id'] 33 | if not listing_coll.find_one({'_id': listing_id}): 34 | listing_coll.insert({'_id': listing_id}) 35 | print "SUCCESS: Added %s to database" % listing_id 36 | else: 37 | "DUPLICATE: Already added %s to database" % listing_id 38 | except: 39 | "ERROR: No listings" 40 | 41 | 42 | if __name__ == "__main__": 43 | main() 44 | -------------------------------------------------------------------------------- /libs/generate_mnb_models.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import pandas as pd 3 | from sklearn.feature_extraction.text import TfidfVectorizer 4 | from sklearn.naive_bayes import MultinomialNB 5 | from sklearn.cross_validation import train_test_split 6 | from airbnb.airbnbneighborhood import AirBnBNeighborhood 7 | from airbnb.airbnblisting import AirBnBListing 8 | 9 | MAX_DF_LIST = [.5, .66, .75, .83, .90, 1.0] 10 | MAX_FEATURE_LIST = [500, 1000, 2000, 3000, 4000, 5000, 8000] 11 | MIN_DF_LIST = [1, 2] 12 | RANDOM_STATE_LIST = [1, 42, 1337] # lulz 13 | TRAIT_LIST = ['artsy', 'shopping', 'dining', 'nightlife'] 14 | 15 | 16 | def load_data(): 17 | air_hood = AirBnBNeighborhood(db_name='airbnb', coll_name='neighborhoods') 18 | hood_df = pd.DataFrame(list(air_hood.coll.find({}))) 19 | 20 | air_listing = AirBnBNeighborhood(db_name='airbnb', coll_name='listings') 21 | listing_df = pd.DataFrame(list(air_listing.coll.find({}))) 22 | listing_df = listing_df[listing_df['description_raw'].isnull() == False] 23 | 24 | merged_df = listing_df.merge(right=hood_df[['neighborhood', 'city', 'traits']], 25 | on='neighborhood', 26 | suffixes=('', '_copy')) 27 | return merged_df 28 | 29 | 30 | def add_features(df): 31 | new_df = df.copy() 32 | new_df['artsy'] = ['Artsy' in x for x in new_df['traits']] 33 | new_df['shopping'] = ['Shopping' in x for x in new_df['traits']] 34 | new_df['dining'] = ['Dining' in x for x in new_df['traits']] 35 | new_df['nightlife'] = ['Nightlife' in x for x in new_df['traits']] 36 | return new_df 37 | 38 | 39 | def run_mnb(X_doc, y, mnb, max_df, max_features, min_df, random_state): 40 | # split the data 41 | X_train_doc, X_test_doc, y_train, y_test = \ 42 | train_test_split(X_doc, y, random_state=random_state) 43 | 44 | # Vectorize the training data 45 | tfidf = TfidfVectorizer(max_df=max_df, max_features=max_features, min_df=min_df) 46 | vectorized_corpus = tfidf.fit_transform(X_train_doc) 47 | X_train = vectorized_corpus.toarray() 48 | 49 | # fit the Naive Bayes model 50 | mnb.fit(X_train, y_train) 51 | train_score = mnb.score(X_train, y_train) 52 | 53 | # score it against the test data 54 | X_test = tfidf.transform(X_test_doc).toarray() 55 | test_score = mnb.score(X_test, y_test) 56 | 57 | return (max_df, max_features, min_df, random_state, train_score, test_score) 58 | 59 | 60 | def run_grid_search(X_doc, y, mnb, trait): 61 | results = [] 62 | 63 | for max_df in MAX_DF_LIST: 64 | for max_features in MAX_FEATURE_LIST: 65 | for min_df in MIN_DF_LIST: 66 | for random_state in RANDOM_STATE_LIST: 67 | result = run_mnb(X_doc, y, mnb, max_df, max_features, min_df, random_state) 68 | print result 69 | results.append(result) 70 | 71 | selected_columns = ['max_df', 'max_features', 'min_df', 'random_state', 'train_score', 'test_score'] 72 | results_df = pd.DataFrame(results, columns=selected_columns) 73 | results_df.to_csv('../models/%s_results.csv' % trait) 74 | 75 | results_df = results_df.groupby(by=['max_df', 'max_features', 'min_df', 'random_state'], as_index=False).mean() 76 | 77 | max_test_score = max(results_df['test_score']) 78 | max_test_results_df = results_df[results_df['test_score'] == max_test_score] 79 | 80 | return max_test_results_df.reset_index() 81 | 82 | 83 | def tiebreaker(mnb, max_test_results_df, X_doc, y): 84 | tiebreaker_df = max_test_results_df.copy() 85 | tiebreaker_df['final_score'] = 0 86 | for i in tiebreaker_df.index: 87 | max_df = tiebreaker_df['max_df'][i] 88 | max_features = tiebreaker_df['max_features'][i] 89 | min_df = tiebreaker_df['min_df'][i] 90 | random_state = tiebreaker_df['random_state'][i] 91 | 92 | tfidf = TfidfVectorizer(max_df=max_df, max_features=max_features, min_df=min_df) 93 | vectorized_corpus = tfidf.fit_transform(X_doc) 94 | X = vectorized_corpus.toarray() 95 | 96 | mnb.fit(X, y) 97 | tiebreaker_df['full_score'] = mnb.score(X, y) 98 | 99 | tiebreaker_df.sort('full_score', inplace=True) 100 | 101 | return tiebreaker_df.reset_index() 102 | 103 | 104 | def run_winning_model(X_doc, y, max_df, max_features, min_df, mnb, trait): 105 | print 106 | print "THE WINNING MODEL IS FOR %s IS: %s, %s, %s" % (trait, max_df, max_features, min_df) 107 | 108 | tfidf = TfidfVectorizer(max_df=max_df, max_features=max_features, min_df=min_df) 109 | vectorized_corpus = tfidf.fit_transform(X_doc) 110 | 111 | tfidf_pickle_file = open('../models/tfidf_%s.pkl' % trait, 'w') 112 | pickle.dump(tfidf, tfidf_pickle_file) 113 | tfidf_pickle_file.close() 114 | 115 | X = vectorized_corpus.toarray() 116 | 117 | mnb.fit(X, y) 118 | mnb_pickle_file = open('../models/mnb_%s_final.pkl' % trait, 'w') 119 | pickle.dump(mnb, mnb_pickle_file) 120 | mnb_pickle_file.close() 121 | 122 | 123 | def main(): 124 | df = load_data() 125 | feature_df = add_features(df) 126 | X_doc = list(feature_df['description_clean']) 127 | mnb = MultinomialNB() 128 | 129 | for trait in TRAIT_LIST: 130 | y = feature_df[trait] 131 | max_test_results_df = run_grid_search(X_doc, y, mnb, trait) 132 | 133 | if len(max_test_results_df) > 1: 134 | max_test_results_df = tiebreaker(mnb, max_test_results_df, X_doc, y) 135 | 136 | max_df = max_test_results_df['max_df'][0] 137 | max_features = max_test_results_df['max_features'][0] 138 | min_df = max_test_results_df['min_df'][0] 139 | 140 | run_winning_model(X_doc=X_doc, y=y, max_df=max_df, 141 | max_features=max_features, min_df=min_df, 142 | mnb=mnb, trait=trait) 143 | 144 | 145 | if __name__ == "__main__": 146 | main() 147 | -------------------------------------------------------------------------------- /libs/generate_svm_models.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import pandas as pd 3 | from sklearn.feature_extraction.text import TfidfVectorizer 4 | from sklearn.svm import LinearSVC 5 | from sklearn.cross_validation import train_test_split 6 | from airbnb.airbnbneighborhood import AirBnBNeighborhood 7 | from airbnb.airbnblisting import AirBnBListing 8 | 9 | # Ran rto find models 10 | # MAX_DF_LIST = [.5, .66, .75, .83, .90, 1.0] 11 | # MAX_FEATURE_LIST = [500, 1000, 2000, 3000, 4000, 5000, 8000] 12 | # MIN_DF_LIST = [1,2] 13 | # RANDOM_STATE_LIST = [1, 42, 1337] # lulz 14 | 15 | # Ran as final solution 16 | MAX_DF_LIST = [.90] 17 | MAX_FEATURE_LIST = [8000] 18 | MIN_DF_LIST = [2] 19 | RANDOM_STATE_LIST = [55] 20 | 21 | TRAIT_LIST = ['artsy', 'shopping', 'dining', 'nightlife'] 22 | 23 | 24 | def load_data(): 25 | air_hood = AirBnBNeighborhood(db_name='airbnb', coll_name='neighborhoods') 26 | hood_df = pd.DataFrame(list(air_hood.coll.find({}))) 27 | 28 | air_listing = AirBnBNeighborhood(db_name='airbnb', coll_name='listings') 29 | listing_df = pd.DataFrame(list(air_listing.coll.find({}))) 30 | listing_df = listing_df[listing_df['description_raw'].isnull() == False] 31 | 32 | merged_df = listing_df.merge(right=hood_df[['neighborhood', 'city', 'traits']], 33 | on='neighborhood', suffixes=('', '_copy')) 34 | return merged_df 35 | 36 | 37 | def add_features(df): 38 | new_df = df.copy() 39 | new_df['artsy'] = ['Artsy' in x for x in new_df['traits']] 40 | new_df['shopping'] = ['Shopping' in x for x in new_df['traits']] 41 | new_df['dining'] = ['Dining' in x for x in new_df['traits']] 42 | new_df['nightlife'] = ['Nightlife' in x for x in new_df['traits']] 43 | return new_df 44 | 45 | 46 | def run_svc(X_doc, y, svc, max_df, max_features, min_df, random_state): 47 | # split the data 48 | X_train_doc, X_test_doc, y_train, y_test = train_test_split(X_doc, y, random_state=random_state) 49 | 50 | # Vectorize the training data 51 | tfidf = TfidfVectorizer(max_df=max_df, max_features=max_features, min_df=min_df) 52 | vectorized_corpus = tfidf.fit_transform(X_train_doc) 53 | X_train = vectorized_corpus.toarray() 54 | 55 | # fit the SVM model 56 | svc.fit(X_train, y_train) 57 | train_score = svc.score(X_train, y_train) 58 | 59 | # score it against the test data 60 | X_test = tfidf.transform(X_test_doc).toarray() 61 | test_score = svc.score(X_test, y_test) 62 | 63 | return (max_df, max_features, min_df, random_state, train_score, test_score) 64 | 65 | 66 | def run_grid_search(X_doc, y, svc, trait): 67 | results = [] 68 | 69 | for max_df in MAX_DF_LIST: 70 | for max_features in MAX_FEATURE_LIST: 71 | for min_df in MIN_DF_LIST: 72 | for random_state in RANDOM_STATE_LIST: 73 | result = run_svc(X_doc, y, svc, max_df, max_features, min_df, random_state) 74 | print result 75 | results.append(result) 76 | 77 | results_df = pd.DataFrame(results, columns=['max_df', 'max_features', 'min_df', 'random_state', 'train_score', 'test_score']) 78 | results_df.to_csv('../models/%s_svc_results.csv' % trait) 79 | 80 | results_df = results_df.groupby(by=['max_df', 'max_features', 'min_df', 'random_state'], 81 | as_index=False).mean() 82 | 83 | max_test_score = max(results_df['test_score']) 84 | max_test_results_df = results_df[results_df['test_score'] == max_test_score] 85 | 86 | return max_test_results_df.reset_index() 87 | 88 | 89 | def tiebreaker(svc, max_test_results_df, X_doc, y): 90 | tiebreaker_df = max_test_results_df.copy() 91 | tiebreaker_df['final_score'] = 0 92 | for i in tiebreaker_df.index: 93 | max_df = tiebreaker_df['max_df'][i] 94 | max_features = tiebreaker_df['max_features'][i] 95 | min_df = tiebreaker_df['min_df'][i] 96 | random_state = tiebreaker_df['random_state'][i] 97 | 98 | tfidf = TfidfVectorizer(max_df=max_df, max_features=max_features, min_df=min_df) 99 | vectorized_corpus = tfidf.fit_transform(X_doc) 100 | X = vectorized_corpus.toarray() 101 | 102 | svc.fit(X, y) 103 | tiebreaker_df['full_score'] = svc.score(X, y) 104 | 105 | tiebreaker_df.sort('full_score', inplace=True) 106 | 107 | return tiebreaker_df.reset_index() 108 | 109 | 110 | def run_winning_model(X_doc, y, max_df, max_features, min_df, svc, trait): 111 | print 112 | print "THE WINNING MODEL IS FOR %s IS: %s, %s, %s" % (trait, max_df, max_features, min_df) 113 | 114 | tfidf = TfidfVectorizer(max_df=max_df, max_features=max_features, min_df=min_df) 115 | vectorized_corpus = tfidf.fit_transform(X_doc) 116 | 117 | tfidf_pickle_file = open('../models/tfidf_svc_%s.pkl' % trait, 'w') 118 | pickle.dump(tfidf, tfidf_pickle_file) 119 | tfidf_pickle_file.close() 120 | 121 | X = vectorized_corpus.toarray() 122 | 123 | svc.fit(X, y) 124 | svc_pickle_file = open('../models/svc_%s_final.pkl' % trait, 'w') 125 | pickle.dump(svc, svc_pickle_file) 126 | svc_pickle_file.close() 127 | 128 | 129 | def main(): 130 | df = load_data() 131 | feature_df = add_features(df) 132 | X_doc = list(feature_df['description_clean']) 133 | svc = LinearSVC() 134 | 135 | for trait in TRAIT_LIST: 136 | y = feature_df[trait] 137 | max_test_results_df = run_grid_search(X_doc, y, svc, trait) 138 | 139 | if len(max_test_results_df) > 1: 140 | max_test_results_df = tiebreaker(svc, max_test_results_df, X_doc, y) 141 | 142 | max_df = max_test_results_df['max_df'][0] 143 | max_features = max_test_results_df['max_features'][0] 144 | min_df = max_test_results_df['min_df'][0] 145 | 146 | run_winning_model(X_doc=X_doc, y=y, max_df=max_df, 147 | max_features=max_features, min_df=min_df, 148 | svc=svc, trait=trait) 149 | 150 | 151 | if __name__ == "__main__": 152 | main() 153 | -------------------------------------------------------------------------------- /libs/scrape_listings.py: -------------------------------------------------------------------------------- 1 | """ 2 | This script is used to scrape AirBnB's listings pages for its content 3 | * scrapes every listing found in the listing collection 4 | 5 | DEPENDENCIES: 6 | 1) extract_listings_from_search_results.py > MongoDB 'listing' collection 7 | 8 | POTENTIAL ISSUES: 9 | 1) Gettign Blocked/Banned: 10 | * While I did this file, I got stopped several times. 11 | * At a certain point, I used a VPN from Canada to scrape 12 | * Thankfully, I had a 3500+ chunk that I was able to do overnight 13 | 14 | NOTES: make sure mongod running. use `sudo mongod` in terminal 15 | """ 16 | 17 | from airbnb.airbnblisting import AirBnBListing 18 | import time 19 | 20 | DB_NAME = 'airbnb' 21 | COLL_NAME = 'listings' 22 | 23 | 24 | def main(): 25 | air_listing = AirBnBListing(db_name=DB_NAME, coll_name=COLL_NAME) 26 | 27 | # grab a dict of listings that haven't yet been scraped 28 | # based off of the existings of the 'dt' field 29 | listing_dicts = list(air_listing.coll.find({'dt': {'$exists': 0}}, {'_id': 1})) 30 | 31 | # for each listing not yet pulled, attempt to scrape & insert into the db 32 | 33 | for listing in listing_dicts: 34 | listing_id = listing['_id'] 35 | air_listing.scrape_and_insert(listing_id=listing_id, overwrite=True) 36 | 37 | # print "Scraping & Adding %s" % listing_id 38 | 39 | time.sleep(3) # as to not get banned from AirBnB 40 | 41 | 42 | if __name__ == '__main__': 43 | main() 44 | -------------------------------------------------------------------------------- /libs/scrape_neighborhood_list.py: -------------------------------------------------------------------------------- 1 | """ 2 | This script is used to grab neighborhoods from cities 3 | * takes in a file ('../data/city_list.csv') 4 | * grabs the neighborhoods we wish to scrape from the file 5 | * scrapes AirBnB's city guide for each of the cities to grab the hoods 6 | 7 | DEPENDENCIES: 8 | 1) '../data/city_list.csv' 9 | 10 | import pandas as pd 11 | df = pd.read_csv('../data/city_list.csv') 12 | print df.head(2) 13 | 14 | city_id city state country 15 | 0 1 San-Francisco CA United-States 16 | 1 2 New-York NY United-States 17 | 18 | POTENTIAL ISSUES: 19 | 1) city_id: 20 | * city_id is unique to this project; it is not based on AirBnB's city ids 21 | 2) neighborhood_id: 22 | * arbitrarily assigns neighborhood_ids based on the order they were scraped 23 | * if content changes, then all dependencies are impacted 24 | * if I were to recreate, 25 | AirBnB's neighborhood_ids on the scrape and use that field 26 | 27 | NOTES: make sure mongod running. use `sudo mongod` in terminal 28 | """ 29 | 30 | import pandas as pd 31 | import requests 32 | from bs4 import BeautifulSoup 33 | 34 | NEIGHBORHOOD_URL = 'https://www.airbnb.com/locations/' 35 | CITY_FILEPATH = '../data/city_list.csv' 36 | NEIGHBORHOOD_OUTPUT = '../data/neighborhood_list.csv' 37 | 38 | 39 | def main(): 40 | df = pd.read_csv(CITY_FILEPATH) 41 | city_tuples = [(city_id, city.lower()) for city_id, city in zip(df['city_id'], df['city'])] 42 | 43 | neighborhoods = [] 44 | 45 | for city_id, city in city_tuples: 46 | r = requests.get(NEIGHBORHOOD_URL + city) 47 | soup = BeautifulSoup(r.content) 48 | neighborhood_list_raw = soup.find('div', {'class': 'neighborhood-list'}).find_all('a')[1:] 49 | for hood in neighborhood_list_raw: 50 | hood_name = hood.get_text() 51 | hood_url = hood['href'] 52 | neighborhoods.append((hood_name, hood_url, city_id, city)) 53 | 54 | select_cols = ["neighborhood", "neighborhood_url", "city_id", "city"] 55 | hood_df = pd.DataFrame(neighborhoods, columns=select_cols) 56 | hood_df.to_csv(NEIGHBORHOOD_OUTPUT, index=True, index_label='neighborhood_id') 57 | 58 | 59 | if __name__ == "__main__": 60 | main() 61 | -------------------------------------------------------------------------------- /libs/scrape_neighborhoods.py: -------------------------------------------------------------------------------- 1 | """ 2 | This script is used to scrape the neighborhood pages for its content 3 | * takes in a file ('../data/neighborhood_list.csv') 4 | * scrapes AirBnB's neighborhood guide for neighborhoods 5 | 6 | DEPENDENCIES: 7 | 1) scrape_neighborhood_list.py > '../data/neighborhood_list.csv' 8 | 9 | import pandas as pd 10 | df = pd.read_csv('../data/neighborhood_list.csv') 11 | print df.head(2) 12 | 13 | 14 | neighborhood_id neighborhood neighborhood_url \ 15 | 0 0 Alamo Square /locations/san-francisco/alamo-square 16 | 1 1 Bayview /locations/san-francisco/bayview 17 | 18 | city_id city 19 | 0 1 san-francisco 20 | 1 1 san-francisco 21 | 22 | 23 | print df.tail(2) 24 | 25 | neighborhood_id neighborhood neighborhood_url \ 26 | 100 100 Williamsburg /locations/new-york/williamsburg 27 | 101 101 Windsor Terrace /locations/new-york/windsor-terrace 28 | 29 | city_id city 30 | 100 2 new-york 31 | 101 2 new-york 32 | 33 | 34 | POTENTIAL ISSUES: 35 | 1) city_id: 36 | * city_id is unique to this project; it is not based on AirBnB's city ids 37 | 2) neighborhood_id: 38 | * arbitrarily assigned neighborhood_ids based on the orderscraped 39 | * if content changes, then all dependencies are impacted 40 | 41 | NOTES: make sure mongod running. use `sudo mongod` in terminal 42 | """ 43 | 44 | import pandas as pd 45 | from airbnb.airbnbneighborhood import AirBnBNeighborhood 46 | import time 47 | 48 | DB_NAME = 'airbnb' 49 | COLL_NAME = 'neighborhoods' 50 | NEIGHBORHOOD_FILEPATH = '../data/neighborhood_list.csv' 51 | 52 | 53 | def main(): 54 | air_hood = AirBnBNeighborhood(db_name=DB, coll_name=COLL) 55 | 56 | df = pd.read_csv(NEIGHBORHOOD_FILEPATH) 57 | hood_list = df.to_dict('records') 58 | 59 | for hood in hood_list: 60 | air_hood.scrape_and_insert(neighborhood_id=hood['neighborhood_id'], 61 | neighborhood=hood['neighborhood'], 62 | neighborhood_url=hood['neighborhood_url'], 63 | city_id=hood['city_id'], 64 | city=hood['city']) 65 | 66 | # print "%s > %s" % (hood['city'], hood['neighborhood']) 67 | 68 | time.sleep(2.5) # as to not get banned from AirBnB 69 | 70 | 71 | if __name__ == "__main__": 72 | main() 73 | -------------------------------------------------------------------------------- /libs/scrape_search_results.py: -------------------------------------------------------------------------------- 1 | """ 2 | This script is used to scrape AirBnB's search result pages for its content 3 | * takes in a file ('../data/city_list.csv') 4 | * generates a sampling of dates 5 | * scrapes AirBnB's searchlings 6 | 7 | DEPENDENCIES: 8 | 1) '../data/city_list.csv' 9 | 10 | import pandas as pd 11 | df = pd.read_csv('../data/city_list.csv') 12 | print df.head(2) 13 | 14 | city_id city state country 15 | 0 1 San-Francisco CA United-States 16 | 1 2 New-York NY United-States 17 | 18 | NOTES: make sure mongod running. use `sudo mongod` in terminal 19 | """ 20 | 21 | import pandas as pd 22 | from pandas.tseries.offsets import DateOffset 23 | from airbnb.airbnbsearchresult import AirBnBSearchResult 24 | import time 25 | 26 | START_DATE = '06-15-2015' 27 | 28 | # NUM_WEEKS = 1 # test runs 29 | NUM_WEEKS = 26 # all other runs 30 | 31 | # NUM_GUESTS = {1} # test runs 32 | # NUM_GUESTS = {1, 2, 4} # used for 1st run 33 | NUM_GUESTS = {1, 4} # used for subsequent runs 34 | 35 | # NUM_PAGES = 1 # test runs 36 | # NUM_PAGES = 10 # used for 1st run 37 | NUM_PAGES = 25 # used for subsequent runs 38 | 39 | CITY_FILEPATH = '../data/city_list.csv' 40 | DB_NAME = 'airbnb' 41 | COLL_NAME = 'search' 42 | 43 | 44 | def import_city_list(filepath): 45 | df = pd.read_csv(filepath) 46 | return df.to_dict('records') 47 | 48 | 49 | def create_date_list(): 50 | dates = pd.date_range(START_DATE, periods=7*NUM_WEEKS, freq='D') 51 | day_of_weeks = [1, 4] # we want only Tuesdays & Fridays 52 | cond = [d.dayofweek in day_of_weeks for d in dates] 53 | sampled_dates = dates[cond] 54 | # for the purposes of scraping, checkout will always be 2 days later 55 | return sampled_dates 56 | 57 | 58 | def main(): 59 | city_list = import_city_list(CITY_FILEPATH) 60 | date_list = create_date_list() 61 | 62 | air_search = AirBnBSearchResult(db_name=DB_NAME, coll_name=COLL_NAME) 63 | 64 | for city in city_list: 65 | for num_guests in NUM_GUESTS: 66 | for date in date_list: 67 | params = {'city': city['city'], 68 | 'state': city['state'], 69 | 'country': city['country'], 70 | 'checkin': date.date().strftime('%m-%d-%Y'), 71 | 'checkout': (date + DateOffset(2)).strftime('%m-%d-%Y'), 72 | 'guests': num_guests, 73 | 'start_page': 1, 74 | 'end_page': NUM_PAGES 75 | } 76 | 77 | air_search.set_params(params) 78 | air_sr.scrape_all_results(pause, pause_between_pages=1) 79 | 80 | # print "%s > %s > %s" % (city['city'], num_guests, date.date().strftime('%m-%d-%Y')) 81 | 82 | time.sleep(2.5) # as to not get banned from AirBnB 83 | 84 | 85 | if __name__ == "__main__": 86 | main() 87 | -------------------------------------------------------------------------------- /model_results/artsy_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.5,500,1,1,0.7336134453781512,0.7115869017632241 3 | 1,0.5,500,1,42,0.7243697478991596,0.7090680100755667 4 | 2,0.5,500,1,1337,0.7331932773109243,0.6561712846347607 5 | 3,0.5,500,2,1,0.7319327731092437,0.7128463476070529 6 | 4,0.5,500,2,42,0.7239495798319328,0.7115869017632241 7 | 5,0.5,500,2,1337,0.7336134453781512,0.6561712846347607 8 | 6,0.5,1000,1,1,0.7970588235294118,0.7619647355163728 9 | 7,0.5,1000,1,42,0.7941176470588235,0.760705289672544 10 | 8,0.5,1000,1,1337,0.8012605042016807,0.6952141057934509 11 | 9,0.5,1000,2,1,0.7966386554621848,0.7619647355163728 12 | 10,0.5,1000,2,42,0.7941176470588235,0.760705289672544 13 | 11,0.5,1000,2,1337,0.8016806722689076,0.6964735516372796 14 | 12,0.5,2000,1,1,0.8189075630252101,0.7695214105793451 15 | 13,0.5,2000,1,42,0.8184873949579832,0.7707808564231738 16 | 14,0.5,2000,1,1337,0.8273109243697478,0.716624685138539 17 | 15,0.5,2000,2,1,0.8180672268907563,0.7682619647355163 18 | 16,0.5,2000,2,42,0.8172268907563025,0.7732997481108312 19 | 17,0.5,2000,2,1337,0.8264705882352941,0.7178841309823678 20 | 18,0.5,3000,1,1,0.8226890756302521,0.7531486146095718 21 | 19,0.5,3000,1,42,0.8218487394957983,0.760705289672544 22 | 20,0.5,3000,1,1337,0.8189075630252101,0.7002518891687658 23 | 21,0.5,3000,2,1,0.8218487394957983,0.7493702770780857 24 | 22,0.5,3000,2,42,0.8205882352941176,0.7581863979848866 25 | 23,0.5,3000,2,1337,0.8189075630252101,0.7015113350125944 26 | 24,0.5,4000,1,1,0.8151260504201681,0.739294710327456 27 | 25,0.5,4000,1,42,0.8088235294117647,0.7405541561712846 28 | 26,0.5,4000,1,1337,0.8033613445378152,0.6876574307304786 29 | 27,0.5,4000,2,1,0.8142857142857143,0.743073047858942 30 | 28,0.5,4000,2,42,0.8084033613445378,0.7418136020151134 31 | 29,0.5,4000,2,1337,0.8033613445378152,0.6863979848866498 32 | 30,0.5,5000,1,1,0.7974789915966387,0.7317380352644837 33 | 31,0.5,5000,1,42,0.7924369747899159,0.7304785894206549 34 | 32,0.5,5000,1,1337,0.7852941176470588,0.6675062972292192 35 | 33,0.5,5000,2,1,0.7970588235294118,0.7317380352644837 36 | 34,0.5,5000,2,42,0.7924369747899159,0.7304785894206549 37 | 35,0.5,5000,2,1337,0.7865546218487395,0.6675062972292192 38 | 36,0.5,8000,1,1,0.746218487394958,0.6914357682619647 39 | 37,0.5,8000,1,42,0.7394957983193278,0.6889168765743073 40 | 38,0.5,8000,1,1337,0.730672268907563,0.6309823677581864 41 | 39,0.5,8000,2,1,0.7457983193277311,0.6914357682619647 42 | 40,0.5,8000,2,42,0.7403361344537815,0.6889168765743073 43 | 41,0.5,8000,2,1337,0.7315126050420168,0.628463476070529 44 | 42,0.66,500,1,1,0.726890756302521,0.7103274559193955 45 | 43,0.66,500,1,42,0.719327731092437,0.7103274559193955 46 | 44,0.66,500,1,1337,0.7319327731092437,0.654911838790932 47 | 45,0.66,500,2,1,0.7273109243697479,0.7141057934508817 48 | 46,0.66,500,2,42,0.7197478991596639,0.7103274559193955 49 | 47,0.66,500,2,1337,0.7323529411764705,0.654911838790932 50 | 48,0.66,1000,1,1,0.792016806722689,0.7531486146095718 51 | 49,0.66,1000,1,42,0.7886554621848739,0.7544080604534005 52 | 50,0.66,1000,1,1337,0.7932773109243697,0.6889168765743073 53 | 51,0.66,1000,2,1,0.7924369747899159,0.7531486146095718 54 | 52,0.66,1000,2,42,0.7886554621848739,0.7544080604534005 55 | 53,0.66,1000,2,1337,0.7915966386554621,0.6926952141057935 56 | 54,0.66,2000,1,1,0.8172268907563025,0.7644836272040302 57 | 55,0.66,2000,1,42,0.8117647058823529,0.7657430730478589 58 | 56,0.66,2000,1,1337,0.8180672268907563,0.707808564231738 59 | 57,0.66,2000,2,1,0.8168067226890756,0.7670025188916877 60 | 58,0.66,2000,2,42,0.8109243697478992,0.7632241813602015 61 | 59,0.66,2000,2,1337,0.8176470588235294,0.7090680100755667 62 | 60,0.66,3000,1,1,0.8180672268907563,0.7455919395465995 63 | 61,0.66,3000,1,42,0.8138655462184874,0.7531486146095718 64 | 62,0.66,3000,1,1337,0.8105042016806723,0.6964735516372796 65 | 63,0.66,3000,2,1,0.8184873949579832,0.7455919395465995 66 | 64,0.66,3000,2,42,0.8134453781512605,0.7531486146095718 67 | 65,0.66,3000,2,1337,0.811344537815126,0.6952141057934509 68 | 66,0.66,4000,1,1,0.8025210084033614,0.7380352644836272 69 | 67,0.66,4000,1,42,0.7978991596638656,0.7329974811083123 70 | 68,0.66,4000,1,1337,0.7894957983193277,0.6712846347607053 71 | 69,0.66,4000,2,1,0.8025210084033614,0.7380352644836272 72 | 70,0.66,4000,2,42,0.7962184873949579,0.7329974811083123 73 | 71,0.66,4000,2,1337,0.7903361344537815,0.672544080604534 74 | 72,0.66,5000,1,1,0.7831932773109244,0.7279596977329975 75 | 73,0.66,5000,1,42,0.776890756302521,0.7191435768261965 76 | 74,0.66,5000,1,1337,0.769327731092437,0.6536523929471033 77 | 75,0.66,5000,2,1,0.7831932773109244,0.7279596977329975 78 | 76,0.66,5000,2,42,0.776890756302521,0.7204030226700252 79 | 77,0.66,5000,2,1337,0.769327731092437,0.6536523929471033 80 | 78,0.66,8000,1,1,0.726890756302521,0.6801007556675063 81 | 79,0.66,8000,1,42,0.7235294117647059,0.672544080604534 82 | 80,0.66,8000,1,1337,0.7189075630252101,0.6234256926952141 83 | 81,0.66,8000,2,1,0.7294117647058823,0.6788413098236776 84 | 82,0.66,8000,2,42,0.7239495798319328,0.672544080604534 85 | 83,0.66,8000,2,1337,0.7197478991596639,0.6196473551637279 86 | 84,0.75,500,1,1,0.7256302521008403,0.7141057934508817 87 | 85,0.75,500,1,42,0.719327731092437,0.707808564231738 88 | 86,0.75,500,1,1337,0.7310924369747899,0.6523929471032746 89 | 87,0.75,500,2,1,0.7256302521008403,0.7141057934508817 90 | 88,0.75,500,2,42,0.719327731092437,0.707808564231738 91 | 89,0.75,500,2,1337,0.7310924369747899,0.6523929471032746 92 | 90,0.75,1000,1,1,0.7915966386554621,0.7518891687657431 93 | 91,0.75,1000,1,42,0.7848739495798319,0.7518891687657431 94 | 92,0.75,1000,1,1337,0.7907563025210084,0.6926952141057935 95 | 93,0.75,1000,2,1,0.7915966386554621,0.7518891687657431 96 | 94,0.75,1000,2,42,0.7852941176470588,0.7518891687657431 97 | 95,0.75,1000,2,1337,0.7899159663865546,0.690176322418136 98 | 96,0.75,2000,1,1,0.8151260504201681,0.760705289672544 99 | 97,0.75,2000,1,42,0.8079831932773109,0.7644836272040302 100 | 98,0.75,2000,1,1337,0.8168067226890756,0.7040302267002518 101 | 99,0.75,2000,2,1,0.8163865546218487,0.7581863979848866 102 | 100,0.75,2000,2,42,0.8079831932773109,0.7632241813602015 103 | 101,0.75,2000,2,1337,0.8168067226890756,0.7052896725440806 104 | 102,0.75,3000,1,1,0.815546218487395,0.7455919395465995 105 | 103,0.75,3000,1,42,0.8100840336134454,0.7518891687657431 106 | 104,0.75,3000,1,1337,0.8100840336134454,0.690176322418136 107 | 105,0.75,3000,2,1,0.8147058823529412,0.7455919395465995 108 | 106,0.75,3000,2,42,0.8100840336134454,0.7518891687657431 109 | 107,0.75,3000,2,1337,0.8100840336134454,0.690176322418136 110 | 108,0.75,4000,1,1,0.8021008403361345,0.7380352644836272 111 | 109,0.75,4000,1,42,0.7936974789915966,0.7329974811083123 112 | 110,0.75,4000,1,1337,0.7907563025210084,0.6687657430730478 113 | 111,0.75,4000,2,1,0.8008403361344538,0.739294710327456 114 | 112,0.75,4000,2,42,0.7941176470588235,0.7329974811083123 115 | 113,0.75,4000,2,1337,0.7903361344537815,0.6700251889168766 116 | 114,0.75,5000,1,1,0.7773109243697479,0.7279596977329975 117 | 115,0.75,5000,1,42,0.776890756302521,0.716624685138539 118 | 116,0.75,5000,1,1337,0.7655462184873949,0.6536523929471033 119 | 117,0.75,5000,2,1,0.7789915966386555,0.7292191435768262 120 | 118,0.75,5000,2,42,0.7747899159663866,0.7153652392947103 121 | 119,0.75,5000,2,1337,0.7655462184873949,0.6536523929471033 122 | 120,0.75,8000,1,1,0.7256302521008403,0.6788413098236776 123 | 121,0.75,8000,1,42,0.7222689075630252,0.6712846347607053 124 | 122,0.75,8000,1,1337,0.7172268907563025,0.6221662468513854 125 | 123,0.75,8000,2,1,0.7252100840336134,0.6775818639798489 126 | 124,0.75,8000,2,42,0.7201680672268908,0.6738035264483627 127 | 125,0.75,8000,2,1337,0.7189075630252101,0.6196473551637279 128 | 126,0.83,500,1,1,0.723109243697479,0.7115869017632241 129 | 127,0.83,500,1,42,0.7172268907563025,0.707808564231738 130 | 128,0.83,500,1,1337,0.7289915966386554,0.6486146095717884 131 | 129,0.83,500,2,1,0.723109243697479,0.7115869017632241 132 | 130,0.83,500,2,42,0.7172268907563025,0.707808564231738 133 | 131,0.83,500,2,1337,0.7281512605042016,0.6473551637279596 134 | 132,0.83,1000,1,1,0.7894957983193277,0.7506297229219143 135 | 133,0.83,1000,1,42,0.7819327731092437,0.7518891687657431 136 | 134,0.83,1000,1,1337,0.7886554621848739,0.6914357682619647 137 | 135,0.83,1000,2,1,0.7894957983193277,0.7506297229219143 138 | 136,0.83,1000,2,42,0.7819327731092437,0.7518891687657431 139 | 137,0.83,1000,2,1337,0.7878151260504201,0.690176322418136 140 | 138,0.83,2000,1,1,0.8134453781512605,0.7581863979848866 141 | 139,0.83,2000,1,42,0.8067226890756303,0.760705289672544 142 | 140,0.83,2000,1,1337,0.815546218487395,0.7040302267002518 143 | 141,0.83,2000,2,1,0.8138655462184874,0.7581863979848866 144 | 142,0.83,2000,2,42,0.8071428571428572,0.760705289672544 145 | 143,0.83,2000,2,1337,0.8138655462184874,0.7015113350125944 146 | 144,0.83,3000,1,1,0.8138655462184874,0.743073047858942 147 | 145,0.83,3000,1,42,0.8100840336134454,0.7481108312342569 148 | 146,0.83,3000,1,1337,0.8079831932773109,0.6889168765743073 149 | 147,0.83,3000,2,1,0.8134453781512605,0.743073047858942 150 | 148,0.83,3000,2,42,0.8092436974789916,0.7455919395465995 151 | 149,0.83,3000,2,1337,0.8084033613445378,0.6889168765743073 152 | 150,0.83,4000,1,1,0.7978991596638656,0.7355163727959698 153 | 151,0.83,4000,1,42,0.7924369747899159,0.7329974811083123 154 | 152,0.83,4000,1,1337,0.7869747899159664,0.6662468513853904 155 | 153,0.83,4000,2,1,0.7978991596638656,0.7355163727959698 156 | 154,0.83,4000,2,42,0.7915966386554621,0.7304785894206549 157 | 155,0.83,4000,2,1337,0.7878151260504201,0.6700251889168766 158 | 156,0.83,5000,1,1,0.7760504201680672,0.72544080604534 159 | 157,0.83,5000,1,42,0.7722689075630252,0.7103274559193955 160 | 158,0.83,5000,1,1337,0.7626050420168067,0.6486146095717884 161 | 159,0.83,5000,2,1,0.7764705882352941,0.7279596977329975 162 | 160,0.83,5000,2,42,0.7714285714285715,0.7115869017632241 163 | 161,0.83,5000,2,1337,0.7626050420168067,0.6486146095717884 164 | 162,0.83,8000,1,1,0.7222689075630252,0.6788413098236776 165 | 163,0.83,8000,1,42,0.715546218487395,0.6700251889168766 166 | 164,0.83,8000,1,1337,0.7147058823529412,0.6196473551637279 167 | 165,0.83,8000,2,1,0.7226890756302521,0.6775818639798489 168 | 166,0.83,8000,2,42,0.715546218487395,0.6687657430730478 169 | 167,0.83,8000,2,1337,0.715546218487395,0.6196473551637279 170 | 168,0.9,500,1,1,0.7239495798319328,0.7103274559193955 171 | 169,0.9,500,1,42,0.7168067226890756,0.707808564231738 172 | 170,0.9,500,1,1337,0.7277310924369748,0.6473551637279596 173 | 171,0.9,500,2,1,0.7239495798319328,0.7103274559193955 174 | 172,0.9,500,2,42,0.7168067226890756,0.707808564231738 175 | 173,0.9,500,2,1337,0.7277310924369748,0.6473551637279596 176 | 174,0.9,1000,1,1,0.7890756302521008,0.7506297229219143 177 | 175,0.9,1000,1,42,0.7810924369747899,0.7518891687657431 178 | 176,0.9,1000,1,1337,0.7869747899159664,0.690176322418136 179 | 177,0.9,1000,2,1,0.7865546218487395,0.7493702770780857 180 | 178,0.9,1000,2,42,0.7819327731092437,0.7518891687657431 181 | 179,0.9,1000,2,1337,0.7886554621848739,0.6914357682619647 182 | 180,0.9,2000,1,1,0.8138655462184874,0.7544080604534005 183 | 181,0.9,2000,1,42,0.807563025210084,0.7594458438287154 184 | 182,0.9,2000,1,1337,0.8138655462184874,0.7015113350125944 185 | 183,0.9,2000,2,1,0.8134453781512605,0.7594458438287154 186 | 184,0.9,2000,2,42,0.8063025210084034,0.760705289672544 187 | 185,0.9,2000,2,1337,0.8151260504201681,0.7015113350125944 188 | 186,0.9,3000,1,1,0.8134453781512605,0.7443324937027708 189 | 187,0.9,3000,1,42,0.8084033613445378,0.7481108312342569 190 | 188,0.9,3000,1,1337,0.8079831932773109,0.6889168765743073 191 | 189,0.9,3000,2,1,0.8130252100840336,0.743073047858942 192 | 190,0.9,3000,2,42,0.8088235294117647,0.7481108312342569 193 | 191,0.9,3000,2,1337,0.8079831932773109,0.6889168765743073 194 | 192,0.9,4000,1,1,0.7987394957983194,0.7367758186397985 195 | 193,0.9,4000,1,42,0.7903361344537815,0.7304785894206549 196 | 194,0.9,4000,1,1337,0.7857142857142857,0.6662468513853904 197 | 195,0.9,4000,2,1,0.7978991596638656,0.7329974811083123 198 | 196,0.9,4000,2,42,0.7903361344537815,0.7304785894206549 199 | 197,0.9,4000,2,1337,0.7857142857142857,0.663727959697733 200 | 198,0.9,5000,1,1,0.7764705882352941,0.7279596977329975 201 | 199,0.9,5000,1,42,0.7705882352941177,0.7090680100755667 202 | 200,0.9,5000,1,1337,0.7605042016806722,0.6486146095717884 203 | 201,0.9,5000,2,1,0.7756302521008404,0.7267002518891688 204 | 202,0.9,5000,2,42,0.7722689075630252,0.7115869017632241 205 | 203,0.9,5000,2,1337,0.761344537815126,0.6486146095717884 206 | 204,0.9,8000,1,1,0.7205882352941176,0.6775818639798489 207 | 205,0.9,8000,1,42,0.715546218487395,0.6700251889168766 208 | 206,0.9,8000,1,1337,0.7138655462184874,0.6196473551637279 209 | 207,0.9,8000,2,1,0.7218487394957983,0.6763224181360201 210 | 208,0.9,8000,2,42,0.7142857142857143,0.6687657430730478 211 | 209,0.9,8000,2,1337,0.7151260504201681,0.6196473551637279 212 | 210,1.0,500,1,1,0.7075630252100841,0.698992443324937 213 | 211,1.0,500,1,42,0.7037815126050421,0.6914357682619647 214 | 212,1.0,500,1,1337,0.7159663865546219,0.6322418136020151 215 | 213,1.0,500,2,1,0.7075630252100841,0.698992443324937 216 | 214,1.0,500,2,42,0.7037815126050421,0.6914357682619647 217 | 215,1.0,500,2,1337,0.715546218487395,0.6322418136020151 218 | 216,1.0,1000,1,1,0.7722689075630252,0.7405541561712846 219 | 217,1.0,1000,1,42,0.7689075630252101,0.7317380352644837 220 | 218,1.0,1000,1,1337,0.7617647058823529,0.6675062972292192 221 | 219,1.0,1000,2,1,0.7726890756302521,0.7405541561712846 222 | 220,1.0,1000,2,42,0.7680672268907563,0.7317380352644837 223 | 221,1.0,1000,2,1337,0.7659663865546219,0.6700251889168766 224 | 222,1.0,2000,1,1,0.8004201680672269,0.7443324937027708 225 | 223,1.0,2000,1,42,0.7899159663865546,0.743073047858942 226 | 224,1.0,2000,1,1337,0.7911764705882353,0.6838790931989924 227 | 225,1.0,2000,2,1,0.8008403361344538,0.7418136020151134 228 | 226,1.0,2000,2,42,0.7873949579831933,0.7443324937027708 229 | 227,1.0,2000,2,1337,0.792016806722689,0.6838790931989924 230 | 228,1.0,3000,1,1,0.7966386554621848,0.7317380352644837 231 | 229,1.0,3000,1,42,0.7894957983193277,0.7329974811083123 232 | 230,1.0,3000,1,1337,0.7848739495798319,0.6712846347607053 233 | 231,1.0,3000,2,1,0.7970588235294118,0.7317380352644837 234 | 232,1.0,3000,2,42,0.7903361344537815,0.7317380352644837 235 | 233,1.0,3000,2,1337,0.7852941176470588,0.6738035264483627 236 | 234,1.0,4000,1,1,0.7777310924369748,0.7292191435768262 237 | 235,1.0,4000,1,42,0.7676470588235295,0.7103274559193955 238 | 236,1.0,4000,1,1337,0.761344537815126,0.6473551637279596 239 | 237,1.0,4000,2,1,0.7760504201680672,0.7292191435768262 240 | 238,1.0,4000,2,42,0.7680672268907563,0.7115869017632241 241 | 239,1.0,4000,2,1337,0.7617647058823529,0.6473551637279596 242 | 240,1.0,5000,1,1,0.7529411764705882,0.7027707808564232 243 | 241,1.0,5000,1,42,0.7432773109243698,0.6914357682619647 244 | 242,1.0,5000,1,1337,0.7407563025210084,0.6397984886649875 245 | 243,1.0,5000,2,1,0.7529411764705882,0.7027707808564232 246 | 244,1.0,5000,2,42,0.7436974789915967,0.690176322418136 247 | 245,1.0,5000,2,1337,0.7411764705882353,0.6397984886649875 248 | 246,1.0,8000,1,1,0.7058823529411765,0.6675062972292192 249 | 247,1.0,8000,1,42,0.6983193277310924,0.6586901763224181 250 | 248,1.0,8000,1,1337,0.6949579831932773,0.6083123425692695 251 | 249,1.0,8000,2,1,0.7088235294117647,0.6675062972292192 252 | 250,1.0,8000,2,42,0.6974789915966386,0.6586901763224181 253 | 251,1.0,8000,2,1337,0.6932773109243697,0.6083123425692695 254 | -------------------------------------------------------------------------------- /model_results/artsy_svc_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.9,8000,2,55,0.996218487394958,0.8161209068010076 3 | -------------------------------------------------------------------------------- /model_results/dining_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.5,500,1,1,0.6869747899159664,0.6700251889168766 3 | 1,0.5,500,1,42,0.688655462184874,0.6649874055415617 4 | 2,0.5,500,1,1337,0.6798319327731093,0.6372795969773299 5 | 3,0.5,500,2,1,0.6873949579831933,0.6675062972292192 6 | 4,0.5,500,2,42,0.6890756302521008,0.6675062972292192 7 | 5,0.5,500,2,1337,0.6798319327731093,0.6372795969773299 8 | 6,0.5,1000,1,1,0.7474789915966387,0.6977329974811083 9 | 7,0.5,1000,1,42,0.7378151260504202,0.698992443324937 10 | 8,0.5,1000,1,1337,0.7432773109243698,0.6763224181360201 11 | 9,0.5,1000,2,1,0.7483193277310924,0.6952141057934509 12 | 10,0.5,1000,2,42,0.7378151260504202,0.698992443324937 13 | 11,0.5,1000,2,1337,0.7453781512605042,0.6763224181360201 14 | 12,0.5,2000,1,1,0.7890756302521008,0.7204030226700252 15 | 13,0.5,2000,1,42,0.7798319327731092,0.7367758186397985 16 | 14,0.5,2000,1,1337,0.788235294117647,0.7115869017632241 17 | 15,0.5,2000,2,1,0.7886554621848739,0.7229219143576826 18 | 16,0.5,2000,2,42,0.7827731092436975,0.7342569269521411 19 | 17,0.5,2000,2,1337,0.7886554621848739,0.7128463476070529 20 | 18,0.5,3000,1,1,0.7936974789915966,0.7103274559193955 21 | 19,0.5,3000,1,42,0.7827731092436975,0.7267002518891688 22 | 20,0.5,3000,1,1337,0.7861344537815126,0.7052896725440806 23 | 21,0.5,3000,2,1,0.792016806722689,0.7090680100755667 24 | 22,0.5,3000,2,42,0.784453781512605,0.7279596977329975 25 | 23,0.5,3000,2,1337,0.7861344537815126,0.707808564231738 26 | 24,0.5,4000,1,1,0.776890756302521,0.6964735516372796 27 | 25,0.5,4000,1,42,0.7689075630252101,0.7052896725440806 28 | 26,0.5,4000,1,1337,0.765126050420168,0.6952141057934509 29 | 27,0.5,4000,2,1,0.7777310924369748,0.698992443324937 30 | 28,0.5,4000,2,42,0.7676470588235295,0.7052896725440806 31 | 29,0.5,4000,2,1337,0.7642857142857142,0.6914357682619647 32 | 30,0.5,5000,1,1,0.7596638655462185,0.6851385390428212 33 | 31,0.5,5000,1,42,0.7470588235294118,0.6914357682619647 34 | 32,0.5,5000,1,1337,0.7457983193277311,0.6738035264483627 35 | 33,0.5,5000,2,1,0.7592436974789916,0.6851385390428212 36 | 34,0.5,5000,2,42,0.7474789915966387,0.6926952141057935 37 | 35,0.5,5000,2,1337,0.7466386554621849,0.6738035264483627 38 | 36,0.5,8000,1,1,0.7025210084033613,0.654911838790932 39 | 37,0.5,8000,1,42,0.6857142857142857,0.6511335012594458 40 | 38,0.5,8000,1,1337,0.6920168067226891,0.6410579345088161 41 | 39,0.5,8000,2,1,0.7033613445378152,0.654911838790932 42 | 40,0.5,8000,2,42,0.6899159663865546,0.6523929471032746 43 | 41,0.5,8000,2,1337,0.6911764705882353,0.6397984886649875 44 | 42,0.66,500,1,1,0.6722689075630253,0.6612090680100756 45 | 43,0.66,500,1,42,0.6785714285714286,0.6511335012594458 46 | 44,0.66,500,1,1337,0.6701680672268907,0.6360201511335013 47 | 45,0.66,500,2,1,0.6785714285714286,0.663727959697733 48 | 46,0.66,500,2,42,0.6773109243697479,0.6498740554156172 49 | 47,0.66,500,2,1337,0.6722689075630253,0.6372795969773299 50 | 48,0.66,1000,1,1,0.7382352941176471,0.6863979848866498 51 | 49,0.66,1000,1,42,0.7310924369747899,0.698992443324937 52 | 50,0.66,1000,1,1337,0.7411764705882353,0.672544080604534 53 | 51,0.66,1000,2,1,0.7373949579831933,0.6863979848866498 54 | 52,0.66,1000,2,42,0.7310924369747899,0.698992443324937 55 | 53,0.66,1000,2,1337,0.7390756302521009,0.672544080604534 56 | 54,0.66,2000,1,1,0.7840336134453781,0.7128463476070529 57 | 55,0.66,2000,1,42,0.7743697478991597,0.7216624685138538 58 | 56,0.66,2000,1,1337,0.7815126050420168,0.7052896725440806 59 | 57,0.66,2000,2,1,0.7827731092436975,0.7128463476070529 60 | 58,0.66,2000,2,42,0.7726890756302521,0.7229219143576826 61 | 59,0.66,2000,2,1337,0.7802521008403361,0.7052896725440806 62 | 60,0.66,3000,1,1,0.7878151260504201,0.707808564231738 63 | 61,0.66,3000,1,42,0.7752100840336135,0.716624685138539 64 | 62,0.66,3000,1,1337,0.7789915966386555,0.7015113350125944 65 | 63,0.66,3000,2,1,0.7861344537815126,0.707808564231738 66 | 64,0.66,3000,2,42,0.7756302521008404,0.716624685138539 67 | 65,0.66,3000,2,1337,0.780672268907563,0.7002518891687658 68 | 66,0.66,4000,1,1,0.7726890756302521,0.6914357682619647 69 | 67,0.66,4000,1,42,0.7605042016806722,0.7002518891687658 70 | 68,0.66,4000,1,1337,0.7584033613445378,0.6838790931989924 71 | 69,0.66,4000,2,1,0.7705882352941177,0.6939546599496221 72 | 70,0.66,4000,2,42,0.7596638655462185,0.7027707808564232 73 | 71,0.66,4000,2,1337,0.757563025210084,0.6851385390428212 74 | 72,0.66,5000,1,1,0.7466386554621849,0.6851385390428212 75 | 73,0.66,5000,1,42,0.7323529411764705,0.6788413098236776 76 | 74,0.66,5000,1,1337,0.7294117647058823,0.6675062972292192 77 | 75,0.66,5000,2,1,0.7470588235294118,0.6851385390428212 78 | 76,0.66,5000,2,42,0.7331932773109243,0.6775818639798489 79 | 77,0.66,5000,2,1337,0.7310924369747899,0.6675062972292192 80 | 78,0.66,8000,1,1,0.6882352941176471,0.6511335012594458 81 | 79,0.66,8000,1,42,0.6760504201680673,0.6473551637279596 82 | 80,0.66,8000,1,1337,0.6785714285714286,0.6372795969773299 83 | 81,0.66,8000,2,1,0.6907563025210084,0.6511335012594458 84 | 82,0.66,8000,2,42,0.6794117647058824,0.6473551637279596 85 | 83,0.66,8000,2,1337,0.676890756302521,0.6360201511335013 86 | 84,0.75,500,1,1,0.6697478991596638,0.6574307304785895 87 | 85,0.75,500,1,42,0.6760504201680673,0.6523929471032746 88 | 86,0.75,500,1,1337,0.6714285714285714,0.6322418136020151 89 | 87,0.75,500,2,1,0.6697478991596638,0.6574307304785895 90 | 88,0.75,500,2,42,0.6760504201680673,0.6523929471032746 91 | 89,0.75,500,2,1337,0.6714285714285714,0.6322418136020151 92 | 90,0.75,1000,1,1,0.7357142857142858,0.6863979848866498 93 | 91,0.75,1000,1,42,0.7302521008403361,0.698992443324937 94 | 92,0.75,1000,1,1337,0.7357142857142858,0.6675062972292192 95 | 93,0.75,1000,2,1,0.7361344537815127,0.6863979848866498 96 | 94,0.75,1000,2,42,0.7302521008403361,0.6977329974811083 97 | 95,0.75,1000,2,1337,0.7361344537815127,0.6675062972292192 98 | 96,0.75,2000,1,1,0.7836134453781513,0.7103274559193955 99 | 97,0.75,2000,1,42,0.773109243697479,0.7204030226700252 100 | 98,0.75,2000,1,1337,0.7810924369747899,0.7015113350125944 101 | 99,0.75,2000,2,1,0.7831932773109244,0.707808564231738 102 | 100,0.75,2000,2,42,0.7714285714285715,0.7204030226700252 103 | 101,0.75,2000,2,1337,0.7802521008403361,0.7002518891687658 104 | 102,0.75,3000,1,1,0.7852941176470588,0.707808564231738 105 | 103,0.75,3000,1,42,0.7739495798319328,0.7128463476070529 106 | 104,0.75,3000,1,1337,0.7785714285714286,0.7002518891687658 107 | 105,0.75,3000,2,1,0.7840336134453781,0.707808564231738 108 | 106,0.75,3000,2,42,0.7735294117647059,0.7128463476070529 109 | 107,0.75,3000,2,1337,0.7785714285714286,0.7002518891687658 110 | 108,0.75,4000,1,1,0.7680672268907563,0.6914357682619647 111 | 109,0.75,4000,1,42,0.757563025210084,0.7002518891687658 112 | 110,0.75,4000,1,1337,0.7542016806722689,0.6826196473551638 113 | 111,0.75,4000,2,1,0.7689075630252101,0.6914357682619647 114 | 112,0.75,4000,2,42,0.7567226890756302,0.7002518891687658 115 | 113,0.75,4000,2,1337,0.7558823529411764,0.6838790931989924 116 | 114,0.75,5000,1,1,0.7445378151260504,0.6826196473551638 117 | 115,0.75,5000,1,42,0.7285714285714285,0.6738035264483627 118 | 116,0.75,5000,1,1337,0.7281512605042016,0.6662468513853904 119 | 117,0.75,5000,2,1,0.7453781512605042,0.681360201511335 120 | 118,0.75,5000,2,42,0.7315126050420168,0.672544080604534 121 | 119,0.75,5000,2,1337,0.7294117647058823,0.6662468513853904 122 | 120,0.75,8000,1,1,0.6869747899159664,0.6486146095717884 123 | 121,0.75,8000,1,42,0.6743697478991597,0.6473551637279596 124 | 122,0.75,8000,1,1337,0.6739495798319328,0.6335012594458438 125 | 123,0.75,8000,2,1,0.6873949579831933,0.6486146095717884 126 | 124,0.75,8000,2,42,0.676890756302521,0.646095717884131 127 | 125,0.75,8000,2,1337,0.6739495798319328,0.6360201511335013 128 | 126,0.83,500,1,1,0.66890756302521,0.6574307304785895 129 | 127,0.83,500,1,42,0.6710084033613445,0.6486146095717884 130 | 128,0.83,500,1,1337,0.6697478991596638,0.6309823677581864 131 | 129,0.83,500,2,1,0.66890756302521,0.6574307304785895 132 | 130,0.83,500,2,42,0.6710084033613445,0.6486146095717884 133 | 131,0.83,500,2,1337,0.6693277310924369,0.6297229219143576 134 | 132,0.83,1000,1,1,0.7340336134453781,0.6838790931989924 135 | 133,0.83,1000,1,42,0.7285714285714285,0.6926952141057935 136 | 134,0.83,1000,1,1337,0.7310924369747899,0.6687657430730478 137 | 135,0.83,1000,2,1,0.7340336134453781,0.6838790931989924 138 | 136,0.83,1000,2,42,0.7281512605042016,0.6926952141057935 139 | 137,0.83,1000,2,1337,0.7319327731092437,0.6712846347607053 140 | 138,0.83,2000,1,1,0.7823529411764706,0.7065491183879093 141 | 139,0.83,2000,1,42,0.7689075630252101,0.7178841309823678 142 | 140,0.83,2000,1,1337,0.7773109243697479,0.7002518891687658 143 | 141,0.83,2000,2,1,0.7823529411764706,0.707808564231738 144 | 142,0.83,2000,2,42,0.7684873949579832,0.7191435768261965 145 | 143,0.83,2000,2,1337,0.776890756302521,0.7027707808564232 146 | 144,0.83,3000,1,1,0.7798319327731092,0.7052896725440806 147 | 145,0.83,3000,1,42,0.7705882352941177,0.707808564231738 148 | 146,0.83,3000,1,1337,0.7756302521008404,0.7002518891687658 149 | 147,0.83,3000,2,1,0.7794117647058824,0.7052896725440806 150 | 148,0.83,3000,2,42,0.7710084033613446,0.7065491183879093 151 | 149,0.83,3000,2,1337,0.7739495798319328,0.6952141057934509 152 | 150,0.83,4000,1,1,0.7638655462184873,0.6889168765743073 153 | 151,0.83,4000,1,42,0.7542016806722689,0.6964735516372796 154 | 152,0.83,4000,1,1337,0.7521008403361344,0.681360201511335 155 | 153,0.83,4000,2,1,0.7647058823529411,0.690176322418136 156 | 154,0.83,4000,2,42,0.7546218487394958,0.6952141057934509 157 | 155,0.83,4000,2,1337,0.753781512605042,0.681360201511335 158 | 156,0.83,5000,1,1,0.7415966386554622,0.6788413098236776 159 | 157,0.83,5000,1,42,0.7235294117647059,0.6675062972292192 160 | 158,0.83,5000,1,1337,0.7256302521008403,0.6649874055415617 161 | 159,0.83,5000,2,1,0.742436974789916,0.6788413098236776 162 | 160,0.83,5000,2,42,0.7260504201680672,0.6700251889168766 163 | 161,0.83,5000,2,1337,0.7260504201680672,0.6662468513853904 164 | 162,0.83,8000,1,1,0.6840336134453782,0.6486146095717884 165 | 163,0.83,8000,1,42,0.6718487394957983,0.6435768261964736 166 | 164,0.83,8000,1,1337,0.6718487394957983,0.6335012594458438 167 | 165,0.83,8000,2,1,0.6836134453781513,0.6486146095717884 168 | 166,0.83,8000,2,42,0.6752100840336135,0.646095717884131 169 | 167,0.83,8000,2,1337,0.6710084033613445,0.6347607052896725 170 | 168,0.9,500,1,1,0.66890756302521,0.6561712846347607 171 | 169,0.9,500,1,42,0.6714285714285714,0.6473551637279596 172 | 170,0.9,500,1,1337,0.6680672268907563,0.6297229219143576 173 | 171,0.9,500,2,1,0.66890756302521,0.6561712846347607 174 | 172,0.9,500,2,42,0.6714285714285714,0.6473551637279596 175 | 173,0.9,500,2,1337,0.6680672268907563,0.6297229219143576 176 | 174,0.9,1000,1,1,0.7340336134453781,0.6826196473551638 177 | 175,0.9,1000,1,42,0.7289915966386554,0.6926952141057935 178 | 176,0.9,1000,1,1337,0.7302521008403361,0.6700251889168766 179 | 177,0.9,1000,2,1,0.7298319327731092,0.6801007556675063 180 | 178,0.9,1000,2,42,0.7273109243697479,0.6952141057934509 181 | 179,0.9,1000,2,1337,0.7298319327731092,0.6687657430730478 182 | 180,0.9,2000,1,1,0.7815126050420168,0.707808564231738 183 | 181,0.9,2000,1,42,0.769327731092437,0.7204030226700252 184 | 182,0.9,2000,1,1337,0.7764705882352941,0.7002518891687658 185 | 183,0.9,2000,2,1,0.7810924369747899,0.7065491183879093 186 | 184,0.9,2000,2,42,0.7672268907563026,0.7191435768261965 187 | 185,0.9,2000,2,1337,0.7773109243697479,0.6977329974811083 188 | 186,0.9,3000,1,1,0.7815126050420168,0.7015113350125944 189 | 187,0.9,3000,1,42,0.769327731092437,0.707808564231738 190 | 188,0.9,3000,1,1337,0.7739495798319328,0.6952141057934509 191 | 189,0.9,3000,2,1,0.7802521008403361,0.7002518891687658 192 | 190,0.9,3000,2,42,0.7684873949579832,0.707808564231738 193 | 191,0.9,3000,2,1337,0.7747899159663866,0.6964735516372796 194 | 192,0.9,4000,1,1,0.765126050420168,0.6889168765743073 195 | 193,0.9,4000,1,42,0.7550420168067227,0.6939546599496221 196 | 194,0.9,4000,1,1337,0.7512605042016807,0.681360201511335 197 | 195,0.9,4000,2,1,0.7630252100840336,0.690176322418136 198 | 196,0.9,4000,2,42,0.753781512605042,0.6952141057934509 199 | 197,0.9,4000,2,1337,0.7516806722689076,0.681360201511335 200 | 198,0.9,5000,1,1,0.7407563025210084,0.6775818639798489 201 | 199,0.9,5000,1,42,0.7218487394957983,0.6675062972292192 202 | 200,0.9,5000,1,1337,0.7226890756302521,0.6624685138539043 203 | 201,0.9,5000,2,1,0.7411764705882353,0.6775818639798489 204 | 202,0.9,5000,2,42,0.7243697478991596,0.6675062972292192 205 | 203,0.9,5000,2,1337,0.7243697478991596,0.663727959697733 206 | 204,0.9,8000,1,1,0.6831932773109244,0.6486146095717884 207 | 205,0.9,8000,1,42,0.6718487394957983,0.646095717884131 208 | 206,0.9,8000,1,1337,0.6701680672268907,0.6309823677581864 209 | 207,0.9,8000,2,1,0.6827731092436975,0.6486146095717884 210 | 208,0.9,8000,2,42,0.6739495798319328,0.646095717884131 211 | 209,0.9,8000,2,1337,0.6705882352941176,0.6322418136020151 212 | 210,1.0,500,1,1,0.6546218487394958,0.646095717884131 213 | 211,1.0,500,1,42,0.6554621848739496,0.6372795969773299 214 | 212,1.0,500,1,1337,0.6567226890756303,0.628463476070529 215 | 213,1.0,500,2,1,0.6546218487394958,0.646095717884131 216 | 214,1.0,500,2,42,0.6554621848739496,0.6372795969773299 217 | 215,1.0,500,2,1337,0.6563025210084034,0.628463476070529 218 | 216,1.0,1000,1,1,0.7100840336134454,0.6662468513853904 219 | 217,1.0,1000,1,42,0.7058823529411765,0.6662468513853904 220 | 218,1.0,1000,1,1337,0.7058823529411765,0.6523929471032746 221 | 219,1.0,1000,2,1,0.7096638655462185,0.6662468513853904 222 | 220,1.0,1000,2,42,0.7063025210084034,0.663727959697733 223 | 221,1.0,1000,2,1337,0.704201680672269,0.6561712846347607 224 | 222,1.0,2000,1,1,0.7567226890756302,0.6926952141057935 225 | 223,1.0,2000,1,42,0.7495798319327731,0.7065491183879093 226 | 224,1.0,2000,1,1337,0.7521008403361344,0.6838790931989924 227 | 225,1.0,2000,2,1,0.7563025210084033,0.6926952141057935 228 | 226,1.0,2000,2,42,0.7487394957983193,0.7052896725440806 229 | 227,1.0,2000,2,1337,0.7508403361344538,0.6863979848866498 230 | 228,1.0,3000,1,1,0.7571428571428571,0.6876574307304786 231 | 229,1.0,3000,1,42,0.7512605042016807,0.6939546599496221 232 | 230,1.0,3000,1,1337,0.7478991596638656,0.6863979848866498 233 | 231,1.0,3000,2,1,0.7579831932773109,0.6889168765743073 234 | 232,1.0,3000,2,42,0.7504201680672269,0.6876574307304786 235 | 233,1.0,3000,2,1337,0.7478991596638656,0.6863979848866498 236 | 234,1.0,4000,1,1,0.7403361344537815,0.6788413098236776 237 | 235,1.0,4000,1,42,0.7264705882352941,0.6687657430730478 238 | 236,1.0,4000,1,1337,0.7226890756302521,0.6649874055415617 239 | 237,1.0,4000,2,1,0.7407563025210084,0.6788413098236776 240 | 238,1.0,4000,2,42,0.7252100840336134,0.6687657430730478 241 | 239,1.0,4000,2,1337,0.7235294117647059,0.6687657430730478 242 | 240,1.0,5000,1,1,0.7159663865546219,0.663727959697733 243 | 241,1.0,5000,1,42,0.6991596638655462,0.6599496221662469 244 | 242,1.0,5000,1,1337,0.7037815126050421,0.654911838790932 245 | 243,1.0,5000,2,1,0.7168067226890756,0.663727959697733 246 | 244,1.0,5000,2,42,0.7,0.6599496221662469 247 | 245,1.0,5000,2,1337,0.7033613445378152,0.654911838790932 248 | 246,1.0,8000,1,1,0.6609243697478991,0.6347607052896725 249 | 247,1.0,8000,1,42,0.6508403361344538,0.6309823677581864 250 | 248,1.0,8000,1,1337,0.6546218487394958,0.6196473551637279 251 | 249,1.0,8000,2,1,0.6592436974789916,0.6360201511335013 252 | 250,1.0,8000,2,42,0.6512605042016807,0.6322418136020151 253 | 251,1.0,8000,2,1337,0.6542016806722689,0.6209068010075567 254 | -------------------------------------------------------------------------------- /model_results/dining_svc_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.9,8000,2,55,0.9815126050420168,0.7808564231738035 3 | -------------------------------------------------------------------------------- /model_results/nightlife_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.5,500,1,1,0.7205882352941176,0.6826196473551638 3 | 1,0.5,500,1,42,0.726890756302521,0.6763224181360201 4 | 2,0.5,500,1,1337,0.707983193277311,0.6687657430730478 5 | 3,0.5,500,2,1,0.7205882352941176,0.681360201511335 6 | 4,0.5,500,2,42,0.726890756302521,0.6750629722921915 7 | 5,0.5,500,2,1337,0.7084033613445379,0.6712846347607053 8 | 6,0.5,1000,1,1,0.7668067226890757,0.6851385390428212 9 | 7,0.5,1000,1,42,0.7596638655462185,0.6838790931989924 10 | 8,0.5,1000,1,1337,0.757563025210084,0.6926952141057935 11 | 9,0.5,1000,2,1,0.7668067226890757,0.6876574307304786 12 | 10,0.5,1000,2,42,0.7596638655462185,0.6838790931989924 13 | 11,0.5,1000,2,1337,0.757563025210084,0.6926952141057935 14 | 12,0.5,2000,1,1,0.8142857142857143,0.7128463476070529 15 | 13,0.5,2000,1,42,0.8063025210084034,0.7103274559193955 16 | 14,0.5,2000,1,1337,0.7991596638655463,0.7153652392947103 17 | 15,0.5,2000,2,1,0.8147058823529412,0.7115869017632241 18 | 16,0.5,2000,2,42,0.8063025210084034,0.7115869017632241 19 | 17,0.5,2000,2,1337,0.7983193277310925,0.7128463476070529 20 | 18,0.5,3000,1,1,0.8319327731092437,0.7027707808564232 21 | 19,0.5,3000,1,42,0.8210084033613445,0.7040302267002518 22 | 20,0.5,3000,1,1337,0.8172268907563025,0.7103274559193955 23 | 21,0.5,3000,2,1,0.83109243697479,0.7027707808564232 24 | 22,0.5,3000,2,42,0.8222689075630252,0.7040302267002518 25 | 23,0.5,3000,2,1337,0.8180672268907563,0.707808564231738 26 | 24,0.5,4000,1,1,0.8306722689075631,0.6914357682619647 27 | 25,0.5,4000,1,42,0.8184873949579832,0.7027707808564232 28 | 26,0.5,4000,1,1337,0.8184873949579832,0.6952141057934509 29 | 27,0.5,4000,2,1,0.8302521008403362,0.6926952141057935 30 | 28,0.5,4000,2,42,0.8210084033613445,0.7002518891687658 31 | 29,0.5,4000,2,1337,0.8201680672268907,0.6914357682619647 32 | 30,0.5,5000,1,1,0.8239495798319327,0.6863979848866498 33 | 31,0.5,5000,1,42,0.8168067226890756,0.6964735516372796 34 | 32,0.5,5000,1,1337,0.8147058823529412,0.6788413098236776 35 | 33,0.5,5000,2,1,0.8235294117647058,0.6863979848866498 36 | 34,0.5,5000,2,42,0.815546218487395,0.6964735516372796 37 | 35,0.5,5000,2,1337,0.8142857142857143,0.681360201511335 38 | 36,0.5,8000,1,1,0.7928571428571428,0.6486146095717884 39 | 37,0.5,8000,1,42,0.7857142857142857,0.6712846347607053 40 | 38,0.5,8000,1,1337,0.7756302521008404,0.646095717884131 41 | 39,0.5,8000,2,1,0.7936974789915966,0.654911838790932 42 | 40,0.5,8000,2,42,0.7848739495798319,0.6700251889168766 43 | 41,0.5,8000,2,1337,0.7764705882352941,0.646095717884131 44 | 42,0.66,500,1,1,0.7113445378151261,0.6763224181360201 45 | 43,0.66,500,1,42,0.7138655462184874,0.6700251889168766 46 | 44,0.66,500,1,1337,0.6974789915966386,0.6612090680100756 47 | 45,0.66,500,2,1,0.7109243697478992,0.6738035264483627 48 | 46,0.66,500,2,42,0.7142857142857143,0.6700251889168766 49 | 47,0.66,500,2,1337,0.6974789915966386,0.6624685138539043 50 | 48,0.66,1000,1,1,0.7596638655462185,0.6838790931989924 51 | 49,0.66,1000,1,42,0.7491596638655462,0.6687657430730478 52 | 50,0.66,1000,1,1337,0.7508403361344538,0.6876574307304786 53 | 51,0.66,1000,2,1,0.7600840336134453,0.6838790931989924 54 | 52,0.66,1000,2,42,0.7508403361344538,0.672544080604534 55 | 53,0.66,1000,2,1337,0.7508403361344538,0.6876574307304786 56 | 54,0.66,2000,1,1,0.8088235294117647,0.7040302267002518 57 | 55,0.66,2000,1,42,0.7987394957983194,0.707808564231738 58 | 56,0.66,2000,1,1337,0.7949579831932773,0.7090680100755667 59 | 57,0.66,2000,2,1,0.8100840336134454,0.7040302267002518 60 | 58,0.66,2000,2,42,0.7970588235294118,0.707808564231738 61 | 59,0.66,2000,2,1337,0.7953781512605042,0.7090680100755667 62 | 60,0.66,3000,1,1,0.823109243697479,0.7002518891687658 63 | 61,0.66,3000,1,42,0.8117647058823529,0.6939546599496221 64 | 62,0.66,3000,1,1337,0.8084033613445378,0.7141057934508817 65 | 63,0.66,3000,2,1,0.8247899159663865,0.7002518891687658 66 | 64,0.66,3000,2,42,0.8121848739495798,0.6939546599496221 67 | 65,0.66,3000,2,1337,0.8084033613445378,0.7103274559193955 68 | 66,0.66,4000,1,1,0.8176470588235294,0.6851385390428212 69 | 67,0.66,4000,1,42,0.807563025210084,0.690176322418136 70 | 68,0.66,4000,1,1337,0.8058823529411765,0.6926952141057935 71 | 69,0.66,4000,2,1,0.8172268907563025,0.6863979848866498 72 | 70,0.66,4000,2,42,0.8079831932773109,0.6914357682619647 73 | 71,0.66,4000,2,1337,0.8058823529411765,0.6939546599496221 74 | 72,0.66,5000,1,1,0.8142857142857143,0.6838790931989924 75 | 73,0.66,5000,1,42,0.8029411764705883,0.6863979848866498 76 | 74,0.66,5000,1,1337,0.7991596638655463,0.6712846347607053 77 | 75,0.66,5000,2,1,0.8142857142857143,0.6826196473551638 78 | 76,0.66,5000,2,42,0.8025210084033614,0.6889168765743073 79 | 77,0.66,5000,2,1337,0.7995798319327732,0.6700251889168766 80 | 78,0.66,8000,1,1,0.773109243697479,0.6360201511335013 81 | 79,0.66,8000,1,42,0.7680672268907563,0.663727959697733 82 | 80,0.66,8000,1,1337,0.7571428571428571,0.6272040302267002 83 | 81,0.66,8000,2,1,0.7735294117647059,0.6372795969773299 84 | 82,0.66,8000,2,42,0.7705882352941177,0.6675062972292192 85 | 83,0.66,8000,2,1337,0.7579831932773109,0.6309823677581864 86 | 84,0.75,500,1,1,0.7092436974789916,0.6687657430730478 87 | 85,0.75,500,1,42,0.7142857142857143,0.663727959697733 88 | 86,0.75,500,1,1337,0.6949579831932773,0.6586901763224181 89 | 87,0.75,500,2,1,0.7092436974789916,0.6687657430730478 90 | 88,0.75,500,2,42,0.7142857142857143,0.663727959697733 91 | 89,0.75,500,2,1337,0.6949579831932773,0.6586901763224181 92 | 90,0.75,1000,1,1,0.7563025210084033,0.6851385390428212 93 | 91,0.75,1000,1,42,0.7470588235294118,0.6649874055415617 94 | 92,0.75,1000,1,1337,0.7516806722689076,0.690176322418136 95 | 93,0.75,1000,2,1,0.7558823529411764,0.6851385390428212 96 | 94,0.75,1000,2,42,0.7470588235294118,0.6662468513853904 97 | 95,0.75,1000,2,1337,0.7521008403361344,0.690176322418136 98 | 96,0.75,2000,1,1,0.8109243697478992,0.7065491183879093 99 | 97,0.75,2000,1,42,0.7949579831932773,0.7090680100755667 100 | 98,0.75,2000,1,1337,0.7932773109243697,0.707808564231738 101 | 99,0.75,2000,2,1,0.811344537815126,0.7027707808564232 102 | 100,0.75,2000,2,42,0.7941176470588235,0.7090680100755667 103 | 101,0.75,2000,2,1337,0.7949579831932773,0.7115869017632241 104 | 102,0.75,3000,1,1,0.8222689075630252,0.6977329974811083 105 | 103,0.75,3000,1,42,0.8092436974789916,0.690176322418136 106 | 104,0.75,3000,1,1337,0.8088235294117647,0.7128463476070529 107 | 105,0.75,3000,2,1,0.8218487394957983,0.7002518891687658 108 | 106,0.75,3000,2,42,0.8105042016806723,0.690176322418136 109 | 107,0.75,3000,2,1337,0.8079831932773109,0.7115869017632241 110 | 108,0.75,4000,1,1,0.8168067226890756,0.6876574307304786 111 | 109,0.75,4000,1,42,0.8058823529411765,0.6914357682619647 112 | 110,0.75,4000,1,1337,0.8096638655462185,0.690176322418136 113 | 111,0.75,4000,2,1,0.8151260504201681,0.6863979848866498 114 | 112,0.75,4000,2,42,0.8067226890756303,0.6939546599496221 115 | 113,0.75,4000,2,1337,0.8046218487394958,0.6926952141057935 116 | 114,0.75,5000,1,1,0.8105042016806723,0.6801007556675063 117 | 115,0.75,5000,1,42,0.7966386554621848,0.6851385390428212 118 | 116,0.75,5000,1,1337,0.7970588235294118,0.6700251889168766 119 | 117,0.75,5000,2,1,0.8105042016806723,0.6801007556675063 120 | 118,0.75,5000,2,42,0.7966386554621848,0.6863979848866498 121 | 119,0.75,5000,2,1337,0.7970588235294118,0.6687657430730478 122 | 120,0.75,8000,1,1,0.7701680672268908,0.6372795969773299 123 | 121,0.75,8000,1,42,0.7634453781512605,0.6612090680100756 124 | 122,0.75,8000,1,1337,0.7516806722689076,0.6309823677581864 125 | 123,0.75,8000,2,1,0.7714285714285715,0.6347607052896725 126 | 124,0.75,8000,2,42,0.7655462184873949,0.663727959697733 127 | 125,0.75,8000,2,1337,0.753781512605042,0.6322418136020151 128 | 126,0.83,500,1,1,0.7075630252100841,0.6675062972292192 129 | 127,0.83,500,1,42,0.7033613445378152,0.663727959697733 130 | 128,0.83,500,1,1337,0.6928571428571428,0.6574307304785895 131 | 129,0.83,500,2,1,0.7075630252100841,0.6675062972292192 132 | 130,0.83,500,2,42,0.7033613445378152,0.663727959697733 133 | 131,0.83,500,2,1337,0.6899159663865546,0.6612090680100756 134 | 132,0.83,1000,1,1,0.7554621848739496,0.6863979848866498 135 | 133,0.83,1000,1,42,0.7478991596638656,0.6675062972292192 136 | 134,0.83,1000,1,1337,0.7508403361344538,0.6889168765743073 137 | 135,0.83,1000,2,1,0.7554621848739496,0.6863979848866498 138 | 136,0.83,1000,2,42,0.7483193277310924,0.6662468513853904 139 | 137,0.83,1000,2,1337,0.7504201680672269,0.6889168765743073 140 | 138,0.83,2000,1,1,0.8096638655462185,0.7002518891687658 141 | 139,0.83,2000,1,42,0.7945378151260504,0.707808564231738 142 | 140,0.83,2000,1,1337,0.7899159663865546,0.707808564231738 143 | 141,0.83,2000,2,1,0.8084033613445378,0.7002518891687658 144 | 142,0.83,2000,2,42,0.7928571428571428,0.7090680100755667 145 | 143,0.83,2000,2,1337,0.7924369747899159,0.7090680100755667 146 | 144,0.83,3000,1,1,0.819327731092437,0.6939546599496221 147 | 145,0.83,3000,1,42,0.8063025210084034,0.6876574307304786 148 | 146,0.83,3000,1,1337,0.8029411764705883,0.707808564231738 149 | 147,0.83,3000,2,1,0.8205882352941176,0.6939546599496221 150 | 148,0.83,3000,2,42,0.8067226890756303,0.6889168765743073 151 | 149,0.83,3000,2,1337,0.8046218487394958,0.7040302267002518 152 | 150,0.83,4000,1,1,0.8138655462184874,0.6826196473551638 153 | 151,0.83,4000,1,42,0.8050420168067227,0.6926952141057935 154 | 152,0.83,4000,1,1337,0.8016806722689076,0.6876574307304786 155 | 153,0.83,4000,2,1,0.8105042016806723,0.6826196473551638 156 | 154,0.83,4000,2,42,0.8054621848739496,0.6977329974811083 157 | 155,0.83,4000,2,1337,0.8025210084033614,0.6889168765743073 158 | 156,0.83,5000,1,1,0.8079831932773109,0.6801007556675063 159 | 157,0.83,5000,1,42,0.7936974789915966,0.6838790931989924 160 | 158,0.83,5000,1,1337,0.7945378151260504,0.6675062972292192 161 | 159,0.83,5000,2,1,0.807563025210084,0.6801007556675063 162 | 160,0.83,5000,2,42,0.7911764705882353,0.681360201511335 163 | 161,0.83,5000,2,1337,0.7936974789915966,0.6675062972292192 164 | 162,0.83,8000,1,1,0.7647058823529411,0.6335012594458438 165 | 163,0.83,8000,1,42,0.7567226890756302,0.654911838790932 166 | 164,0.83,8000,1,1337,0.7449579831932773,0.6246851385390428 167 | 165,0.83,8000,2,1,0.7663865546218488,0.6360201511335013 168 | 166,0.83,8000,2,42,0.7605042016806722,0.654911838790932 169 | 167,0.83,8000,2,1337,0.7470588235294118,0.628463476070529 170 | 168,0.9,500,1,1,0.7050420168067227,0.6675062972292192 171 | 169,0.9,500,1,42,0.7046218487394958,0.6649874055415617 172 | 170,0.9,500,1,1337,0.6907563025210084,0.6612090680100756 173 | 171,0.9,500,2,1,0.7050420168067227,0.6675062972292192 174 | 172,0.9,500,2,42,0.7046218487394958,0.6649874055415617 175 | 173,0.9,500,2,1337,0.6907563025210084,0.6612090680100756 176 | 174,0.9,1000,1,1,0.7554621848739496,0.6863979848866498 177 | 175,0.9,1000,1,42,0.7495798319327731,0.6662468513853904 178 | 176,0.9,1000,1,1337,0.7483193277310924,0.6851385390428212 179 | 177,0.9,1000,2,1,0.7546218487394958,0.6826196473551638 180 | 178,0.9,1000,2,42,0.7508403361344538,0.6662468513853904 181 | 179,0.9,1000,2,1337,0.7491596638655462,0.6889168765743073 182 | 180,0.9,2000,1,1,0.8071428571428572,0.7015113350125944 183 | 181,0.9,2000,1,42,0.7903361344537815,0.7052896725440806 184 | 182,0.9,2000,1,1337,0.7928571428571428,0.7052896725440806 185 | 183,0.9,2000,2,1,0.8071428571428572,0.7027707808564232 186 | 184,0.9,2000,2,42,0.792016806722689,0.7065491183879093 187 | 185,0.9,2000,2,1337,0.7915966386554621,0.7052896725440806 188 | 186,0.9,3000,1,1,0.8184873949579832,0.6939546599496221 189 | 187,0.9,3000,1,42,0.8067226890756303,0.6889168765743073 190 | 188,0.9,3000,1,1337,0.8046218487394958,0.7040302267002518 191 | 189,0.9,3000,2,1,0.8201680672268907,0.6952141057934509 192 | 190,0.9,3000,2,42,0.8058823529411765,0.6914357682619647 193 | 191,0.9,3000,2,1337,0.8046218487394958,0.7040302267002518 194 | 192,0.9,4000,1,1,0.8126050420168067,0.6851385390428212 195 | 193,0.9,4000,1,42,0.8042016806722689,0.6939546599496221 196 | 194,0.9,4000,1,1337,0.8016806722689076,0.6889168765743073 197 | 195,0.9,4000,2,1,0.8134453781512605,0.6826196473551638 198 | 196,0.9,4000,2,42,0.8046218487394958,0.6926952141057935 199 | 197,0.9,4000,2,1337,0.8008403361344538,0.6889168765743073 200 | 198,0.9,5000,1,1,0.8084033613445378,0.6801007556675063 201 | 199,0.9,5000,1,42,0.7903361344537815,0.681360201511335 202 | 200,0.9,5000,1,1337,0.7911764705882353,0.6662468513853904 203 | 201,0.9,5000,2,1,0.8071428571428572,0.6788413098236776 204 | 202,0.9,5000,2,42,0.7903361344537815,0.6801007556675063 205 | 203,0.9,5000,2,1337,0.7924369747899159,0.6662468513853904 206 | 204,0.9,8000,1,1,0.7605042016806722,0.6335012594458438 207 | 205,0.9,8000,1,42,0.7542016806722689,0.654911838790932 208 | 206,0.9,8000,1,1337,0.742436974789916,0.6234256926952141 209 | 207,0.9,8000,2,1,0.7634453781512605,0.6385390428211587 210 | 208,0.9,8000,2,42,0.7571428571428571,0.6561712846347607 211 | 209,0.9,8000,2,1337,0.7436974789915967,0.6234256926952141 212 | 210,1.0,500,1,1,0.688655462184874,0.6511335012594458 213 | 211,1.0,500,1,42,0.6995798319327731,0.6523929471032746 214 | 212,1.0,500,1,1337,0.6752100840336135,0.6410579345088161 215 | 213,1.0,500,2,1,0.688655462184874,0.6511335012594458 216 | 214,1.0,500,2,42,0.6995798319327731,0.6523929471032746 217 | 215,1.0,500,2,1337,0.6764705882352942,0.6435768261964736 218 | 216,1.0,1000,1,1,0.746218487394958,0.6801007556675063 219 | 217,1.0,1000,1,42,0.7390756302521009,0.6624685138539043 220 | 218,1.0,1000,1,1337,0.7365546218487395,0.681360201511335 221 | 219,1.0,1000,2,1,0.7470588235294118,0.6801007556675063 222 | 220,1.0,1000,2,42,0.7403361344537815,0.663727959697733 223 | 221,1.0,1000,2,1337,0.7365546218487395,0.6851385390428212 224 | 222,1.0,2000,1,1,0.7978991596638656,0.690176322418136 225 | 223,1.0,2000,1,42,0.7827731092436975,0.6977329974811083 226 | 224,1.0,2000,1,1337,0.7785714285714286,0.7040302267002518 227 | 225,1.0,2000,2,1,0.7991596638655463,0.6914357682619647 228 | 226,1.0,2000,2,42,0.7836134453781513,0.6977329974811083 229 | 227,1.0,2000,2,1337,0.7794117647058824,0.7052896725440806 230 | 228,1.0,3000,1,1,0.795798319327731,0.6889168765743073 231 | 229,1.0,3000,1,42,0.7932773109243697,0.6926952141057935 232 | 230,1.0,3000,1,1337,0.7878151260504201,0.6851385390428212 233 | 231,1.0,3000,2,1,0.7978991596638656,0.6889168765743073 234 | 232,1.0,3000,2,42,0.7936974789915966,0.6914357682619647 235 | 233,1.0,3000,2,1337,0.7907563025210084,0.6863979848866498 236 | 234,1.0,4000,1,1,0.7970588235294118,0.6775818639798489 237 | 235,1.0,4000,1,42,0.7857142857142857,0.6851385390428212 238 | 236,1.0,4000,1,1337,0.784453781512605,0.6662468513853904 239 | 237,1.0,4000,2,1,0.7974789915966387,0.6775818639798489 240 | 238,1.0,4000,2,42,0.7861344537815126,0.6838790931989924 241 | 239,1.0,4000,2,1337,0.7810924369747899,0.6687657430730478 242 | 240,1.0,5000,1,1,0.7848739495798319,0.6649874055415617 243 | 241,1.0,5000,1,42,0.7668067226890757,0.6775818639798489 244 | 242,1.0,5000,1,1337,0.7672268907563026,0.654911838790932 245 | 243,1.0,5000,2,1,0.7857142857142857,0.6662468513853904 246 | 244,1.0,5000,2,42,0.7655462184873949,0.6763224181360201 247 | 245,1.0,5000,2,1337,0.7659663865546219,0.6511335012594458 248 | 246,1.0,8000,1,1,0.7243697478991596,0.6120906801007556 249 | 247,1.0,8000,1,42,0.7239495798319328,0.6259445843828715 250 | 248,1.0,8000,1,1337,0.7109243697478992,0.6007556675062973 251 | 249,1.0,8000,2,1,0.7310924369747899,0.6158690176322418 252 | 250,1.0,8000,2,42,0.7285714285714285,0.628463476070529 253 | 251,1.0,8000,2,1337,0.7100840336134454,0.6020151133501259 254 | -------------------------------------------------------------------------------- /model_results/nightlife_svc_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.9,8000,2,55,0.988655462184874,0.7783375314861462 3 | -------------------------------------------------------------------------------- /model_results/shopping_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.5,500,1,1,0.7264705882352941,0.6977329974811083 3 | 1,0.5,500,1,42,0.7218487394957983,0.7040302267002518 4 | 2,0.5,500,1,1337,0.7285714285714285,0.690176322418136 5 | 3,0.5,500,2,1,0.7264705882352941,0.6977329974811083 6 | 4,0.5,500,2,42,0.7222689075630252,0.7065491183879093 7 | 5,0.5,500,2,1337,0.7281512605042016,0.690176322418136 8 | 6,0.5,1000,1,1,0.7789915966386555,0.7292191435768262 9 | 7,0.5,1000,1,42,0.7726890756302521,0.7380352644836272 10 | 8,0.5,1000,1,1337,0.769327731092437,0.7216624685138538 11 | 9,0.5,1000,2,1,0.7785714285714286,0.7279596977329975 12 | 10,0.5,1000,2,42,0.7726890756302521,0.7380352644836272 13 | 11,0.5,1000,2,1337,0.7705882352941177,0.7216624685138538 14 | 12,0.5,2000,1,1,0.8134453781512605,0.7380352644836272 15 | 13,0.5,2000,1,42,0.8189075630252101,0.7506297229219143 16 | 14,0.5,2000,1,1337,0.811344537815126,0.7267002518891688 17 | 15,0.5,2000,2,1,0.8151260504201681,0.7355163727959698 18 | 16,0.5,2000,2,42,0.8180672268907563,0.7493702770780857 19 | 17,0.5,2000,2,1337,0.8109243697478992,0.7267002518891688 20 | 18,0.5,3000,1,1,0.8163865546218487,0.7241813602015114 21 | 19,0.5,3000,1,42,0.8147058823529412,0.7455919395465995 22 | 20,0.5,3000,1,1337,0.8147058823529412,0.7229219143576826 23 | 21,0.5,3000,2,1,0.8163865546218487,0.7279596977329975 24 | 22,0.5,3000,2,42,0.815546218487395,0.743073047858942 25 | 23,0.5,3000,2,1337,0.8159663865546218,0.7229219143576826 26 | 24,0.5,4000,1,1,0.8021008403361345,0.6939546599496221 27 | 25,0.5,4000,1,42,0.8,0.7279596977329975 28 | 26,0.5,4000,1,1337,0.8025210084033614,0.7115869017632241 29 | 27,0.5,4000,2,1,0.803781512605042,0.6952141057934509 30 | 28,0.5,4000,2,42,0.8016806722689076,0.7279596977329975 31 | 29,0.5,4000,2,1337,0.8012605042016807,0.7115869017632241 32 | 30,0.5,5000,1,1,0.7777310924369748,0.6788413098236776 33 | 31,0.5,5000,1,42,0.7802521008403361,0.7191435768261965 34 | 32,0.5,5000,1,1337,0.7760504201680672,0.698992443324937 35 | 33,0.5,5000,2,1,0.7785714285714286,0.6788413098236776 36 | 34,0.5,5000,2,42,0.7785714285714286,0.7191435768261965 37 | 35,0.5,5000,2,1337,0.7760504201680672,0.698992443324937 38 | 36,0.5,8000,1,1,0.7172268907563025,0.6410579345088161 39 | 37,0.5,8000,1,42,0.730672268907563,0.672544080604534 40 | 38,0.5,8000,1,1337,0.7105042016806723,0.6523929471032746 41 | 39,0.5,8000,2,1,0.7184873949579832,0.6410579345088161 42 | 40,0.5,8000,2,42,0.7340336134453781,0.6738035264483627 43 | 41,0.5,8000,2,1337,0.7138655462184874,0.6523929471032746 44 | 42,0.66,500,1,1,0.7172268907563025,0.6939546599496221 45 | 43,0.66,500,1,42,0.7105042016806723,0.6952141057934509 46 | 44,0.66,500,1,1337,0.7142857142857143,0.6851385390428212 47 | 45,0.66,500,2,1,0.7180672268907563,0.6926952141057935 48 | 46,0.66,500,2,42,0.711764705882353,0.6952141057934509 49 | 47,0.66,500,2,1337,0.7142857142857143,0.6851385390428212 50 | 48,0.66,1000,1,1,0.769327731092437,0.7229219143576826 51 | 49,0.66,1000,1,42,0.7684873949579832,0.7292191435768262 52 | 50,0.66,1000,1,1337,0.7609243697478991,0.7141057934508817 53 | 51,0.66,1000,2,1,0.7697478991596639,0.7216624685138538 54 | 52,0.66,1000,2,42,0.7684873949579832,0.7304785894206549 55 | 53,0.66,1000,2,1337,0.7600840336134453,0.7141057934508817 56 | 54,0.66,2000,1,1,0.8033613445378152,0.7317380352644837 57 | 55,0.66,2000,1,42,0.8046218487394958,0.7493702770780857 58 | 56,0.66,2000,1,1337,0.8008403361344538,0.7267002518891688 59 | 57,0.66,2000,2,1,0.8046218487394958,0.7317380352644837 60 | 58,0.66,2000,2,42,0.8046218487394958,0.7506297229219143 61 | 59,0.66,2000,2,1337,0.8012605042016807,0.7267002518891688 62 | 60,0.66,3000,1,1,0.8054621848739496,0.7191435768261965 63 | 61,0.66,3000,1,42,0.803781512605042,0.7367758186397985 64 | 62,0.66,3000,1,1337,0.803781512605042,0.716624685138539 65 | 63,0.66,3000,2,1,0.8063025210084034,0.7178841309823678 66 | 64,0.66,3000,2,42,0.8042016806722689,0.739294710327456 67 | 65,0.66,3000,2,1337,0.803781512605042,0.7178841309823678 68 | 66,0.66,4000,1,1,0.7894957983193277,0.681360201511335 69 | 67,0.66,4000,1,42,0.7878151260504201,0.7204030226700252 70 | 68,0.66,4000,1,1337,0.7865546218487395,0.7103274559193955 71 | 69,0.66,4000,2,1,0.7903361344537815,0.681360201511335 72 | 70,0.66,4000,2,42,0.7869747899159664,0.7191435768261965 73 | 71,0.66,4000,2,1337,0.7865546218487395,0.7115869017632241 74 | 72,0.66,5000,1,1,0.7638655462184873,0.6738035264483627 75 | 73,0.66,5000,1,42,0.7647058823529411,0.7090680100755667 76 | 74,0.66,5000,1,1337,0.7596638655462185,0.6876574307304786 77 | 75,0.66,5000,2,1,0.7647058823529411,0.672544080604534 78 | 76,0.66,5000,2,42,0.7647058823529411,0.7090680100755667 79 | 77,0.66,5000,2,1337,0.7592436974789916,0.6876574307304786 80 | 78,0.66,8000,1,1,0.7016806722689075,0.6360201511335013 81 | 79,0.66,8000,1,42,0.7113445378151261,0.6662468513853904 82 | 80,0.66,8000,1,1337,0.6978991596638655,0.6448362720403022 83 | 81,0.66,8000,2,1,0.7021008403361344,0.6347607052896725 84 | 82,0.66,8000,2,42,0.7134453781512605,0.6675062972292192 85 | 83,0.66,8000,2,1337,0.6983193277310924,0.6448362720403022 86 | 84,0.75,500,1,1,0.7130252100840336,0.690176322418136 87 | 85,0.75,500,1,42,0.707983193277311,0.6914357682619647 88 | 86,0.75,500,1,1337,0.7130252100840336,0.6863979848866498 89 | 87,0.75,500,2,1,0.7130252100840336,0.690176322418136 90 | 88,0.75,500,2,42,0.707983193277311,0.6914357682619647 91 | 89,0.75,500,2,1337,0.7130252100840336,0.6863979848866498 92 | 90,0.75,1000,1,1,0.7672268907563026,0.7216624685138538 93 | 91,0.75,1000,1,42,0.7655462184873949,0.7329974811083123 94 | 92,0.75,1000,1,1337,0.7579831932773109,0.7178841309823678 95 | 93,0.75,1000,2,1,0.7672268907563026,0.7216624685138538 96 | 94,0.75,1000,2,42,0.765126050420168,0.7304785894206549 97 | 95,0.75,1000,2,1337,0.7584033613445378,0.7178841309823678 98 | 96,0.75,2000,1,1,0.8050420168067227,0.7267002518891688 99 | 97,0.75,2000,1,42,0.8008403361344538,0.7455919395465995 100 | 98,0.75,2000,1,1337,0.7995798319327732,0.7267002518891688 101 | 99,0.75,2000,2,1,0.8042016806722689,0.7279596977329975 102 | 100,0.75,2000,2,42,0.8021008403361345,0.7455919395465995 103 | 101,0.75,2000,2,1337,0.8004201680672269,0.7279596977329975 104 | 102,0.75,3000,1,1,0.8033613445378152,0.7178841309823678 105 | 103,0.75,3000,1,42,0.8004201680672269,0.7355163727959698 106 | 104,0.75,3000,1,1337,0.8012605042016807,0.7178841309823678 107 | 105,0.75,3000,2,1,0.8025210084033614,0.716624685138539 108 | 106,0.75,3000,2,42,0.8,0.7380352644836272 109 | 107,0.75,3000,2,1337,0.8004201680672269,0.7178841309823678 110 | 108,0.75,4000,1,1,0.7886554621848739,0.6851385390428212 111 | 109,0.75,4000,1,42,0.7848739495798319,0.7191435768261965 112 | 110,0.75,4000,1,1337,0.784453781512605,0.7090680100755667 113 | 111,0.75,4000,2,1,0.7894957983193277,0.681360201511335 114 | 112,0.75,4000,2,42,0.7848739495798319,0.7204030226700252 115 | 113,0.75,4000,2,1337,0.7831932773109244,0.7052896725440806 116 | 114,0.75,5000,1,1,0.7609243697478991,0.6712846347607053 117 | 115,0.75,5000,1,42,0.7609243697478991,0.707808564231738 118 | 116,0.75,5000,1,1337,0.7558823529411764,0.6851385390428212 119 | 117,0.75,5000,2,1,0.761344537815126,0.6700251889168766 120 | 118,0.75,5000,2,42,0.761344537815126,0.7040302267002518 121 | 119,0.75,5000,2,1337,0.7558823529411764,0.6851385390428212 122 | 120,0.75,8000,1,1,0.6991596638655462,0.6347607052896725 123 | 121,0.75,8000,1,42,0.7100840336134454,0.6612090680100756 124 | 122,0.75,8000,1,1337,0.6957983193277311,0.6448362720403022 125 | 123,0.75,8000,2,1,0.6995798319327731,0.6347607052896725 126 | 124,0.75,8000,2,42,0.7113445378151261,0.663727959697733 127 | 125,0.75,8000,2,1337,0.6945378151260504,0.6448362720403022 128 | 126,0.83,500,1,1,0.7105042016806723,0.6826196473551638 129 | 127,0.83,500,1,42,0.7071428571428572,0.6889168765743073 130 | 128,0.83,500,1,1337,0.7121848739495799,0.6851385390428212 131 | 129,0.83,500,2,1,0.7105042016806723,0.6826196473551638 132 | 130,0.83,500,2,42,0.7071428571428572,0.6889168765743073 133 | 131,0.83,500,2,1337,0.7113445378151261,0.6863979848866498 134 | 132,0.83,1000,1,1,0.7642857142857142,0.7204030226700252 135 | 133,0.83,1000,1,42,0.7621848739495798,0.7267002518891688 136 | 134,0.83,1000,1,1337,0.7567226890756302,0.7103274559193955 137 | 135,0.83,1000,2,1,0.7642857142857142,0.7204030226700252 138 | 136,0.83,1000,2,42,0.7621848739495798,0.7267002518891688 139 | 137,0.83,1000,2,1337,0.7571428571428571,0.7103274559193955 140 | 138,0.83,2000,1,1,0.8016806722689076,0.7317380352644837 141 | 139,0.83,2000,1,42,0.8004201680672269,0.7418136020151134 142 | 140,0.83,2000,1,1337,0.7983193277310925,0.7279596977329975 143 | 141,0.83,2000,2,1,0.8025210084033614,0.7317380352644837 144 | 142,0.83,2000,2,42,0.8012605042016807,0.7443324937027708 145 | 143,0.83,2000,2,1337,0.7987394957983194,0.72544080604534 146 | 144,0.83,3000,1,1,0.8,0.7141057934508817 147 | 145,0.83,3000,1,42,0.7970588235294118,0.7342569269521411 148 | 146,0.83,3000,1,1337,0.7983193277310925,0.7141057934508817 149 | 147,0.83,3000,2,1,0.8,0.7141057934508817 150 | 148,0.83,3000,2,42,0.7983193277310925,0.7355163727959698 151 | 149,0.83,3000,2,1337,0.7991596638655463,0.7128463476070529 152 | 150,0.83,4000,1,1,0.784453781512605,0.6851385390428212 153 | 151,0.83,4000,1,42,0.7802521008403361,0.7178841309823678 154 | 152,0.83,4000,1,1337,0.7823529411764706,0.7052896725440806 155 | 153,0.83,4000,2,1,0.7852941176470588,0.6826196473551638 156 | 154,0.83,4000,2,42,0.7810924369747899,0.7229219143576826 157 | 155,0.83,4000,2,1337,0.780672268907563,0.7052896725440806 158 | 156,0.83,5000,1,1,0.7529411764705882,0.6687657430730478 159 | 157,0.83,5000,1,42,0.7596638655462185,0.7052896725440806 160 | 158,0.83,5000,1,1337,0.7508403361344538,0.6738035264483627 161 | 159,0.83,5000,2,1,0.7525210084033613,0.6675062972292192 162 | 160,0.83,5000,2,42,0.7588235294117647,0.7027707808564232 163 | 161,0.83,5000,2,1337,0.7516806722689076,0.6750629722921915 164 | 162,0.83,8000,1,1,0.6953781512605042,0.6335012594458438 165 | 163,0.83,8000,1,42,0.7067226890756303,0.6599496221662469 166 | 164,0.83,8000,1,1337,0.6915966386554622,0.6410579345088161 167 | 165,0.83,8000,2,1,0.6941176470588235,0.6335012594458438 168 | 166,0.83,8000,2,42,0.7067226890756303,0.6612090680100756 169 | 167,0.83,8000,2,1337,0.6932773109243697,0.6435768261964736 170 | 168,0.9,500,1,1,0.7109243697478992,0.6851385390428212 171 | 169,0.9,500,1,42,0.7058823529411765,0.6889168765743073 172 | 170,0.9,500,1,1337,0.7100840336134454,0.6838790931989924 173 | 171,0.9,500,2,1,0.7109243697478992,0.6851385390428212 174 | 172,0.9,500,2,42,0.7058823529411765,0.6889168765743073 175 | 173,0.9,500,2,1337,0.7100840336134454,0.6838790931989924 176 | 174,0.9,1000,1,1,0.7634453781512605,0.7191435768261965 177 | 175,0.9,1000,1,42,0.7617647058823529,0.7241813602015114 178 | 176,0.9,1000,1,1337,0.7571428571428571,0.707808564231738 179 | 177,0.9,1000,2,1,0.7634453781512605,0.7178841309823678 180 | 178,0.9,1000,2,42,0.7634453781512605,0.72544080604534 181 | 179,0.9,1000,2,1337,0.7563025210084033,0.7090680100755667 182 | 180,0.9,2000,1,1,0.7995798319327732,0.7292191435768262 183 | 181,0.9,2000,1,42,0.8004201680672269,0.7443324937027708 184 | 182,0.9,2000,1,1337,0.7974789915966387,0.72544080604534 185 | 183,0.9,2000,2,1,0.8,0.7304785894206549 186 | 184,0.9,2000,2,42,0.8016806722689076,0.743073047858942 187 | 185,0.9,2000,2,1337,0.7949579831932773,0.72544080604534 188 | 186,0.9,3000,1,1,0.7983193277310925,0.7153652392947103 189 | 187,0.9,3000,1,42,0.7974789915966387,0.7317380352644837 190 | 188,0.9,3000,1,1337,0.7962184873949579,0.7141057934508817 191 | 189,0.9,3000,2,1,0.7983193277310925,0.7153652392947103 192 | 190,0.9,3000,2,42,0.7970588235294118,0.7342569269521411 193 | 191,0.9,3000,2,1337,0.7970588235294118,0.716624685138539 194 | 192,0.9,4000,1,1,0.7823529411764706,0.6851385390428212 195 | 193,0.9,4000,1,42,0.7781512605042017,0.7191435768261965 196 | 194,0.9,4000,1,1337,0.7802521008403361,0.707808564231738 197 | 195,0.9,4000,2,1,0.7827731092436975,0.6838790931989924 198 | 196,0.9,4000,2,42,0.7781512605042017,0.7191435768261965 199 | 197,0.9,4000,2,1337,0.7798319327731092,0.7027707808564232 200 | 198,0.9,5000,1,1,0.7516806722689076,0.6599496221662469 201 | 199,0.9,5000,1,42,0.7579831932773109,0.7002518891687658 202 | 200,0.9,5000,1,1337,0.75,0.6763224181360201 203 | 201,0.9,5000,2,1,0.7512605042016807,0.6612090680100756 204 | 202,0.9,5000,2,42,0.757563025210084,0.7015113350125944 205 | 203,0.9,5000,2,1337,0.7495798319327731,0.6750629722921915 206 | 204,0.9,8000,1,1,0.6945378151260504,0.6322418136020151 207 | 205,0.9,8000,1,42,0.704201680672269,0.6599496221662469 208 | 206,0.9,8000,1,1337,0.6915966386554622,0.6423173803526449 209 | 207,0.9,8000,2,1,0.6936974789915966,0.6322418136020151 210 | 208,0.9,8000,2,42,0.7046218487394958,0.6612090680100756 211 | 209,0.9,8000,2,1337,0.692436974789916,0.6423173803526449 212 | 210,1.0,500,1,1,0.6941176470588235,0.6738035264483627 213 | 211,1.0,500,1,42,0.692436974789916,0.6851385390428212 214 | 212,1.0,500,1,1337,0.6957983193277311,0.6738035264483627 215 | 213,1.0,500,2,1,0.6941176470588235,0.6738035264483627 216 | 214,1.0,500,2,42,0.692436974789916,0.6851385390428212 217 | 215,1.0,500,2,1337,0.696218487394958,0.672544080604534 218 | 216,1.0,1000,1,1,0.7466386554621849,0.707808564231738 219 | 217,1.0,1000,1,42,0.75,0.7040302267002518 220 | 218,1.0,1000,1,1337,0.7411764705882353,0.6952141057934509 221 | 219,1.0,1000,2,1,0.7466386554621849,0.707808564231738 222 | 220,1.0,1000,2,42,0.7495798319327731,0.7040302267002518 223 | 221,1.0,1000,2,1337,0.7411764705882353,0.6964735516372796 224 | 222,1.0,2000,1,1,0.7857142857142857,0.7141057934508817 225 | 223,1.0,2000,1,42,0.7802521008403361,0.7204030226700252 226 | 224,1.0,2000,1,1337,0.7747899159663866,0.7090680100755667 227 | 225,1.0,2000,2,1,0.7861344537815126,0.7128463476070529 228 | 226,1.0,2000,2,42,0.7815126050420168,0.7216624685138538 229 | 227,1.0,2000,2,1337,0.7752100840336135,0.7090680100755667 230 | 228,1.0,3000,1,1,0.7718487394957984,0.6914357682619647 231 | 229,1.0,3000,1,42,0.7701680672268908,0.716624685138539 232 | 230,1.0,3000,1,1337,0.7676470588235295,0.7065491183879093 233 | 231,1.0,3000,2,1,0.7739495798319328,0.6914357682619647 234 | 232,1.0,3000,2,42,0.7689075630252101,0.716624685138539 235 | 233,1.0,3000,2,1337,0.7684873949579832,0.7065491183879093 236 | 234,1.0,4000,1,1,0.7470588235294118,0.6649874055415617 237 | 235,1.0,4000,1,42,0.7521008403361344,0.7015113350125944 238 | 236,1.0,4000,1,1337,0.7495798319327731,0.6801007556675063 239 | 237,1.0,4000,2,1,0.7483193277310924,0.6662468513853904 240 | 238,1.0,4000,2,42,0.7525210084033613,0.7015113350125944 241 | 239,1.0,4000,2,1337,0.7508403361344538,0.6788413098236776 242 | 240,1.0,5000,1,1,0.7226890756302521,0.6486146095717884 243 | 241,1.0,5000,1,42,0.7327731092436974,0.6763224181360201 244 | 242,1.0,5000,1,1337,0.7184873949579832,0.6536523929471033 245 | 243,1.0,5000,2,1,0.723109243697479,0.6486146095717884 246 | 244,1.0,5000,2,42,0.7327731092436974,0.6788413098236776 247 | 245,1.0,5000,2,1337,0.7180672268907563,0.6536523929471033 248 | 246,1.0,8000,1,1,0.6701680672268907,0.6259445843828715 249 | 247,1.0,8000,1,42,0.6718487394957983,0.6574307304785895 250 | 248,1.0,8000,1,1337,0.6743697478991597,0.6322418136020151 251 | 249,1.0,8000,2,1,0.6701680672268907,0.6259445843828715 252 | 250,1.0,8000,2,42,0.673109243697479,0.6561712846347607 253 | 251,1.0,8000,2,1337,0.6756302521008404,0.6335012594458438 254 | -------------------------------------------------------------------------------- /model_results/shopping_svc_results.csv: -------------------------------------------------------------------------------- 1 | ,max_df,max_features,min_df,random_state,train_score,test_score 2 | 0,0.9,8000,2,55,0.9878151260504202,0.8198992443324937 3 | -------------------------------------------------------------------------------- /models/tfidf.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/models/tfidf.pkl -------------------------------------------------------------------------------- /models/top_words_artsy.pkl: -------------------------------------------------------------------------------- 1 | c__builtin__ 2 | set 3 | p0 4 | ((lp1 5 | Vbushwick 6 | p2 7 | aVgolden 8 | p3 9 | aVbars 10 | p4 11 | aVart 12 | p5 13 | aVbart 14 | p6 15 | aVpeople 16 | p7 17 | aVsoma 18 | p8 19 | aVgalleries 20 | p9 21 | aVmission 22 | p10 23 | aVvillage 24 | p11 25 | aVclinton 26 | p12 27 | aVliving 28 | p13 29 | aVgreenpoint 30 | p14 31 | aVspace 32 | p15 33 | aVmuseum 34 | p16 35 | aVstuy 36 | p17 37 | aVwilliamsburg 38 | p18 39 | aVdolores 40 | p19 41 | aVhaight 42 | p20 43 | aVwhich 44 | p21 45 | aVvalencia 46 | p22 47 | aVbeach 48 | p23 49 | aVwe 50 | p24 51 | aVloft 52 | p25 53 | aVbus 54 | p26 55 | aVsafe 56 | p27 57 | aVpark 58 | p28 59 | aVbrooklyn 60 | p29 61 | aVtrain 62 | p30 63 | aVstuyvesant 64 | p31 65 | aVbedford 66 | p32 67 | aVcentral 68 | p33 69 | aVcastro 70 | p34 71 | atp35 72 | Rp36 73 | . -------------------------------------------------------------------------------- /models/top_words_dining.pkl: -------------------------------------------------------------------------------- 1 | c__builtin__ 2 | set 3 | p0 4 | ((lp1 5 | Vprospect 6 | p2 7 | aVupper 8 | p3 9 | aVbart 10 | p4 11 | aVheart 12 | p5 13 | aVhouse 14 | p6 15 | aVmission 16 | p7 17 | aVvillage 18 | p8 19 | aVclose 20 | p9 21 | aVglen 22 | p10 23 | aVunion 24 | p11 25 | aVmuseum 26 | p12 27 | aVwilliamsburg 28 | p13 29 | aVdolores 30 | p14 31 | aVzoo 32 | p15 33 | aVother 34 | p16 35 | aVhaight 36 | p17 37 | aVutica 38 | p18 39 | aVth 40 | p19 41 | aVnopa 42 | p20 43 | aVvalencia 44 | p21 45 | aVfull 46 | p22 47 | aVloft 48 | p23 49 | aVvery 50 | p24 51 | aVexpress 52 | p25 53 | aVpark 54 | p26 55 | aVtrain 56 | p27 57 | aVstuyvesant 58 | p28 59 | aVcentral 60 | p29 61 | aVisland 62 | p30 63 | aVlexington 64 | p31 65 | aValamo 66 | p32 67 | aVsubway 68 | p33 69 | aVside 70 | p34 71 | atp35 72 | Rp36 73 | . -------------------------------------------------------------------------------- /models/top_words_nightlife.pkl: -------------------------------------------------------------------------------- 1 | c__builtin__ 2 | set 3 | p0 4 | ((lp1 5 | Vprospect 6 | p2 7 | aVheart 8 | p3 9 | aVbars 10 | p4 11 | aVsquare 12 | p5 13 | aVbart 14 | p6 15 | aVgalleries 16 | p7 17 | aVmission 18 | p8 19 | aVvillage 20 | p9 21 | aVmarket 22 | p10 23 | aVglen 24 | p11 25 | aVunion 26 | p12 27 | aVwest 28 | p13 29 | aVwilliamsburg 30 | p14 31 | aVdolores 32 | p15 33 | aVheights 34 | p16 35 | aVother 36 | p17 37 | aVbest 38 | p18 39 | aVvalencia 40 | p19 41 | aVmanhattan 42 | p20 43 | aVavailable 44 | p21 45 | aVqueen 46 | p22 47 | aVpark 48 | p23 49 | aVbrooklyn 50 | p24 51 | aVtrain 52 | p25 53 | aVgate 54 | p26 55 | aVday 56 | p27 57 | aVjfk 58 | p28 59 | aVchelsea 60 | p29 61 | aVcastro 62 | p30 63 | aVso 64 | p31 65 | aVmoscone 66 | p32 67 | aVminutes 68 | p33 69 | aVgeary 70 | p34 71 | atp35 72 | Rp36 73 | . -------------------------------------------------------------------------------- /models/top_words_shopping.pkl: -------------------------------------------------------------------------------- 1 | c__builtin__ 2 | set 3 | p0 4 | ((lp1 5 | Vprospect 6 | p2 7 | aVupper 8 | p3 9 | aVbars 10 | p4 11 | aVheart 12 | p5 13 | aVmission 14 | p6 15 | aVtartine 16 | p7 17 | aVvillage 18 | p8 19 | aVhome 20 | p9 21 | aVnolita 22 | p10 23 | aVboutiques 24 | p11 25 | aVbest 26 | p12 27 | aVliving 28 | p13 29 | aVitaly 30 | p14 31 | aVwilliamsburg 32 | p15 33 | aVdolores 34 | p16 35 | aVlocation 36 | p17 37 | aVth 38 | p18 39 | aVvalencia 40 | p19 41 | aVeast 42 | p20 43 | aVhill 44 | p21 45 | aVnoe 46 | p22 47 | aVpark 48 | p23 49 | aVbrooklyn 50 | p24 51 | aVtrain 52 | p25 53 | aVline 54 | p26 55 | aVentire 56 | p27 57 | aVlower 58 | p28 59 | aVvalley 60 | p29 61 | aVcastro 62 | p30 63 | aVsoho 64 | p31 65 | aVharlem 66 | p32 67 | aVminutes 68 | p33 69 | aVside 70 | p34 71 | atp35 72 | Rp36 73 | . -------------------------------------------------------------------------------- /static/css/bootstrap-slider.css: -------------------------------------------------------------------------------- 1 | /*! ========================================================= 2 | * bootstrap-slider.js 3 | * 4 | * Maintainers: 5 | * Kyle Kemp 6 | * - Twitter: @seiyria 7 | * - Github: seiyria 8 | * Rohit Kalkur 9 | * - Twitter: @Rovolutionary 10 | * - Github: rovolution 11 | * 12 | * ========================================================= 13 | * 14 | * Licensed under the Apache License, Version 2.0 (the "License"); 15 | * you may not use this file except in compliance with the License. 16 | * You may obtain a copy of the License at 17 | * 18 | * http://www.apache.org/licenses/LICENSE-2.0 19 | * 20 | * Unless required by applicable law or agreed to in writing, software 21 | * distributed under the License is distributed on an "AS IS" BASIS, 22 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 23 | * See the License for the specific language governing permissions and 24 | * limitations under the License. 25 | * ========================================================= */ 26 | .slider { 27 | display: inline-block; 28 | vertical-align: middle; 29 | position: relative; 30 | } 31 | .slider.slider-horizontal { 32 | width: 210px; 33 | height: 20px; 34 | } 35 | .slider.slider-horizontal .slider-track { 36 | height: 10px; 37 | width: 100%; 38 | margin-top: -5px; 39 | top: 50%; 40 | left: 0; 41 | } 42 | .slider.slider-horizontal .slider-selection, 43 | .slider.slider-horizontal .slider-track-low, 44 | .slider.slider-horizontal .slider-track-high { 45 | height: 100%; 46 | top: 0; 47 | bottom: 0; 48 | } 49 | .slider.slider-horizontal .slider-tick, 50 | .slider.slider-horizontal .slider-handle { 51 | margin-left: -10px; 52 | margin-top: -5px; 53 | } 54 | .slider.slider-horizontal .slider-tick.triangle, 55 | .slider.slider-horizontal .slider-handle.triangle { 56 | border-width: 0 10px 10px 10px; 57 | width: 0; 58 | height: 0; 59 | border-bottom-color: #0480be; 60 | margin-top: 0; 61 | } 62 | .slider.slider-horizontal .slider-tick-label-container { 63 | white-space: nowrap; 64 | margin-top: 20px; 65 | } 66 | .slider.slider-horizontal .slider-tick-label-container .slider-tick-label { 67 | padding-top: 4px; 68 | display: inline-block; 69 | text-align: center; 70 | } 71 | .slider.slider-vertical { 72 | height: 210px; 73 | width: 20px; 74 | } 75 | .slider.slider-vertical .slider-track { 76 | width: 10px; 77 | height: 100%; 78 | margin-left: -5px; 79 | left: 50%; 80 | top: 0; 81 | } 82 | .slider.slider-vertical .slider-selection { 83 | width: 100%; 84 | left: 0; 85 | top: 0; 86 | bottom: 0; 87 | } 88 | .slider.slider-vertical .slider-track-low, 89 | .slider.slider-vertical .slider-track-high { 90 | width: 100%; 91 | left: 0; 92 | right: 0; 93 | } 94 | .slider.slider-vertical .slider-tick, 95 | .slider.slider-vertical .slider-handle { 96 | margin-left: -5px; 97 | margin-top: -10px; 98 | } 99 | .slider.slider-vertical .slider-tick.triangle, 100 | .slider.slider-vertical .slider-handle.triangle { 101 | border-width: 10px 0 10px 10px; 102 | width: 1px; 103 | height: 1px; 104 | border-left-color: #0480be; 105 | margin-left: 0; 106 | } 107 | .slider.slider-disabled .slider-handle { 108 | background-image: -webkit-linear-gradient(top, #dfdfdf 0%, #bebebe 100%); 109 | background-image: -o-linear-gradient(top, #dfdfdf 0%, #bebebe 100%); 110 | background-image: linear-gradient(to bottom, #dfdfdf 0%, #bebebe 100%); 111 | background-repeat: repeat-x; 112 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ffdfdfdf', endColorstr='#ffbebebe', GradientType=0); 113 | } 114 | .slider.slider-disabled .slider-track { 115 | background-image: -webkit-linear-gradient(top, #e5e5e5 0%, #e9e9e9 100%); 116 | background-image: -o-linear-gradient(top, #e5e5e5 0%, #e9e9e9 100%); 117 | background-image: linear-gradient(to bottom, #e5e5e5 0%, #e9e9e9 100%); 118 | background-repeat: repeat-x; 119 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ffe5e5e5', endColorstr='#ffe9e9e9', GradientType=0); 120 | cursor: not-allowed; 121 | } 122 | .slider input { 123 | display: none; 124 | } 125 | .slider .tooltip.top { 126 | margin-top: -36px; 127 | } 128 | .slider .tooltip-inner { 129 | white-space: nowrap; 130 | } 131 | .slider .hide { 132 | display: none; 133 | } 134 | .slider-track { 135 | position: absolute; 136 | cursor: pointer; 137 | background-image: -webkit-linear-gradient(top, #f5f5f5 0%, #f9f9f9 100%); 138 | background-image: -o-linear-gradient(top, #f5f5f5 0%, #f9f9f9 100%); 139 | background-image: linear-gradient(to bottom, #f5f5f5 0%, #f9f9f9 100%); 140 | background-repeat: repeat-x; 141 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#fff5f5f5', endColorstr='#fff9f9f9', GradientType=0); 142 | -webkit-box-shadow: inset 0 1px 2px rgba(0, 0, 0, 0.1); 143 | box-shadow: inset 0 1px 2px rgba(0, 0, 0, 0.1); 144 | border-radius: 4px; 145 | } 146 | .slider-selection { 147 | position: absolute; 148 | background-image: -webkit-linear-gradient(top, #f9f9f9 0%, #f5f5f5 100%); 149 | background-image: -o-linear-gradient(top, #f9f9f9 0%, #f5f5f5 100%); 150 | background-image: linear-gradient(to bottom, #f9f9f9 0%, #f5f5f5 100%); 151 | background-repeat: repeat-x; 152 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#fff9f9f9', endColorstr='#fff5f5f5', GradientType=0); 153 | -webkit-box-shadow: inset 0 -1px 0 rgba(0, 0, 0, 0.15); 154 | box-shadow: inset 0 -1px 0 rgba(0, 0, 0, 0.15); 155 | -webkit-box-sizing: border-box; 156 | -moz-box-sizing: border-box; 157 | box-sizing: border-box; 158 | border-radius: 4px; 159 | } 160 | .slider-selection.tick-slider-selection { 161 | background-image: -webkit-linear-gradient(top, #89cdef 0%, #81bfde 100%); 162 | background-image: -o-linear-gradient(top, #89cdef 0%, #81bfde 100%); 163 | background-image: linear-gradient(to bottom, #89cdef 0%, #81bfde 100%); 164 | background-repeat: repeat-x; 165 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ff89cdef', endColorstr='#ff81bfde', GradientType=0); 166 | } 167 | .slider-track-low, 168 | .slider-track-high { 169 | position: absolute; 170 | background: transparent; 171 | -webkit-box-sizing: border-box; 172 | -moz-box-sizing: border-box; 173 | box-sizing: border-box; 174 | border-radius: 4px; 175 | } 176 | .slider-handle { 177 | position: absolute; 178 | width: 20px; 179 | height: 20px; 180 | background-color: #337ab7; 181 | background-image: -webkit-linear-gradient(top, #149bdf 0%, #0480be 100%); 182 | background-image: -o-linear-gradient(top, #149bdf 0%, #0480be 100%); 183 | background-image: linear-gradient(to bottom, #149bdf 0%, #0480be 100%); 184 | background-repeat: repeat-x; 185 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ff149bdf', endColorstr='#ff0480be', GradientType=0); 186 | filter: none; 187 | -webkit-box-shadow: inset 0 1px 0 rgba(255,255,255,.2), 0 1px 2px rgba(0,0,0,.05); 188 | box-shadow: inset 0 1px 0 rgba(255,255,255,.2), 0 1px 2px rgba(0,0,0,.05); 189 | border: 0px solid transparent; 190 | } 191 | .slider-handle.round { 192 | border-radius: 50%; 193 | } 194 | .slider-handle.triangle { 195 | background: transparent none; 196 | } 197 | .slider-handle.custom { 198 | background: transparent none; 199 | } 200 | .slider-handle.custom::before { 201 | line-height: 20px; 202 | font-size: 20px; 203 | content: '\2605'; 204 | color: #726204; 205 | } 206 | .slider-tick { 207 | position: absolute; 208 | width: 20px; 209 | height: 20px; 210 | background-image: -webkit-linear-gradient(top, #f9f9f9 0%, #f5f5f5 100%); 211 | background-image: -o-linear-gradient(top, #f9f9f9 0%, #f5f5f5 100%); 212 | background-image: linear-gradient(to bottom, #f9f9f9 0%, #f5f5f5 100%); 213 | background-repeat: repeat-x; 214 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#fff9f9f9', endColorstr='#fff5f5f5', GradientType=0); 215 | -webkit-box-shadow: inset 0 -1px 0 rgba(0, 0, 0, 0.15); 216 | box-shadow: inset 0 -1px 0 rgba(0, 0, 0, 0.15); 217 | -webkit-box-sizing: border-box; 218 | -moz-box-sizing: border-box; 219 | box-sizing: border-box; 220 | filter: none; 221 | opacity: 0.8; 222 | border: 0px solid transparent; 223 | } 224 | .slider-tick.round { 225 | border-radius: 50%; 226 | } 227 | .slider-tick.triangle { 228 | background: transparent none; 229 | } 230 | .slider-tick.custom { 231 | background: transparent none; 232 | } 233 | .slider-tick.custom::before { 234 | line-height: 20px; 235 | font-size: 20px; 236 | content: '\2605'; 237 | color: #726204; 238 | } 239 | .slider-tick.in-selection { 240 | background-image: -webkit-linear-gradient(top, #89cdef 0%, #81bfde 100%); 241 | background-image: -o-linear-gradient(top, #89cdef 0%, #81bfde 100%); 242 | background-image: linear-gradient(to bottom, #89cdef 0%, #81bfde 100%); 243 | background-repeat: repeat-x; 244 | filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#ff89cdef', endColorstr='#ff81bfde', GradientType=0); 245 | opacity: 1; 246 | } 247 | -------------------------------------------------------------------------------- /static/css/localebnb_main.css: -------------------------------------------------------------------------------- 1 | @font-face{font-family:Circular;src:url("https://a1.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book-22799398756cc42454a77735013a3378.eot");src:url("https://a1.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book-22799398756cc42454a77735013a3378.eot?#") format("eot"),url("https://a1.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book-030dcebde359eb3be354ab21c34a89ce.woff") format("woff"),url("https://a1.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book-287e910a06039c130e488343a7564c39.svg") format("svg");font-weight:normal;font-style:normal;} 2 | @font-face{font-family:Circular;src:url("https://a2.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book_Italic-35e1cf57d93dc4eb3db11cc2448cb91f.eot");src:url("https://a2.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book_Italic-35e1cf57d93dc4eb3db11cc2448cb91f.eot?#") format("eot"),url("https://a1.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book_Italic-1db902f5b85bbb0964e2994434edbe16.woff") format("woff"),url("https://a1.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Book_Italic-0d9eb203b260869dc2470ae35162fc1e.svg") format("svg");font-weight:normal;font-style:italic;} 3 | @font-face{font-family:Circular;src:url("https://a0.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Bold-d74b6eea213711f97770fccaf37a7644.eot");src:url("https://a0.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Bold-d74b6eea213711f97770fccaf37a7644.eot?#") format("eot"),url("https://a2.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Bold-ba3e389678777af817295255589ca6f5.woff") format("woff"),url("https://a2.muscache.com/airbnb/static/o2.1/build/fonts/Circular_Air-Bold-3831f8bc07e9e70a9b42b4be3ea1a32c.svg") format("svg");font-weight:700;font-style:normal;} 4 | body{font-family:Circular,sans-serif;font-size:14px;line-height:1.43;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;} 5 | 6 | -------------------------------------------------------------------------------- /static/css/navbar-fixed-top.css: -------------------------------------------------------------------------------- 1 | body { 2 | min-height: 2000px; 3 | padding-top: 70px; 4 | } 5 | -------------------------------------------------------------------------------- /static/css/sortable-theme-bootstrap-for-localebnb.css: -------------------------------------------------------------------------------- 1 | /* line 2, ../sass/_sortable.sass */ 2 | table[data-sortable] { 3 | border-collapse: collapse; 4 | border-spacing: 0; 5 | } 6 | /* line 6, ../sass/_sortable.sass */ 7 | table[data-sortable] th { 8 | vertical-align: bottom; 9 | font-weight: bold; 10 | } 11 | /* line 10, ../sass/_sortable.sass */ 12 | table[data-sortable] th, table[data-sortable] td { 13 | text-align: left; 14 | padding: 10px; 15 | } 16 | /* line 14, ../sass/_sortable.sass */ 17 | table[data-sortable] th:not([data-sortable="false"]) { 18 | -webkit-user-select: none; 19 | -moz-user-select: none; 20 | -ms-user-select: none; 21 | -o-user-select: none; 22 | user-select: none; 23 | -webkit-tap-highlight-color: rgba(0, 0, 0, 0); 24 | -webkit-touch-callout: none; 25 | cursor: pointer; 26 | } 27 | /* line 26, ../sass/_sortable.sass */ 28 | table[data-sortable] th:after { 29 | content: ""; 30 | visibility: hidden; 31 | display: inline-block; 32 | vertical-align: inherit; 33 | height: 0; 34 | width: 0; 35 | border-width: 5px; 36 | border-style: solid; 37 | border-color: transparent; 38 | margin-right: 1px; 39 | margin-left: 10px; 40 | float: right; 41 | } 42 | /* line 40, ../sass/_sortable.sass */ 43 | table[data-sortable] th[data-sorted="true"]:after { 44 | visibility: visible; 45 | } 46 | /* line 43, ../sass/_sortable.sass */ 47 | table[data-sortable] th[data-sorted-direction="descending"]:after { 48 | border-top-color: inherit; 49 | margin-top: 8px; 50 | } 51 | /* line 47, ../sass/_sortable.sass */ 52 | table[data-sortable] th[data-sorted-direction="ascending"]:after { 53 | border-bottom-color: inherit; 54 | margin-top: 3px; 55 | } 56 | 57 | /* line 5, ../sass/sortable-theme-bootstrap.sass */ 58 | table[data-sortable].sortable-theme-bootstrap { 59 | font-size: 14px; 60 | line-height: 20px; 61 | color: #333333; 62 | background: white; 63 | } 64 | /* line 12, ../sass/sortable-theme-bootstrap.sass */ 65 | table[data-sortable].sortable-theme-bootstrap thead th { 66 | border-bottom: 2px solid #e0e0e0; 67 | } 68 | /* line 15, ../sass/sortable-theme-bootstrap.sass */ 69 | table[data-sortable].sortable-theme-bootstrap tbody td { 70 | border-top: 1px solid #e0e0e0; 71 | } 72 | /* line 18, ../sass/sortable-theme-bootstrap.sass */ 73 | table[data-sortable].sortable-theme-bootstrap th[data-sorted="true"] { 74 | color: #3a87ad; 75 | background: #d9edf7; 76 | border-bottom-color: #bce8f1; 77 | } 78 | /* line 23, ../sass/sortable-theme-bootstrap.sass */ 79 | table[data-sortable].sortable-theme-bootstrap th[data-sorted="true"][data-sorted-direction="descending"]:after { 80 | border-top-color: #3a87ad; 81 | } 82 | /* line 26, ../sass/sortable-theme-bootstrap.sass */ 83 | table[data-sortable].sortable-theme-bootstrap th[data-sorted="true"][data-sorted-direction="ascending"]:after { 84 | border-bottom-color: #3a87ad; 85 | } 86 | /* line 31, ../sass/sortable-theme-bootstrap.sass */ 87 | table[data-sortable].sortable-theme-bootstrap.sortable-theme-bootstrap-striped tbody > tr:nth-child(odd) > td { 88 | background-color: #f9f9f9; 89 | } 90 | -------------------------------------------------------------------------------- /static/fonts/glyphicons-halflings-regular.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/fonts/glyphicons-halflings-regular.eot -------------------------------------------------------------------------------- /static/fonts/glyphicons-halflings-regular.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/fonts/glyphicons-halflings-regular.ttf -------------------------------------------------------------------------------- /static/fonts/glyphicons-halflings-regular.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/fonts/glyphicons-halflings-regular.woff -------------------------------------------------------------------------------- /static/fonts/glyphicons-halflings-regular.woff2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/fonts/glyphicons-halflings-regular.woff2 -------------------------------------------------------------------------------- /static/img/alamo_square.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/alamo_square.jpg -------------------------------------------------------------------------------- /static/img/location_heart_ico.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/location_heart_ico.png -------------------------------------------------------------------------------- /static/img/location_ico.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/location_ico.png -------------------------------------------------------------------------------- /static/img/presentation_insights.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/presentation_insights.jpg -------------------------------------------------------------------------------- /static/img/presentation_methodology.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/presentation_methodology.jpg -------------------------------------------------------------------------------- /static/img/presentation_solution.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/presentation_solution.jpg -------------------------------------------------------------------------------- /static/img/presentation_title.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gscottstukey/Localebnb/c48b21995606602a48ac8e3740edb34383ea14ea/static/img/presentation_title.jpg -------------------------------------------------------------------------------- /static/js/ie-emulation-modes-warning.js: -------------------------------------------------------------------------------- 1 | // NOTICE!! DO NOT USE ANY OF THIS JAVASCRIPT 2 | // IT'S JUST JUNK FOR OUR DOCS! 3 | // ++++++++++++++++++++++++++++++++++++++++++ 4 | /*! 5 | * Copyright 2014 Twitter, Inc. 6 | * 7 | * Licensed under the Creative Commons Attribution 3.0 Unported License. For 8 | * details, see http://creativecommons.org/licenses/by/3.0/. 9 | */ 10 | // Intended to prevent false-positive bug reports about Bootstrap not working properly in old versions of IE due to folks testing using IE's unreliable emulation modes. 11 | (function () { 12 | 'use strict'; 13 | 14 | function emulatedIEMajorVersion() { 15 | var groups = /MSIE ([0-9.]+)/.exec(window.navigator.userAgent) 16 | if (groups === null) { 17 | return null 18 | } 19 | var ieVersionNum = parseInt(groups[1], 10) 20 | var ieMajorVersion = Math.floor(ieVersionNum) 21 | return ieMajorVersion 22 | } 23 | 24 | function actualNonEmulatedIEMajorVersion() { 25 | // Detects the actual version of IE in use, even if it's in an older-IE emulation mode. 26 | // IE JavaScript conditional compilation docs: http://msdn.microsoft.com/en-us/library/ie/121hztk3(v=vs.94).aspx 27 | // @cc_on docs: http://msdn.microsoft.com/en-us/library/ie/8ka90k2e(v=vs.94).aspx 28 | var jscriptVersion = new Function('/*@cc_on return @_jscript_version; @*/')() // jshint ignore:line 29 | if (jscriptVersion === undefined) { 30 | return 11 // IE11+ not in emulation mode 31 | } 32 | if (jscriptVersion < 9) { 33 | return 8 // IE8 (or lower; haven't tested on IE<8) 34 | } 35 | return jscriptVersion // IE9 or IE10 in any mode, or IE11 in non-IE11 mode 36 | } 37 | 38 | var ua = window.navigator.userAgent 39 | if (ua.indexOf('Opera') > -1 || ua.indexOf('Presto') > -1) { 40 | return // Opera, which might pretend to be IE 41 | } 42 | var emulated = emulatedIEMajorVersion() 43 | if (emulated === null) { 44 | return // Not IE 45 | } 46 | var nonEmulated = actualNonEmulatedIEMajorVersion() 47 | 48 | if (emulated !== nonEmulated) { 49 | window.alert('WARNING: You appear to be using IE' + nonEmulated + ' in IE' + emulated + ' emulation mode.\nIE emulation modes can behave significantly differently from ACTUAL older versions of IE.\nPLEASE DON\'T FILE BOOTSTRAP BUGS based on testing in IE emulation modes!') 50 | } 51 | })(); 52 | -------------------------------------------------------------------------------- /static/js/ie10-viewport-bug-workaround.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * IE10 viewport hack for Surface/desktop Windows 8 bug 3 | * Copyright 2014 Twitter, Inc. 4 | * Licensed under the Creative Commons Attribution 3.0 Unported License. For 5 | * details, see http://creativecommons.org/licenses/by/3.0/. 6 | */ 7 | 8 | // See the Getting Started docs for more information: 9 | // http://getbootstrap.com/getting-started/#support-ie10-width 10 | 11 | (function () { 12 | 'use strict'; 13 | if (navigator.userAgent.match(/IEMobile\/10\.0/)) { 14 | var msViewportStyle = document.createElement('style') 15 | msViewportStyle.appendChild( 16 | document.createTextNode( 17 | '@-ms-viewport{width:auto!important}' 18 | ) 19 | ) 20 | document.querySelector('head').appendChild(msViewportStyle) 21 | } 22 | })(); 23 | -------------------------------------------------------------------------------- /static/js/ie8-responsive-file-warning.js: -------------------------------------------------------------------------------- 1 | // NOTICE!! DO NOT USE ANY OF THIS JAVASCRIPT 2 | // IT'S JUST JUNK FOR OUR DOCS! 3 | // ++++++++++++++++++++++++++++++++++++++++++ 4 | /*! 5 | * Copyright 2011-2014 Twitter, Inc. 6 | * 7 | * Licensed under the Creative Commons Attribution 3.0 Unported License. For 8 | * details, see http://creativecommons.org/licenses/by/3.0/. 9 | */ 10 | // Intended to prevent false-positive bug reports about responsive styling supposedly not working in IE8. 11 | if (window.location.protocol == 'file:') { 12 | window.alert('ERROR: Bootstrap\'s responsive CSS is disabled!\nSee getbootstrap.com/getting-started/#respond-file-proto for details.') 13 | } 14 | -------------------------------------------------------------------------------- /static/js/npm.js: -------------------------------------------------------------------------------- 1 | // This file is autogenerated via the `commonjs` Grunt task. You can require() this file in a CommonJS environment. 2 | require('../../js/transition.js') 3 | require('../../js/alert.js') 4 | require('../../js/button.js') 5 | require('../../js/carousel.js') 6 | require('../../js/collapse.js') 7 | require('../../js/dropdown.js') 8 | require('../../js/modal.js') 9 | require('../../js/tooltip.js') 10 | require('../../js/popover.js') 11 | require('../../js/scrollspy.js') 12 | require('../../js/tab.js') 13 | require('../../js/affix.js') -------------------------------------------------------------------------------- /static/js/sortable.js: -------------------------------------------------------------------------------- 1 | (function() { 2 | var SELECTOR, clickEvent, numberRegExp, sortable, touchDevice, trimRegExp; 3 | 4 | SELECTOR = 'table[data-sortable]'; 5 | 6 | numberRegExp = /^-?[£$¤]?[\d,.]+%?$/; 7 | 8 | trimRegExp = /^\s+|\s+$/g; 9 | 10 | touchDevice = 'ontouchstart' in document.documentElement; 11 | 12 | clickEvent = touchDevice ? 'touchstart' : 'click'; 13 | 14 | sortable = { 15 | init: function() { 16 | var table, tables, _i, _len, _results; 17 | tables = document.querySelectorAll(SELECTOR); 18 | _results = []; 19 | for (_i = 0, _len = tables.length; _i < _len; _i++) { 20 | table = tables[_i]; 21 | _results.push(sortable.initTable(table)); 22 | } 23 | return _results; 24 | }, 25 | initTable: function(table) { 26 | var i, th, ths, _i, _len; 27 | if (table.tHead.rows.length !== 1) { 28 | return; 29 | } 30 | if (table.getAttribute('data-sortable-initialized') === 'true') { 31 | return; 32 | } 33 | table.setAttribute('data-sortable-initialized', 'true'); 34 | ths = table.querySelectorAll('th'); 35 | for (i = _i = 0, _len = ths.length; _i < _len; i = ++_i) { 36 | th = ths[i]; 37 | if (th.getAttribute('data-sortable') !== 'false') { 38 | sortable.setupClickableTH(table, th, i); 39 | } 40 | } 41 | return table; 42 | }, 43 | setupClickableTH: function(table, th, i) { 44 | var type; 45 | type = sortable.getColumnType(table, i); 46 | return th.addEventListener(clickEvent, function(e) { 47 | var newSortedDirection, row, rowArray, rowArrayObject, sorted, sortedDirection, tBody, ths, _i, _j, _k, _len, _len1, _len2, _ref, _results; 48 | sorted = this.getAttribute('data-sorted') === 'true'; 49 | sortedDirection = this.getAttribute('data-sorted-direction'); 50 | if (sorted) { 51 | newSortedDirection = sortedDirection === 'ascending' ? 'descending' : 'ascending'; 52 | } else { 53 | newSortedDirection = type.defaultSortDirection; 54 | } 55 | ths = this.parentNode.querySelectorAll('th'); 56 | for (_i = 0, _len = ths.length; _i < _len; _i++) { 57 | th = ths[_i]; 58 | th.setAttribute('data-sorted', 'false'); 59 | th.removeAttribute('data-sorted-direction'); 60 | } 61 | this.setAttribute('data-sorted', 'true'); 62 | this.setAttribute('data-sorted-direction', newSortedDirection); 63 | tBody = table.tBodies[0]; 64 | rowArray = []; 65 | _ref = tBody.rows; 66 | for (_j = 0, _len1 = _ref.length; _j < _len1; _j++) { 67 | row = _ref[_j]; 68 | rowArray.push([sortable.getNodeValue(row.cells[i]), row]); 69 | } 70 | if (sorted) { 71 | rowArray.reverse(); 72 | } else { 73 | rowArray.sort(type.compare); 74 | } 75 | _results = []; 76 | for (_k = 0, _len2 = rowArray.length; _k < _len2; _k++) { 77 | rowArrayObject = rowArray[_k]; 78 | _results.push(tBody.appendChild(rowArrayObject[1])); 79 | } 80 | return _results; 81 | }); 82 | }, 83 | 84 | clickRefresh2: function(table, col, i) { 85 | var type; 86 | type = sortable.types.numeric; 87 | 88 | var newSortedDirection, row, rowArray, rowArrayObject, sorted, sortedDirection, tBody, ths, _i, _j, _k, _len, _len1, _len2, _ref, _results; 89 | 90 | newSortedDirection = 'descending'; 91 | 92 | ths = col.parentNode.querySelectorAll('th'); 93 | for (_i = 0, _len = ths.length; _i < _len; _i++) { 94 | th = ths[_i]; 95 | th.setAttribute('data-sorted', 'false'); 96 | th.removeAttribute('data-sorted-direction'); 97 | } 98 | col.setAttribute('data-sorted', 'true'); 99 | col.setAttribute('data-sorted-direction', newSortedDirection); 100 | tBody = table.tBodies[0]; 101 | rowArray = []; 102 | _ref = tBody.rows; 103 | for (_j = 0, _len1 = _ref.length; _j < _len1; _j++) { 104 | row = _ref[_j]; 105 | rowArray.push([sortable.getNodeValue(row.cells[i]), row]); 106 | } 107 | if (sorted) { 108 | rowArray.reverse(); 109 | } else { 110 | rowArray.sort(type.compare); 111 | } 112 | _results = []; 113 | for (_k = 0, _len2 = rowArray.length; _k < _len2; _k++) { 114 | rowArrayObject = rowArray[_k]; 115 | _results.push(tBody.appendChild(rowArrayObject[1])); 116 | } 117 | return _results; 118 | }, 119 | 120 | getColumnType: function(table, i) { 121 | var row, text, _i, _len, _ref; 122 | _ref = table.tBodies[0].rows; 123 | for (_i = 0, _len = _ref.length; _i < _len; _i++) { 124 | row = _ref[_i]; 125 | text = sortable.getNodeValue(row.cells[i]); 126 | if (text !== '' && text.match(numberRegExp)) { 127 | return sortable.types.numeric; 128 | } 129 | } 130 | return sortable.types.alpha; 131 | }, 132 | getNodeValue: function(node) { 133 | if (!node) { 134 | return ''; 135 | } 136 | if (node.getAttribute('data-value') !== null) { 137 | return node.getAttribute('data-value'); 138 | } 139 | if (typeof node.innerText !== 'undefined') { 140 | return node.innerText.replace(trimRegExp, ''); 141 | } 142 | return node.textContent.replace(trimRegExp, ''); 143 | }, 144 | types: { 145 | numeric: { 146 | defaultSortDirection: 'descending', 147 | compare: function(a, b) { 148 | var aa, bb; 149 | aa = parseFloat(a[0].replace(/[^0-9.-]/g, '')); 150 | bb = parseFloat(b[0].replace(/[^0-9.-]/g, '')); 151 | if (isNaN(aa)) { 152 | aa = 0; 153 | } 154 | if (isNaN(bb)) { 155 | bb = 0; 156 | } 157 | return bb - aa; 158 | } 159 | }, 160 | alpha: { 161 | defaultSortDirection: 'ascending', 162 | compare: function(a, b) { 163 | var aa, bb; 164 | aa = a[0].toLowerCase(); 165 | bb = b[0].toLowerCase(); 166 | if (aa === bb) { 167 | return 0; 168 | } 169 | if (aa < bb) { 170 | return -1; 171 | } 172 | return 1; 173 | } 174 | } 175 | } 176 | }; 177 | 178 | setTimeout(sortable.init, 0); 179 | 180 | window.Sortable = sortable; 181 | 182 | }).call(this); 183 | -------------------------------------------------------------------------------- /static/js/sortable.min.js: -------------------------------------------------------------------------------- 1 | /*! sortable.js 0.5.0 */ 2 | (function(){var a,b,c,d,e,f;a="table[data-sortable]",c=/^-?[£$¤]?[\d,.]+%?$/,f=/^\s+|\s+$/g,e="ontouchstart"in document.documentElement,b=e?"touchstart":"click",d={init:function(){var b,c,e,f,g;for(c=document.querySelectorAll(a),g=[],e=0,f=c.length;f>e;e++)b=c[e],g.push(d.initTable(b));return g},initTable:function(a){var b,c,e,f,g;if(1===a.tHead.rows.length&&"true"!==a.getAttribute("data-sortable-initialized")){for(a.setAttribute("data-sortable-initialized","true"),e=a.querySelectorAll("th"),b=f=0,g=e.length;g>f;b=++f)c=e[b],"false"!==c.getAttribute("data-sortable")&&d.setupClickableTH(a,c,b);return a}},setupClickableTH:function(a,c,e){var f;return f=d.getColumnType(a,e),c.addEventListener(b,function(){var b,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u;for(j="true"===this.getAttribute("data-sorted"),k=this.getAttribute("data-sorted-direction"),b=j?"ascending"===k?"descending":"ascending":f.defaultSortDirection,m=this.parentNode.querySelectorAll("th"),n=0,q=m.length;q>n;n++)c=m[n],c.setAttribute("data-sorted","false"),c.removeAttribute("data-sorted-direction");for(this.setAttribute("data-sorted","true"),this.setAttribute("data-sorted-direction",b),l=a.tBodies[0],h=[],t=l.rows,o=0,r=t.length;r>o;o++)g=t[o],h.push([d.getNodeValue(g.cells[e]),g]);for(j?h.reverse():h.sort(f.compare),u=[],p=0,s=h.length;s>p;p++)i=h[p],u.push(l.appendChild(i[1]));return u})},getColumnType:function(a,b){var e,f,g,h,i;for(i=a.tBodies[0].rows,g=0,h=i.length;h>g;g++)if(e=i[g],f=d.getNodeValue(e.cells[b]),""!==f&&f.match(c))return d.types.numeric;return d.types.alpha},getNodeValue:function(a){return a?null!==a.getAttribute("data-value")?a.getAttribute("data-value"):"undefined"!=typeof a.innerText?a.innerText.replace(f,""):a.textContent.replace(f,""):""},types:{numeric:{defaultSortDirection:"descending",compare:function(a,b){var c,d;return c=parseFloat(a[0].replace(/[^0-9.-]/g,"")),d=parseFloat(b[0].replace(/[^0-9.-]/g,"")),isNaN(c)&&(c=0),isNaN(d)&&(d=0),d-c}},alpha:{defaultSortDirection:"ascending",compare:function(a,b){var c,d;return c=a[0].toLowerCase(),d=b[0].toLowerCase(),c===d?0:d>c?-1:1}}}},setTimeout(d.init,0),window.Sortable=d}).call(this); -------------------------------------------------------------------------------- /templates/contact.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | Localebnb - An Airbnb Contexual Recommender 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 29 | 30 | 31 | 46 | 47 | 57 | 58 | 59 | 60 | 61 | 62 | 83 |
84 |
85 |
86 | 87 | 88 |
89 |

Hi! I'm

90 |

G Scott Stukey

91 |
92 |

say hi to me via:

93 | 99 |
100 |
101 |
102 |
103 | 104 | 105 | 106 | 107 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | -------------------------------------------------------------------------------- /templates/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | Localebnb - An Airbnb Contexual Recommender 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 29 | 30 | 31 | 43 | 44 | 54 | 55 | 56 | 57 | 58 | 79 |
80 |
81 | 82 | 83 |
84 |

Localebnb

85 |

An Airbnb Contextual Recommender to help you find the best 'hood to stay

86 |
87 |
88 |
89 | 90 |
91 |
92 | 101 | 102 |

Where do you want to go?

103 |
104 | 107 |
108 |
109 | 112 |

113 | {% for city in cities%} 114 |
115 | 118 |
119 |
120 | {% endfor %} 121 | 122 |
123 |
124 | 125 |
126 |

Dates & Guests:

127 | 130 | 133 | 141 | 142 |
143 |
144 | 145 | 146 |
147 |

Select Neighborhood Trait Weights:

148 | 166 |
167 | 168 | 169 |
170 | 171 |
172 | 173 |
174 | 175 |
176 | 177 | 178 |
179 | 180 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 196 | -------------------------------------------------------------------------------- /templates/listing.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | Localebnb - An Airbnb Contexual Recommender 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 29 | 30 | 31 | 38 | 48 | 49 | 50 | 51 | 52 | 73 | 74 |
75 |
76 |

Key Words Pulled Out:

77 |
78 |
79 | Artsy Words: 80 |
    {% for word in listing_words_artsy%}
  • {{word}}
  • {% endfor %}

81 | Shopping Words: 82 |
    {% for word in listing_words_shopping%}
  • {{word}}
  • {% endfor %}

83 | Dining Words: 84 |
    {% for word in listing_words_dining%}
  • {{word}}
  • {% endfor %}

85 | Nightlife Words: 86 |
    {% for word in listing_words_nightlife%}
  • {{word}}
  • {% endfor %}

87 |
88 |
89 |

Listing Description:

90 |
91 | {{description_raw_html|safe}} 92 |
93 |
94 | 95 |
96 | 97 | 98 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | -------------------------------------------------------------------------------- /templates/search.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | Localebnb - Search Results for {{city}}, {{state}} 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 64 | 65 | 75 | 76 | 77 | 78 | 79 | 100 | 101 |
102 |
103 | 104 | 105 | 106 | {% for trait in traits%} 107 | 118 | {% endfor %} 119 | 120 | 121 |
{{trait}}: 108 | 117 |
122 |
123 | 124 | 125 | 126 |
127 |
128 |

Search Results for {{city}}, {{state}}:

129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | {% for listing in sorted_listings%} 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | {% endfor %} 165 |
RankScorePrev RankPrev ScoreΔListingPriceASDN
{{listing_dict[listing]['position']}}{{"%.2f" % listing_dict[listing]['score']}}{{listing_dict[listing]['default_position']}}{{"%.2f" % listing_dict[listing]['default_score']}}{% if listing_dict[listing]['default_position'] - listing_dict[listing]['position'] >0 %}{{listing_dict[listing]['default_position'] - listing_dict[listing]['position']}}{% endif %}{% if listing_dict[listing]['default_position'] - listing_dict[listing]['position'] <0 %}{{listing_dict[listing]['default_position'] - listing_dict[listing]['position']}}{% endif %}{% if listing_dict[listing]['default_position'] - listing_dict[listing]['position']== 0 %}{% endif %}{{listing_dict[listing]['blurb']}}${{listing_dict[listing]['thumbnail_price']}}{% if listing_dict[listing]['is_artsy'] == 1 %}{% endif %}{% if listing_dict[listing]['is_artsy'] != 1 %}{% endif %}{% if listing_dict[listing]['is_shopping'] == 1 %}{% endif %}{% if listing_dict[listing]['is_shopping'] != 1 %}{% endif %}{% if listing_dict[listing]['is_dining'] == 1 %}{% endif %}{% if listing_dict[listing]['is_dining'] != 1 %}{% endif %}{% if listing_dict[listing]['is_nightlife'] == 1 %}{% endif %}{% if listing_dict[listing]['is_nightlife'] != 1 %}{% endif %}
166 |
167 | 168 |
169 |
170 |
171 |
172 |
173 |
174 |
175 | 176 | 207 | 208 | 271 | 272 |
273 | 274 | 276 | 277 | 278 | 279 | 280 | 281 | 282 | 283 | 284 | 285 | --------------------------------------------------------------------------------