└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Kaggle - Regression 2 | 3 | "Those who cannot remember the past are condemned to repeat it." -- George Santayana 4 | 5 | This is a compiled list of Kaggle competitions and their winning solutions for [regression](https://en.wikipedia.org/wiki/Regression_analysis) problems. 6 | 7 | The purpose to complie this list is for easier access and therefore learning from the best in data science. 8 | 9 | Literature review is a crucial yet sometimes overlooked part in data science. To avoid reinventing the wheels and get inspired on how to preprocess, engineer, and model the data, it's worth spend 1/10 to 1/5 of the project time just researching how people deal with similar problems/datasets. 10 | 11 | Time spent on literature review is time well spent. 12 | 13 | This is only one list of the whole compilation. For other lists of competitions and solutions, please refer to: 14 | 15 | * Kaggle - [Classification](https://github.com/ShuaiW/kaggle-classification/) 16 | * Kaggle - [Sequence](https://github.com/ShuaiW/kaggle-sequence) 17 | * Kaggle - [Image](https://github.com/ShuaiW/kaggle-image) 18 | * Kaggle - [Miscellaneous](https://github.com/ShuaiW/kaggle-miscellaneous) 19 | 20 | Hope the compilation can save you efforts and offer you insights. Enjoy! 21 | 22 | ====== 23 | 24 | ### [Grupo Bimbo Inventory Demand](https://www.kaggle.com/c/grupo-bimbo-inventory-demand) 25 | 26 | Wed 8 Jun 2016 - Tue 30 Aug 2016 27 | 28 | Maximize sales and minimize returns of bakery goods 29 | 30 | ====== 31 | 32 | ### [Kobe Bryant Shot Selection](https://www.kaggle.com/c/kobe-bryant-shot-selection) 33 | 34 | Fri 15 Apr 2016 – Mon 13 Jun 2016 35 | 36 | Which shots did Kobe sink? 37 | 38 | ====== 39 | 40 | ### [Home Depot Product Search Relevance](https://www.kaggle.com/c/home-depot-product-search-relevance) 41 | 42 | Mon 18 Jan 2016 – Mon 25 Apr 2016 43 | 44 | Predict the relevance of search results on homedepot.com 45 | 46 | ====== 47 | 48 | ### [Rossmann Store Sales](https://www.kaggle.com/c/rossmann-store-sales) 49 | 50 | Wed 30 Sep 2015 – Mon 14 Dec 2015 51 | 52 | Forecast sales using store, promotion, and competitor data 53 | 54 | ====== 55 | 56 | ### [How Much Did It Rain? II](https://www.kaggle.com/c/how-much-did-it-rain-ii) 57 | 58 | Thu 17 Sep 2015 – Mon 7 Dec 2015 59 | 60 | Predict hourly rainfall using data from polarimetric radars 61 | 62 | ====== 63 | 64 | ### [Caterpillar Tube Pricing](https://www.kaggle.com/c/caterpillar-tube-pricing) 65 | 66 | Mon 29 Jun 2015 – Mon 31 Aug 2015 67 | 68 | Model quoted prices for industrial tube assemblies 69 | 70 | ====== 71 | 72 | ### [Liberty Mutual Group: Property Inspection Prediction](https://www.kaggle.com/c/liberty-mutual-group-property-inspection-prediction) 73 | 74 | Mon 6 Jul 2015 – Fri 28 Aug 2015 75 | 76 | Quantify property hazards before time of inspection 77 | 78 | ====== 79 | 80 | ### [ECML/PKDD 15: Taxi Trip Time Prediction (II)](https://www.kaggle.com/c/pkdd-15-taxi-trip-time-prediction-ii) 81 | 82 | Fri 24 Apr 2015 – Wed 1 Jul 2015 83 | 84 | Predict the total travel time of taxi trips based on their initial partial trajectories 85 | 86 | ====== 87 | 88 | ### [Bike Sharing Demand](https://www.kaggle.com/c/bike-sharing-demand) 89 | 90 | Wed 28 May 2014 – Fri 29 May 2015 91 | 92 | Forecast use of a city bikeshare system 93 | 94 | ====== 95 | 96 | ### [Walmart Recruiting II: Sales in Stormy Weather](https://www.kaggle.com/c/walmart-recruiting-sales-in-stormy-weather) 97 | 98 | Wed 1 Apr 2015 – Mon 25 May 2015 99 | 100 | Walmart challenges participants to accurately predict the sales of 111 potentially weather-sensitive products (like umbrellas, bread, and milk) around the time of major weather events at 45 of their retail locations. 101 | 102 | ====== 103 | 104 | ### [How Much Did It Rain?](https://www.kaggle.com/c/how-much-did-it-rain) 105 | 106 | Fri 9 Jan 2015 – Fri 15 May 2015 107 | 108 | Predict probabilistic distribution of hourly rain given polarimetric radar measurements 109 | 110 | ====== 111 | 112 | ### [Restaurant Revenue Prediction](https://www.kaggle.com/c/restaurant-revenue-prediction) 113 | 114 | Mon 23 Mar 2015 – Mon 4 May 2015 115 | 116 | Predict annual restaurant sales based on objective measurements 117 | 118 | ====== 119 | 120 | ### [Finding Elo](https://www.kaggle.com/c/finding-elo) 121 | 122 | Mon 20 Oct 2014 – Mon 23 Mar 2015 123 | 124 | Predict a chess player's FIDE Elo rating from one game 125 | 126 | ====== 127 | 128 | ### [Africa Soil Property Prediction Challenge](https://www.kaggle.com/c/afsis-soil-properties) 129 | 130 | Wed 27 Aug 2014 – Tue 21 Oct 2014 131 | 132 | Predict physical and chemical properties of soil using spectral measurements 133 | 134 | ====== 135 | 136 | ### [Liberty Mutual Group - Fire Peril Loss Cost](https://www.kaggle.com/c/liberty-mutual-fire-peril) 137 | 138 | Tue 8 Jul 2014 – Tue 2 Sep 2014 139 | 140 | Predict expected fire losses for insurance policies 141 | 142 | ====== 143 | 144 | ### [Walmart Recruiting - Store Sales Forecasting](https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting) 145 | 146 | Thu 20 Feb 2014 – Mon 5 May 2014 147 | 148 | In this recruiting competition, job-seekers are provided with historical sales data for 45 Walmart stores located in different regions. Each store contains many departments, and participants must project the sales for each department in each store. 149 | 150 | ====== 151 | 152 | ### [PAKDD 2014 - ASUS Malfunctional Components Prediction](https://www.kaggle.com/c/pakdd-cup-2014) 153 | 154 | Sun 26 Jan 2014 – Tue 1 Apr 2014 155 | 156 | Predict malfunctional components of ASUS notebooks 157 | 158 | ====== 159 | 160 | ### [Loan Default Prediction - Imperial College London](https://www.kaggle.com/c/loan-default-prediction) 161 | 162 | Fri 17 Jan 2014 – Fri 14 Mar 2014 163 | 164 | Constructing an optimal portfolio of loans 165 | 166 | ====== 167 | 168 | ### [See Click Predict Fix](https://www.kaggle.com/c/see-click-predict-fix) 169 | 170 | Sun 29 Sep 2013 – Wed 27 Nov 2013 171 | 172 | Predict which 311 issues are most important to citizens 173 | 174 | ====== 175 | 176 | ### [AMS 2013-2014 Solar Energy Prediction Contest](https://www.kaggle.com/c/ams-2014-solar-energy-prediction-contest) 177 | 178 | Mon 8 Jul 2013 – Fri 15 Nov 2013 179 | 180 | Forecast daily solar energy with an ensemble of weather models 181 | 182 | ====== 183 | 184 | ### [The Big Data Combine Engineered by BattleFin](https://www.kaggle.com/c/battlefin-s-big-data-combine-forecasting-challenge) 185 | 186 | Fri 16 Aug 2013 – Tue 1 Oct 2013 187 | 188 | Predict short term movements in stock prices using news and sentiment data provided by RavenPack 189 | 190 | ====== 191 | 192 | ### [See Click Predict Fix - Hackathon](https://www.kaggle.com/c/the-seeclickfix-311-challenge) 193 | 194 | Sat 28 Sep 2013 – Sun 29 Sep 2013 195 | 196 | Predict which 311 issues are most important to citizens 197 | 198 | ====== 199 | 200 | ### [RecSys2013: Yelp Business Rating Prediction](https://www.kaggle.com/c/yelp-recsys-2013) 201 | 202 | Wed 24 Apr 2013 – Sat 31 Aug 2013 203 | 204 | RecSys Challenge 2013: Yelp business rating prediction 205 | 206 | ====== 207 | 208 | ### [Yelp Recruiting Competition](https://www.kaggle.com/c/yelp-recruiting) 209 | 210 | Wed 27 Mar 2013 – Sun 30 Jun 2013 211 | 212 | The goal of this competition is to estimate the number of Useful votes a review will receive. 213 | 214 | ====== 215 | 216 | ### [dunnhumby & hack/reduce Product Launch Challenge](https://www.kaggle.com/c/hack-reduce-dunnhumby-hackathon) 217 | 218 | Sat 11 May 2013 – Sat 11 May 2013 219 | 220 | The success or failure of a new product launch is often evident within the first few weeks of sales. Can you predict a product's destiny? 221 | 222 | ====== 223 | 224 | ### [ICDAR2013 - Handwriting Stroke Recovery from Offline Data](https://www.kaggle.com/c/icdar2013-stroke-recovery-from-offline-data) 225 | 226 | Wed 20 Mar 2013 – Sat 20 Apr 2013 227 | 228 | Predict the trajectory of a handwritten signature 229 | 230 | ====== 231 | 232 | ### [Blue Book for Bulldozers](https://www.kaggle.com/c/bluebook-for-bulldozers) 233 | 234 | Fri 25 Jan 2013 – Wed 17 Apr 2013 235 | 236 | Predict the auction sale price for a piece of heavy equipment to create a "blue book" for bulldozers. 237 | 238 | ====== 239 | 240 | ### [Job Salary Prediction](https://www.kaggle.com/c/job-salary-prediction) 241 | 242 | Wed 13 Feb 2013 – Wed 3 Apr 2013 243 | 244 | Predict the salary of any UK job ad based on its contents. 245 | 246 | ====== 247 | 248 | ### [Observing Dark Worlds](https://www.kaggle.com/c/DarkWorlds) 249 | 250 | Fri 12 Oct 2012 – Sun 16 Dec 2012 251 | 252 | Can you find the Dark Matter that dominates our Universe? Winton Capital offers you the chance to unlock the secrets of dark worlds. 253 | 254 | ====== 255 | 256 | ### [U.S. Census Return Rate Challenge](https://www.kaggle.com/c/us-census-challenge) 257 | 258 | Fri 31 Aug 2012 – Sun 11 Nov 2012 259 | 260 | Predict census mail return rates. 261 | 262 | ====== 263 | 264 | ### [Global Energy Forecasting Competition 2012 - Wind Forecasting](https://www.kaggle.com/c/GEF2012-wind-forecasting) 265 | 266 | Thu 6 Sep 2012 – Wed 31 Oct 2012 267 | 268 | A wind power forecasting problem: predicting hourly power generation up to 48 hours ahead at 7 wind farms 269 | 270 | ====== 271 | 272 | ### [Global Energy Forecasting Competition 2012 - Load Forecasting](https://www.kaggle.com/c/global-energy-forecasting-competition-2012-load-forecasting) 273 | 274 | Sat 1 Sep 2012 – Wed 31 Oct 2012 275 | 276 | A hierarchical load forecasting problem: backcasting and forecasting hourly loads (in kW) for a US utility with 20 zones. 277 | 278 | ====== 279 | 280 | ### [Raising Money to Fund an Organizational Mission](https://www.kaggle.com/c/Raising-Money-to-Fund-an-Organizational-Mission) 281 | 282 | Wed 18 Jul 2012 – Tue 18 Sep 2012 283 | 284 | Help worthy organizations more efficiently target and recruit loyal donors to support their causes. 285 | 286 | ====== 287 | 288 | ### [Online Product Sales](https://www.kaggle.com/c/online-sales) 289 | 290 | Fri 4 May 2012 – Tue 3 Jul 2012 291 | 292 | Predict the online sales of a consumer product based on a data set of product features. 293 | 294 | ====== 295 | 296 | ### [Psychopathy Prediction Based on Twitter Usage](https://www.kaggle.com/c/twitter-psychopathy-prediction) 297 | 298 | Mon 14 May 2012 – Fri 29 Jun 2012 299 | 300 | Identify people who have a high degree of Psychopathy based on Twitter usage. 301 | 302 | ====== 303 | 304 | ### [Benchmark Bond Trade Price Challenge](https://www.kaggle.com/c/benchmark-bond-trade-price-challenge) 305 | 306 | Fri 27 Jan 2012 – Mon 30 Apr 2012 307 | 308 | Develop models to accurately predict the trade price of a bond. 309 | 310 | ====== 311 | 312 | ### [EMC Data Science Global Hackathon (Air Quality Prediction)](https://www.kaggle.com/c/dsg-hackathon) 313 | 314 | Sat 28 Apr 2012 – Sun 29 Apr 2012 315 | 316 | Build a local early warning systems to accurately predict dangerous levels of air pollutants on an hourly basis. 317 | 318 | ====== 319 | 320 | ### [Algorithmic Trading Challenge](https://www.kaggle.com/c/AlgorithmicTradingChallenge) 321 | 322 | Fri 11 Nov 2011 – Sun 8 Jan 2012 323 | 324 | Develop new models to accurately predict the market response to large trades. 325 | 326 | ====== 327 | 328 | ### [Allstate Claim Prediction Challenge](https://www.kaggle.com/c/ClaimPredictionChallenge) 329 | 330 | Wed 13 Jul 2011 – Wed 12 Oct 2011 331 | 332 | A key part of insurance is charging each customer the appropriate price for the risk they represent. 333 | 334 | ====== 335 | 336 | ### [dunnhumby's Shopper Challenge](https://www.kaggle.com/c/dunnhumbychallenge) 337 | 338 | Fri 29 Jul 2011 – Fri 30 Sep 2011 339 | 340 | Going grocery shopping, we all have to do it, some even enjoy it, but can you predict it? dunnhumby is looking to build a model to better predict when supermarket shoppers will next visit the store and how much they will spend. 341 | 342 | ====== 343 | 344 | ### [Wikipedia's Participation Challenge](https://www.kaggle.com/c/wikichallenge) 345 | 346 | Tue 28 Jun 2011 – Tue 20 Sep 2011 347 | 348 | This competition challenges data-mining experts to build a predictive model that predicts the number of edits an editor will make five months from the end date of the training dataset. 349 | 350 | ====== 351 | 352 | 353 | ### [Mapping Dark Matter](https://www.kaggle.com/c/mdm) 354 | 355 | Mon 23 May 2011 – Thu 18 Aug 2011 356 | 357 | Measure the small distortion in galaxy images caused by dark matter 358 | 359 | ====== 360 | 361 | ### [Deloitte/FIDE Chess Rating Challenge](https://www.kaggle.com/c/ChessRatings2) 362 | 363 | Mon 7 Feb 2011 – Wed 4 May 2011 364 | 365 | This contest, sponsored by professional services firm Deloitte, will find the most accurate system to predict chess outcomes, and FIDE will also bring a top finisher to Athens to present their system 366 | 367 | ====== 368 | 369 | ### [RTA Freeway Travel Time Prediction](https://www.kaggle.com/c/RTA) 370 | 371 | Tue 23 Nov 2010 – Sun 13 Feb 2011 372 | 373 | This competition requires participants to predict travel time on Sydney's M4 freeway from past travel time observations. 374 | 375 | ====== 376 | 377 | ### [Tourism Forecasting Part Two](https://www.kaggle.com/c/tourism2) 378 | 379 | Mon 20 Sep 2010 – Sun 21 Nov 2010 380 | 381 | Part two requires competitors to predict 793 tourism-related time series. The winner of this competition will be invited to contribute a discussion paper to the International Journal of Forecasting. 382 | 383 | ====== 384 | 385 | ### [Chess ratings - Elo versus the Rest of the World](https://www.kaggle.com/c/chess) 386 | 387 | Tue 3 Aug 2010 – Wed 17 Nov 2010 388 | 389 | This competition aims to discover whether other approaches can predict the outcome of chess games more accurately than the workhorse Elo rating system. 390 | 391 | ====== 392 | 393 | ### [Tourism Forecasting Part One](https://www.kaggle.com/c/tourism1) 394 | 395 | Mon 9 Aug 2010 – Sun 19 Sep 2010 396 | 397 | Part one requires competitors to predict 518 tourism-related time series. The winner of this competition will be invited to contribute a discussion paper to the International Journal of Forecasting. 398 | 399 | ====== 400 | 401 | ### [World Cup 2010 - Take on the Quants](https://www.kaggle.com/c/worldcup2010) 402 | 403 | Thu 3 Jun 2010 – Fri 11 Jun 2010 404 | 405 | Quants at Goldman Sachs and JP Morgan have modeled the likely outcomes of the 2010 World Cup. Can you do better? 406 | 407 | ====== 408 | 409 | ### [World Cup 2010 - Confidence Challenge](https://www.kaggle.com/c/worldcupconf) 410 | 411 | Thu 3 Jun 2010 – Fri 11 Jun 2010 412 | 413 | The Confidence Challenge requires competitors to assign a level of confidence to their World Cup predictions. 414 | 415 | ====== 416 | 417 | 418 | 419 | 420 | 421 | --------------------------------------------------------------------------------