└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Machine Learning Datasets 2 | This repo is created to make it easier for machine learning practioners to find great sources of datasets. The goal is to list all sites that share datasets. 3 | 4 | Feel free to send a pull request if you have a dataset you'd like to add, or simply notify me about it through submitting an issue. 5 | 6 | ## Companies 7 | * [Amazon Public Data Sets](http://aws.amazon.com/datasets/) 8 | * [Windows Azure Marketplace](https://datamarket.azure.com/browse/data?price=any) 9 | * [Yahoo Datasets](http://webscope.sandbox.yahoo.com/) 10 | * [Yelp Academic Datasets](https://www.yelp.com/academic_dataset) 11 | * [NYT Linked Open Data](http://data.nytimes.com/) 12 | * [Google Public Data](https://www.google.com/publicdata/directory) 13 | 14 | ## Educational institutions 15 | * [Deeplearning Datasets](http://deeplearning.net/datasets/) 16 | * [Stanford Large Network Dataset Collection](http://snap.stanford.edu/data/) 17 | * [UCI Machine Learning repository](https://archive.ics.uci.edu/ml/datasets.html) 18 | * [ImageNet](http://www.image-net.org/) 19 | * [Million Song Dataset](http://labrosa.ee.columbia.edu/millionsong/) 20 | 21 | ## Governements 22 | * [US City Open Data Census](http://us-city.census.okfn.org/) 23 | * [US Dataset](http://www.data.gov/) 24 | * [Open Government Data Platform India](https://data.gov.in/) 25 | * [World Bank Data](http://databank.worldbank.org/data/home.aspx) 26 | * [Humanitarian Data Exchange](https://data.hdx.rwlabs.org/dataset) 27 | * [The Text REtrieval Conference Datasets](http://trec.nist.gov/data.html) 28 | * [Open Data Institute](https://certificates.theodi.org/en/) 29 | * [UK Datasets](https://data.gov.uk/) 30 | * [Los Angeles Open Data](https://data.lacity.org/) 31 | * [Chicago Data Portal](https://data.cityofchicago.org/) 32 | * [Seattle Data Portal](https://data.seattle.gov/browse) 33 | 34 | 35 | ## Other 36 | * [Freebase](http://www.freebase.com/) 37 | * [Reddit datasets subreddit](https://www.reddit.com/r/datasets/) 38 | * [Reddit Top Posts](https://github.com/umbrae/reddit-top-2.5-million) 39 | * [DBpedia](http://wiki.dbpedia.org/) 40 | * [Awesome Public Datasets](https://github.com/caesar0301/awesome-public-datasets) 41 | --------------------------------------------------------------------------------