├── ARTICLES.md ├── BLOGS.md ├── BOOKS.md ├── CHALLENGES.md ├── COURSES.md ├── DATASETS.md ├── LICENSE ├── PODCASTS.md ├── PYTHON.md ├── README.md ├── Repositories.md └── YOUTUBE.md /ARTICLES.md: -------------------------------------------------------------------------------- 1 | # Some Articles of subjects related to Data Science 2 | 3 | ## David Venturi Guide 4 | * [I Dropped Out of School to Create My Own Data Science Master’s — Here’s My Curriculum](https://medium.com/@davidventuri/i-dropped-out-of-school-to-create-my-own-data-science-master-s-here-s-my-curriculum-1b400dcee412) 5 | * [If you want to learn Data Science, take a few of these statistics classes](https://medium.freecodecamp.org/if-you-want-to-learn-data-science-take-a-few-of-these-statistics-classes-9bbabab098b9) 6 | * [If you want to learn Data Science, start with one of these programming classes](https://medium.freecodecamp.org/if-you-want-to-learn-data-science-start-with-one-of-these-programming-classes-fb694ffe780c) 7 | * [Every single Machine Learning course on the internet, ranked by your reviews](https://medium.freecodecamp.org/every-single-machine-learning-course-on-the-internet-ranked-by-your-reviews-3c4a7b8026c0) 8 | 9 | ## Highlight 10 | * [Matheus Facure tutoriais](https://matheusfacure.github.io/tutoriais/) 11 | * [Cientista de Dados - Por onde começar em 8 passos](http://datascienceacademy.com.br/blog/cientista-de-dados-por-onde-comecar-em-8-passos/) - Language: Portuguese 12 | * [Construindo seu portfólio em Data Science](https://medium.com/databootcamp/construindo-seu-portf%C3%B3lio-em-data-science-f208b8edc53b) - Language: Portuguese 13 | * [Como Aprender Data Science de Graça nas Melhores Universidades do Mundo](https://medium.com/data-science-brigade/como-aprender-data-science-de-gra%C3%A7a-nas-melhores-universidades-do-mundo-60a76a3af887) - Language: Portuguese 14 | * [A Tour of Machine Learning Algorithms](https://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/) 15 | * [Which are the best known machine learning algorithms? Infographic](http://thinkbigdata.in/best-known-machine-learning-algorithms-infographic/) 16 | * [Top 10 TED Talks for the Data Scientists](https://www.kdnuggets.com/2016/02/top-10-tedtalks-data-scientists.html) 17 | * [10 bibliotecas de Data Science para Python que ninguém te conta](https://paulovasconcellos.com.br/10-bibliotecas-de-data-science-para-python-que-ningu%C3%A9m-te-conta-706ec3c4fcef) - Language: Portuguese 18 | * [Supercharge your Python Plots with Zero Extra Code](https://blog.datasciencedojo.com/python-plots-data-visualization/) 19 | 20 | ## Others 21 | * [Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data](https://becominghuman.ai/cheat-sheets-for-ai-neural-networks-machine-learning-deep-learning-big-data-678c51b4b463) 22 | * [Lambda e Pandas: Uma história de Amor](http://minerandodados.com.br/index.php/2018/09/04/lambda-e-pandas-uma-historia-de-amor/) 23 | * [How to Data Science without a Degree](https://towardsdatascience.com/how-to-data-science-without-a-degree-79d8388a49ba) 24 | * [Introdução ao aprendizado de máquinas com Python](https://www.infoq.com/br/articles/ml-intro-python) - Language: Portuguese 25 | * [29 Certificações em Big Data e Data Science/](http://datascienceacademy.com.br/blog/29-certificacoes-em-big-data-e-data-science/) - Language: Portuguese 26 | * [A history of Machine Learning](https://cloud.withgoogle.com/build/data-analytics/explore-history-machine-learning/) 27 | * [Essential Math for Data Science — ‘Why’ and ‘How’](https://towardsdatascience.com/essential-math-for-data-science-why-and-how-e88271367fbd) 28 | * [Demystifying Convolutional Neural Networks](https://medium.com/@eternalzer0dayx/demystifying-convolutional-neural-networks-ca17bdc75559) 29 | * [Top 5 Machine Learning Courses for 2019](https://www.learndatasci.com/best-machine-learning-courses/) 30 | -------------------------------------------------------------------------------- /BLOGS.md: -------------------------------------------------------------------------------- 1 | # Blogs about Python and Data Science 2 | ## Highlight 3 | * [CalmCode](https://calmcode.io/) - Made by people who want to remedy the skill anxiety. Short and simple video lessons that start from scratch. Tools and thoughts that might make your professional life more enjoyable. 4 | * [Data Flair](https://data-flair.training/blogs/) 5 | * [ClaoudML](http://www.claoudml.co/) - The best curated list of Data Science materials on the web! 6 | * [Storytelling with Data](http://www.storytellingwithdata.com/) 7 | * [Advice to aspiring data scientists: start a blog](http://varianceexplained.org/r/start-blog/) 8 | * [Learning Math for Machine Learning](https://blog.ycombinator.com/learning-math-for-machine-learning/) 9 | * [Ciência e Dados](http://www.cienciaedados.com) - Language: Portuguese 10 | * [Matheus Facure](https://matheusfacure.github.io/tutoriais/) - Language: Portuguese 11 | * [Chris Albon](https://chrisalbon.com/) 12 | * [Open AI](https://openai.com/) 13 | * [Fast.ai](https://www.fast.ai/) 14 | ## Other 15 | * [Paulo Vasconcelos](https://paulovasconcellos.com.br/) 16 | * [DataHackers](https://medium.com/data-hackers) 17 | * [leportella](https://leportella.com/) 18 | * [dataquest.io](https://www.dataquest.io) 19 | * [Data Science Central](https://www.datasciencecentral.com/) 20 | * [Stanford Lecture Notes](https://stanford.edu/~shervine/teaching/) 21 | -------------------------------------------------------------------------------- /BOOKS.md: -------------------------------------------------------------------------------- 1 | # Books 2 | 3 | ## Highlights 4 | * [Dive into Deep Learning](https://d2l.ai/) 5 | * [The Best Python Books](https://realpython.com/best-python-books/) 6 | * [Free Programming Books](https://github.com/EbookFoundation/free-programming-books) - A lot of books in various languages 7 | * [Free Programming Books - Portuguese](https://github.com/EbookFoundation/free-programming-books/blob/master/free-programming-books-pt_BR.md) - Books about a lot of different themes 8 | * [Free Programming Books - English](https://github.com/EbookFoundation/free-programming-books/blob/master/free-programming-books.md) - Books about a lot of different themes 9 | 10 | ## Other 11 | * [Deep Learning Book MIT](https://www.deeplearningbook.org/) 12 | * [Deep Learning Book](https://deeplearningbook.com.br/capitulos/page/2/): FREE (Portuguese) 13 | * [Problem Solving with Algorithms and Data Structures using Python](http://interactivepython.org/courselib/static/pythonds/index.html): FREE 14 | * [O Guia do Mochileiro para Python!](https://python-guide-pt-br.readthedocs.io/pt_BR/latest/) 15 | * [The Hitchhiker’s Guide to Python!](https://docs.python-guide.org/) 16 | * [Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook): FREE 17 | * [Pense em Python Segunda Edição](https://github.com/PenseAllen/PensePython2e): FREE (Portuguese) 18 | * [Test-Driven Development with Python](http://www.obeythetestinggoat.com/pages/book.html#toc): FREE 19 | * [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/index.html): FREE 20 | * [Dive Into Python 3](http://www.diveintopython3.net/): FREE 21 | * [Data Science para Negócios](https://www.amazon.com.br/gp/product/8576089726?ref=em_1p_1_ti&ref_=pe_1822510_362394480): R$ 87,90 (Portuguese) 22 | * [Data Science do Zero](https://www.amazon.com.br/gp/product/857608998X?ref=em_1p_2_ti&ref_=pe_1822510_362394480): R$ 76,90 (Portugueses) 23 | -------------------------------------------------------------------------------- /CHALLENGES.md: -------------------------------------------------------------------------------- 1 | # Websites with challenges, exercises and projects 2 | 3 | ## Competitions or Projects for Organizations: 4 | * [Cognitivo.Ai](https://www.cognitivo.ai/experts/new-expert/) 5 | * [Data Kind](https://www.datakind.org/do-good-with-data) 6 | * [Kaggle](https://www.kaggle.com/competitions) 7 | * [Driven Data](https://www.drivendata.org/competitions/) 8 | * [Physionet](https://physionet.org/challenge/) 9 | * [Crowd Analytix](https://www.crowdanalytix.com/community) 10 | * [Coda Lab](https://competitions.codalab.org/) 11 | * [Data Science Challenge](https://www.datasciencechallenge.org/) 12 | * [KDD](https://www.kdd.org/kdd-cup) 13 | 14 | ## Training your skills 15 | * [70+ Machine Learning Datasets & Project Ideas – Work on real-time Data Science projects](https://data-flair.training/blogs/machine-learning-datasets) 16 | * [Neps Academy](https://neps.academy/login?next=%2F) - Language: Portuguese 17 | * [Code Nation](https://www.codenation.com.br/) - Language: Portuguese 18 | * [Hacker Rank](https://www.hackerrank.com/) 19 | * [Code Signal](https://codesignal.com/developers/) 20 | * [Code Wars](https://www.codewars.com/) 21 | * [Coder Byte](https://www.coderbyte.com/) 22 | -------------------------------------------------------------------------------- /COURSES.md: -------------------------------------------------------------------------------- 1 | # Coursera 2 | ## Applied Data Science with Python Specialization: 3 | 1. [Introduction to Data Science in Python](https://www.coursera.org/learn/python-data-analysis) 4 | 2. [Applied Plotting, Charting & Data Representation in Python](https://www.coursera.org/learn/python-plotting) 5 | 3. [Applied Machine Learning in Python](https://www.coursera.org/learn/python-machine-learning) 6 | 4. [Applied Text Mining in Python](https://www.coursera.org/learn/python-text-mining) 7 | 5. [Applied Social Network Analysis in Python](https://www.coursera.org/learn/python-social-network-analysis) 8 | 9 | ## IBM Data Science Professional Certificate 10 | 1. [What is Data Science](https://www.coursera.org/learn/what-is-datascience) 11 | 2. [Open Source tools for Data Science](https://www.coursera.org/learn/open-source-tools-for-data-science) 12 | 3. [Data Science Methodology](https://www.coursera.org/learn/data-science-methodology) 13 | 4. [Python for Data Science](https://www.coursera.org/learn/python-for-applied-data-science) 14 | 5. [Databases and SQL for Data Science](https://www.coursera.org/learn/sql-data-science) 15 | 6. [Data Visualization with Python](https://www.coursera.org/learn/python-for-data-visualization) 16 | 7. [Data Analysis with Python](https://www.coursera.org/learn/data-analysis-with-python) 17 | 8. [Machine Learning with Python](https://www.coursera.org/learn/machine-learning-with-python) 18 | 9. [Applied Data Science Capstone](https://www.coursera.org/learn/applied-data-science-capstone) 19 | 20 | ## Advanced Data Science with IBM 21 | 1. [Fundamentals of Scalable Data Science](https://www.coursera.org/learn/ds) 22 | 2. [Advanced Machine Learning and Signal Processing](https://www.coursera.org/learn/advanced-machine-learning-signal-processing) 23 | 3. [Applied AI with DeepLearning](https://www.coursera.org/learn/ai) 24 | 4. [Advanced Data Science Capstone](https://www.coursera.org/learn/advanced-data-science-capstone) 25 | 26 | ## Genomic Data Science 27 | 1. [Introduction to Genomic Technologies](https://www.coursera.org/learn/introduction-genomics) 28 | 2. [Genomic Data Science with Galaxy](https://www.coursera.org/learn/galaxy-project) 29 | 3. [Python for Genomic Data Science](https://www.coursera.org/learn/python-genomics) 30 | 4. [DNA Sequencing](https://www.coursera.org/learn/dna-sequencing) 31 | 5. [Command Line Tools for Genomic Data Science](https://www.coursera.org/learn/genomic-tools) 32 | 6. [Bioconductor for Genomic Data Science](https://www.coursera.org/learn/bioconductor) 33 | 7. [Statistics for Genomic Data Science](https://www.coursera.org/learn/statistical-genomics) 34 | 8. [Genomic Data Science Capstone](https://www.coursera.org/learn/genomic-data-science-project) 35 | 36 | ## Executive Data Science 37 | 1. [Data Science Course](https://www.coursera.org/learn/data-science-course) 38 | 2. [Build Data Science Team](https://www.coursera.org/learn/build-data-science-team) 39 | 3. [Managing Data Analysis](https://www.coursera.org/learn/managing-data-analysis) 40 | 4. [Real Life Data Science](https://www.coursera.org/learn/real-life-data-science) 41 | 5. [Executiva Data Science Capstone](https://www.coursera.org/learn/executive-data-science-capstone) 42 | 43 | ## Deep Learning Specialization 44 | 1. [Neural Networks and Deep Learning](https://pt.coursera.org/learn/neural-networks-deep-learning) 45 | 2. [Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization](https://pt.coursera.org/learn/deep-neural-network) 46 | 3. [Structuring Machine Learning Projects](https://pt.coursera.org/learn/machine-learning-projects) 47 | 4. [Convolutional Neural Networks](https://pt.coursera.org/learn/convolutional-neural-networks) 48 | 5. [Sequence Models](https://pt.coursera.org/learn/nlp-sequence-models) 49 | 50 | ## Advanced Machine Learning Specialization 51 | 1. [Introduction to Deep Learning](https://www.coursera.org/learn/intro-to-deep-learning?specialization=aml) 52 | 2. [How to Win a Data Science Competition: Learn from Top Kagglers](https://www.coursera.org/learn/competitive-data-science?specialization=aml) 53 | 3. [Bayesian Methods for Machine Learning](https://www.coursera.org/learn/bayesian-methods-in-machine-learning?specialization=aml) 54 | 4. [Practical Reinforcement Learning](https://www.coursera.org/learn/practical-rl?specialization=aml) 55 | 5. [Deep Learning in Computer Vision](https://www.coursera.org/learn/deep-learning-in-computer-vision) 56 | 6. [Natural Language Processing](https://www.coursera.org/learn/language-processing) 57 | 7. [Addressing Large Hadron Collider Challenges by Machine Learning](https://www.coursera.org/learn/hadron-collider-machine-learning) 58 | 59 | ## Statistics with Python 60 | 1. [ 61 | Understanding and Visualizing Data with Python](https://pt.coursera.org/learn/understanding-visualization-data) 62 | 2. [ 63 | Inferential Statistical Analysis with Python](https://pt.coursera.org/learn/inferential-statistical-analysis-python) 64 | 3. [ 65 | Fitting Statistical Models to Data with Python 66 | ](https://pt.coursera.org/learn/fitting-statistical-models-data-python) 67 | 68 | ***Obs.: You can enroll for free in each course clicking in "participate as listener".*** 69 | 70 | # Cognitiveclass.ai 71 | ## [Data Science Foundations](https://cognitiveclass.ai/learn/data-science/) 72 | 1. [Introduction to Data Science](https://cognitiveclass.ai/courses/data-science-101/) 73 | 2. [Data Science Tools](https://cognitiveclass.ai/courses/data-science-hands-open-source-tools-2/) 74 | 3. [Data Science Methodology](https://cognitiveclass.ai/courses/data-science-methodology-2/) 75 | ## [Applied Data Science with Python](https://cognitiveclass.ai/learn/data-science-with-python/) 76 | 1. [Python for Data Science](https://cognitiveclass.ai/courses/python-for-data-science/) 77 | 2. [Data Analysis with Python](https://cognitiveclass.ai/courses/data-analysis-python/) 78 | 3. [Data Visualization with Python](https://cognitiveclass.ai/courses/data-visualization-with-python/) 79 | ## [Deep Learning](https://cognitiveclass.ai/learn/deep-learning/) 80 | 1. [Deep Learning Fundamentals](https://cognitiveclass.ai/courses/introduction-deep-learning/) 81 | 2. [Deep Learning with TensorFlow](https://cognitiveclass.ai/courses/deep-learning-tensorflow/) 82 | 3. [Accelerating Deep Learning with GPU](https://cognitiveclass.ai/courses/accelerating-deep-learning-gpu/) 83 | 84 | # edX 85 | ## [Data Science](https://www.edx.org/micromasters/data-science?utm_source=sailthru&utm_medium=email&utm_campaign=programs_bundle_campaign_sept2018&utm_term=Computer%20Science%20and%20Data%20Science%20Interest) 86 | 1. [Python for Data Science](https://www.edx.org/course/python-for-data-science) 87 | 2. [Probability and Statistics in Data Science using Python](https://www.edx.org/course/probability-and-statistics-in-data-science-using-python) 88 | 3. [Machine Learning Fundamentals](https://www.edx.org/course/machine-learning-fundamentals) 89 | 4. [Big Data Analytics Using Spark](https://www.edx.org/course/big-data-analytics-using-spark) 90 | 91 | ## [MicroMasters Program in Artificial Intelligence](https://www.edx.org/micromasters/columbiax-artificial-intelligence) 92 | 1. [Artificial Intelligence](https://www.edx.org/course/artificial-intelligence-ai) 93 | 2. [Machine Learning](https://www.edx.org/course/machine-learning) 94 | 3. [Robotics]() 95 | 4. [Animation and CGI Motion](https://www.edx.org/course/animation-cgi-motion-1) 96 | 97 | ## [Algorithms and Data Structures](https://www.edx.org/micromasters/ucsandiegox-algorithms-and-data-structures?utm_source=sailthru&utm_medium=email&utm_campaign=programs_bundle_campaign_sept2018&utm_term=Computer%20Science%20and%20Data%20Science%20Interest) 98 | 1. [Algorithmic Design and Techniques](https://www.edx.org/course/algorithmic-design-techniques-uc-san-diegox-algs200x) 99 | 2. [Data Structures Fundamentals](https://www.edx.org/course/data-structures-fundamentals-uc-san-diegox-algs201x) 100 | 3. [Graph Algorithms](https://www.edx.org/course/graph-algorithms-uc-san-diegox-algs202x) 101 | 4. [NP-Complete Problems](https://www.edx.org/course/np-complete-problems-uc-san-diegox-algs203x) 102 | 5. [String Processing and Pattern Matching Algorithms](https://www.edx.org/course/string-processing-pattern-matching-uc-san-diegox-algs204x) 103 | 6. [Dynamic Programming: Applications In Machine Learning and Genomics](https://www.edx.org/course/dynamic-programming-applications-machine-uc-san-diegox-algs205x) 104 | 7. [Graph Algorithms in Genome Sequencing](https://www.edx.org/course/graph-algorithms-genome-sequencing-uc-san-diegox-algs206x) 105 | 8. [Algorithms and Data Structures Capstone](https://www.edx.org/course/algorithms-data-structures-capstone-uc-san-diegox-algs207x) 106 | 107 | ## [Data Science for Executives](https://www.edx.org/professional-certificate/data-science-executives?utm_source=sailthru&utm_medium=email&utm_campaign=programs_bundle_campaign_sept2018&utm_term=Computer%20Science%20and%20Data%20Science%20Interest) 108 | 1. [Statistical Thinking for Data Science and Analytics](https://www.edx.org/course/statistical-thinking-for-data-science-and-analytics) 109 | 2. [Machine Learning for Data Science and Analytics](https://www.edx.org/course/machine-learning-for-data-science-and-analytics) 110 | 3. [Enabling Technologies for Data Science and Analytics: The Internet of Things](https://www.edx.org/course/enabling-technologies-for-data-science-and-analytics-the-internet-of-things) 111 | 112 | ## [Statistics and Data Science](https://www.edx.org/micromasters/mitx-statistics-and-data-science#courses) 113 | 1. [Probability - The Science of Uncertainty and Data](https://www.edx.org/course/probability-the-science-of-uncertainty-and-data) 114 | 2. [Data Analysis in Social Science—Assessing Your Knowledge](https://www.edx.org/course/data-analysis-in-social-science-assessing-your-knowledge) 115 | 3. [Fundamentals of Statistics](https://www.edx.org/course/fundamentals-of-statistics) 116 | 4. [Machine Learning with Python: from Linear Models to Deep Learning](https://www.edx.org/course/machine-learning-with-python-from-linear-models-to-deep-learning) 117 | 118 | ***Obs: You can audit the courses for free if you don't want a certificate. 119 | 120 | ## Others 121 | * [Awesome CS Courses](https://github.com/prakhar1989/awesome-courses) - List of awesome university courses for learning Computer Science! 122 | * [CS109 Data Science - Harvard](http://cs109.github.io/2015/index.html) 123 | * [The School AI](https://www.theschool.ai/courses/) 124 | * [AcademIA](https://www.microsoft.com/pt-br/academia) 125 | -------------------------------------------------------------------------------- /DATASETS.md: -------------------------------------------------------------------------------- 1 | # Websites with a huge amount of data to use in your projects 2 | ## Highlight 3 | * [Information is Beautiful](https://informationisbeautiful.net/data/) - A really good website with beatiful graphs and data. 4 | * [Colaboradados](https://colaboradados.github.io/) - Repository with a lot of good datasets from Brazil in Portuguese. 5 | * [Awesome Public Datasets](https://github.com/awesomedata/awesome-public-datasets) - The repository bellow contains a lot of datasets, you have to take a look. 6 | * [Google Data Search](https://toolbox.google.com/datasetsearch) - The site bellow is a Google's tool for searching for datasets. 7 | * [Chatito](https://rodrigopivi.github.io/Chatito/) - Helps you generate datasets for natural language understanding models using a simple DSL. 8 | * [Datahub](https://datahub.io/collections) - Collections - high quality data and datasets organized by topic. 9 | * [Common Voice](https://commonvoice.mozilla.org/pt/datasets) - An open source, multi-language dataset of voices that anyone can use to train speech-enabled applications. Each entry in the dataset consists of a unique MP3 and corresponding text file. 10 | 11 | ## Data from Brazil in Portuguese 12 | * [TCE-MG - Dados de Municípios](https://dadosabertos.tce.mg.gov.br/index.xhtml) 13 | * [Catálogos Dados Brasil](https://github.com/dadosgovbr/catalogos-dados-brasil/blob/master/dados/catalogos.csv) 14 | * [Portal Brasileiro de Dados Abertos](http://dados.gov.br/dataset) 15 | * [Ciência de dados aplicada à Saúde](https://bigdata.icict.fiocruz.br/) 16 | * [Brasil.io](https://brasil.io/datasets) 17 | * [Portal da Transparência](http://www.portaldatransparencia.gov.br/) 18 | * [Awesome Brazil Data (GitHub Repository)](https://github.com/juliohm/awesome-brazil-data) 19 | * [Open Data (GitHub Repository)](https://github.com/datasets-br) 20 | * [IBGE](https://downloads.ibge.gov.br/) 21 | * [Receita Federal](http://idg.receita.fazenda.gov.br/dados) 22 | * [FipeZap](http://fipezap.zapimoveis.com.br/) 23 | * [Banco Central do Brasil](https://www.bcb.gov.br/?serietemp) 24 | * [Dataset com info do Brasileirão entre 2000 e 2017](https://github.com/adaoduque/Brasileirao_Dataset) 25 | 26 | ## Top resources 27 | * [Open Neuro](https://openneuro.org/public/datasets) 28 | * [Reddit](https://www.reddit.com/r/datasets) 29 | * [Kaggle](https://www.kaggle.com/datasets) 30 | * [FiveThirtyEight](https://data.fivethirtyeight.com/) 31 | * [UNICEF](https://data.unicef.org/resources/resource-type/datasets/) 32 | * [UCI Machine Learning Repository](http://mlr.cs.umass.edu/ml/datasets.html) 33 | * [World Banl Open Data](https://data.worldbank.org/) 34 | * [Open Data for All New Yorkers](https://opendata.cityofnewyork.us/) 35 | * [Open Industrial Datasets](https://github.com/AndreaPi/Open-industrial-datasets) 36 | ## Other resources 37 | * [Spotify Datasets](https://research.atspotify.com/datasets/) - Dive into datasets for everything from podcasts to music recommendation 38 | * [Coronavirus](https://docs.google.com/spreadsheets/d/1JALlvOAolTQXad38ffSVHe0-TfjgBYDSpbQqQBIjRyE/edit#gid=0) 39 | * [70+ Machine Learning Datasets & Project Ideas – Work on real-time Data Science projects](https://data-flair.training/blogs/machine-learning-datasets) 40 | * [The Big Bad NLP Database](https://quantumstat.com/dataset/dataset.html) - Datasets for Natural Language Processing 41 | * [Football Data](https://datahub.io/collections/football) - A collection of awesome football datasets including national teams, clubs, match schedules, players, stadiums, etc. 42 | * [Disney Dataset](https://www.disneyresearch.com/datasets/) 43 | * [NASA's Open Data Portal](https://data.nasa.gov/) 44 | * [UK Data](https://data.gov.uk/) 45 | * [EU Open Data](http://data.europa.eu/euodp/en/data/?utm_source=datafloq&utm_medium=ref&utm_campaign=datafloq) 46 | * [Amazon Open Data](https://registry.opendata.aws/) 47 | * [Bureau of Labor Statistics](https://www.bls.gov/data/) 48 | * [Bureau of Economic Analysis](http://www.bea.gov/data/gdp) 49 | * [Quandl](https://www.quandl.com/search) 50 | * [Socrata](https://opendata.socrata.com/) 51 | * [Data.World](https://data.world/) 52 | * [Datasets for Machine Learning](https://www.datasetlist.com/) 53 | * [Guide to Football and Soccer Data and Api's](https://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/) 54 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Gabriel Aparecido Fonseca 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /PODCASTS.md: -------------------------------------------------------------------------------- 1 | # Podcasts 2 | * [Pizza de Dados](https://www.youtube.com/channel/UCqOX4hl_9DJ5Zmzh8MpgL9A/videos) - Language: Portuguese 3 | * [Podcast Data Science Academy](http://datascienceacademy.com.br/blog/podcast-dsa/) - Language: Portuguese 4 | * [Data Hackers](https://radiopublic.com/data-hackers-GE2pa4) ([listen in Spotify](https://open.spotify.com/show/1oMIHOXsrLFENAeM743g93?si=VbislDR-Tx6Fyw2AKlfDyw)) - Language: Portuguese 5 | * [DataFramed](https://www.datacamp.com/community/podcast) 6 | -------------------------------------------------------------------------------- /PYTHON.md: -------------------------------------------------------------------------------- 1 | # Knowing more about Python 2 | 3 | ## Data Science with Python 4 | * [Python Data Science Tutorials](https://realpython.com/tutorials/data-science/) 5 | * [101 NumPy Exercises for Data Analysis (Python)](https://www.machinelearningplus.com/python/101-numpy-exercises-python/) 6 | * [101 Pandas Exercises for Data Analysis](https://www.machinelearningplus.com/python/101-pandas-exercises-python/) 7 | * [Python for DS 101](https://notebooks.azure.com/gabriel19913/libraries/PythonDS101) 8 | 9 | ## Pandas 10 | * [A Guide to Pandas and Matplotlib for Data Exploration](https://towardsdatascience.com/a-guide-to-pandas-and-matplotlib-for-data-exploration-56fad95f951c) 11 | * [10 Minutes to pandas](https://pandas.pydata.org/pandas-docs/stable/10min.html) - Language: Portuguese 12 | * [Dominando o Pandas: A Biblioteca para Análise de Dados preferida entre os Cientistas de Dados (Parte 1)](http://minerandodados.com.br/index.php/2017/09/26/python-para-analise-de-dados/) - Language: Portuguese 13 | * [Dominando o Pandas: A Biblioteca para Análise de Dados preferida entre os Cientistas de Dados (Parte 2)](http://minerandodados.com.br/index.php/2017/11/10/dominando-o-pandas-datascience-dozero/) - Language: Portuguese 14 | * [Intro to pandas data structures](http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/) 15 | * [Working with DataFrames](http://www.gregreda.com/2013/10/26/working-with-pandas-dataframes/) 16 | * [Using pandas on the MovieLens dataset](http://www.gregreda.com/2013/10/26/using-pandas-on-the-movielens-dataset/) 17 | 18 | ## Other 19 | * [Improve Your Python: Python Classes and Object Oriented Programming](https://jeffknupp.com/blog/2014/06/18/improve-your-python-python-classes-and-object-oriented-programming/) 20 | * [Python Plotting With Matplotlib (Guide)](https://realpython.com/python-matplotlib-guide/) 21 | * [Top 10 Must Watch Pycon Talks](https://realpython.com/must-watch-pycon-talks/) 22 | * [Python Web Scraping Tutorials](https://realpython.com/tutorials/web-scraping/) 23 | * [Python 3's f-Strings: An Improved String Formatting Syntax (Guide)](https://realpython.com/python-f-strings/) 24 | * [Python Code Quality: Tools & Best Practices](https://realpython.com/python-code-quality/) 25 | * [Python's range() Function (Guide)](https://realpython.com/python-range/) 26 | * [13 Project Ideas for Intermediate Python Developers](https://realpython.com/intermediate-python-project-ideas/) 27 | 28 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DataScienceWithPython 2 | A repository to store everything for learning DataScience using Python 3 | 4 | The links provided are for study purposes only. 5 | **Work in Progress** 6 | *** 7 | A good place to start in data science is [The Open Source Data Science Masters repository](https://github.com/datasciencemasters/go). 8 | *** 9 | ## Table of Contents 10 | * [Articles](ARTICLES.md) 11 | * [Blogs](BLOGS.md) 12 | * [Books](BOOKS.md) 13 | * [Courses](COURSES.md) 14 | * [Datasets websites](DATASETS.md) 15 | * [Podcasts](PODCASTS.md) 16 | * [Python](PYTHON.md) 17 | * [Repositories](Repositories.md) 18 | * [Websites to do exercises](CHALLENGES.md) 19 | * [Youtube channels and playlists](YOUTUBE.md) 20 | -------------------------------------------------------------------------------- /Repositories.md: -------------------------------------------------------------------------------- 1 | # GitHub Repositories 2 | 3 | ## Some ML libraries 4 | * [aiomultiprocess](https://github.com/omnilib/aiomultiprocess) - aiomultiprocess presents a simple interface, while running a full AsyncIO event loop on each child process, enabling levels of concurrency never before seen in a Python application. 5 | * [Amundsen](https://github.com/lyft/amundsen) - Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data https://lyft.github.io/amundsen/ 6 | * [atheris](https://github.com/google/atheris) - Atheris is a coverage-guided Python fuzzing engine. It supports fuzzing of Python code, but also native extensions written for CPython. Atheris is based off of libFuzzer. 7 | * [atoti](https://github.com/atoti/atoti) - atoti is a free Python BI analytics platform for Quants, Data Analysts, Data Scientists & Business Users to collaborate better, analyze faster and translate their data into business KPIs. 8 | * [bamboolib](https://github.com/tkrabel/bamboolib) - A GUI for pandas DataFrames 9 | * [baselines](https://github.com/openai/baselines) - OpenAI Baselines is a set of high-quality implementations of reinforcement learning algorithms. 10 | * [BayesianOptimization](https://github.com/fmfn/BayesianOptimization) - Pure Python implementation of bayesian global optimization with gaussian processes. 11 | * [beakerx](https://github.com/twosigma/beakerx) - Beaker Extensions for Jupyter Notebook http://BeakerX.com 12 | * [BentoML](https://github.com/bentoml/BentoML) - BentoML is an open-source platform for high-performance ML model serving. 13 | * [CacheSQL](https://github.com/felipeam86/cachesql) - CacheSQL is a simple library for making SQL queries with cache functionality. The main target of this library are data scientists and data analysts that rely on SQLalchemy to query data from SQL and pandas to do the heavy lifting in Python. 14 | * [Causal ML](https://github.com/uber/causalml) - Causal ML is a Python package that provides a suite of uplift modeling and causal inference methods using machine learning algorithms based on recent research. It provides a standard interface that allows user to estimate the Conditional Average Treatment Effect (CATE) or Individual Treatment Effect (ITE) from experimental or observational data 15 | * [causalnex](https://github.com/quantumblacklabs/causalnex) - A Python library that helps data scientists to infer causation rather than observing correlation http://causalnex.readthedocs.io/ 16 | * [Celluloid](https://github.com/jwkvam/celluloid) - This module makes it easy to adapt your existing visualization code to create an animation. 17 | * [Chefboost](https://github.com/serengil/chefboost) - Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting (GBDT, GBRT, GBM), Random Forest and Adaboost w/categorical features support for Python. 18 | * [Ciphey](https://github.com/Ciphey/Ciphey) - Ciphey is an automated decryption tool. Input encrypted text, get the decrypted text back. 19 | * [Click](https://github.com/pallets/click/) - Click is a Python package for creating beautiful command line interfaces in a composable way with as little code as necessary. It's the "Command Line Interface Creation Kit". It's highly configurable but comes with sensible defaults out of the box. 20 | * [CLIP](https://github.com/openai/CLIP) - CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. 21 | * [Code Video Generator](https://github.com/sleuth-io/code-video-generator) - Code Video Generator is a library that uses the Manim animation engine to automatically generate code walkthrough videos. 22 | * [creme](https://github.com/creme-ml/creme) - creme is a Python library for online machine learning. 23 | * [cuML](https://github.com/rapidsai/cuml) - cuML enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs without going into the details of CUDA programming. In most cases, cuML's Python API matches the API from scikit-learn. 24 | * [Curecharts](https://github.com/cutecharts/cutecharts.py) - There is no doubt that Javascript has more advantages in interaction as well as visual effect. Besides that, as we all know, Python is an expressive language and is loved by data science community. Hence I want to combine the strength of both technologies, as the result of this idea, cutecharts.py is born. 25 | * [D2Go](https://github.com/facebookresearch/d2go) - D2Go is a production ready software system from FacebookResearch, which supports end-to-end model training and deployment for mobile platforms. 26 | * [dataprep](https://github.com/sfu-db/dataprep) - Dataprep lets you prepare your data using a single library with a few lines of code. 27 | * [datasette](https://github.com/simonw/datasette) - A tool for exploring and publishing data http://datasette.readthedocs.io/ 28 | * [deepchecks](https://github.com/deepchecks/deepchecks) - Test Suites for Validating ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort. 29 | * [DeText](https://github.com/linkedin/detext) - DeText is a Deep Text understanding framework for NLP related ranking, classification, and language generation tasks. 30 | * [DoWhy](https://github.com/microsoft/dowhy) - DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. 31 | * [D-Tale](https://github.com/man-group/dtale) - D-Tale is the combination of a Flask back-end and a React front-end to bring you an easy way to view & analyze Pandas data structures. 32 | * [EconML](https://github.com/microsoft/EconML) - EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. 33 | * [EfficientWord-Net](https://github.com/Ant-Brain/EfficientWord-Net) - OneShot Learning-based hotword detection. 34 | * [Elara DB](https://github.com/saurabh0719/elara) - Elara DB is an easy to use, lightweight NoSQL database written for python that can also be used as a fast in-memory cache for JSON-serializable data. Includes various methods to manipulate data structures in-memory, secure database files and export data. 35 | * [Euporie](https://github.com/joouha/euporie) 0 Euporie is a text-based user interface for running and editing Jupyter notebooks. 36 | * [Evidently](https://github.com/evidentlyai/evidently) - Interactive reports to analyze machine learning models during validation or production monitoring. 37 | * [Evol](https://github.com/godatadriven/evol) - A python grammar for evolutionary algorithms and heuristics 38 | * [falcon](https://github.com/falconry/falcon) - The no-nonsense, minimalist web services and app backend framework for Python developers with a focus on reliability and performance at scale https://falcon.readthedocs.io/en/stable/ 39 | * [FastAPI](https://github.com/tiangolo/fastapi) - FastAPI is a modern, fast (high-performance), web framework for building APIs with Python 3.6+ based on standard Python type hints. 40 | * [FastAPI CRUD Router](https://github.com/awtkns/fastapi-crudrouter) - A dynamic FastAPI router that automatically creates CRUD routes for your models. 41 | * [fds](https://github.com/DAGsHub/fds) - Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc. 42 | * [FiftyOne](https://github.com/voxel51/fiftyone) - The open-source tool for building high-quality datasets and computer vision models. 43 | * [gazpacho](https://github.com/maxhumber/gazpacho) - gazpacho is a simple, fast, and modern web scraping library. 44 | * [ggnerator](https://github.com/Datenworks/ggenerator) - A simple command line tool for fake dataset generation given a specification defined as a JSON DSL https://pypi.org/project/ggenerator/ 45 | * [Google Research Football](https://github.com/google-research/football) - This repository contains an RL environment based on open-source game Gameplay Football. 46 | * [gpt3-sandbox](https://github.com/shreyashankar/gpt3-sandbox) - The goal of this project is to enable users to create cool web demos using the newly released OpenAI GPT-3 API with just a few lines of Python. 47 | * [Great Expectations](https://github.com/great-expectations/great_expectations) - Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. 48 | * [guietta](https://github.com/alfiopuglisi/guietta) - A tool for making simple Python GUIs 49 | * [gym](https://github.com/openai/gym) - OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithm. 50 | * [Hermione](https://github.com/A3Data/hermione) - Hermione is the newest open source library that will help Data Scientists on setting up more organized codes, in a quicker and simpler way. 51 | * [Hoppscotch](https://github.com/hoppscotch/hoppscotch) - A free, fast and beautiful API request builder used by 120k+ developers. https://hoppscotch.io 52 | * [Hyperactive](https://github.com/SimonBlanke/Hyperactive) - A hyperparameter optimization and data collection toolbox for convenient and fast prototyping of machine-learning models. 53 | * [Hyperopt-sklearn](https://github.com/hyperopt/hyperopt-sklearn) - Hyperopt-sklearn is Hyperopt-based model selection among machine learning algorithms in scikit-learn https://hyperopt.github.io/hyperopt-sklearn/ 54 | * [igel](https://github.com/nidhaloff/igel) - A machine learning tool that allows to train, test and use models without writing code. 55 | * [image-to-latex](https://github.com/kingyiusuen/image-to-latex) - Convert images of LaTex math equations into LaTex code. 56 | * [jukebox](https://github.com/openai/jukebox/) - Code for "Jukebox: A Generative Model for Music" 57 | * [jupyter-book](https://github.com/executablebooks/jupyter-book) - Build interactive, publication-quality documents from Jupyter Notebooks http://jupyterbook.org 58 | * [kedro](https://github.com/quantumblacklabs/kedro) - Kedro is an open-source Python framework that applies software engineering best-practice to data and machine-learning pipelines. 59 | * [koalas](https://github.com/databricks/koalas) - The Koalas project makes data scientists more productive when interacting with big data, by implementing the pandas DataFrame API on top of Apache Spark. 60 | * [lightly](https://github.com/lightly-ai/lightly) - Lightly is a computer vision framework for self-supervised learning. 61 | * [LineaPy](https://github.com/LineaLabs/lineapy) - LineaPy is a Python package for capturing, analyzing, and automating data science workflows. At a high level, LineaPy traces the sequence of code execution to form a comprehensive understanding of the code and its context. 62 | * [Lip2Wav](https://github.com/Rudrabha/Lip2Wav) - Generate high quality speech from only lip movements. 63 | * [locust](https://github.com/locustio/locust) - Scalable user load testing tool written in Python [http://locust.io](http://locust.io) 64 | * [lona](https://github.com/fscherf/lona) - Lona is a web application framework, designed to write responsive web apps in full Python. 65 | * [lux](https://github.com/lux-org/lux) - Lux is a Python library that makes data science easier by automating aspects of the data exploration process. Lux facilitate faster experimentation with data, even when the user does not have a clear idea of what they are looking for. 66 | * [manim](https://github.com/3b1b/manim) - Animation engine for explanatory math videos 67 | * [Mava](https://github.com/instadeepai/Mava) - Mava is a library for building multi-agent reinforcement learning (MARL) systems. Mava provides useful components, abstractions, utilities and tools for MARL and allows for simple scaling for multi-process system training and execution while providing a high level of flexibility and composability. 68 | * [MLextend](https://github.com/rasbt/mlxtend) - A library of extension and helper modules for Python's data analysis and machine learning libraries http://rasbt.github.io/mlxtend/ 69 | * [NannyML](https://github.com/NannyML/nannyml) - NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. 70 | * [neupy](https://github.com/itdxer/neupy) - NeuPy is a python library for prototyping and building neural networks. 71 | * [NeuralDB](https://github.com/facebookresearch/NeuralDB) - Database Reasoning Over Text project for ACL paper. 72 | * [NeuralProphet](https://github.com/ourownstory/neural_prophet) - A Neural Network based Time-Series model. 73 | * [Newspaper](https://github.com/codelucas/newspaper/) - Newspaper is an amazing python library for extracting & curating articles. 74 | * [OpenChat](https://github.com/hyunwoongko/openchat) - Opensource chatting framework for generative models. 75 | * [optuna](https://github.com/optuna/optuna) - Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API. 76 | * [Opytimizer](https://github.com/gugarosa/opytimizer) - This package provides an easy-to-go implementation of meta-heuristic optimizations. 77 | * [orchest](https://github.com/orchest/orchest) - Orchest is a web based data science tool that works on top of your filesystem allowing you to use your editor of choice. With Orchest you get to focus on visually building and iterating on your pipeline ideas 78 | * [pandasgui](https://github.com/adamerose/pandasgui) - A GUI for analyzing Pandas DataFrames. 79 | * [panel](https://github.com/holoviz/panel) - Panel provides tools for easily composing widgets, plots, tables, and other viewable objects and controls into custom analysis tools, apps, and dashboards. 80 | * [pingouin](https://github.com/raphaelvallat/pingouin) - Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. 81 | * [PlotNeuralNet](https://github.com/HarisIqbal88/PlotNeuralNet) - Latex code for drawing neural networks for reports and presentation. 82 | * [PolyFuzz](https://github.com/MaartenGr/PolyFuzz) - PolyFuzz performs fuzzy string matching, string grouping, and contains extensive evaluation functions. PolyFuzz is meant to bring fuzzy string matching techniques together within a single framework. 83 | * [prophet](https://github.com/facebook/prophet) - Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. 84 | * [pybaobabdt](https://gitlab.tue.nl/20040367/pybaobab) - The pybaobabdt package provides a python implementation for the visualization of decision trees. 85 | * [pycaret](https://github.com/pycaret/pycaret) - An open source, low-code machine learning library in Python https://www.pycaret.org 86 | * [PyCM](https://github.com/sepandhaghighi/pycm) - PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. 87 | * [PyComp](https://github.com/ThiagoPanini/pycomp) - Fábrica de componentes Python para automatização desde atividades rotineiras até construção de modelos de Machine Learning. 88 | * [pydantic](https://github.com/samuelcolvin/pydantic) - Data parsing and validation using Python type hints https://pydantic-docs.helpmanual.io/ 89 | * [pydash](https://github.com/dgilland/pydash) - The kitchen sink of Python utility libraries for doing "stuff" in a functional way. 90 | * [pygod](https://github.com/pygod-team/pygod) - A Python Library for Graph Outlier Detection (Anomaly Detection). 91 | * [PyInfra](https://github.com/Fizzadar/pyinfra) - pyinfra automates infrastructure super fast at massive scale. It can be used for ad-hoc command execution, service deployment, configuration management and more https://pyinfra.com 92 | * [pyinstrument](https://github.com/joerick/pyinstrument) - Pyinstrument is a Python profiler. A profiler is a tool to help you 'optimize' your code - make it faster. 93 | * [PyMC3](https://github.com/pymc-devs/pymc3) - PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. Its flexibility and extensibility make it applicable to a large suite of problems. 94 | * [PySyft](https://github.com/OpenMined/PySyft) - A library for encrypted, privacy preserving machine learning https://www.openmined.org/ 95 | * [Qlib](https://github.com/microsoft/qlib) - Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. 96 | * [Quarto](https://github.com/quarto-dev/quarto-cli) - Open-source scientific and technical publishing system built on Pandoc. 97 | * [Quant DSL](https://github.com/johnbywater/quantdsl) - Domain specific language for quantitative analytics in finance and trading. 98 | * [Realtime PyAudio FFT](https://github.com/tr1pzz/Realtime_PyAudio_FFT) - A simple package to do realtime audio analysis in native Python, using PyAudio and Numpy to extract and visualize FFT features from a live audio stream. 99 | * [ReBeL](https://github.com/facebookresearch/rebel) - Implementation of ReBeL, an algorithm that generalizes the paradigm of self-play reinforcement learning and search to imperfect-information games. This repository contains implementation only for Liar's Dice game. 100 | * [RPA Framework](https://github.com/robocorp/rpaframework) - RPA Framework is a collection of open-source libraries and tools for Robotic Process Automation (RPA), and it is designed to be used with both Robot Framework and Python. 101 | * [Replicate](https://github.com/replicate/replicate) - Version control for machine learning. Replicate is a Python library that uploads files and metadata (like hyperparameters) to Amazon S3 or Google Cloud Storage. You can get the data back out using the command-line interface or a notebook. 102 | * [samila](https://github.com/sepandhaghighi/samila) - Samila is a generative art generator written in Python, Samila let's you create arts based on many thousand points. 103 | * [SDV](https://github.com/sdv-dev/SDV) - The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset. 104 | * [shap](https://github.com/slundberg/shap) - SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. 105 | * [Shapash](https://github.com/MAIF/shapash) - Shapash is a Python library which aims to make machine learning interpretable and understandable by everyone. It provides several types of visualization that display explicit labels that everyone can understand. 106 | * [sidetable](https://github.com/chris1610/sidetable) - sidetable builds simple but useful summary tables of your data https://pbpython.com 107 | * [scikit-survival](https://github.com/sebp/scikit-survival) - scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while utilizing the power of scikit-learn, e.g., for pre-processing or doing cross-validation. 108 | * [skits](https://github.com/ethanrosenthal/skits) - A library for SciKit-learn-Inspired Time Series models. 109 | * [spleeter](https://github.com/deezer/spleeter) - Deezer source separation library including pretrained models. 110 | * [sqlacodegen](https://github.com/agronholm/sqlacodegen) - This is a tool that reads the structure of an existing database and generates the appropriate SQLAlchemy model code, using the declarative style if possible. 111 | * [sqlmodel](https://github.com/tiangolo/sqlmodel) - SQL databases in Python, designed for simplicity, compatibility, and robustness. 112 | * [stock-pandas](https://github.com/kaelzhang/stock-pandas) - The production-ready subclass of `pandas.DataFrame` to support stock statistics and indicators. 113 | * [Stories](https://github.com/benawad/vscode-stories) - Stories is a simple way of sharing code snippets with other developers. [Download in marketplace](https://marketplace.visualstudio.com/items?itemName=benawad.stories) 114 | * [Streamlit](https://github.com/streamlit/streamlit) - The fastest way to build custom ML tools [https://www.streamlit.io/](https://www.streamlit.io/) 115 | * [superset](https://github.com/apache/superset) - Apache Superset is a Data Visualization and Data Exploration Platform 116 | * [SyntheticControlMethods](https://github.com/OscarEngelbrektson/SyntheticControlMethods) - A Python package for causal inference using Synthetic Controls 117 | * [sysidentpy](https://github.com/wilsonrljr/sysidentpy) - sysidentpy is a Python module for System Identification using NARMAX models built on top of numpy and is distributed under the 3-Clause BSD license. 118 | * [sweetviz](https://github.com/fbdesignpro/sweetviz) - Sweetviz is an open-source Python library that generates beautiful, high-density visualizations to kickstart EDA (Exploratory Data Analysis) with just two lines of code. 119 | * [Texthero](https://github.com/jbesomi/texthero) - Text preprocessing, representation and visualization from zero to hero https://texthero.org 120 | * [tpot](https://github.com/EpistasisLab/tpot) - A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming http://epistasislab.github.io/tpot/ 121 | * [tsfel](https://github.com/fraunhoferportugal/tsfel) - This repository hosts the TSFEL - Time Series Feature Extraction Library python package. TSFEL assists researchers on exploratory feature extraction tasks on time series without requiring significant programming effort. 122 | * [tuplex](https://github.com/tuplex/tuplex) - Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. 123 | * [Unvoiced](https://github.com/grassknoted/Unvoiced) - Application that converts American Sign Language to Speech. 124 | * [Visual Python](https://github.com/visualpython/visualpython) - Visual Python is a GUI-based Python code generator, developed on the Jupyter Notebook environment as an extension. 125 | * [Vowpal Wabbit](https://github.com/VowpalWabbit/vowpal_wabbit) - Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. 126 | * [Wav2Lip](https://github.com/Rudrabha/Wav2Lip) - This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. 127 | * [xarray](https://github.com/pydata/xarray) - xarray (formerly xray) is an open source project and Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun! 128 | * [zero](https://github.com/Ananto30/zero#benchmarks) - A high performance and fast Python microservice framework (RPC + PubSub). 129 | 130 | ## Highlights 131 | * [ML Notebooks](https://github.com/dair-ai/ML-Notebooks) - A series of code examples for all sorts of machine learning tasks and applications. 132 | * [Best-of Machine Learning with Python](https://github.com/ml-tooling/best-of-ml-python) - A ranked list of awesome machine learning Python libraries. Updated weekly. 133 | * [Awesome Dash](https://github.com/ucg8j/awesome-dash) 134 | * [Awesome FastAPI](https://github.com/mjhea0/awesome-fastapi) - A curated list of awesome things related to FastAPI. 135 | * [Awesome Python Data Science](https://github.com/krzjoa/awesome-python-data-science) - Probably the best curated list of data science software in Python. 136 | * [Awesome Machine Learning](https://github.com/italojs/awesome-machine-learning-portugues) - Language: Portuguese 137 | * [Open Source Society University](https://github.com/ossu/data-science) 138 | * [CursoDataScience](https://github.com/araramakerspace/CursoDataScience) - Language: Portuguese 139 | * [For Data Science Beginners](https://github.com/amrrs/For-Data-Science-Beginners) 140 | * [DataSciencePython](https://github.com/ujjwalkarn/DataSciencePython): Common data analysis and machine learning tasks using Python 141 | * [A to Z Resources for Students](https://github.com/dipakkr/A-to-Z-Resources-for-Students) 142 | * [A gallery of interesting Jupyter Notebooks](https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks) 143 | * [All Algorithms implemented in Python](https://github.com/TheAlgorithms/Python) 144 | * [Data Science Pizza](https://github.com/leportella/datascience-pizza) 145 | * [Cheat Sheets Data Science](https://github.com/kailashahirwar/cheatsheets-ai) 146 | * [Data Science - Cheat Sheet](https://github.com/abhat222/Data-Science--Cheat-Sheet) 147 | ## Others 148 | * [Natural Language Processing Best Practices & Examples](https://github.com/microsoft/nlp-recipes) 149 | * [Machine Learning Links and Lessons Learned](https://github.com/adeshpande3/Machine-Learning-Links-And-Lessons-Learned) 150 | * [Data Science Roadmap](https://github.com/datascience-python/data-science-roadmap) 151 | * [Data Science in a Box](https://github.com/rstudio-education/datascience-box) 152 | * [100 Days of Machine Learning Code](https://github.com/Avik-Jain/100-Days-Of-ML-Code) 153 | * [Awesome Deep Learning Papers](https://github.com/terryum/awesome-deep-learning-papers) 154 | * [Recursos by @MachineLearningBR](https://github.com/MachineLearningBR/recursos) 155 | * [NLP Progress](https://github.com/sebastianruder/NLP-progress) 156 | -------------------------------------------------------------------------------- /YOUTUBE.md: -------------------------------------------------------------------------------- 1 | # Some playlists and channels from YouTube 2 | ## Highlight 3 | * [The Ultimate List of Python YouTube Channels](https://realpython.com/python-youtube-channels/#.W7LVNdikKEQ.facebook) 4 | * [sentdex](https://www.youtube.com/channel/UCfzlCWGWYyIQ0aLC5w48gBQ) has a lot of good videos, the themes vary, it is a must watch channel! 5 | * [Eduardo Mendes](https://www.youtube.com/user/mendesesduardo) is another channel with good videos. He periodically makes a live videos with a guest. You have to understand Portuguese to watch the videos. 6 | * [Introduction to Computational Thinking and Data Science - MIT Course](https://www.youtube.com/playlist?list=PLUl4u3cNGP619EG1wp0kT-7rDE_Az5TNd) 7 | * [Neural networks](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi) 8 | # Other resources 9 | * [Machine Learning Tutorial in Python](https://www.youtube.com/playlist?list=PL9ooVrP1hQOHUfd-g8GUpKI3hHOwM_9Dn) 10 | * [Machine Learning with Python by @sentdex](https://www.youtube.com/playlist?list=PLQVvvaa0QuDfKTOs3Keq_kaG2P55YRn5v) 11 | * [Python Pro - Tech Talks](https://www.youtube.com/playlist?list=PLA05yVJtRWYSQ0loqX4Er6wIwJ_sU8j3S) - Language: Portuguese 12 | * [Cursos de Análise de Dados em Python para iniciantes](https://www.youtube.com/playlist?list=PLqiFjCF_dtcymXtdjwAP4s7tRoW4CYwnH) - Language: Portuguese 13 | * [Curso Deep Learning do Zero](https://www.youtube.com/playlist?list=PLxWEfWCujM7Y3Xf1bAxpICRlw2jt1a4S7) - Language: Portuguese 14 | * [Machine Learning Recipes with Josh Gordon](https://www.youtube.com/playlist?list=PLOU2XLYxmsIIuiBfYad6rFYQU_jL2ryal) 15 | * [Aprenda a Ensinar a Máquina!](https://www.youtube.com/playlist?list=PLjdDBZW3EmXdWKWIUGcP66lqnOapLDJlf) - Language: Portuguese 16 | * [Curso Deep Learning](https://www.youtube.com/playlist?list=PLSZEVLiOtIgF19_cPrvhJC2bWn-dUh1zB) - Language: Portuguese 17 | --------------------------------------------------------------------------------