├── .gitignore ├── CNAME ├── assets └── head.jpg ├── .github ├── FUNDING.yml └── workflows │ └── action.yml ├── OWNERS ├── .travis.yml ├── mlc_config.json ├── LICENSE ├── CODE_OF_CONDUCT.md └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | -------------------------------------------------------------------------------- /CNAME: -------------------------------------------------------------------------------- 1 | awesome-datascience.academic.io -------------------------------------------------------------------------------- /assets/head.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/academic/awesome-datascience/HEAD/assets/head.jpg -------------------------------------------------------------------------------- /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | github: [hmert] 3 | patreon: hmert 4 | -------------------------------------------------------------------------------- /OWNERS: -------------------------------------------------------------------------------- 1 | --- 2 | reviewers: 3 | - hmert 4 | - fakturk 5 | 6 | approvers: 7 | - hmert 8 | - fakturk 9 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: ruby 2 | rvm: 3 | - 2.2 4 | before_script: 5 | - gem install awesome_bot 6 | script: 7 | - awesome_bot README.md 8 | -------------------------------------------------------------------------------- /mlc_config.json: -------------------------------------------------------------------------------- 1 | { 2 | 3 | "ignorePatterns": [ 4 | { 5 | "pattern": "^https://www.datacamp.com*" 6 | } 7 | ], 8 | "timeout": "20s", 9 | "retryOn429": true, 10 | "retryCount": 5, 11 | "fallbackRetryDelay": "30s", 12 | "aliveStatusCodes": [200, 206] 13 | } 14 | -------------------------------------------------------------------------------- /.github/workflows/action.yml: -------------------------------------------------------------------------------- 1 | name: Check Markdown links 2 | on: 3 | schedule: 4 | - cron: "* */24 * * *" 5 | push: 6 | branches: [ live ] 7 | 8 | jobs: 9 | markdown-link-check: 10 | runs-on: ubuntu-latest 11 | steps: 12 | - uses: actions/checkout@master 13 | with: 14 | fetch-depth: 1 15 | - uses: gaurav-nelson/github-action-markdown-link-check@master 16 | with: 17 | use-quiet-mode: "yes" 18 | use-verbose-mode: "yes" 19 | config-file: './mlc_config.json' 20 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2014-2025 Academic.IO 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, sex characteristics, gender identity and expression, 9 | level of experience, education, socio-economic status, nationality, personal 10 | appearance, race, religion, or sexual identity and orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | * Using welcoming and inclusive language 18 | * Being respectful of differing viewpoints and experiences 19 | * Gracefully accepting constructive criticism 20 | * Focusing on what is best for the community 21 | * Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | * The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | * Trolling, insulting/derogatory comments, and personal or political attacks 28 | * Public or private harassment 29 | * Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | * Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or 41 | reject comments, commits, code, wiki edits, issues, and other contributions 42 | that are not aligned to this Code of Conduct, or to ban temporarily or 43 | permanently any contributor for other behaviors that they deem inappropriate, 44 | threatening, offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies both within project spaces and in public spaces 49 | when an individual is representing the project or its community. Examples of 50 | representing a project or community include using an official project e-mail 51 | address, posting via an official social media account, or acting as an appointed 52 | representative at an online or offline event. Representation of a project may be 53 | further defined and clarified by project maintainers. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported by contacting the project team at hi@academic.io. All 59 | complaints will be reviewed and investigated and will result in a response that 60 | is deemed necessary and appropriate to the circumstances. The project team is 61 | obligated to maintain confidentiality with regard to the reporter of an incident. 62 | Further details of specific enforcement policies may be posted separately. 63 | 64 | Project maintainers who do not follow or enforce the Code of Conduct in good 65 | faith may face temporary or permanent repercussions as determined by other 66 | members of the project's leadership. 67 | 68 | ## Attribution 69 | 70 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 71 | available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html 72 | 73 | [homepage]: https://www.contributor-covenant.org 74 | 75 | For answers to common questions about this code of conduct, see 76 | https://www.contributor-covenant.org/faq 77 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |
2 | Special thanks to Sponsors: 3 |
4 |
5 | 6 | Warp sponsorship 7 | 8 | 9 | ### [Warp, the intelligent terminal for developers](https://go.warp.dev/awesome-datascience) 10 | [Available for MacOS, Linux, & Windows](https://go.warp.dev/awesome-datascience)
11 |
12 |
13 | 14 | Requestly sponsorship 15 | 16 | ### [Requestly - Free & Open-Source alternative to Postman](https://requestly.com/awesomedatascience) 17 | [All-in-one platform to Test, Mock and Intercept APIs](https://requestly.com/awesomedatascience) 18 |
19 |
20 | 21 |
22 | 23 |
24 | 25 | # AWESOME DATA SCIENCE 26 | 27 | [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) 28 | 29 | **An open-source Data Science repository to learn and apply towards solving real world problems.** 30 | 31 | This is a shortcut path to start studying **Data Science**. Just follow the steps to answer the questions, "What is Data Science and what should I study to learn Data Science?" 32 | 33 | 34 |
35 | 36 | 37 | ## Sponsors 38 | 39 | | Sponsor | Pitch | 40 | | --- | --- | 41 | | --- | Be the first to sponsor! `github@academic.io` | 42 | 43 | 44 | 45 | ## Table of Contents 46 | 47 | - [What is Data Science?](#what-is-data-science) 48 | - [Where do I Start?](#where-do-i-start) 49 | - [Training Resources](#training-resources) 50 | - [Tutorials](#tutorials) 51 | - [Free Courses](#free-courses) 52 | - [Massively Open Online Courses](#moocs) 53 | - [Intensive Programs](#intensive-programs) 54 | - [Colleges](#colleges) 55 | - [The Data Science Toolbox](#the-data-science-toolbox) 56 | - [Algorithms](#algorithms) 57 | - [Supervised Learning](#supervised-learning) 58 | - [Unsupervised Learning](#unsupervised-learning) 59 | - [Semi-Supervised Learning](#semi-supervised-learning) 60 | - [Reinforcement Learning](#reinforcement-learning) 61 | - [Data Mining Algorithms](#data-mining-algorithms) 62 | - [Deep Learning Architectures](#deep-learning-architectures) 63 | - [General Machine Learning Packages](#general-machine-learning-packages) 64 | - [Model Evaluation & Monitoring](#model-evaluation--monitoring) 65 | - [Evidently AI](#evidently-ai) 66 | - [Deep Learning Packages](#deep-learning-packages) 67 | - [PyTorch Ecosystem](#pytorch-ecosystem) 68 | - [TensorFlow Ecosystem](#tensorflow-ecosystem) 69 | - [Keras Ecosystem](#keras-ecosystem) 70 | - [Visualization Tools](#visualization-tools) 71 | - [Miscellaneous Tools](#miscellaneous-tools) 72 | - [Literature and Media](#literature-and-media) 73 | - [Books](#books) 74 | - [Book Deals (Affiliated)](#book-deals-affiliated) 75 | - [Journals, Publications, and Magazines](#journals-publications-and-magazines) 76 | - [Newsletters](#newsletters) 77 | - [Bloggers](#bloggers) 78 | - [Presentations](#presentations) 79 | - [Podcasts](#podcasts) 80 | - [YouTube Videos & Channels](#youtube-videos--channels) 81 | - [Socialize](#socialize) 82 | - [Facebook Accounts](#facebook-accounts) 83 | - [Twitter Accounts](#twitter-accounts) 84 | - [Telegram Channels](#telegram-channels) 85 | - [Slack Communities](#slack-communities) 86 | - [GitHub Groups](#github-groups) 87 | - [Data Science Competitions](#data-science-competitions) 88 | - [Fun](#fun) 89 | - [Infographics](#infographics) 90 | - [Datasets](#datasets) 91 | - [Comics](#comics) 92 | - [Other Awesome Lists](#other-awesome-lists) 93 | - [Hobby](#hobby) 94 | 95 | ## What is Data Science? 96 | **[`^ back to top ^`](#awesome-data-science)** 97 | 98 | Data Science is one of the hottest topics on the Computer and Internet farmland nowadays. People have gathered data from applications and systems until today and now is the time to analyze them. The next steps are producing suggestions from the data and creating predictions about the future. [Here](https://www.quora.com/Data-Science/What-is-data-science) you can find the biggest question for **Data Science** and hundreds of answers from experts. 99 | 100 | 101 | | Link | Preview | 102 | | --- | --- | 103 | | [Data Science For Beginners](https://github.com/microsoft/Data-Science-For-Beginners) | Microsoft are pleased to offer a 10-week, 20-lesson curriculum all about Data Science. | 104 | | [What is Data Science @ O'reilly](https://www.oreilly.com/ideas/what-is-data-science) | _Data scientists combine entrepreneurship with patience, the willingness to build data products incrementally, the ability to explore, and the ability to iterate over a solution. They are inherently interdisciplinary. They can tackle all aspects of a problem, from initial data collection and data conditioning to drawing conclusions. They can think outside the box to come up with new ways to view the problem, or to work with very broadly defined problems: “here’s a lot of data, what can you make from it?”_ | 105 | | [What is Data Science @ Quora](https://www.quora.com/Data-Science/What-is-data-science) | Data Science is a combination of a number of aspects of Data such as Technology, Algorithm development, and data interference to study the data, analyse it, and find innovative solutions to difficult problems. Basically Data Science is all about Analysing data and driving for business growth by finding creative ways. | 106 | | [The sexiest job of 21st century](https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century) | _Data scientists today are akin to Wall Street “quants” of the 1980s and 1990s. In those days people with backgrounds in physics and math streamed to investment banks and hedge funds, where they could devise entirely new algorithms and data strategies. Then a variety of universities developed master’s programs in financial engineering, which churned out a second generation of talent that was more accessible to mainstream firms. The pattern was repeated later in the 1990s with search engineers, whose rarefied skills soon came to be taught in computer science programs._ | 107 | | [Wikipedia](https://en.wikipedia.org/wiki/Data_science) | _Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining, machine learning and big data._ | 108 | | [How to Become a Data Scientist](https://www.mastersindatascience.org/careers/data-scientist/) | _Data scientists are big data wranglers, gathering and analyzing large sets of structured and unstructured data. A data scientist’s role combines computer science, statistics, and mathematics. They analyze, process, and model data then interpret the results to create actionable plans for companies and other organizations._ | 109 | | [a very short history of #datascience](https://www.forbes.com/sites/gilpress/2013/05/28/a-very-short-history-of-data-science/) | _The story of how data scientists became sexy is mostly the story of the coupling of the mature discipline of statistics with a very young one--computer science. The term “Data Science” has emerged only recently to specifically designate a new profession that is expected to make sense of the vast stores of big data. But making sense of data has a long history and has been discussed by scientists, statisticians, librarians, computer scientists and others for years. The following timeline traces the evolution of the term “Data Science” and its use, attempts to define it, and related terms._ | 110 | |[Software Development Resources for Data Scientists](https://www.rstudio.com/blog/software-development-resources-for-data-scientists/)|_Data scientists concentrate on making sense of data through exploratory analysis, statistics, and models. Software developers apply a separate set of knowledge with different tools. Although their focus may seem unrelated, data science teams can benefit from adopting software development best practices. Version control, automated testing, and other dev skills help create reproducible, production-ready code and tools._| 111 | |[Data Scientist Roadmap](https://www.scaler.com/blog/how-to-become-a-data-scientist/)|_Data science is an excellent career choice in today’s data-driven world where approx 328.77 million terabytes of data are generated daily. And this number is only increasing day by day, which in turn increases the demand for skilled data scientists who can utilize this data to drive business growth._| 112 | |[Navigating Your Path to Becoming a Data Scientist](https://www.appliedaicourse.com/blog/how-to-become-a-data-scientist/)|_Data science is one of the most in-demand careers today. With businesses increasingly relying on data to make decisions, the need for skilled data scientists has grown rapidly. Whether it’s tech companies, healthcare organizations, or even government institutions, data scientists play a crucial role in turning raw data into valuable insights. But how do you become a data scientist, especially if you’re just starting out? _| 113 | 114 | ## Where do I Start? 115 | **[`^ back to top ^`](#awesome-data-science)** 116 | 117 | While not strictly necessary, having a programming language is a crucial skill to be effective as a data scientist. Currently, the most popular language is _Python_, closely followed by _R_. Python is a general-purpose scripting language that sees applications in a wide variety of fields. R is a domain-specific language for statistics, which contains a lot of common statistics tools out of the box. 118 | 119 | [Python](https://python.org/) is by far the most popular language in science, due in no small part to the ease at which it can be used and the vibrant ecosystem of user-generated packages. To install packages, there are two main methods: Pip (invoked as `pip install`), the package manager that comes bundled with Python, and [Anaconda](https://www.anaconda.com) (invoked as `conda install`), a powerful package manager that can install packages for Python, R, and can download executables like Git. 120 | 121 | Unlike R, Python was not built from the ground up with data science in mind, but there are plenty of third party libraries to make up for this. A much more exhaustive list of packages can be found later in this document, but these four packages are a good set of choices to start your data science journey with: [Scikit-Learn](https://scikit-learn.org/stable/index.html) is a general-purpose data science package which implements the most popular algorithms - it also includes rich documentation, tutorials, and examples of the models it implements. Even if you prefer to write your own implementations, Scikit-Learn is a valuable reference to the nuts-and-bolts behind many of the common algorithms you'll find. With [Pandas](https://pandas.pydata.org/), one can collect and analyze their data into a convenient table format. [Numpy](https://numpy.org/) provides very fast tooling for mathematical operations, with a focus on vectors and matrices. [Seaborn](https://seaborn.pydata.org/), itself based on the [Matplotlib](https://matplotlib.org/) package, is a quick way to generate beautiful visualizations of your data, with many good defaults available out of the box, as well as a gallery showing how to produce many common visualizations of your data. 122 | 123 | When embarking on your journey to becoming a data scientist, the choice of language isn't particularly important, and both Python and R have their pros and cons. Pick a language you like, and check out one of the [Free courses](#free-courses) we've listed below! 124 | 125 | ## Agents 126 | 127 | Please, contribute about "agents" 128 | 129 | ### Frameworks 130 | - [ADK-Rust](https://github.com/zavora-ai/adk-rust) - Production-ready AI agent development kit for Rust with model-agnostic design (Gemini, OpenAI, Anthropic), multiple agent types (LLM, Graph, Workflow), MCP support, and built-in telemetry. 131 | 132 | ### Workflow 133 | **[`^ back to top ^`](#awesome-data-science)** 134 | - [sim](https://sim.ai) Sim Studio's interface is a lightweight, intuitive way to quickly build and deploy LLMs that connect with your favorite tools. 135 | 136 | 137 | ## Training Resources 138 | **[`^ back to top ^`](#awesome-data-science)** 139 | 140 | How do you learn data science? By doing data science, of course! Okay, okay - that might not be particularly helpful when you're first starting out. In this section, we've listed some learning resources, in rough order from least to greatest commitment - [Tutorials](#tutorials), [Massively Open Online Courses (MOOCs)](#moocs), [Intensive Programs](#intensive-programs), and [Colleges](#colleges). 141 | 142 | 143 | ### Tutorials 144 | **[`^ back to top ^`](#awesome-data-science)** 145 | 146 | - [1000 Data Science Projects](https://cloud.blobcity.com/#/ps/explore) you can run on the browser with IPython. 147 | - [#tidytuesday](https://github.com/rfordatascience/tidytuesday) A weekly data project aimed at the R ecosystem. 148 | - [Data science your way](https://github.com/jadianes/data-science-your-way) 149 | - [DataCamp Cheatsheets](https://www.datacamp.com/cheat-sheet) Cheatsheets for data science. 150 | - [PySpark Cheatsheet](https://github.com/kevinschaich/pyspark-cheatsheet) 151 | - [Machine Learning, Data Science and Deep Learning with Python ](https://www.manning.com/livevideo/machine-learning-data-science-and-deep-learning-with-python) 152 | - [Your Guide to Latent Dirichlet Allocation](https://medium.com/@lettier/how-does-lda-work-ill-explain-using-emoji-108abf40fa7d) 153 | - [Tutorials of source code from the book Genetic Algorithms with Python by Clinton Sheppard](https://github.com/handcraftsman/GeneticAlgorithmsWithPython) 154 | - [Tutorials to get started on signal processing for machine learning](https://github.com/jinglescode/python-signal-processing) 155 | - [Realtime deployment](https://www.microprediction.com/python-1) Tutorial on Python time-series model deployment. 156 | - [Python for Data Science: A Beginner’s Guide](https://learntocodewith.me/posts/python-for-data-science/) 157 | - [Minimum Viable Study Plan for Machine Learning Interviews](https://github.com/khangich/machine-learning-interview) 158 | - [Understand and Know Machine Learning Engineering by Building Solid Projects](http://mlzoomcamp.com/) 159 | - [12 free Data Science projects to practice Python and Pandas](https://www.datawars.io/articles/12-free-data-science-projects-to-practice-python-and-pandas) 160 | - [Best CV/Resume for Data Science Freshers](https://enhancv.com/resume-examples/data-scientist/) 161 | - [Understand Data Science Course in Java](https://www.alter-solutions.com/articles/java-data-science) 162 | - [Data Analytics Interview Questions (Beginner to Advanced)](https://www.appliedaicourse.com/blog/data-analytics-interview-questions/) 163 | - [Top 100+ Data Science Interview Questions and Answers](https://www.appliedaicourse.com/blog/data-science-interview-questions/) 164 | 165 | ### Free Courses 166 | **[`^ back to top ^`](#awesome-data-science)** 167 | 168 | - [Data Scientist with R](https://www.datacamp.com/tracks/data-scientist-with-r) 169 | - [Data Scientist with Python](https://www.datacamp.com/tracks/data-scientist-with-python) 170 | - [Genetic Algorithms OCW Course](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/lecture-videos/lecture-1-introduction-and-scope/) 171 | - [AI Expert Roadmap](https://github.com/AMAI-GmbH/AI-Expert-Roadmap) - Roadmap to becoming an Artificial Intelligence Expert 172 | - [Convex Optimization](https://www.edx.org/course/convex-optimization) - Convex Optimization (basics of convex analysis; least-squares, linear and quadratic programs, semidefinite programming, minimax, extremal volume, and other problems; optimality conditions, duality theory...) 173 | - [Learning from Data](https://home.work.caltech.edu/telecourse.html) - Introduction to machine learning covering basic theory, algorithms and applications 174 | - [Kaggle](https://www.kaggle.com/learn) - Learn about Data Science, Machine Learning, Python etc 175 | - [ML Observability Fundamentals](https://arize.com/ml-observability-fundamentals/) - Learn how to monitor and root-cause production ML issues. 176 | - [Weights & Biases Effective MLOps: Model Development](https://www.wandb.courses/courses/effective-mlops-model-development) - Free Course and Certification for building an end-to-end machine using W&B 177 | - [Python for Data Science by Scaler](https://www.scaler.com/topics/course/python-for-data-science/) - This course is designed to empower beginners with the essential skills to excel in today's data-driven world. The comprehensive curriculum will give you a solid foundation in statistics, programming, data visualization, and machine learning. 178 | - [MLSys-NYU-2022](https://github.com/jacopotagliabue/MLSys-NYU-2022/tree/main) - Slides, scripts and materials for the Machine Learning in Finance course at NYU Tandon, 2022. 179 | - [Hands-on Train and Deploy ML](https://github.com/Paulescu/hands-on-train-and-deploy-ml) - A hands-on course to train and deploy a serverless API that predicts crypto prices. 180 | - [LLMOps: Building Real-World Applications With Large Language Models](https://www.comet.com/site/llm-course/) - Learn to build modern software with LLMs using the newest tools and techniques in the field. 181 | - [Prompt Engineering for Vision Models](https://www.deeplearning.ai/short-courses/prompt-engineering-for-vision-models/) - Learn to prompt cutting-edge computer vision models with natural language, coordinate points, bounding boxes, segmentation masks, and even other images in this free course from DeepLearning.AI. 182 | - [Data Science Course By IBM](https://skillsbuild.org/students/course-catalog/data-science) - Free resources and learn what data science is and how it’s used in different industries. 183 | 184 | 185 | 186 | ### MOOC's 187 | **[`^ back to top ^`](#awesome-data-science)** 188 | 189 | - [Coursera Introduction to Data Science](https://www.coursera.org/specializations/data-science) 190 | - [Data Science - 9 Steps Courses, A Specialization on Coursera](https://www.coursera.org/specializations/jhu-data-science) 191 | - [Data Mining - 5 Steps Courses, A Specialization on Coursera](https://www.coursera.org/specializations/data-mining) 192 | - [Machine Learning – 5 Steps Courses, A Specialization on Coursera](https://www.coursera.org/specializations/machine-learning) 193 | - [CS 109 Data Science](https://cs109.github.io/2015/) 194 | - [OpenIntro](https://www.openintro.org/) 195 | - [CS 171 Visualization](https://www.cs171.org/#!index.md) 196 | - [Process Mining: Data science in Action](https://www.coursera.org/learn/process-mining) 197 | - [Oxford Deep Learning](https://www.cs.ox.ac.uk/projects/DeepLearn/) 198 | - [Oxford Deep Learning - video](https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu) 199 | - [Oxford Machine Learning](https://www.cs.ox.ac.uk/research/ai_ml/index.html) 200 | - [UBC Machine Learning - video](https://www.cs.ubc.ca/~nando/540-2013/lectures.html) 201 | - [Data Science Specialization](https://github.com/DataScienceSpecialization/courses) 202 | - [Coursera Big Data Specialization](https://www.coursera.org/specializations/big-data) 203 | - [Statistical Thinking for Data Science and Analytics by Edx](https://www.edx.org/course/statistical-thinking-for-data-science-and-analytic) 204 | - [Cognitive Class AI by IBM](https://cognitiveclass.ai/) 205 | - [Udacity - Deep Learning](https://www.udacity.com/course/intro-to-tensorflow-for-deep-learning--ud187) 206 | - [Keras in Motion](https://www.manning.com/livevideo/keras-in-motion) 207 | - [Microsoft Professional Program for Data Science](https://academy.microsoft.com/en-us/professional-program/tracks/data-science/) 208 | - [COMP3222/COMP6246 - Machine Learning Technologies](https://tdgunes.com/COMP6246-2019Fall/) 209 | - [CS 231 - Convolutional Neural Networks for Visual Recognition](https://cs231n.github.io/) 210 | - [Coursera Tensorflow in practice](https://www.coursera.org/professional-certificates/tensorflow-in-practice) 211 | - [Coursera Deep Learning Specialization](https://www.coursera.org/specializations/deep-learning) 212 | - [365 Data Science Course](https://365datascience.com/) 213 | - [Coursera Natural Language Processing Specialization](https://www.coursera.org/specializations/natural-language-processing) 214 | - [Coursera GAN Specialization](https://www.coursera.org/specializations/generative-adversarial-networks-gans) 215 | - [Codecademy's Data Science](https://www.codecademy.com/learn/paths/data-science) 216 | - [Linear Algebra](https://ocw.mit.edu/courses/18-06sc-linear-algebra-fall-2011/) - Linear Algebra course by Gilbert Strang 217 | - [A 2020 Vision of Linear Algebra (G. Strang)](https://ocw.mit.edu/resources/res-18-010-a-2020-vision-of-linear-algebra-spring-2020/) 218 | - [Python for Data Science Foundation Course](https://intellipaat.com/academy/course/python-for-data-science-free-training/) 219 | - [Data Science: Statistics & Machine Learning](https://www.coursera.org/specializations/data-science-statistics-machine-learning) 220 | - [Machine Learning Engineering for Production (MLOps)](https://www.coursera.org/specializations/machine-learning-engineering-for-production-mlops) 221 | - [Recommender Systems Specialization from University of Minnesota](https://www.coursera.org/specializations/recommender-systems) is an intermediate/advanced level specialization focused on Recommender System on the Coursera platform. 222 | - [Stanford Artificial Intelligence Professional Program](https://online.stanford.edu/programs/artificial-intelligence-professional-program) 223 | - [Data Scientist with Python](https://app.datacamp.com/learn/career-tracks/data-scientist-with-python) 224 | - [Programming with Julia](https://www.udemy.com/course/programming-with-julia/) 225 | - [Scaler Data Science & Machine Learning Program](https://www.scaler.com/data-science-course/) 226 | - [Data Science Skill Tree](https://labex.io/skilltrees/data-science) 227 | - [Data Science for Beginners - Learn with AI tutor](https://codekidz.ai/lesson-intro/data-science-368dbf) 228 | - [Machine Learning for Beginners - Learn with AI tutor](https://codekidz.ai/lesson-intro/machine-lear-36abfb) 229 | - [Introduction to Data Science](https://www.mygreatlearning.com/academy/learn-for-free/courses/introduction-to-data-science) 230 | -[Getting Started with Python for Data Science](https://www.codecademy.com/learn/getting-started-with-python-for-data-science) 231 | - [Google Advanced Data Analytics Certificate](https://grow.google/data-analytics/) – Professional courses in data analysis, statistics, and machine learning fundamentals. 232 | 233 | ### Intensive Programs 234 | **[`^ back to top ^`](#awesome-data-science)** 235 | 236 | - [S2DS](https://www.s2ds.org/) 237 | - [WorldQuant University Applied Data Science Lab](https://www.wqu.edu/adsl) 238 | 239 | 240 | ### Colleges 241 | **[`^ back to top ^`](#awesome-data-science)** 242 | 243 | - [A list of colleges and universities offering degrees in data science.](https://github.com/ryanswanstrom/awesome-datascience-colleges) 244 | - [Data Science Degree @ Berkeley](https://ischoolonline.berkeley.edu/data-science/) 245 | - [Data Science Degree @ UVA](https://datascience.virginia.edu/) 246 | - [Data Science Degree @ Wisconsin](https://datasciencedegree.wisconsin.edu/) 247 | - [BS in Data Science & Applications](https://study.iitm.ac.in/ds/) 248 | - [MS in Computer Information Systems @ Boston University](https://www.bu.edu/online/programs/graduate-programs/computer-information-systems-masters-degree/) 249 | - [MS in Business Analytics @ ASU Online](https://asuonline.asu.edu/online-degree-programs/graduate/master-science-business-analytics/) 250 | - [MS in Applied Data Science @ Syracuse](https://ischool.syr.edu/academics/applied-data-science-masters-degree/) 251 | - [M.S. Management & Data Science @ Leuphana](https://www.leuphana.de/en/graduate-school/masters-programmes/management-data-science.html) 252 | - [Master of Data Science @ Melbourne University](https://study.unimelb.edu.au/find/courses/graduate/master-of-data-science/#overview) 253 | - [Msc in Data Science @ The University of Edinburgh](https://www.ed.ac.uk/studying/postgraduate/degrees/index.php?r=site/view&id=902) 254 | - [Master of Management Analytics @ Queen's University](https://smith.queensu.ca/grad_studies/mma/index.php) 255 | - [Master of Data Science @ Illinois Institute of Technology](https://www.iit.edu/academics/programs/data-science-mas) 256 | - [Master of Applied Data Science @ The University of Michigan](https://www.si.umich.edu/programs/master-applied-data-science) 257 | - [Master Data Science and Artificial Intelligence @ Eindhoven University of Technology](https://www.tue.nl/en/education/graduate-school/master-data-science-and-artificial-intelligence/) 258 | - [Master's Degree in Data Science and Computer Engineering @ University of Granada](https://masteres.ugr.es/datcom/) 259 | 260 | ## The Data Science Toolbox 261 | **[`^ back to top ^`](#awesome-data-science)** 262 | 263 | This section is a collection of packages, tools, algorithms, and other useful items in the data science world. 264 | 265 | ### Algorithms 266 | **[`^ back to top ^`](#awesome-data-science)** 267 | 268 | These are some Machine Learning and Data Mining algorithms and models help you to understand your data and derive meaning from it. 269 | 270 | #### Three kinds of Machine Learning Systems 271 | 272 | - Based on training with human supervision 273 | - Based on learning incrementally on fly 274 | - Based on data points comparison and pattern detection 275 | 276 | ### Comparison 277 | - [datacompy](https://github.com/capitalone/datacompy) - DataComPy is a package to compare two Pandas DataFrames. 278 | 279 | #### Supervised Learning 280 | 281 | - [Regression](https://en.wikipedia.org/wiki/Regression) 282 | - [Linear Regression](https://en.wikipedia.org/wiki/Linear_regression) 283 | - [Ordinary Least Squares](https://en.wikipedia.org/wiki/Ordinary_least_squares) 284 | - [Logistic Regression](https://en.wikipedia.org/wiki/Logistic_regression) 285 | - [Stepwise Regression](https://en.wikipedia.org/wiki/Stepwise_regression) 286 | - [Multivariate Adaptive Regression Splines](https://en.wikipedia.org/wiki/Multivariate_adaptive_regression_spline) 287 | - [Softmax Regression](https://d2l.ai/chapter_linear-classification/softmax-regression.html) 288 | - [Locally Estimated Scatterplot Smoothing](https://en.wikipedia.org/wiki/Local_regression) 289 | - Classification 290 | - [k-nearest neighbor](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) 291 | - [Support Vector Machines](https://en.wikipedia.org/wiki/Support_vector_machine) 292 | - [Decision Trees](https://en.wikipedia.org/wiki/Decision_tree) 293 | - [ID3 algorithm](https://en.wikipedia.org/wiki/ID3_algorithm) 294 | - [C4.5 algorithm](https://en.wikipedia.org/wiki/C4.5_algorithm) 295 | - [Ensemble Learning](https://scikit-learn.org/stable/modules/ensemble.html) 296 | - [Boosting](https://en.wikipedia.org/wiki/Boosting_(machine_learning)) 297 | - [Stacking](https://machinelearningmastery.com/stacking-ensemble-machine-learning-with-python) 298 | - [Bagging](https://en.wikipedia.org/wiki/Bootstrap_aggregating) 299 | - [Random Forest](https://en.wikipedia.org/wiki/Random_forest) 300 | - [AdaBoost](https://en.wikipedia.org/wiki/AdaBoost) 301 | 302 | #### Unsupervised Learning 303 | - [Clustering](https://scikit-learn.org/stable/modules/clustering.html#clustering) 304 | - [Hierchical clustering](https://scikit-learn.org/stable/modules/clustering.html#hierarchical-clustering) 305 | - [k-means](https://scikit-learn.org/stable/modules/clustering.html#k-means) 306 | - [Density-based clustering](https://scikit-learn.org/stable/modules/clustering.html#dbscan) 307 | - [Fuzzy clustering](https://en.wikipedia.org/wiki/Fuzzy_clustering) 308 | - [Mixture models](https://en.wikipedia.org/wiki/Mixture_model) 309 | - [Dimension Reduction](https://en.wikipedia.org/wiki/Dimensionality_reduction) 310 | - [Principal Component Analysis (PCA)](https://scikit-learn.org/stable/modules/decomposition.html#principal-component-analysis-pca) 311 | - [t-SNE; t-distributed Stochastic Neighbor Embedding](https://scikit-learn.org/stable/modules/manifold.html#t-distributed-stochastic-neighbor-embedding-tsne) 312 | - [Factor Analysis](https://scikit-learn.org/stable/modules/decomposition.html#factor-analysis) 313 | - [Latent Dirichlet Allocation (LDA)](https://scikit-learn.org/stable/modules/decomposition.html#latent-dirichlet-allocation-lda) 314 | - [Neural Networks](https://en.wikipedia.org/wiki/Neural_network) 315 | - [Self-organizing map](https://en.wikipedia.org/wiki/Self-organizing_map) 316 | - [Adaptive resonance theory](https://en.wikipedia.org/wiki/Adaptive_resonance_theory) 317 | - [Hidden Markov Models (HMM)](https://en.wikipedia.org/wiki/Hidden_Markov_model) 318 | 319 | #### Semi-Supervised Learning 320 | 321 | - S3VM 322 | - [Clustering](https://en.wikipedia.org/wiki/Weak_supervision#Cluster_assumption) 323 | - [Generative models](https://en.wikipedia.org/wiki/Weak_supervision#Generative_models) 324 | - [Low-density separation](https://en.wikipedia.org/wiki/Weak_supervision#Low-density_separation) 325 | - [Laplacian regularization](https://en.wikipedia.org/wiki/Weak_supervision#Laplacian_regularization) 326 | - [Heuristic approaches](https://en.wikipedia.org/wiki/Weak_supervision#Heuristic_approaches) 327 | 328 | #### Reinforcement Learning 329 | 330 | - [Q Learning](https://en.wikipedia.org/wiki/Q-learning) 331 | - [SARSA (State-Action-Reward-State-Action) algorithm](https://en.wikipedia.org/wiki/State%E2%80%93action%E2%80%93reward%E2%80%93state%E2%80%93action) 332 | - [Temporal difference learning](https://en.wikipedia.org/wiki/Temporal_difference_learning#:~:text=Temporal%20difference%20(TD)%20learning%20refers,estimate%20of%20the%20value%20function.) 333 | 334 | #### Data Mining Algorithms 335 | 336 | - [C4.5](https://en.wikipedia.org/wiki/C4.5_algorithm) 337 | - [k-Means](https://en.wikipedia.org/wiki/K-means_clustering) 338 | - [SVM (Support Vector Machine)](https://en.wikipedia.org/wiki/Support_vector_machine) 339 | - [Apriori](https://en.wikipedia.org/wiki/Apriori_algorithm) 340 | - [EM (Expectation-Maximization)](https://en.wikipedia.org/wiki/Expectation%E2%80%93maximization_algorithm) 341 | - [PageRank](https://en.wikipedia.org/wiki/PageRank) 342 | - [AdaBoost](https://en.wikipedia.org/wiki/AdaBoost) 343 | - [KNN (K-Nearest Neighbors)](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) 344 | - [Naive Bayes](https://en.wikipedia.org/wiki/Naive_Bayes_classifier) 345 | - [CART (Classification and Regression Trees)](https://en.wikipedia.org/wiki/Decision_tree_learning) 346 | #### Modern Data Mining Algorithms 347 | 348 | - [XGBoost (Extreme Gradient Boosting)](https://en.wikipedia.org/wiki/XGBoost) 349 | - [LightGBM (Light Gradient Boosting Machine)](https://en.wikipedia.org/wiki/LightGBM) 350 | - [CatBoost](https://catboost.ai/) 351 | - [HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise)](https://en.wikipedia.org/wiki/DBSCAN#HDBSCAN) 352 | - [FP-Growth (Frequent Pattern Growth Algorithm)](https://en.wikipedia.org/wiki/Association_rule_learning#FP-growth_algorithm) 353 | - [Isolation Forest](https://en.wikipedia.org/wiki/Isolation_forest) 354 | - [Deep Embedded Clustering (DEC)](https://arxiv.org/abs/1511.06335) 355 | - [TPU (Top-k Periodic and High-Utility Patterns)](https://arxiv.org/abs/2509.15732) 356 | - [Context-Aware Rule Mining (Transformer-Based Framework)](https://arxiv.org/abs/2503.11125) 357 | 358 | 359 | #### Deep Learning architectures 360 | 361 | - [Multilayer Perceptron](https://en.wikipedia.org/wiki/Multilayer_perceptron) 362 | - [Convolutional Neural Network (CNN)](https://en.wikipedia.org/wiki/Convolutional_neural_network) 363 | - [Recurrent Neural Network (RNN)](https://en.wikipedia.org/wiki/Recurrent_neural_network) 364 | - [Boltzmann Machines](https://en.wikipedia.org/wiki/Boltzmann_machine) 365 | - [Autoencoder](https://www.tensorflow.org/tutorials/generative/autoencoder) 366 | - [Generative Adversarial Network (GAN)](https://developers.google.com/machine-learning/gan/gan_structure) 367 | - [Self-Organized Maps](https://en.wikipedia.org/wiki/Self-organizing_map) 368 | - [Transformer](https://www.tensorflow.org/text/tutorials/transformer) 369 | - [Conditional Random Field (CRF)](https://towardsdatascience.com/conditional-random-fields-explained-e5b8256da776) 370 | - [ML System Designs)](https://www.evidentlyai.com/ml-system-design) 371 | 372 | ### General Machine Learning Packages 373 | **[`^ back to top ^`](#awesome-data-science)** 374 | 375 | * [scikit-learn](https://scikit-learn.org/) 376 | * [scikit-multilearn](https://github.com/scikit-multilearn/scikit-multilearn) 377 | * [sklearn-expertsys](https://github.com/tmadl/sklearn-expertsys) 378 | * [scikit-feature](https://github.com/jundongl/scikit-feature) 379 | * [scikit-rebate](https://github.com/EpistasisLab/scikit-rebate) 380 | * [seqlearn](https://github.com/larsmans/seqlearn) 381 | * [sklearn-bayes](https://github.com/AmazaspShumik/sklearn-bayes) 382 | * [sklearn-crfsuite](https://github.com/TeamHG-Memex/sklearn-crfsuite) 383 | * [sklearn-deap](https://github.com/rsteca/sklearn-deap) 384 | * [sigopt_sklearn](https://github.com/sigopt/sigopt-sklearn) 385 | * [sklearn-evaluation](https://github.com/edublancas/sklearn-evaluation) 386 | * [scikit-image](https://github.com/scikit-image/scikit-image) 387 | * [scikit-opt](https://github.com/guofei9987/scikit-opt) 388 | * [scikit-posthocs](https://github.com/maximtrp/scikit-posthocs) 389 | * [feature-engine](https://feature-engine.trainindata.com/) 390 | * [pystruct](https://github.com/pystruct/pystruct) 391 | * [Shogun](https://www.shogun-toolbox.org/) 392 | * [xLearn](https://github.com/aksnzhy/xlearn) 393 | * [cuML](https://github.com/rapidsai/cuml) 394 | * [causalml](https://github.com/uber/causalml) 395 | * [mlpack](https://github.com/mlpack/mlpack) 396 | * [MLxtend](https://github.com/rasbt/mlxtend) 397 | * [modAL](https://github.com/modAL-python/modAL) 398 | * [Sparkit-learn](https://github.com/lensacom/sparkit-learn) 399 | * [hyperlearn](https://github.com/danielhanchen/hyperlearn) 400 | * [dlib](https://github.com/davisking/dlib) 401 | * [imodels](https://github.com/csinva/imodels) 402 | * [RuleFit](https://github.com/christophM/rulefit) 403 | * [pyGAM](https://github.com/dswah/pyGAM) 404 | * [Deepchecks](https://github.com/deepchecks/deepchecks) 405 | * [scikit-survival](https://scikit-survival.readthedocs.io/en/stable) 406 | * [interpretable](https://pypi.org/project/interpretable) 407 | * [XGBoost](https://github.com/dmlc/xgboost) 408 | * [LightGBM](https://github.com/microsoft/LightGBM) 409 | * [CatBoost](https://github.com/catboost/catboost) 410 | * [PerpetualBooster](https://github.com/perpetual-ml/perpetual) 411 | * [JAX](https://github.com/google/jax) 412 | 413 | 414 | 415 | ### Deep Learning Packages 416 | 417 | #### PyTorch Ecosystem 418 | * [PyTorch](https://github.com/pytorch/pytorch) 419 | * [torchvision](https://github.com/pytorch/vision) 420 | * [torchtext](https://github.com/pytorch/text) 421 | * [torchaudio](https://github.com/pytorch/audio) 422 | * [ignite](https://github.com/pytorch/ignite) 423 | * [PyTorchNet](https://github.com/pytorch/tnt) 424 | * [PyToune](https://github.com/GRAAL-Research/poutyne) 425 | * [skorch](https://github.com/skorch-dev/skorch) 426 | * [PyVarInf](https://github.com/ctallec/pyvarinf) 427 | * [pytorch_geometric](https://github.com/pyg-team/pytorch_geometric) 428 | * [GPyTorch](https://github.com/cornellius-gp/gpytorch) 429 | * [pyro](https://github.com/pyro-ppl/pyro) 430 | * [Catalyst](https://github.com/catalyst-team/catalyst) 431 | * [pytorch_tabular](https://github.com/manujosephv/pytorch_tabular) 432 | * [Yolov3](https://github.com/ultralytics/yolov3) 433 | * [Yolov5](https://github.com/ultralytics/yolov5) 434 | * [Yolov8](https://github.com/ultralytics/ultralytics) 435 | 436 | #### TensorFlow Ecosystem 437 | * [TensorFlow](https://github.com/tensorflow/tensorflow) 438 | * [TensorLayer](https://github.com/tensorlayer/TensorLayer) 439 | * [TFLearn](https://github.com/tflearn/tflearn) 440 | * [Sonnet](https://github.com/deepmind/sonnet) 441 | * [tensorpack](https://github.com/tensorpack/tensorpack) 442 | * [TRFL](https://github.com/deepmind/trfl) 443 | * [Polyaxon](https://github.com/polyaxon/polyaxon) 444 | * [NeuPy](https://github.com/itdxer/neupy) 445 | * [tfdeploy](https://github.com/riga/tfdeploy) 446 | * [tensorflow-upstream](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream) 447 | * [TensorFlow Fold](https://github.com/tensorflow/fold) 448 | * [tensorlm](https://github.com/batzner/tensorlm) 449 | * [TensorLight](https://github.com/bsautermeister/tensorlight) 450 | * [Mesh TensorFlow](https://github.com/tensorflow/mesh) 451 | * [Ludwig](https://github.com/ludwig-ai/ludwig) 452 | * [TF-Agents](https://github.com/tensorflow/agents) 453 | * [TensorForce](https://github.com/tensorforce/tensorforce) 454 | 455 | #### Keras Ecosystem 456 | 457 | * [Keras](https://keras.io) 458 | * [keras-contrib](https://github.com/keras-team/keras-contrib) 459 | * [Hyperas](https://github.com/maxpumperla/hyperas) 460 | * [Elephas](https://github.com/maxpumperla/elephas) 461 | * [Hera](https://github.com/keplr-io/hera) 462 | * [Spektral](https://github.com/danielegrattarola/spektral) 463 | * [qkeras](https://github.com/google/qkeras) 464 | * [keras-rl](https://github.com/keras-rl/keras-rl) 465 | * [Talos](https://github.com/autonomio/talos) 466 | 467 | #### Visualization Tools 468 | **[`^ back to top ^`](#awesome-data-science)** 469 | 470 | - [altair](https://altair-viz.github.io/) 471 | - [amcharts](https://www.amcharts.com/) 472 | - [anychart](https://www.anychart.com/) 473 | - [bokeh](https://bokeh.org/) 474 | - [Comet](https://www.comet.com/site/products/ml-experiment-tracking/?utm_source=awesome-datascience) 475 | - [slemma](https://slemma.com/) 476 | - [cartodb](https://cartodb.github.io/odyssey.js/) 477 | - [Cube](https://square.github.io/cube/) 478 | - [d3plus](https://d3plus.org/) 479 | - [Data-Driven Documents(D3js)](https://d3js.org/) 480 | - [dygraphs](https://dygraphs.com/) 481 | - [exhibit](https://www.simile-widgets.org/exhibit/) 482 | - [gephi](https://gephi.org/) 483 | - [ggplot2](https://ggplot2.tidyverse.org/) 484 | - [Glue](http://docs.glueviz.org/en/latest/index.html) 485 | - [Google Chart Gallery](https://developers.google.com/chart/interactive/docs/gallery) 486 | - [highcarts](https://www.highcharts.com/) 487 | - [import.io](https://www.import.io/) 488 | - [Matplotlib](https://matplotlib.org/) 489 | - [nvd3](https://nvd3.org/) 490 | - [Netron](https://github.com/lutzroeder/netron) 491 | - [Openrefine](https://openrefine.org/) 492 | - [plot.ly](https://plot.ly/) 493 | - [raw](https://rawgraphs.io) 494 | - [Resseract Lite](https://github.com/abistarun/resseract-lite) 495 | - [Seaborn](https://seaborn.pydata.org/) 496 | - [techanjs](https://techanjs.org/) 497 | - [Timeline](https://timeline.knightlab.com/) 498 | - [variancecharts](https://variancecharts.com/index.html) 499 | - [vida](https://vida.io/) 500 | - [vizzu](https://github.com/vizzuhq/vizzu-lib) 501 | - [Wrangler](http://vis.stanford.edu/wrangler/) 502 | - [r2d3](http://www.r2d3.us/visual-intro-to-machine-learning-part-1/) 503 | - [NetworkX](https://networkx.org/) 504 | - [Redash](https://redash.io/) 505 | - [C3](https://c3js.org/) 506 | - [TensorWatch](https://github.com/microsoft/tensorwatch) 507 | - [geomap](https://pypi.org/project/geomap/) 508 | - [Dash](https://plotly.com/dash/) 509 | 510 | ### Miscellaneous Tools 511 | **[`^ back to top ^`](#awesome-data-science)** 512 | 513 | | Link | Description | 514 | | --- | --- | 515 | | [The Data Science Lifecycle Process](https://github.com/dslp/dslp) | The Data Science Lifecycle Process is a process for taking data science teams from Idea to Value repeatedly and sustainably. The process is documented in this repo | 516 | | [Data Science Lifecycle Template Repo](https://github.com/dslp/dslp-repo-template) | Template repository for data science lifecycle project | 517 | | [RexMex](https://github.com/AstraZeneca/rexmex) | A general purpose recommender metrics library for fair evaluation. | 518 | | [ChemicalX](https://github.com/AstraZeneca/chemicalx) | A PyTorch based deep learning library for drug pair scoring. | 519 | | [PyTorch Geometric Temporal](https://github.com/benedekrozemberczki/pytorch_geometric_temporal) | Representation learning on dynamic graphs. | 520 | | [Little Ball of Fur](https://github.com/benedekrozemberczki/littleballoffur) | A graph sampling library for NetworkX with a Scikit-Learn like API. | 521 | | [Karate Club](https://github.com/benedekrozemberczki/karateclub) | An unsupervised machine learning extension library for NetworkX with a Scikit-Learn like API. | 522 | | [ML Workspace](https://github.com/ml-tooling/ml-workspace) | All-in-one web-based IDE for machine learning and data science. The workspace is deployed as a Docker container and is preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch) and dev tools (e.g., Jupyter, VS Code) | 523 | | [Neptune.ai](https://neptune.ai) | Community-friendly platform supporting data scientists in creating and sharing machine learning models. Neptune facilitates teamwork, infrastructure management, models comparison and reproducibility. | 524 | | [steppy](https://github.com/minerva-ml/steppy) | Lightweight, Python library for fast and reproducible machine learning experimentation. Introduces very simple interface that enables clean machine learning pipeline design. | 525 | | [steppy-toolkit](https://github.com/minerva-ml/steppy-toolkit) | Curated collection of the neural networks, transformers and models that make your machine learning work faster and more effective. | 526 | | [Datalab from Google](https://cloud.google.com/datalab/docs/) | easily explore, visualize, analyze, and transform data using familiar languages, such as Python and SQL, interactively. | 527 | | [Hortonworks Sandbox](https://www.cloudera.com/downloads/hortonworks-sandbox.html) | is a personal, portable Hadoop environment that comes with a dozen interactive Hadoop tutorials. | 528 | | [R](https://www.r-project.org/) | is a free software environment for statistical computing and graphics. | 529 | | [Tidyverse](https://www.tidyverse.org/) | is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures. | 530 | | [RStudio](https://www.rstudio.com) | IDE – powerful user interface for R. It’s free and open source, and works on Windows, Mac, and Linux. | 531 | | [Python - Pandas - Anaconda](https://www.anaconda.com) | Completely free enterprise-ready Python distribution for large-scale data processing, predictive analytics, and scientific computing | 532 | | [Pandas GUI](https://github.com/adrotog/PandasGUI) | Pandas GUI | 533 | | [Scikit-Learn](https://scikit-learn.org/stable/) | Machine Learning in Python | 534 | | [NumPy](https://numpy.org/) | NumPy is fundamental for scientific computing with Python. It supports large, multi-dimensional arrays and matrices and includes an assortment of high-level mathematical functions to operate on these arrays. | 535 | | [Vaex](https://vaex.io/) | Vaex is a Python library that allows you to visualize large datasets and calculate statistics at high speeds. | 536 | | [SciPy](https://scipy.org/) | SciPy works with NumPy arrays and provides efficient routines for numerical integration and optimization. | 537 | | [Data Science Toolbox](https://www.coursera.org/learn/data-scientists-tools) | Coursera Course | 538 | | [Data Science Toolbox](https://datasciencetoolbox.org/) | Blog | 539 | | [Wolfram Data Science Platform](https://www.wolfram.com/data-science-platform/) | Take numerical, textual, image, GIS or other data and give it the Wolfram treatment, carrying out a full spectrum of data science analysis and visualization and automatically generate rich interactive reports—all powered by the revolutionary knowledge-based Wolfram Language. | 540 | | [Datadog](https://www.datadoghq.com/) | Solutions, code, and devops for high-scale data science. | 541 | | [Variance](https://variancecharts.com/) | Build powerful data visualizations for the web without writing JavaScript | 542 | | [Kite Development Kit](http://kitesdk.org/docs/current/index.html) | The Kite Software Development Kit (Apache License, Version 2.0), or Kite for short, is a set of libraries, tools, examples, and documentation focused on making it easier to build systems on top of the Hadoop ecosystem. | 543 | | [Domino Data Labs](https://www.dominodatalab.com) | Run, scale, share, and deploy your models — without any infrastructure or setup. | 544 | | [Apache Flink](https://flink.apache.org/) | A platform for efficient, distributed, general-purpose data processing. | 545 | | [Apache Hama](https://hama.apache.org/) | Apache Hama is an Apache Top-Level open source project, allowing you to do advanced analytics beyond MapReduce. | 546 | | [Weka](https://ml.cms.waikato.ac.nz/weka/index.html) | Weka is a collection of machine learning algorithms for data mining tasks. | 547 | | [Octave](https://www.gnu.org/software/octave/) | GNU Octave is a high-level interpreted language, primarily intended for numerical computations.(Free Matlab) | 548 | | [Apache Spark](https://spark.apache.org/) | Lightning-fast cluster computing | 549 | | [Hydrosphere Mist](https://github.com/Hydrospheredata/mist) | a service for exposing Apache Spark analytics jobs and machine learning models as realtime, batch or reactive web services. | 550 | | [Data Mechanics](https://www.datamechanics.co) | A data science and engineering platform making Apache Spark more developer-friendly and cost-effective. | 551 | | [Caffe](https://caffe.berkeleyvision.org/) | Deep Learning Framework | 552 | | [Torch](http://torch.ch/) | A SCIENTIFIC COMPUTING FRAMEWORK FOR LUAJIT | 553 | | [Nervana's python based Deep Learning Framework](https://github.com/NervanaSystems/neon) | Intel® Nervana™ reference deep learning framework committed to best performance on all hardware. | 554 | | [Skale](https://github.com/skale-me/skale) | High performance distributed data processing in NodeJS | 555 | | [Aerosolve](https://airbnb.io/aerosolve/) | A machine learning package built for humans. | 556 | | [Intel framework](https://github.com/intel/idlf) | Intel® Deep Learning Framework | 557 | | [Datawrapper](https://www.datawrapper.de/) | An open source data visualization platform helping everyone to create simple, correct and embeddable charts. Also at [github.com](https://github.com/datawrapper/datawrapper) | 558 | | [Tensor Flow](https://www.tensorflow.org/) | TensorFlow is an Open Source Software Library for Machine Intelligence | 559 | | [Natural Language Toolkit](https://www.nltk.org/) | An introductory yet powerful toolkit for natural language processing and classification | 560 | | [Annotation Lab](https://www.johnsnowlabs.com/annotation-lab/) | Free End-to-End No-Code platform for text annotation and DL model training/tuning. Out-of-the-box support for Named Entity Recognition, Classification, Relation extraction and Assertion Status Spark NLP models. Unlimited support for users, teams, projects, documents. | 561 | | [nlp-toolkit for node.js](https://www.npmjs.com/package/nlp-toolkit) | This module covers some basic nlp principles and implementations. The main focus is performance. When we deal with sample or training data in nlp, we quickly run out of memory. Therefore every implementation in this module is written as stream to only hold that data in memory that is currently processed at any step. | 562 | | [Julia](https://julialang.org) | high-level, high-performance dynamic programming language for technical computing | 563 | | [IJulia](https://github.com/JuliaLang/IJulia.jl) | a Julia-language backend combined with the Jupyter interactive environment | 564 | | [Apache Zeppelin](https://zeppelin.apache.org/) | Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more | 565 | | [Featuretools](https://github.com/alteryx/featuretools) | An open source framework for automated feature engineering written in python | 566 | | [Optimus](https://github.com/hi-primus/optimus) | Cleansing, pre-processing, feature engineering, exploratory data analysis and easy ML with PySpark backend. | 567 | | [Albumentations](https://github.com/albumentations-team/albumentations) | А fast and framework agnostic image augmentation library that implements a diverse set of augmentation techniques. Supports classification, segmentation, and detection out of the box. Was used to win a number of Deep Learning competitions at Kaggle, Topcoder and those that were a part of the CVPR workshops. | 568 | | [DVC](https://github.com/iterative/dvc) | An open-source data science version control system. It helps track, organize and make data science projects reproducible. In its very basic scenario it helps version control and share large data and model files. | 569 | | [Lambdo](https://github.com/asavinov/lambdo) | is a workflow engine that significantly simplifies data analysis by combining in one analysis pipeline (i) feature engineering and machine learning (ii) model training and prediction (iii) table population and column evaluation. | 570 | | [Feast](https://github.com/feast-dev/feast) | A feature store for the management, discovery, and access of machine learning features. Feast provides a consistent view of feature data for both model training and model serving. | 571 | | [Polyaxon](https://github.com/polyaxon/polyaxon) | A platform for reproducible and scalable machine learning and deep learning. | 572 | | [UBIAI](https://ubiai.tools) | Easy-to-use text annotation tool for teams with most comprehensive auto-annotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling | 573 | | [Trains](https://github.com/allegroai/clearml) | Auto-Magical Experiment Manager, Version Control & DevOps for AI | 574 | | [Hopsworks](https://github.com/logicalclocks/hopsworks) | Open-source data-intensive machine learning platform with a feature store. Ingest and manage features for both online (MySQL Cluster) and offline (Apache Hive) access, train and serve models at scale. | 575 | | [MindsDB](https://github.com/mindsdb/mindsdb) | MindsDB is an Explainable AutoML framework for developers. With MindsDB you can build, train and use state of the art ML models in as simple as one line of code. | 576 | | [Lightwood](https://github.com/mindsdb/lightwood) | A Pytorch based framework that breaks down machine learning problems into smaller blocks that can be glued together seamlessly with an objective to build predictive models with one line of code. | 577 | | [AWS Data Wrangler](https://github.com/awslabs/aws-data-wrangler) | An open-source Python package that extends the power of Pandas library to AWS connecting DataFrames and AWS data related services (Amazon Redshift, AWS Glue, Amazon Athena, Amazon EMR, etc). | 578 | | [Amazon Rekognition](https://aws.amazon.com/rekognition/) | AWS Rekognition is a service that lets developers working with Amazon Web Services add image analysis to their applications. Catalog assets, automate workflows, and extract meaning from your media and applications.| 579 | | [Amazon Textract](https://aws.amazon.com/textract/) | Automatically extract printed text, handwriting, and data from any document. | 580 | | [Amazon Lookout for Vision](https://aws.amazon.com/lookout-for-vision/) | Spot product defects using computer vision to automate quality inspection. Identify missing product components, vehicle and structure damage, and irregularities for comprehensive quality control.| 581 | | [Amazon CodeGuru](https://aws.amazon.com/codeguru/) | Automate code reviews and optimize application performance with ML-powered recommendations.| 582 | | [CML](https://github.com/iterative/cml) | An open source toolkit for using continuous integration in data science projects. Automatically train and test models in production-like environments with GitHub Actions & GitLab CI, and autogenerate visual reports on pull/merge requests. | 583 | | [Dask](https://dask.org/) | An open source Python library to painlessly transition your analytics code to distributed computing systems (Big Data) | 584 | | [Statsmodels](https://www.statsmodels.org/stable/index.html) | A Python-based inferential statistics, hypothesis testing and regression framework | 585 | | [Gensim](https://radimrehurek.com/gensim/) | An open-source library for topic modeling of natural language text | 586 | | [spaCy](https://spacy.io/) | A performant natural language processing toolkit | 587 | | [Grid Studio](https://github.com/ricklamers/gridstudio) | Grid studio is a web-based spreadsheet application with full integration of the Python programming language. | 588 | |[Python Data Science Handbook](https://github.com/jakevdp/PythonDataScienceHandbook)|Python Data Science Handbook: full text in Jupyter Notebooks| 589 | | [Shapley](https://github.com/benedekrozemberczki/shapley) | A data-driven framework to quantify the value of classifiers in a machine learning ensemble. | 590 | | [DAGsHub](https://dagshub.com) | A platform built on open source tools for data, model and pipeline management. | 591 | | [Deepnote](https://deepnote.com) | A new kind of data science notebook. Jupyter-compatible, with real-time collaboration and running in the cloud. | 592 | | [Valohai](https://valohai.com) | An MLOps platform that handles machine orchestration, automatic reproducibility and deployment. | 593 | | [PyMC3](https://docs.pymc.io/) | A Python Library for Probabalistic Programming (Bayesian Inference and Machine Learning) | 594 | | [PyStan](https://pypi.org/project/pystan/) | Python interface to Stan (Bayesian inference and modeling) | 595 | | [hmmlearn](https://pypi.org/project/hmmlearn/) | Unsupervised learning and inference of Hidden Markov Models | 596 | | [Chaos Genius](https://github.com/chaos-genius/chaos_genius/) | ML powered analytics engine for outlier/anomaly detection and root cause analysis | 597 | | [Nimblebox](https://nimblebox.ai/) | A full-stack MLOps platform designed to help data scientists and machine learning practitioners around the world discover, create, and launch multi-cloud apps from their web browser. | 598 | | [Towhee](https://github.com/towhee-io/towhee) | A Python library that helps you encode your unstructured data into embeddings. | 599 | | [LineaPy](https://github.com/LineaLabs/lineapy) | Ever been frustrated with cleaning up long, messy Jupyter notebooks? With LineaPy, an open source Python library, it takes as little as two lines of code to transform messy development code into production pipelines. | 600 | | [envd](https://github.com/tensorchord/envd) | 🏕️ machine learning development environment for data science and AI/ML engineering teams | 601 | | [Explore Data Science Libraries](https://kandi.openweaver.com/explore/data-science) | A search engine 🔎 tool to discover & find a curated list of popular & new libraries, top authors, trending project kits, discussions, tutorials & learning resources | 602 | | [MLEM](https://github.com/iterative/mlem) | 🐶 Version and deploy your ML models following GitOps principles | 603 | | [MLflow](https://mlflow.org/) | MLOps framework for managing ML models across their full lifecycle | 604 | | [cleanlab](https://github.com/cleanlab/cleanlab) | Python library for data-centric AI and automatically detecting various issues in ML datasets | 605 | | [AutoGluon](https://github.com/awslabs/autogluon) | AutoML to easily produce accurate predictions for image, text, tabular, time-series, and multi-modal data | 606 | | [Arize AI](https://arize.com/) | Arize AI community tier observability tool for monitoring machine learning models in production and root-causing issues such as data quality and performance drift. | 607 | | [Aureo.io](https://aureo.io) | Aureo.io is a low-code platform that focuses on building artificial intelligence. It provides users with the capability to create pipelines, automations and integrate them with artificial intelligence models – all with their basic data. | 608 | | [ERD Lab](https://www.erdlab.io/) | Free cloud based entity relationship diagram (ERD) tool made for developers. 609 | | [Arize-Phoenix](https://docs.arize.com/phoenix) | MLOps in a notebook - uncover insights, surface problems, monitor, and fine tune your models. | 610 | | [Comet](https://github.com/comet-ml/comet-examples) | An MLOps platform with experiment tracking, model production management, a model registry, and full data lineage to support your ML workflow from training straight through to production. | 611 | | [Opik](https://github.com/comet-ml/opik) | Evaluate, test, and ship LLM applications across your dev and production lifecycles. | 612 | | [Synthical](https://synthical.com) | AI-powered collaborative environment for research. Find relevant papers, create collections to manage bibliography, and summarize content — all in one place | 613 | | [teeplot](https://github.com/mmore500/teeplot) | Workflow tool to automatically organize data visualization output | 614 | | [Streamlit](https://github.com/streamlit/streamlit) | App framework for Machine Learning and Data Science projects | 615 | | [Gradio](https://github.com/gradio-app/gradio) | Create customizable UI components around machine learning models | 616 | | [Weights & Biases](https://github.com/wandb/wandb) | Experiment tracking, dataset versioning, and model management | 617 | | [DVC](https://github.com/iterative/dvc) | Open-source version control system for machine learning projects | 618 | | [Optuna](https://github.com/optuna/optuna) | Automatic hyperparameter optimization software framework | 619 | | [Ray Tune](https://github.com/ray-project/ray) | Scalable hyperparameter tuning library | 620 | | [Apache Airflow](https://github.com/apache/airflow) | Platform to programmatically author, schedule, and monitor workflows | 621 | | [Prefect](https://github.com/PrefectHQ/prefect) | Workflow management system for modern data stacks | 622 | | [Kedro](https://github.com/kedro-org/kedro) | Open-source Python framework for creating reproducible, maintainable data science code | 623 | | [Hamilton](https://github.com/dagworks-inc/hamilton) | Lightweight library to author and manage reliable data transformations | 624 | | [SHAP](https://github.com/slundberg/shap) | Game theoretic approach to explain the output of any machine learning model | 625 | | [InterpretML](https://github.com/interpretml/interpret) | InterpretML implements the Explainable Boosting Machine (EBM), a modern, fully interpretable machine learning model based on Generalized Additive Models (GAMs). This open-source package also provides visualization tools for EBMs, other glass-box models, and black-box explanations | 626 | | [LIME](https://github.com/marcotcr/lime) | Explaining the predictions of any machine learning classifier | 627 | | [flyte](https://github.com/flyteorg/flyte) | Workflow automation platform for machine learning | 628 | | [dbt](https://github.com/dbt-labs/dbt-core) | Data build tool | 629 | | [zasper](https://github.com/zasper-io/zasper) | Supercharged IDE for Data Science | 630 | | [skrub](https://github.com/skrub-data/skrub/) | A Python library to ease preprocessing and feature engineering for tabular machine learning | 631 | | [Codeflash](https://www.codeflash.ai/) | Ship Blazing-Fast Python Code — Every Time | 632 | | [Hugging Face](https://huggingface.co/) | Popular open platform for sharing ML models, datasets, and collaborating on NLP and generative AI projects. | 633 | | [Chinese-Elite](https://github.com/anonym-g/Chinese-Elite) | An open-source project that automatically maps relationship networks by parsing public data using LLMs and visualizes it as an interactive graph. | 634 | | [Desbordante](https://github.com/desbordante/desbordante-core/) | An open-source data profiler specifically focused on discovery and validation of complex patterns, such as [numerical association rules](https://colab.research.google.com/github/Desbordante/desbordante-core/blob/main/examples/notebooks/Numerical_Association_Rules.ipynb), [differential dependencies](https://colab.research.google.com/github/Desbordante/desbordante-core/blob/main/examples/notebooks/Differential_Dependencies.ipynb), [denial constraints](https://colab.research.google.com/github/Desbordante/desbordante-core/blob/main/examples/notebooks/Denial_Constraints.ipynb), and more. | 635 | | [RunMat](https://github.com/runmat-org/runmat) | Fast MATLAB-syntax runtime with automatic CPU/GPU execution and fused array kernels. | 636 | 637 | 638 | ## Literature and Media 639 | **[`^ back to top ^`](#awesome-data-science)** 640 | 641 | This section includes some additional reading material, channels to watch, and talks to listen to. 642 | 643 | ### Books 644 | **[`^ back to top ^`](#awesome-data-science)** 645 | 646 | - [Data Science From Scratch: First Principles with Python](https://www.amazon.com/Data-Science-Scratch-Principles-Python-dp-1492041130/dp/1492041130/ref=dp_ob_title_bk) 647 | - [Artificial Intelligence with Python - Tutorialspoint](https://www.tutorialspoint.com/artificial_intelligence_with_python/artificial_intelligence_with_python_tutorial.pdf) 648 | - [Machine Learning from Scratch](https://dafriedman97.github.io/mlbook/content/introduction.html) 649 | - [Probabilistic Machine Learning: An Introduction](https://probml.github.io/pml-book/book1.html) 650 | - [How to Lead in Data Science](https://www.manning.com/books/how-to-lead-in-data-science) - Early Access 651 | - [Fighting Churn With Data](https://www.manning.com/books/fighting-churn-with-data) 652 | - [Data Science at Scale with Python and Dask](https://www.manning.com/books/data-science-with-python-and-dask) 653 | - [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) 654 | - [The Data Science Handbook: Advice and Insights from 25 Amazing Data Scientists](https://www.thedatasciencehandbook.com/) 655 | - [Think Like a Data Scientist](https://www.manning.com/books/think-like-a-data-scientist) 656 | - [Introducing Data Science](https://www.manning.com/books/introducing-data-science) 657 | - [Practical Data Science with R](https://www.manning.com/books/practical-data-science-with-r) 658 | - [Everyday Data Science](https://www.amazon.com/dp/B08TZ1MT3W/ref=cm_sw_r_cp_apa_fabc_a0ceGbWECF9A8) & [(cheaper PDF version)](https://gum.co/everydaydata) 659 | - [Exploring Data Science](https://www.manning.com/books/exploring-data-science) - free eBook sampler 660 | - [Exploring the Data Jungle](https://www.manning.com/books/exploring-the-data-jungle) - free eBook sampler 661 | - [Classic Computer Science Problems in Python](https://www.manning.com/books/classic-computer-science-problems-in-python) 662 | - [Math for Programmers](https://www.manning.com/books/math-for-programmers) Early access 663 | - [R in Action, Third Edition](https://www.manning.com/books/r-in-action-third-edition) Early Access 664 | - [Data Science Bookcamp](https://www.manning.com/books/data-science-bookcamp) Early access 665 | - [Data Science Thinking: The Next Scientific, Technological and Economic Revolution](https://www.springer.com/gp/book/9783319950914) 666 | - [Applied Data Science: Lessons Learned for the Data-Driven Business](https://www.springer.com/gp/book/9783030118204) 667 | - [The Data Science Handbook](https://www.amazon.com/Data-Science-Handbook-Field-Cady/dp/1119092949) 668 | - [Essential Natural Language Processing](https://www.manning.com/books/getting-started-with-natural-language-processing) - Early access 669 | - [Mining Massive Datasets](http://www.mmds.org/) - free e-book comprehended by an online course 670 | - [Pandas in Action](https://www.manning.com/books/pandas-in-action) - Early access 671 | - [Genetic Algorithms and Genetic Programming](https://www.taylorfrancis.com/books/9780429141973) 672 | - [Advances in Evolutionary Algorithms](https://www.intechopen.com/books/advances_in_evolutionary_algorithms) - Free Download 673 | - [Genetic Programming: New Approaches and Successful Applications](https://www.intechopen.com/books/genetic-programming-new-approaches-and-successful-applications) - Free Download 674 | - [Evolutionary Algorithms](https://www.intechopen.com/books/evolutionary-algorithms) - Free Download 675 | - [Advances in Genetic Programming, Vol. 3](http://www0.cs.ucl.ac.uk/staff/W.Langdon/aigp3/) - Free Download 676 | - [Genetic Algorithms and Evolutionary Computation](https://www.talkorigins.org/faqs/genalg/genalg.html) - Free Download 677 | - [Convex Optimization](https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf) - Convex Optimization book by Stephen Boyd - Free Download 678 | - [Data Analysis with Python and PySpark](https://www.manning.com/books/data-analysis-with-python-and-pyspark) - Early Access 679 | - [R for Data Science](https://r4ds.had.co.nz/) 680 | - [Build a Career in Data Science](https://www.manning.com/books/build-a-career-in-data-science) 681 | - [Machine Learning Bookcamp](https://mlbookcamp.com/) - Early access 682 | - [Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/) 683 | - [Effective Data Science Infrastructure](https://www.manning.com/books/effective-data-science-infrastructure) 684 | - [Practical MLOps: How to Get Ready for Production Models](https://valohai.com/mlops-ebook/) 685 | - [Data Analysis with Python and PySpark](https://www.manning.com/books/data-analysis-with-python-and-pyspark) 686 | - [Regression, a Friendly guide](https://www.manning.com/books/regression-a-friendly-guide) - Early Access 687 | - [Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing](https://www.oreilly.com/library/view/streaming-systems/9781491983867/) 688 | - [Data Science at the Command Line: Facing the Future with Time-Tested Tools](https://www.oreilly.com/library/view/data-science-at/9781491947845/) 689 | - [Machine Learning with Python - Tutorialspoint](https://www.tutorialspoint.com/machine_learning_with_python/machine_learning_with_python_tutorial.pdf) 690 | - [Deep Learning](https://www.deeplearningbook.org/) 691 | - [Designing Cloud Data Platforms](https://www.manning.com/books/designing-cloud-data-platforms) - Early Access 692 | - [An Introduction to Statistical Learning with Applications in R](https://www.statlearning.com/) 693 | - [The Elements of Statistical Learning: Data Mining, Inference, and Prediction](https://hastie.su.domains/ElemStatLearn/) 694 | - [Deep Learning with PyTorch](https://www.simonandschuster.com/books/Deep-Learning-with-PyTorch/Eli-Stevens/9781617295263) 695 | - [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com) 696 | - [Deep Learning Cookbook](https://www.oreilly.com/library/view/deep-learning-cookbook/9781491995839/) 697 | - [Introduction to Machine Learning with Python](https://www.oreilly.com/library/view/introduction-to-machine/9781449369880/) 698 | - [Artificial Intelligence: Foundations of Computational Agents, 2nd Edition](https://artint.info/index.html) - Free HTML version 699 | - [The Quest for Artificial Intelligence: A History of Ideas and Achievements](https://ai.stanford.edu/~nilsson/QAI/qai.pdf) - Free Download 700 | - [Graph Algorithms for Data Science](https://www.manning.com/books/graph-algorithms-for-data-science) - Early Access 701 | - [Data Mesh in Action](https://www.manning.com/books/data-mesh-in-action) - Early Access 702 | - [Julia for Data Analysis](https://www.manning.com/books/julia-for-data-analysis) - Early Access 703 | - [Casual Inference for Data Science](https://www.manning.com/books/julia-for-data-analysis) - Early Access 704 | - [Regular Expression Puzzles and AI Coding Assistants](https://www.manning.com/books/regular-expression-puzzles-and-ai-coding-assistants) by David Mertz 705 | - [Dive into Deep Learning](https://d2l.ai/) 706 | - [Data for All](https://www.manning.com/books/data-for-all) 707 | - [Interpretable Machine Learning: A Guide for Making Black Box Models Explainable](https://christophm.github.io/interpretable-ml-book/) - Free GitHub version 708 | - [Foundations of Data Science](https://www.cs.cornell.edu/jeh/book.pdf) Free Download 709 | - [Comet for DataScience: Enhance your ability to manage and optimize the life cycle of your data science project](https://www.amazon.com/Comet-Data-Science-Enhance-optimize/dp/1801814430) 710 | - [Software Engineering for Data Scientists](https://www.manning.com/books/software-engineering-for-data-scientists) - Early Access 711 | - [Julia for Data Science](https://www.manning.com/books/julia-for-data-science) - Early Access 712 | - [An Introduction to Statistical Learning](https://www.statlearning.com/) - Download Page 713 | - [Machine Learning For Absolute Beginners](https://www.amazon.in/Machine-Learning-Absolute-Beginners-Introduction-ebook/dp/B07335JNW1) 714 | - [Unifying Business, Data, and Code: Designing Data Products with JSON Schema](https://learning.oreilly.com/library/view/unifying-business-data/9781098144999/) 715 | - [Grokking Bayes](https://www.manning.com/books/grokking-bayes) 716 | - [Machine Learning Q and AI](https://sebastianraschka.com/books/ml-q-and-ai) 717 | 718 | #### Book Deals (Affiliated) 719 | 720 | - [eBook sale - Save up to 45% on eBooks!](https://www.manning.com/?utm_source=mikrobusiness&utm_medium=affiliate&utm_campaign=ebook_sale_8_8_22) 721 | 722 | - [Causal Machine Learning](https://www.manning.com/books/causal-machine-learning?utm_source=mikrobusiness&utm_medium=affiliate&utm_campaign=book_ness_causal_7_26_22&a_aid=mikrobusiness&a_bid=43a2198b 723 | ) 724 | - [Managing ML Projects](https://www.manning.com/books/managing-machine-learning-projects?utm_source=mikrobusiness&utm_medium=affiliate&utm_campaign=book_thompson_managing_6_14_22) 725 | - [Causal Inference for Data Science](https://www.manning.com/books/causal-inference-for-data-science?utm_source=mikrobusiness&utm_medium=affiliate&utm_campaign=book_ruizdevilla_causal_6_6_22) 726 | - [Data for All](https://www.manning.com/books/data-for-all?utm_source=mikrobusiness&utm_medium=affiliate) 727 | 728 | ### Journals, Publications and Magazines 729 | **[`^ back to top ^`](#awesome-data-science)** 730 | 731 | - [ICML](https://icml.cc/2015/) - International Conference on Machine Learning 732 | - [GECCO](https://gecco-2019.sigevo.org/index.html/HomePage) - The Genetic and Evolutionary Computation Conference (GECCO) 733 | - [epjdatascience](https://epjdatascience.springeropen.com/) 734 | - [Journal of Data Science](https://jds-online.org/journal/JDS) - an international journal devoted to applications of statistical methods at large 735 | - [Big Data Research](https://www.journals.elsevier.com/big-data-research) 736 | - [Journal of Big Data](https://journalofbigdata.springeropen.com/) 737 | - [Big Data & Society](https://journals.sagepub.com/home/bds) 738 | - [Data Science Journal](https://www.jstage.jst.go.jp/browse/dsj) 739 | - [datatau.com/news](https://www.datatau.com/news) - Like Hacker News, but for data 740 | - [Data Science Trello Board](https://trello.com/b/rbpEfMld/data-science) 741 | - [Medium Data Science Topic](https://medium.com/tag/data-science) - Data Science related publications on medium 742 | - [Towards Data Science Genetic Algorithm Topic](https://towardsdatascience.com/introduction-to-genetic-algorithms-including-example-code-e396e98d8bf3#:~:text=A%20genetic%20algorithm%20is%20a,offspring%20of%20the%20next%20generation.) -Genetic Algorithm related Publications towards Data Science 743 | - [Maxim AI](https://getmaxim.ai). Tool for AI Agent Simulation, Evaluation & Observability. 744 | 745 | ### Newsletters 746 | **[`^ back to top ^`](#awesome-data-science)** 747 | 748 | - [DataTalks.Club](https://datatalks.club). A weekly newsletter about data-related things. [Archive](https://us19.campaign-archive.com/home/?u=0d7822ab98152f5afc118c176&id=97178021aa). 749 | - [The Analytics Engineering Roundup](https://roundup.getdbt.com/about). A newsletter about data science. [Archive](https://roundup.getdbt.com/archive). 750 | 751 | 752 | ### Bloggers 753 | **[`^ back to top ^`](#awesome-data-science)** 754 | 755 | - [Wes McKinney](https://wesmckinney.com/archives.html) - Wes McKinney Archives. 756 | - [Matthew Russell](https://miningthesocialweb.com/) - Mining The Social Web. 757 | - [Greg Reda](http://www.gregreda.com/) - Greg Reda Personal Blog 758 | - [Julia Evans](https://jvns.ca/) - Recurse Center alumna 759 | - [Hakan Kardas](https://www.cse.unr.edu/~hkardes/) - Personal Web Page 760 | - [Sean J. Taylor](https://seanjtaylor.com/) - Personal Web Page 761 | - [Drew Conway](http://drewconway.com/) - Personal Web Page 762 | - [Hilary Mason](https://hilarymason.com/) - Personal Web Page 763 | - [Noah Iliinsky](http://complexdiagrams.com/) - Personal Blog 764 | - [Matt Harrison](https://hairysun.com/) - Personal Blog 765 | - [Vamshi Ambati](https://allthingsds.wordpress.com/) - AllThings Data Sciene 766 | - [Prash Chan](https://www.mdmgeek.com/) - Tech Blog on Master Data Management And Every Buzz Surrounding It 767 | - [Clare Corthell](http://datasciencemasters.org/) - The Open Source Data Science Masters 768 | - [Datawrangling](http://www.datawrangling.org) by Peter Skomoroch. MACHINE LEARNING, DATA MINING, AND MORE 769 | - [Quora Data Science](https://www.quora.com/topic/Data-Science) - Data Science Questions and Answers from experts 770 | - [Siah](https://openresearch.wordpress.com/) a PhD student at Berkeley 771 | - [Louis Dorard](https://www.ownml.co/blog/) a technology guy with a penchant for the web and for data, big and small 772 | - [Machine Learning Mastery](https://machinelearningmastery.com/) about helping professional programmers confidently apply machine learning algorithms to address complex problems. 773 | - [Daniel Forsyth](https://www.danielforsyth.me/) - Personal Blog 774 | - [Data Science Weekly](https://www.datascienceweekly.org/) - Weekly News Blog 775 | - [Revolution Analytics](https://blog.revolutionanalytics.com/) - Data Science Blog 776 | - [R Bloggers](https://www.r-bloggers.com/) - R Bloggers 777 | - [The Practical Quant](https://practicalquant.blogspot.com/) Big data 778 | - [Yet Another Data Blog](https://yet-another-data-blog.blogspot.com/) Yet Another Data Blog 779 | - [KD Nuggets](https://www.kdnuggets.com/) Data Mining, Analytics, Big Data, Data, Science not a blog a portal 780 | - [Meta Brown](https://www.metabrown.com/blog/) - Personal Blog 781 | - [Data Scientist](https://datascientists.com/) is building the data scientist culture. 782 | - [WhatSTheBigData](https://whatsthebigdata.com/) is some of, all of, or much more than the above and this blog explores its impact on information technology, the business world, government agencies, and our lives. 783 | - [Tevfik Kosar](https://magnus-notitia.blogspot.com/) - Magnus Notitia 784 | - [New Data Scientist](https://newdatascientist.blogspot.com/) How a Social Scientist Jumps into the World of Big Data 785 | - [Harvard Data Science](https://harvarddatascience.com/) - Thoughts on Statistical Computing and Visualization 786 | - [Data Science 101](https://ryanswanstrom.com/datascience101/) - Learning To Be A Data Scientist 787 | - [Kaggle Past Solutions](https://www.chioka.in/kaggle-competition-solutions/) 788 | - [DataScientistJourney](https://datascientistjourney.wordpress.com/category/data-science/) 789 | - [NYC Taxi Visualization Blog](https://chriswhong.github.io/nyctaxi/) 790 | - [Data-Mania](https://www.data-mania.com/) 791 | - [Data-Magnum](https://data-magnum.com/) 792 | - [datascopeanalytics](https://datascopeanalytics.com/blog/) 793 | - [Digital transformation](https://tarrysingh.com/) 794 | - [datascientistjourney](https://datascientistjourney.wordpress.com/category/data-science/) 795 | - [Data Mania Blog](https://www.data-mania.com/blog/) - [The File Drawer](https://chris-said.io/) - Chris Said's science blog 796 | - [Emilio Ferrara's web page](http://www.emilio.ferrara.name/) 797 | - [DataNews](https://datanews.tumblr.com/) 798 | - [Reddit TextMining](https://www.reddit.com/r/textdatamining/) 799 | - [Periscopic](https://periscopic.com/#!/news) 800 | - [Hilary Parker](https://hilaryparker.com/) 801 | - [Data Stories](https://datastori.es/) 802 | - [Data Science Lab](https://datasciencelab.wordpress.com/) 803 | - [Meaning of](https://www.kennybastani.com/) 804 | - [Adventures in Data Land](https://blog.smola.org) 805 | - [Dataclysm](https://theblog.okcupid.com/) 806 | - [FlowingData](https://flowingdata.com/) - Visualization and Statistics 807 | - [Calculated Risk](https://www.calculatedriskblog.com/) 808 | - [O'reilly Learning Blog](https://www.oreilly.com/content/topics/oreilly-learning/) 809 | - [Dominodatalab](https://blog.dominodatalab.com/) 810 | - [i am trask](https://iamtrask.github.io/) - A Machine Learning Craftsmanship Blog 811 | - [Vademecum of Practical Data Science](https://datasciencevademecum.wordpress.com/) - Handbook and recipes for data-driven solutions of real-world problems 812 | - [Dataconomy](https://dataconomy.com/) - A blog on the newly emerging data economy 813 | - [Springboard](https://www.springboard.com/blog/) - A blog with resources for data science learners 814 | - [Analytics Vidhya](https://www.analyticsvidhya.com/) - A full-fledged website about data science and analytics study material. 815 | - [Occam's Razor](https://www.kaushik.net/avinash/) - Focused on Web Analytics. 816 | - [Data School](https://www.dataschool.io/) - Data science tutorials for beginners! 817 | - [Colah's Blog](https://colah.github.io) - Blog for understanding Neural Networks! 818 | - [Sebastian's Blog](https://ruder.io/#open) - Blog for NLP and transfer learning! 819 | - [Distill](https://distill.pub) - Dedicated to clear explanations of machine learning! 820 | - [Chris Albon's Website](https://chrisalbon.com/) - Data Science and AI notes 821 | - [Andrew Carr](https://andrewnc.github.io/blog/blog.html) - Data Science with Esoteric programming languages 822 | - [floydhub](https://blog.floydhub.com/introduction-to-genetic-algorithms/) - Blog for Evolutionary Algorithms 823 | - [Jingles](https://jinglescode.github.io/) - Review and extract key concepts from academic papers 824 | - [nbshare](https://www.nbshare.io/notebooks/data-science/) - Data Science notebooks 825 | - [Loic Tetrel](https://ltetrel.github.io/) - Data science blog 826 | - [Chip Huyen's Blog](https://huyenchip.com/blog/) - ML Engineering, MLOps, and the use of ML in startups 827 | - [Maria Khalusova](https://www.mariakhalusova.com/) - Data science blog 828 | - [Aditi Rastogi](https://medium.com/@aditi2507rastogi) - ML,DL,Data Science blog 829 | - [Santiago Basulto](https://medium.com/@santiagobasulto) - Data Science with Python 830 | - [Akhil Soni](https://medium.com/@akhil0435) - ML, DL and Data Science 831 | - [Akhil Soni](https://akhilworld.hashnode.dev/) - ML, DL and Data Science 832 | - [Applied AI Blogs](https://www.appliedaicourse.com/blog/) - In-depth articles on AI, machine learning, and data science concepts with practical applications. 833 | - [Scaler Blogs](https://www.scaler.com/blog/) - Educational content on software development, AI, and career growth in tech. 834 | - [Mlu github](https://mlu-explain.github.io/) - Mlu is developed amazon to help people in ml space you can learn everything from basics here with live diagrams 835 | 836 | ### Presentations 837 | **[`^ back to top ^`](#awesome-data-science)** 838 | 839 | - [How to Become a Data Scientist](https://www.slideshare.net/ryanorban/how-to-become-a-data-scientist) 840 | - [Introduction to Data Science](https://www.slideshare.net/NikoVuokko/introduction-to-data-science-25391618) 841 | - [Intro to Data Science for Enterprise Big Data](https://www.slideshare.net/pacoid/intro-to-data-science-for-enterprise-big-data) 842 | - [How to Interview a Data Scientist](https://www.slideshare.net/dtunkelang/how-to-interview-a-data-scientist) 843 | - [How to Share Data with a Statistician](https://github.com/jtleek/datasharing) 844 | - [The Science of a Great Career in Data Science](https://www.slideshare.net/katemats/the-science-of-a-great-career-in-data-science) 845 | - [What Does a Data Scientist Do?](https://www.slideshare.net/datasciencelondon/big-data-sorry-data-science-what-does-a-data-scientist-do) 846 | - [Building Data Start-Ups: Fast, Big, and Focused](https://www.slideshare.net/medriscoll/driscoll-strata-buildingdatastartups25may2011clean) 847 | - [How to win data science competitions with Deep Learning](https://www.slideshare.net/0xdata/how-to-win-data-science-competitions-with-deep-learning) 848 | - [Full-Stack Data Scientist](https://www.slideshare.net/AlexeyGrigorev/fullstack-data-scientist) 849 | 850 | ### Podcasts 851 | **[`^ back to top ^`](#awesome-data-science)** 852 | 853 | - [AI at Home](https://podcasts.apple.com/us/podcast/data-science-at-home/id1069871378) 854 | - [AI Today](https://www.cognilytica.com/aitoday/) 855 | - [Adversarial Learning](https://adversariallearning.com/) 856 | - [Chai time Data Science](https://www.youtube.com/playlist?list=PLLvvXm0q8zUbiNdoIazGzlENMXvZ9bd3x) 857 | - [Data Engineering Podcast](https://www.dataengineeringpodcast.com/) 858 | - [Data Science at Home](https://datascienceathome.com/) 859 | - [Data Science Mixer](https://community.alteryx.com/t5/Data-Science-Mixer/bg-p/mixer) 860 | - [Data Skeptic](https://dataskeptic.com/) 861 | - [Data Stories](https://datastori.es/) 862 | - [Datacast](https://jameskle.com/writes/category/Datacast) 863 | - [DataFramed](https://www.datacamp.com/community/podcast) 864 | - [DataTalks.Club](https://anchor.fm/datatalksclub) 865 | - [Gradient Descent](https://wandb.ai/fully-connected/gradient-descent) 866 | - [Learning Machines 101](https://www.learningmachines101.com/) 867 | - [Let's Data (Brazil)](https://www.youtube.com/playlist?list=PLn_z5E4dh_Lj5eogejMxfOiNX3nOhmhmM) 868 | - [Linear Digressions](https://lineardigressions.com/) 869 | - [Not So Standard Deviations](https://nssdeviations.com/) 870 | - [O'Reilly Data Show Podcast](https://www.oreilly.com/radar/topics/oreilly-data-show-podcast/) 871 | - [Partially Derivative](http://partiallyderivative.com/) 872 | - [Superdatascience](https://www.superdatascience.com/podcast/) 873 | - [The Data Engineering Show](https://www.dataengineeringshow.com/) 874 | - [The Radical AI Podcast](https://www.radicalai.org/) 875 | - [What's The Point](https://fivethirtyeight.com/tag/whats-the-point/) 876 | - [The Analytics Engineering Podcast](https://roundup.getdbt.com/s/the-analytics-engineering-podcast) 877 | 878 | ### YouTube Videos & Channels 879 | **[`^ back to top ^`](#awesome-data-science)** 880 | 881 | - [What is machine learning?](https://www.youtube.com/watch?v=WXHM_i-fgGo) 882 | - [Andrew Ng: Deep Learning, Self-Taught Learning and Unsupervised Feature Learning](https://www.youtube.com/watch?v=n1ViNeWhC24) 883 | - [Data36 - Data Science for Beginners by Tomi Mester](https://www.youtube.com/c/TomiMesterData36comDataScienceForBeginners) 884 | - [Deep Learning: Intelligence from Big Data](https://www.youtube.com/watch?v=czLI3oLDe8M) 885 | - [Interview with Google's AI and Deep Learning 'Godfather' Geoffrey Hinton](https://www.youtube.com/watch?v=1Wp3IIpssEc) 886 | - [Introduction to Deep Learning with Python](https://www.youtube.com/watch?v=S75EdAcXHKk) 887 | - [What is machine learning, and how does it work?](https://www.youtube.com/watch?v=elojMnjn4kk) 888 | - [CampusX](https://www.youtube.com/@campusx-official) 889 | - [Data School](https://www.youtube.com/channel/UCnVzApLJE2ljPZSeQylSEyg) - Data Science Education 890 | - [Neural Nets for Newbies by Melanie Warrick (May 2015)](https://www.youtube.com/watch?v=Cu6A96TUy_o) 891 | - [Neural Networks video series by Hugo Larochelle](https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH) 892 | - [Google DeepMind co-founder Shane Legg - Machine Super Intelligence](https://www.youtube.com/watch?v=evNCyRL3DOU) 893 | - [Data Science Primer](https://www.youtube.com/watch?v=cHzvYxBN9Ls&list=PLPqVjP3T4RIRsjaW07zoGzH-Z4dBACpxY) 894 | - [Data Science with Genetic Algorithms](https://www.youtube.com/watch?v=lpD38NxTOnk) 895 | - [Data Science for Beginners](https://www.youtube.com/playlist?list=PL2zq7klxX5ATMsmyRazei7ZXkP1GHt-vs) 896 | - [DataTalks.Club](https://www.youtube.com/channel/UCDvErgK0j5ur3aLgn6U-LqQ) 897 | - [Mildlyoverfitted - Tutorials on intermediate ML/DL topics](https://www.youtube.com/channel/UCYBSjwkGTK06NnDnFsOcR7g) 898 | - [mlops.community - Interviews of industry experts about production ML](https://www.youtube.com/channel/UCYBSjwkGTK06NnDnFsOcR7g) 899 | - [ML Street Talk - Unabashedly technical and non-commercial, so you will hear no annoying pitches.](https://www.youtube.com/c/machinelearningstreettalk) 900 | - [Neural networks by 3Blue1Brown ](https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi) 901 | - [Neural networks from scratch by Sentdex](https://www.youtube.com/playlist?list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3) 902 | - [Manning Publications YouTube channel](https://www.youtube.com/c/ManningPublications/featured) 903 | - [Ask Dr Chong: How to Lead in Data Science - Part 1](https://youtu.be/JYuQZii5o58) 904 | - [Ask Dr Chong: How to Lead in Data Science - Part 2](https://youtu.be/SzqIXV-O-ko) 905 | - [Ask Dr Chong: How to Lead in Data Science - Part 3](https://youtu.be/Ogwm7k_smTA) 906 | - [Ask Dr Chong: How to Lead in Data Science - Part 4](https://youtu.be/a9usjdzTxTU) 907 | - [Ask Dr Chong: How to Lead in Data Science - Part 5](https://youtu.be/MYdQq-F3Ws0) 908 | - [Ask Dr Chong: How to Lead in Data Science - Part 6](https://youtu.be/LOOt4OVC3hY) 909 | - [Regression Models: Applying simple Poisson regression](https://www.youtube.com/watch?v=9Hk8K8jhiOo) 910 | - [Deep Learning Architectures](https://www.youtube.com/playlist?list=PLv8Cp2NvcY8DpVcsmOT71kymgMmcr59Mf) 911 | - [Time Series Modelling and Analysis](https://www.youtube.com/playlist?list=PL3N9eeOlCrP5cK0QRQxeJd6GrQvhAtpBK) 912 | - [Serrano.Academy](https://www.youtube.com/@SerranoAcademy) 913 | - [End to End Data Science Playlist](https://www.youtube.com/watch?v=S_F_c9e2bz4&list=PLZoTAELRMXVPS-dOaVbAux22vzqdgoGhG) 914 | - [Introduction to Data Science - Linkedin](https://www.linkedin.com/learning/introduction-to-data-science-22668235/beginning-your-data-science-exploration?u=42458916) 915 | 916 | ## Socialize 917 | **[`^ back to top ^`](#awesome-data-science)** 918 | 919 | Below are some Social Media links. Connect with other data scientists! 920 | 921 | - [Facebook Accounts](#facebook-accounts) 922 | - [Twitter Accounts](#twitter-accounts) 923 | - [Telegram Channels](#telegram-channels) 924 | - [Slack Communities](#slack-communities) 925 | - [GitHub Groups](#github-groups) 926 | - [Data Science Competitions](#data-science-competitions) 927 | 928 | 929 | ### Facebook Accounts 930 | **[`^ back to top ^`](#awesome-data-science)** 931 | 932 | - [Data](https://www.facebook.com/data) 933 | - [Big Data Scientist](https://www.facebook.com/Bigdatascientist) 934 | - [Data Science Day](https://www.facebook.com/datascienceday/) 935 | - [Data Science Academy](https://www.facebook.com/nycdatascience) 936 | - [Facebook Data Science Page](https://www.facebook.com/pages/Data-science/431299473579193?ref=br_rs) 937 | - [Data Science London](https://www.facebook.com/pages/Data-Science-London/226174337471513) 938 | - [Data Science Technology and Corporation](https://www.facebook.com/DataScienceTechnologyCorporation?ref=br_rs) 939 | - [Data Science - Closed Group](https://www.facebook.com/groups/1394010454157077/?ref=br_rs) 940 | - [Center for Data Science](https://www.facebook.com/centerdatasciences?ref=br_rs) 941 | - [Big data hadoop NOSQL Hive Hbase](https://www.facebook.com/groups/bigdatahadoop/) 942 | - [Analytics, Data Mining, Predictive Modeling, Artificial Intelligence](https://www.facebook.com/groups/data.analytics/) 943 | - [Big Data Analytics using R](https://www.facebook.com/groups/434352233255448/) 944 | - [Big Data Analytics with R and Hadoop](https://www.facebook.com/groups/rhadoop/) 945 | - [Big Data Learnings](https://www.facebook.com/groups/bigdatalearnings/) 946 | - [Big Data, Data Science, Data Mining & Statistics](https://www.facebook.com/groups/bigdatastatistics/) 947 | - [BigData/Hadoop Expert](https://www.facebook.com/groups/BigDataExpert/) 948 | - [Data Mining / Machine Learning / AI](https://www.facebook.com/groups/machinelearningforum/) 949 | - [Data Mining/Big Data - Social Network Ana](https://www.facebook.com/groups/dataminingsocialnetworks/) 950 | - [Vademecum of Practical Data Science](https://www.facebook.com/datasciencevademecum) 951 | - [Veri Bilimi Istanbul](https://www.facebook.com/groups/veribilimiistanbul/) 952 | - [The Data Science Blog](https://www.facebook.com/theDataScienceBlog/) 953 | 954 | 955 | ### Twitter Accounts 956 | **[`^ back to top ^`](#awesome-data-science)** 957 | 958 | | Twitter | Description | 959 | | --- | --- | 960 | | [Big Data Combine](https://twitter.com/BigDataCombine) | Rapid-fire, live tryouts for data scientists seeking to monetize their models as trading strategies | 961 | | Big Data Mania | Data Viz Wiz, Data Journalist, Growth Hacker, Author of Data Science for Dummies (2015) | 962 | | [Big Data Science](https://twitter.com/analyticbridge) | Big Data, Data Science, Predictive Modeling, Business Analytics, Hadoop, Decision and Operations Research. | 963 | | Charlie Greenbacker | Director of Data Science at @ExploreAltamira | 964 | | [Chris Said](https://twitter.com/Chris_Said) | Data scientist at Twitter | 965 | | [Clare Corthell](https://twitter.com/clarecorthell) | Dev, Design, Data Science @mattermark #hackerei | 966 | | [DADI Charles-Abner](https://twitter.com/DadiCharles) | #datascientist @Ekimetrics. , #machinelearning #dataviz #DynamicCharts #Hadoop #R #Python #NLP #Bitcoin #dataenthousiast | 967 | | [Data Science Central](https://twitter.com/DataScienceCtrl) | Data Science Central is the industry's single resource for Big Data practitioners. | 968 | | [Data Science London](https://twitter.com/ds_ldn) | Data Science. Big Data. Data Hacks. Data Junkies. Data Startups. Open Data | 969 | | [Data Science Renee](https://twitter.com/BecomingDataSci) | Documenting my path from SQL Data Analyst pursuing an Engineering Master's Degree to Data Scientist | 970 | | [Data Science Report](https://twitter.com/TedOBrien93) | Mission is to help guide & advance careers in Data Science & Analytics | 971 | | [Data Science Tips](https://twitter.com/datasciencetips) | Tips and Tricks for Data Scientists around the world! #datascience #bigdata | 972 | | [Data Vizzard](https://twitter.com/DataVisualizati) | DataViz, Security, Military | 973 | | [DataScienceX](https://twitter.com/DataScienceX) | | 974 | | deeplearning4j | | 975 | | [DJ Patil](https://twitter.com/dpatil) | White House Data Chief, VP @ RelateIQ. | 976 | | [Domino Data Lab](https://twitter.com/DominoDataLab) | | 977 | | [Drew Conway](https://twitter.com/drewconway) | Data nerd, hacker, student of conflict. | 978 | | Emilio Ferrara | #Networks, #MachineLearning and #DataScience. I work on #Social Media. Postdoc at @IndianaUniv | 979 | | [Erin Bartolo](https://twitter.com/erinbartolo) | Running with #BigData--enjoying a love/hate relationship with its hype. @iSchoolSU #DataScience Program Mgr. | 980 | | [Greg Reda](https://twitter.com/gjreda) | Working @ _GrubHub_ about data and pandas | 981 | | [Gregory Piatetsky](https://twitter.com/kdnuggets) | KDnuggets President, Analytics/Big Data/Data Mining/Data Science expert, KDD & SIGKDD co-founder, was Chief Scientist at 2 startups, part-time philosopher. | 982 | | [Hadley Wickham](https://twitter.com/hadleywickham) | Chief Scientist at RStudio, and an Adjunct Professor of Statistics at the University of Auckland, Stanford University, and Rice University. | 983 | | [Hakan Kardas](https://twitter.com/hakan_kardes) | Data Scientist | 984 | | [Hilary Mason](https://twitter.com/hmason) | Data Scientist in Residence at @accel. | 985 | | [Jeff Hammerbacher](https://twitter.com/hackingdata) | ReTweeting about data science | 986 | | [John Myles White](https://twitter.com/johnmyleswhite) | Scientist at Facebook and Julia developer. Author of Machine Learning for Hackers and Bandit Algorithms for Website Optimization. Tweets reflect my views only. | 987 | | [Juan Miguel Lavista](https://twitter.com/BDataScientist) | Principal Data Scientist @ Microsoft Data Science Team | 988 | | [Julia Evans](https://twitter.com/b0rk) | Hacker - Pandas - Data Analyze | 989 | | [Kenneth Cukier](https://twitter.com/kncukier) | The Economist's Data Editor and co-author of Big Data (http://www.big-data-book.com/). | 990 | | Kevin Davenport | Organizer of https://www.meetup.com/San-Diego-Data-Science-R-Users-Group/ | 991 | | [Kevin Markham](https://twitter.com/justmarkham) | Data science instructor, and founder of [Data School](https://www.dataschool.io/) | 992 | | [Kim Rees](https://twitter.com/krees) | Interactive data visualization and tools. Data flaneur. | 993 | | [Kirk Borne](https://twitter.com/KirkDBorne) | DataScientist, PhD Astrophysicist, Top #BigData Influencer. | 994 | | Linda Regber | Data storyteller, visualizations. | 995 | | [Luis Rei](https://twitter.com/lmrei) | PhD Student. Programming, Mobile, Web. Artificial Intelligence, Intelligent Robotics Machine Learning, Data Mining, Natural Language Processing, Data Science. | 996 | | Mark Stevenson | Data Analytics Recruitment Specialist at Salt (@SaltJobs) Analytics - Insight - Big Data - Data science | 997 | | [Matt Harrison](https://twitter.com/__mharrison__) | Opinions of full-stack Python guy, author, instructor, currently playing Data Scientist. Occasional fathering, husbanding, organic gardening. | 998 | | [Matthew Russell](https://twitter.com/ptwobrussell) | Mining the Social Web. | 999 | | [Mert Nuhoğlu](https://twitter.com/mertnuhoglu) | Data Scientist at BizQualify, Developer | 1000 | | [Monica Rogati](https://twitter.com/mrogati) | Data @ Jawbone. Turned data into stories & products at LinkedIn. Text mining, applied machine learning, recommender systems. Ex-gamer, ex-machine coder; namer. | 1001 | | [Noah Iliinsky](https://twitter.com/noahi) | Visualization & interaction designer. Practical cyclist. Author of vis books: https://www.oreilly.com/pub/au/4419 | 1002 | | [Paul Miller](https://twitter.com/PaulMiller) | Cloud Computing/ Big Data/ Open Data Analyst & Consultant. Writer, Speaker & Moderator. Gigaom Research Analyst. | 1003 | | [Peter Skomoroch](https://twitter.com/peteskomoroch) | Creating intelligent systems to automate tasks & improve decisions. Entrepreneur, ex-Principal Data Scientist @LinkedIn. Machine Learning, ProductRei, Networks | 1004 | | [Prash Chan](https://twitter.com/MDMGeek) | Solution Architect @ IBM, Master Data Management, Data Quality & Data Governance Blogger. Data Science, Hadoop, Big Data & Cloud. | 1005 | | [Quora Data Science](https://twitter.com/q_datascience) | Quora's data science topic | 1006 | | [R-Bloggers](https://twitter.com/Rbloggers) | Tweet blog posts from the R blogosphere, data science conferences, and (!) open jobs for data scientists. | 1007 | | [Rand Hindi](https://twitter.com/randhindi) | | 1008 | | [Randy Olson](https://twitter.com/randal_olson) | Computer scientist researching artificial intelligence. Data tinkerer. Community leader for @DataIsBeautiful. #OpenScience advocate. | 1009 | | [Recep Erol](https://twitter.com/EROLRecep) | Data Science geek @ UALR | 1010 | | [Ryan Orban](https://twitter.com/ryanorban) | Data scientist, genetic origamist, hardware aficionado | 1011 | | [Sean J. Taylor](https://twitter.com/seanjtaylor) | Social Scientist. Hacker. Facebook Data Science Team. Keywords: Experiments, Causal Inference, Statistics, Machine Learning, Economics. | 1012 | | [Silvia K. Spiva](https://twitter.com/silviakspiva) | #DataScience at Cisco | 1013 | | [Harsh B. Gupta](https://twitter.com/harshbg) | Data Scientist at BBVA Compass | 1014 | | [Spencer Nelson](https://twitter.com/spenczar_n) | Data nerd | 1015 | | [Talha Oz](https://twitter.com/tozCSS) | Enjoys ABM, SNA, DM, ML, NLP, HI, Python, Java. Top percentile Kaggler/data scientist | 1016 | | [Tasos Skarlatidis](https://twitter.com/anskarl) | Complex Event Processing, Big Data, Artificial Intelligence and Machine Learning. Passionate about programming and open-source. | 1017 | | [Terry Timko](https://twitter.com/Terry_Timko) | InfoGov; Bigdata; Data as a Service; Data Science; Open, Social & Business Data Convergence | 1018 | | [Tony Baer](https://twitter.com/TonyBaer) | IT analyst with Ovum covering Big Data & data management with some systems engineering thrown in. | 1019 | | [Tony Ojeda](https://twitter.com/tonyojeda3) | Data Scientist , Author , Entrepreneur. Co-founder @DataCommunityDC. Founder @DistrictDataLab. #DataScience #BigData #DataDC | 1020 | | [Vamshi Ambati](https://twitter.com/vambati) | Data Science @ PayPal. #NLP, #machinelearning; PhD, Carnegie Mellon alumni (Blog: https://allthingsds.wordpress.com ) | 1021 | | [Wes McKinney](https://twitter.com/wesmckinn) | Pandas (Python Data Analysis library). | 1022 | | [WileyEd](https://twitter.com/WileyEd) | Senior Manager - @Seagate Big Data Analytics @McKinsey Alum #BigData + #Analytics Evangelist #Hadoop, #Cloud, #Digital, & #R Enthusiast | 1023 | | [WNYC Data News Team](https://twitter.com/datanews) | The data news crew at @WNYC. Practicing data-driven journalism, making it visual, and showing our work. | 1024 | | [Alexey Grigorev](https://twitter.com/Al_Grigor) | Data science author | 1025 | | [İlker Arslan](https://twitter.com/ilkerarslan_35) | Data science author. Shares mostly about Julia programming | 1026 | | [INEVITABLE](https://twitter.com/WeAreInevitable) | AI & Data Science Start-up Company based in England, UK | 1027 | 1028 | ### Telegram Channels 1029 | **[`^ back to top ^`](#awesome-data-science)** 1030 | 1031 | - [Open Data Science](https://t.me/opendatascience) – First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. 1032 | - [Loss function porn](https://t.me/loss_function_porn) — Beautiful posts on DS/ML theme with video or graphic visualization. 1033 | - [Machinelearning](https://t.me/ai_machinelearning_big_data) – Daily ML news. 1034 | 1035 | 1036 | ### Slack Communities 1037 | [top](#awesome-data-science) 1038 | 1039 | - [DataTalks.Club](https://datatalks.club) 1040 | 1041 | ### GitHub Groups 1042 | - [Berkeley Institute for Data Science](https://github.com/BIDS) 1043 | 1044 | ### Data Science Competitions 1045 | 1046 | Some data mining competition platforms 1047 | 1048 | - [Kaggle](https://www.kaggle.com/) 1049 | - [DrivenData](https://www.drivendata.org/) 1050 | - [Analytics Vidhya](https://datahack.analyticsvidhya.com/) 1051 | - [InnoCentive](https://www.innocentive.com/) 1052 | - [Microprediction](https://www.microprediction.com/python-1) 1053 | 1054 | ## Fun 1055 | 1056 | - [Infographic](#infographics) 1057 | - [Datasets](#datasets) 1058 | - [Comics](#comics) 1059 | 1060 | 1061 | ### Infographics 1062 | **[`^ back to top ^`](#awesome-data-science)** 1063 | 1064 | | Preview | Description | 1065 | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | 1066 | | [](https://i.imgur.com/0OoLaa5.png) | [Key differences of a data scientist vs. data engineer](https://searchbusinessanalytics.techtarget.com/feature/Key-differences-of-a-data-scientist-vs-data-engineer) | 1067 | | [](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/DataScienceEightSteps_Full.png) | A visual guide to Becoming a Data Scientist in 8 Steps by [DataCamp](https://www.datacamp.com) [(img)](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/DataScienceEightSteps_Full.png) | 1068 | | [](https://i.imgur.com/FxsL3b8.png) | Mindmap on required skills ([img](https://i.imgur.com/FxsL3b8.png)) | 1069 | | [](https://nirvacana.com/thoughts/wp-content/uploads/2013/07/RoadToDataScientist1.png) | Swami Chandrasekaran made a [Curriculum via Metro map](http://nirvacana.com/thoughts/2013/07/08/becoming-a-data-scientist/). | 1070 | | [](https://i.imgur.com/4ZBBvb0.png) | by [@kzawadz](https://twitter.com/kzawadz) via [twitter](https://twitter.com/MktngDistillery/status/538671811991715840) | 1071 | | [](https://i.imgur.com/xLY3XZn.jpg) | By [Data Science Central](https://www.datasciencecentral.com/) | 1072 | | [](https://i.imgur.com/0TydZ4M.png) | Data Science Wars: R vs Python | 1073 | | [](https://i.imgur.com/HnRwlce.png) | How to select statistical or machine learning techniques | 1074 | | [](https://scikit-learn.org/1.5/_downloads/b82bf6cd7438a351f19fac60fbc0d927/ml_map.svg) | [Choosing the Right Estimator](https://scikit-learn.org/1.5/machine_learning_map.html#choosing-the-right-estimator) | 1075 | | [](https://i.imgur.com/uEqMwZa.png) | The Data Science Industry: Who Does What | 1076 | | [](https://i.imgur.com/RsHqY84.png) | Data Science ~~Venn~~ Euler Diagram | 1077 | | [](https://www.springboard.com/blog/wp-content/uploads/2016/03/20160324_springboard_vennDiagram.png) | Different Data Science Skills and Roles from [Springboard](https://www.springboard.com) | 1078 | | [Data Fallacies To Avoid](https://data-literacy.geckoboard.com/poster/) | A simple and friendly way of teaching your non-data scientist/non-statistician colleagues [how to avoid mistakes with data](https://data-literacy.geckoboard.com/poster/). From Geckoboard's [Data Literacy Lessons](https://data-literacy.geckoboard.com/). | 1079 | 1080 | ### Datasets 1081 | **[`^ back to top ^`](#awesome-data-science)** 1082 | 1083 | - [Academic Torrents](https://academictorrents.com/) 1084 | - [ADS-B Exchange](https://www.adsbexchange.com/data-samples/) - Specific datasets for aircraft and Automatic Dependent Surveillance-Broadcast (ADS-B) sources. 1085 | - [hadoopilluminated.com](https://hadoopilluminated.com/hadoop_illuminated/Public_Bigdata_Sets.html) 1086 | - [data.gov](https://catalog.data.gov/dataset) - The home of the U.S. Government's open data 1087 | - [United States Census Bureau](https://www.census.gov/) 1088 | - [enigma.com](https://enigma.com/) - Navigate the world of public data - Quickly search and analyze billions of public records published by governments, companies and organizations. 1089 | - [datahub.io](https://datahub.io/) 1090 | - [aws.amazon.com/datasets](https://aws.amazon.com/datasets/) 1091 | - [datacite.org](https://datacite.org/) 1092 | - [The official portal for European data](https://data.europa.eu/en) 1093 | - [NASDAQ:DATA](https://data.nasdaq.com/) - Nasdaq Data Link A premier source for financial, economic and alternative datasets. 1094 | - [figshare.com](https://figshare.com/) 1095 | - [GeoLite Legacy Downloadable Databases](https://dev.maxmind.com/geoip) 1096 | - [Hugging Face Datasets](https://huggingface.co/datasets) 1097 | - [Quora's Big Datasets Answer](https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public) 1098 | - [Public Big Data Sets](https://hadoopilluminated.com/hadoop_illuminated/Public_Bigdata_Sets.html) 1099 | - [Kaggle Datasets](https://www.kaggle.com/datasets) 1100 | - [A Deep Catalog of Human Genetic Variation](https://www.internationalgenome.org/data) 1101 | - [A community-curated database of well-known people, places, and things](https://developers.google.com/freebase/) 1102 | - [Google Public Data](https://www.google.com/publicdata/directory) 1103 | - [World Bank Data](https://data.worldbank.org/) 1104 | - [NYC Taxi data](https://chriswhong.github.io/nyctaxi/) 1105 | - [Open Data Philly](https://www.opendataphilly.org/) Connecting people with data for Philadelphia 1106 | - [grouplens.org](https://grouplens.org/datasets/) Sample movie (with ratings), book and wiki datasets 1107 | - [UC Irvine Machine Learning Repository](https://archive.ics.uci.edu/ml/) - contains data sets good for machine learning 1108 | - [research-quality data sets](https://web.archive.org/web/20150320022752/https://bitly.com/bundles/hmason/1) by [Hilary Mason](https://web.archive.org/web/20150501033715/https://bitly.com/u/hmason/bundles) 1109 | - [National Centers for Environmental Information](https://www.ncei.noaa.gov/) 1110 | - [ClimateData.us](https://www.climatedata.us/) (related: [U.S. Climate Resilience Toolkit](https://toolkit.climate.gov/)) 1111 | - [r/datasets](https://www.reddit.com/r/datasets/) 1112 | - [MapLight](https://www.maplight.org/data-series) - provides a variety of data free of charge for uses that are freely available to the general public. Click on a data set below to learn more 1113 | - [GHDx](https://ghdx.healthdata.org/) - Institute for Health Metrics and Evaluation - a catalog of health and demographic datasets from around the world and including IHME results 1114 | - [St. Louis Federal Reserve Economic Data - FRED](https://fred.stlouisfed.org/) 1115 | - [New Zealand Institute of Economic Research – Data1850](https://data1850.nz/) 1116 | - [Open Data Sources](https://github.com/datasciencemasters/data) 1117 | - [UNICEF Data](https://data.unicef.org/) 1118 | - [undata](https://data.un.org/) 1119 | - [NASA SocioEconomic Data and Applications Center - SEDAC](https://earthdata.nasa.gov/centers/sedac-daac) 1120 | - [The GDELT Project](https://www.gdeltproject.org/) 1121 | - [Sweden, Statistics](https://www.scb.se/en/) 1122 | - [StackExchange Data Explorer](https://data.stackexchange.com) - an open source tool for running arbitrary queries against public data from the Stack Exchange network. 1123 | - [San Fransisco Government Open Data](https://datasf.org/opendata/) 1124 | - [IBM Asset Dataset](https://developer.ibm.com/exchanges/data/) 1125 | - [Open data Index](http://index.okfn.org/) 1126 | - [Public Git Archive](https://github.com/src-d/datasets/tree/master/PublicGitArchive) 1127 | - [GHTorrent](https://ghtorrent.org/) 1128 | - [Microsoft Research Open Data](https://msropendata.com/) 1129 | - [Open Government Data Platform India](https://data.gov.in/) 1130 | - [Google Dataset Search (beta)](https://datasetsearch.research.google.com/) 1131 | - [NAYN.CO Turkish News with categories](https://github.com/naynco/nayn.data) 1132 | - [Covid-19](https://github.com/datasets/covid-19) 1133 | - [Covid-19 Google](https://github.com/google-research/open-covid-19-data) 1134 | - [Enron Email Dataset](https://www.cs.cmu.edu/~./enron/) 1135 | - [5000 Images of Clothes](https://github.com/alexeygrigorev/clothing-dataset) 1136 | - [IBB Open Portal](https://data.ibb.gov.tr/en/) 1137 | - [The Humanitarian Data Exchange](https://data.humdata.org/) 1138 | - [250k+ Job Postings](https://aws.amazon.com/marketplace/pp/prodview-p2554p3tczbes) - An expanding dataset of historical job postings from Luxembourg from 2020 to today. Free with 250k+ job postings hosted on AWS Data Exchange. 1139 | - [FinancialData.Net](https://financialdata.net/documentation) - Financial datasets (stock market data, financial statements, sustainability data, and more). 1140 | - [Google Dataset Search](https://datasetsearch.research.google.com/) – Find datasets across the web. 1141 | 1142 | 1143 | ### Comics 1144 | **[`^ back to top ^`](#awesome-data-science)** 1145 | 1146 | - [Comic compilation](https://medium.com/@nikhil_garg/a-compilation-of-comics-explaining-statistics-data-science-and-machine-learning-eeefbae91277) 1147 | - [Cartoons](https://www.kdnuggets.com/websites/cartoons.html) 1148 | - [Data Science Cartoons](https://www.cartoonstock.com/directory/d/data_science.asp) 1149 | - [Data Science: The XKCD Edition](https://davidlindelof.com/data-science-the-xkcd-edition/) 1150 | 1151 | ## Other Awesome Lists 1152 | 1153 | - Other amazingly awesome lists can be found in the [awesome-awesomeness](https://github.com/bayandin/awesome-awesomeness) 1154 | - [Awesome Machine Learning](https://github.com/josephmisiti/awesome-machine-learning) 1155 | - [lists](https://github.com/jnv/lists) 1156 | - [awesome-dataviz](https://github.com/javierluraschi/awesome-dataviz) 1157 | - [awesome-python](https://github.com/vinta/awesome-python) 1158 | - [Data Science IPython Notebooks.](https://github.com/donnemartin/data-science-ipython-notebooks) 1159 | - [awesome-r](https://github.com/qinwf/awesome-R) 1160 | - [awesome-datasets](https://github.com/awesomedata/awesome-public-datasets) 1161 | - [awesome-Machine Learning & Deep Learning Tutorials](https://github.com/ujjwalkarn/Machine-Learning-Tutorials/blob/master/README.md) 1162 | - [Awesome Data Science Ideas](https://github.com/JosPolfliet/awesome-ai-usecases) 1163 | - [Machine Learning for Software Engineers](https://github.com/ZuzooVn/machine-learning-for-software-engineers) 1164 | - [Community Curated Data Science Resources](https://hackr.io/tutorials/learn-data-science) 1165 | - [Awesome Machine Learning On Source Code](https://github.com/src-d/awesome-machine-learning-on-source-code) 1166 | - [Awesome Community Detection](https://github.com/benedekrozemberczki/awesome-community-detection) 1167 | - [Awesome Graph Classification](https://github.com/benedekrozemberczki/awesome-graph-classification) 1168 | - [Awesome Decision Tree Papers](https://github.com/benedekrozemberczki/awesome-decision-tree-papers) 1169 | - [Awesome Fraud Detection Papers](https://github.com/benedekrozemberczki/awesome-fraud-detection-papers) 1170 | - [Awesome Gradient Boosting Papers](https://github.com/benedekrozemberczki/awesome-gradient-boosting-papers) 1171 | - [Awesome Computer Vision Models](https://github.com/nerox8664/awesome-computer-vision-models) 1172 | - [Awesome Monte Carlo Tree Search](https://github.com/benedekrozemberczki/awesome-monte-carlo-tree-search-papers) 1173 | - [Glossary of common statistics and ML terms](https://www.analyticsvidhya.com/glossary-of-common-statistics-and-machine-learning-terms/) 1174 | - [100 NLP Papers](https://github.com/mhagiwara/100-nlp-papers) 1175 | - [Awesome Game Datasets](https://github.com/leomaurodesenv/game-datasets#readme) 1176 | - [Data Science Interviews Questions](https://github.com/alexeygrigorev/data-science-interviews) 1177 | - [Awesome Explainable Graph Reasoning](https://github.com/AstraZeneca/awesome-explainable-graph-reasoning) 1178 | - [Top Data Science Interview Questions](https://www.interviewbit.com/data-science-interview-questions/) 1179 | - [Awesome Drug Synergy, Interaction and Polypharmacy Prediction](https://github.com/AstraZeneca/awesome-drug-pair-scoring) 1180 | - [Deep Learning Interview Questions](https://www.adaface.com/blog/deep-learning-interview-questions/) 1181 | - [Top Future Trends in Data Science in 2023](https://medium.com/the-modern-scientist/top-future-trends-in-data-science-in-2023-3e616c8998b8) 1182 | - [How Generative AI Is Changing Creative Work](https://hbr.org/2022/11/how-generative-ai-is-changing-creative-work) 1183 | - [What is generative AI?](https://www.techtarget.com/searchenterpriseai/definition/generative-AI) 1184 | - [Top 100+ Machine Learning Interview Questions (Beginner to Advanced)](https://www.appliedaicourse.com/blog/machine-learning-interview-questions/) 1185 | - [Data Science Projects](https://github.com/veb-101/Data-Science-Projects) 1186 | - [Is Data Science a Good Career?](https://www.scaler.com/blog/is-data-science-a-good-career/) 1187 | - [The Future of Data Science: Predictions and Trends](https://www.appliedaicourse.com/blog/future-of-data-science/) 1188 | - [Data Science and Machine Learning: What’s The Difference?](https://www.appliedaicourse.com/blog/data-science-and-machine-learning-whats-the-difference/) 1189 | - [AI in Data Science: Uses, Roles, and Tools](https://www.scaler.com/blog/ai-in-data-science/) 1190 | - [Top 13 Data Science Programming Languages](https://www.appliedaicourse.com/blog/data-science-programming-languages/) 1191 | - [40+ Data Analytics Projects Ideas](https://www.appliedaicourse.com/blog/data-analytics-projects-ideas/) 1192 | - [Best Data Science Courses with Certificates](https://www.appliedaicourse.com/blog/best-data-science-courses/) 1193 | - [Generative AI Models](https://www.appliedaicourse.com/blog/generative-ai-models/) 1194 | - [Awesome Data Analysis](https://github.com/PavelGrigoryevDS/awesome-data-analysis) - A curated list of data analysis tools, libraries and resources. 1195 | 1196 | 1197 | ### Hobby 1198 | - [Awesome Music Production](https://github.com/ad-si/awesome-music-production) 1199 | 1200 | 1201 | 1202 | 1209 | --------------------------------------------------------------------------------