└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # The Only ML Roadmap, that you will ever need! 2 | 3 | So, you want to dive deep into **Machine Learning** and do something in the field of **Artificial Intelligence** but don't know where to begin. You have come to the right place! 4 | I would be putting out everything that I know and things that I feel you need to know to get into any kind of data role. This document would also help you explore every broad data role and might help you decide what you should be doing. 5 | 6 | ## Is ML and Data Science for you? 7 | 8 | Before you dive in any deeper. I guess it makes sense to understand if you should even pursue this field. I would jot down a few points and if it matches your personality, then you should definitely give it a shot! 9 | 10 | - You love mathematics (or atleast you don't hate it). 11 | - You like looking at percentages, statistical data and would love to create such insights if given a chance. 12 | - You have an eye for patterns and you figure out interesting details that most people might miss. 13 | - You don't get scared of large excel sheets and would be okay to clean them to work on cool projects with the cleaned data. 14 | - You are a good communicator, or you are comfortable with communicating your ideas and findings to people around you (**SUPER IMPORTANT**). 15 | - You are okay with not finding the right answer and have accepted that you will never find one. Your job would be to get as close to the correct answer as possible but it's okay if you don't reach it. 16 | - You like brainstorming about things and can tolerate working with vague problem statements and figure out what to do. 17 | 18 | It is absolutely fine if all the points do not match the person you are, but if most of them make sense then you should continue reading further. 19 | 20 | ## The paths to choose from 21 | 22 | If you are still reading, I believe you want to give a chance to data science and Machine Learning. In this section, I would talk about the paths which you can choose from. 23 | Follow the table below, to get a basic idea. 24 | 25 | | Role | What do they do?| 26 | |----------------|-------------------------------| 27 | |Data Analyst| Analyze data provided to them to generate insights. Present their findings using visualization tools and PPTs. | 28 | |Data Engineer|The data which data analysts work with doesn't come easily. The ones who work hard to create data pipelines to bring the data to the table are data engineers. | 29 | |Machine Learning Engineer|Data analysts provide insights of the data based on what happened in the past. The ones who build predictive models to forecast the future using the present data are machine learning engineers| 30 | |MLOps Engineer|Building a model doesn't mean that the model would perform well all the time. every model degrades with time. The ones who look after the deployment of the model and maintain it's performance are MLOps engineers.| 31 | |Data Scientist|The one who has the ability to handle most of the above mentioned roles has the potential to be a data scientist. They are also more inclined towards doing mathematical and statistical research to improve model performance| 32 | |AI Researcher|A data scientist who focuses more on maths and statistics. Focuses less on data engineering but puts a lot of effort to read about the recent advancements in the field of AI and tries to build better and more sophisticated algorithms. They are more often than not in specialized fields like Computer Vision (image related tasks), Natural Language processing (text related tasks), speech recognition, etc. 33 | 34 | 35 | ## What to learn? 36 | 37 | Now that you are aware of the types of roles that are present in the data realm, you might want to know how to get started and what you need to learn for each role. 38 | 39 | ### Data Analyst 40 | 41 | A skilled data analyst has 42 | - strong communication skills 43 | - good presentation skills and ability to explain visualizations 44 | - Proficiency in SQL 45 | - Proficiency in data visualization tools 46 | - Proficiency in Python (optional but recommended) 47 | 48 | So to be a data analyst, the primary skill you need is **SQL**. There's no substitute of SQL, so prepare this well. There are multiple resources to learn SQL 49 | There is this amazing playlist from [campusX](https://youtu.be/nopIGY1zJE0?si=__o2fofre1WbdU0W) to learn SQL. You should also check out [Joey Blue](https://www.youtube.com/playlist?list=PLD20298E653A970F8) who has done exceptional work as well. 50 | Practice Questions from [Data Lemur](https://datalemur.com/) after you learn SQL and you are good to go. 51 | You need to learn a data visualization tool like Power BI or Tableau. I personally prefer Power BI. There's an amazing course by Codebasics to learn Power BI. The link is attached [here](https://www.youtube.com/watch?v=hhZ62IlTxYs&list=PLeo1K3hjS3uva8pk1FI3iK9kCOKQdz1I9). 52 | In case, you want to learn python, you need to learn the fundamentals and a few libraries like Numpy, Pandas and matplotlib/seaborn (I prefer matplotlib). FreeCodeCamp is an youtube channel that provides enormous amount of knowledge for free! 53 | There's a 12 hour long [video](https://youtu.be/LHBE6Q9XlzI?si=sjkup2PNM2T8cPb8) that teaches you how to install python even if you have never done any programming and it also makes you learn about these above-mentioned libraries. 54 | 55 | Finally, I would also ask you to do something that most people ignore. LEARN **ADVANCED EXCEL**. You might feel, you know how to use Excel, I promise you that you don't. There are so many ways that excel can simplify your life, it's just amazing. 56 | Check out this [playlist](https://youtu.be/Vl0H-qTclOg?si=GwkN7sF971wvdJb3) from freecodecamp where they teach you excel from basics and progressively teach you the advanced concepts. 57 | 58 | If youtube is not for you and you require structured online courses, I have a few suggestions as well. 59 | 60 | - [Google Data Analytics Professional Certificate](https://www.coursera.org/professional-certificates/google-data-analytics) 61 | - [Google Business Intelligence Professional Certificate](https://www.coursera.org/professional-certificates/google-business-intelligence) 62 | - [Google Advanced Data Analytics](https://www.coursera.org/professional-certificates/google-advanced-data-analytics) 63 | 64 | Try to do it in the mentioned order. Coursera offers financial aid, so you can do all three of the above mentioned courses for FREE. 65 | 66 | 67 | ### Data Engineer 68 | 69 | I would be honest here, the purpose of this roadmap is to make you guys aware of all the roles present in the data industry. I am not pursuing Data Engineering, so I am not the right person to advise you about this. yet there are things that I can tell you about data engineering. 70 | It is mostly for people who are very efficient programmers. Infact, I would say, data engineering is one of the most programmatically heavy roles out of all the above-mentioned roles. Yet, if you are someone who would prefer to code more than talking to people, then data engineering is something that you might like. 71 | 72 | A few tools that data engineers use are: 73 | 74 | - SQL 75 | - Python 76 | - Scala 77 | - Hadoop/Spark/Kafka for handling big data 78 | - Apache Nifi and Apache Airflow for building ETL pipelines 79 | - Snowflake/Google Big Query, etc for data warehousing 80 | - Familiarity with Cloud Systems like GCP/AWS 81 | - Familiarity with containerization technologies like Docker and Kubernetes. 82 | 83 | I am sorry but since I don't have much experience with data engineering technologies, I can't help here. Yet, you can google and find resources in this field that you deem useful. I have a better option though in the **opportunities** section down below. 84 | 85 | ### Machine Learning Engineer 86 | 87 | If you made it here. I hope you like maths. Data Analytics was communication heavy and Data Engineering was programming heavy. The roles that you would be seeing now are all mathematics and statistics heavy. This shouldn't scare you but you need to learn a bit of Linear Algebra, Calculus and Probability to do well in these fields, to properly understand how things work. Let's dive deep into it. 88 | 89 | If you are a Machine Learning Engineer, you probably would have your data prepared by a team of data engineers. Depending on the size of the company, it would be either analyzed by data analysts or in a small company, you would have to analyze the data before working on your machine learning models. So, to be well prepared, you need to already have the necessary data analyst skills (you can leave out the dashboarding aspects but it's good to know to keep your options open). 90 | The extra skills that a machine learning engineer has other than just having data analysis skills are: 91 | 92 | - Solid foundations of linear algebra, probability statistics and calculus. 93 | - Knowledge of some other python libraries like Scikit learn, beautiful_soup/spacy, etc. 94 | - Knowledge of frameworks like tensorflow/pytorch. 95 | - Understanding of Standard Machine Learning Algorithms like Linear Regression, Logistic Regression, K nearest Neighbors, Support Vector Machine, Decision tree, Random Forest, Gaussian Naive Bayes, etc. 96 | - Understanding of concepts like gradient descent, multi-layer perceptrons, feed-forward neural networks, back propagation, etc. to build the base for understanding deep learning. 97 | - Knowing how to deploy your projects using Streamlit (if possible learn Flask too). 98 | - Learning about specialized fields like Computer Vision and Natural Language Processing (optional). 99 | 100 | There are a lot of free courses available on the internet related to Machine Learning. I would tell you guys to look at the ones that I feel the most important and have actually helped me learn. 101 | 102 | - [Machine Learning Specialization by Andrew Ng](https://www.coursera.org/specializations/machine-learning-introduction) 103 | - [Deep Learning Specialization by Andrew Ng](https://www.coursera.org/specializations/deep-learning) 104 | - [Data Science Playlist by Codebasics](https://www.youtube.com/watch?v=JL_grPUnXzY&list=PLeo1K3hjS3us_ELKYSj_Fth2tIEkdKXvV) 105 | 106 | If you like reading books, you can try reading: 107 | 108 | - Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2 by Sebastian Raschka Vahid Mirjalili 109 | - Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems 110 | 111 | For Mathematics and Statistics you can read: 112 | 113 | - Practical Statistics for Data Scientists (Oreilly) 114 | - Mathematics of Machine Learning by Martin Lotz 115 | - CS229 Lecture Notes Andrew Ng and Tengyu Ma (this is much more than notes trust me.) 116 | 117 | For Advanced Readers out there: 118 | 119 | - Approaching (Almost) Any Machine Learning Problem by Abhishek Thakur 120 | - The Black Swan: The Impact of the Highly Improbable (you can easily go ahead without reading this but this is one of the greatest books that I have ever read). 121 | 122 | For the ones who want to dive deep into Computer Vision and NLP: 123 | 124 | - For NLP, you should follow [Pradip Nichite](https://www.youtube.com/@FutureSmartAI). He is one of my personal favourites. 125 | - For Computer Vision, You should follow [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte). He has also done work in NLP as well but his tutorials on Motion Sensing are amazing. 126 | - For an all rounder, you can follow [Rob Mulla](https://www.youtube.com/@robmulla). 127 | 128 | ### MLOps Engineer 129 | 130 | This is another field that I have very low idea about and hence I won't be talking much about it. Yet, in layman's terms, 131 | MLOps Engineer = Devops Engineer who understands ML 132 | 133 | You can follow [Ayush Singh](https://www.youtube.com/@AyushSinghSh) (I know he's amazing, just don't feel you haven't done anything in your life. You weren't aware. Now you are). Also, be aware that this role is also very programmatically heavy but requires lower communication skills compared to other roles. You can look at the [MLOps roadmap](https://marvelousmlops.substack.com/p/mlops-roadmap-2024) by Marvelous MLOps which I feel is a good place to start. 134 | That's all that I can help you guys with here. You would find something interesting in the **opportunities** section below. 135 | 136 | ### Data Scientist 137 | 138 | If you're reading this, then know that this role prefers candidates with a masters. Although there is a probability that you can get hired as a data scientist right after your undergraduate degree but the chances are extremely slim. Read Further, only if you wish to go for a master's (if not a PhD). 139 | To become a data scientist you need to understand the image below. 140 | 141 | ![Data Science Venn Diagram](https://imgs.search.brave.com/TrMGzN_tHULVIgK8g4XvUXdfo4_--sJ1g8kB7_O3rgs/rs:fit:500:0:0/g:ce/aHR0cHM6Ly93d3cu/a2RudWdnZXRzLmNv/bS93cC1jb250ZW50/L3VwbG9hZHMvZGF0/YS1zY2llbmNlLXZl/bm4tZGlhZ3JhbS1h/c2lsdmVyLTYyOC5q/cGc) 142 | 143 | To be a data scientist you need to: 144 | - Know python and machine learning. 145 | - Understand statistics and mathematics 146 | - Have domain specific knowledge. 147 | 148 | Data Scientists use data to help companies make decisions that have a direct impact. So being comfortable with hypothesis testing and performing statistical tests to check if your decisions are correct would be important. 149 | 150 | As mentioned above, this isn't an entry level role and requires expertise. So the best you could do is try to get expertise in mathematical and statistical concepts that are important to data science. The books presented above in Machine Learning would help you with that. 151 | 152 | ### AI Researcher 153 | 154 | I consider this to be the top role in the field of AI and machine learning. The ones who are in this role definitely have a masters and in most cases, a PhD. You need to have publications in top conferences and a lot of interest in research to actually get to this position. The data scientists use algorithms to solve problems. AI Researchers are the ones who study the underlying mathematics of these algorithms and try to improve them. This is very technical and is only for ones who want to study and dissect the field of Artificial intelligence. This is where I want to reach. I hope some of you guys do as well. It is very academically heavy and is easily the most mathematical. Yet, if you are very interested in solving mathematical problems and proving theorems, this could be for you. 155 | 156 | ## What Next? 157 | 158 | Thank you for reading this far! 159 | I hope you have got an idea of all the major roles that exist and have a vague idea of what you want to pursue. Now, to get a job in the data industry, you need to stand out from the rest. For that you need to have a portfolio website. 160 | 161 | A portfolio website consists of all the projects that you have built and helps the recruiter figure out what you are good at in a few seconds. The resources mentioned above already have a lot of guided projects that you can mention in your resume but other than that you can also do a few projects on the side to help you out. 162 | 163 | ## Opportunities 164 | 165 | I talked about portfolio website but I did not mention how you can do projects of your own. For that there's a platform that comes to your rescue. It's called [Kaggle](https://www.kaggle.com/). Here you would find thousands of datasets to practice your data analysis/machine learning skills to work on and there are also experienced people sharing their code that could help you learn the best practices (What more do you need?). 166 | 167 | ### Hackathons 168 | 169 | Showcasing that you have experience to build an end to end solution in a small time is very attractive to potential employers. There are multiple websites where you can find Data Science/Machine Learning related hackathons. I would name a few: 170 | 171 | - Kaggle 172 | - [MachineHack](https://machinehack.com/) 173 | - [Unstop](https://unstop.com/) 174 | 175 | Keep checking these websites on a timely basis as new opportunities can come up any time. Stay aware. Knowing about opportunities is a gift that people take for granted. 176 | 177 | ### Special mentions 178 | 179 | There are a few opportunities that I would like to talk about which do not fit the above mentioned criterias but could be helpful. 180 | 181 | - [AWS AI & ML Scholarship](https://aws.amazon.com/machine-learning/scholarship/): This is a 20 hour long course that you need to take. After which you have to give an exam and score 80% to be eligible to apply. The top 2000 students who apply get a 4 month long nanodegree course about Python programming and AI on Udacity by AWS (worth $4000) for FREE. The top 500 who do well in that course get another advanced nanodegree program for free and receive guidance from AWS Mentors for a year. The last date to complete your application after completing all the tests and course work is **30th September, 2023**. **BE QUICK!** 182 | - [Hamoye](https://hamoye.com/): This one is for the **Data Engineering Enthusiasts** and **MLOps Enthusiasts**. I asked you to scroll to the opportunities section for this. 183 | Hamoye is an excellent platform that offers students the chance to study in a competitive environment. They offer you four tracks to choose from: 184 | - Data Science 185 | - Data Engineering 186 | - Cloud Engineering and Devops 187 | - Data Storytelling 188 | 189 | You would be in a 4 month long internship here where you would have to submit weekly quizzes and assignments to get a rank among every other student who applied for the internship. The best performers tend to get internship opportunities (and even if you don't, you would have learnt a lot, made projects to showcase and would have a leaderboard position to flex about 190 | ). 191 | The last date to apply to the fall cohort 2023 is **8th September, 2023.** Be **QUICKER!!!** 192 | 193 | - [Omdena](https://omdena.com/): You have built some projects and have some experience to contribute to projects that actually make an impact. Omdena could help you with that. They offer you a chance to work with people from all around the world for 6-8 weeks where you try to solve a real world problem and make an impact. The project actually gets implemented in the real world and is a very solid add to your resume. Projects on omdena are open all around the year and you can apply whenever you feel ready. 194 | 195 | # Conclusion 196 | 197 | **Whoa**, this was long, Thanks for reading this far. I would admit that I am not an experienced professional and I am still learning but this is everything that I had used (or still am using) to improve myself. I believe all these resources could help you all as well. If you feel there's something that you don't agree with or want to contribute something that I missed out. Feel free to raise a PR. 198 | I hope this helps some of you! 199 | 200 | ~Akash 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | --------------------------------------------------------------------------------