├── colab.png ├── medium.png ├── setup.py ├── metrics.md ├── resources.md └── README.md /colab.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moshesipper/Applied-Machine-Learning-Course/HEAD/colab.png -------------------------------------------------------------------------------- /medium.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/moshesipper/Applied-Machine-Learning-Course/HEAD/medium.png -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages 2 | 3 | setup( 4 | name='Applied-Machine-Learning-Course', 5 | version='1.0.0', 6 | url='https://github.com/moshesipper/Applied-Machine-Learning-Course', 7 | author='Moshe Sipper', 8 | author_email='sipper@gmail.com', 9 | description='This course covers the applied side of algorithmics in machine learning and deep learning, focusing on hands-on coding experience in Python.', 10 | packages=find_packages(), 11 | ) -------------------------------------------------------------------------------- /metrics.md: -------------------------------------------------------------------------------- 1 | # Machine Learning Metrics: Classification & Regression 2 | 3 | | Type | Metric | Formula / Description | Best Value | When to Use / Notes | 4 | |----------------|------------------|-----------------------------------------------|------------|--------------------------------------------------------------| 5 | | Classification | Accuracy | (TP + TN) / (TP + TN + FP + FN) | 1 | Overall performance; misleading on imbalanced datasets | 6 | | Classification | Balanced Accuracy| (Sensitivity + Specificity) / 2 | 1 | More robust for imbalanced classes | 7 | | Classification | Precision | TP / (TP + FP) | 1 | When false positives are costly (e.g., spam filtering) | 8 | | Classification | Recall (Sensitivity)| TP / (TP + FN) | 1 | When false negatives are costly (e.g., medical diagnosis) | 9 | | Classification | Specificity | TN / (TN + FP) | 1 | True negative rate; complements recall | 10 | | Classification | F1 Score | 2 * (Precision * Recall) / (Precision + Recall)| 1 | Balances precision and recall; good for imbalanced datasets | 11 | | Classification | ROC-AUC | Area under ROC curve | 1 | Threshold-independent measure of classification performance | 12 | | Classification | Log Loss | Penalizes confident wrong predictions | 0 | For probabilistic classifiers | 13 | | Classification | Confusion Matrix | Table showing counts of TP, FP, FN, TN | N/A | Detailed breakdown of errors per class | 14 | | Regression | MAE (Mean Absolute Error)| Mean of absolute differences between predicted and actual values | 0 | Simple, interpretable average error | 15 | | Regression | MSE (Mean Squared Error)| Mean of squared differences between predicted and actual values | 0 | Penalizes larger errors more than MAE | 16 | | Regression | RMSE (Root Mean Squared Error)| Square root of MSE | 0 | Same units as target; sensitive to outliers | 17 | | Regression | R² Score | 1 - (Sum of squared residuals / Total sum of squares)| 1 | Proportion of variance explained | 18 | | Regression | Adjusted R² | R² adjusted for number of predictors | 1 | More accurate for comparing models with different features | 19 | 20 | --- 21 | 22 | ## Quick Guide 23 | 24 | | If you care about... | Use this metric | 25 | |-----------------------------------|-----------------------------------------| 26 | | Binary classification | Accuracy, F1, ROC-AUC | 27 | | Imbalanced classification | Balanced Accuracy, F1, Precision, Recall| 28 | | Minimizing false positives | Precision | 29 | | Minimizing false negatives | Recall (Sensitivity) | 30 | | Probabilistic output quality | Log Loss, ROC-AUC | 31 | | Regression average error | MAE | 32 | | Penalizing large errors | MSE, RMSE | 33 | | Variance explained by model | R² Score | 34 | | Comparing models with different sizes | Adjusted R² | 35 | -------------------------------------------------------------------------------- /resources.md: -------------------------------------------------------------------------------- 1 | **Additional Resources** 2 | 3 | Cheat Sheets 4 | 5 | * [Machine Learning Glossary](https://ml-cheatsheet.readthedocs.io/en/latest/index.html) 6 | * [Some Pros and Cons of Basic ML Algorithms, in 2 Minutes](https://medium.com/ai-mind-labs/some-pros-and-cons-of-basic-ml-algorithms-in-2-minutes-1cf7f327147f) 7 | * [Cheat Sheets for Machine Learning and Data Science](https://sites.google.com/view/datascience-cheat-sheets) 8 | * [The Illustrated Machine Learning Website](https://illustrated-machine-learning.github.io/) 9 | 10 | 11 | Vids 12 | 13 | * [John Koza Genetic Programming](https://www.youtube.com/channel/UC9MEHhji3ODbE_e66EgFkew) (YouTube) 14 | * [גיא כתבי - אלגוריתמים אבולוציוניים](https://www.youtube.com/watch?v=XPx-a6MVne8&ab_channel=guykatabi) (YouTube) \[גיא בוגר הקורס שלי: _אלגוריתמים אבולוציוניים וחיים מלאכותיים_\] 15 | * [StatQuest with Josh Starmer](https://www.youtube.com/user/joshstarmer) 16 | * [ML YouTube Courses](https://github.com/dair-ai/ML-YouTube-Courses) 17 | * [Machine Learning Essentials for Biomedical Data Science: Introduction and ML Basics](https://www.youtube.com/watch?v=Qcgav8NmPxY&list=PLafPhSv1OSDfEqFsBnurxzJbcwZSJA8X4) 18 | * [Artificial Intelligence Under Fire: Attacking and Defending Deep Neural Networks](https://www.youtube.com/watch?v=pz4PC-mKCDY) 19 | 20 | 21 | Basic Reads 22 | 23 | * [Genetic and Evolutionary Algorithms and Programming](https://drive.google.com/file/d/0B6G3tbmMcpR4WVBTeDhKa3NtQjg/view?usp=sharing) 24 | * [Choosing Representation, Mutation, and Crossover in Genetic Algorithms 25 | ](https://ieeexplore.ieee.org/document/9942691/interactive) 26 | * [Introduction to Evolutionary Computing](http://www.evolutionarycomputation.org/) (course/book slides) 27 | * [26 Top Machine Learning Interview Questions and Answers: Theory Edition](https://www.blog.confetti.ai/post/26-top-machine-learning-interview-questions-and-answers-theory) 28 | * [10 Popular Machine Learning Algorithms In A Nutshell](https://www.theinsaneapp.com/2021/11/machine-learning-algorithms-for-beginners.html) 29 | * [Machine learning preparatory week @PSL](https://data-psl.github.io/lectures2020/) 30 | * [Neural Networks and Deep Learning](https://www.coursera.org/learn/neural-networks-deep-learning/home/welcome) (coursera) 31 | * [Tinker With a Neural Network in Your Browser](https://playground.tensorflow.org/) 32 | * [Common Machine Learning Algorithms for Beginners](https://www.dezyre.com/article/common-machine-learning-algorithms-for-beginners/202) 33 | 34 | 35 | 36 | Advanced Reads 37 | 38 | * [What can LLMs never do?](https://www.strangeloopcanon.com/p/what-can-llms-never-do) 39 | * [Foundational Challenges in Assuring Alignment and Safety of Large Language Models 40 | ](https://arxiv.org/abs/2404.09932) 41 | * [“Explainability” Is a Poor Band-Aid for Biased AI in Medicine](https://fperrywilson.medium.com/explainability-is-a-poor-band-aid-for-biased-ai-in-medicine-3db62a338857) 42 | * [Some Techniques To Make Your PyTorch Models Train (Much) Faster](https://sebastianraschka.com/blog/2023/pytorch-faster.html) 43 | * [GPT in 60 Lines of NumPy](https://jaykmody.com/blog/gpt-from-scratch/) 44 | * [ROC-AUC](https://www.analyticsvidhya.com/blog/2020/06/auc-roc-curve-machine-learning/) 45 | * [Why video games are essential for inventing artificial intelligence](https://togelius.blogspot.co.il/2016/01/why-video-games-are-essential-for.html) 46 | * [Agents](https://huyenchip.com//2025/01/07/agents.html) 47 | 48 | 49 | 50 | Books (🡇 means free to download) 51 | 52 | * M. Sipper, _[Evolved to Win](https://www.moshesipper.com/evolved-to-win.html)_, Lulu, 2011 🡇 53 | * M. Sipper, _[Machine Nature: The Coming Age of Bio-Inspired Computing](https://www.moshesipper.com/machine-nature-the-coming-age-of-bio-inspired-computing.html)_, McGraw-Hill, New York, 2002 54 | * J. Starmer, _[The StatQuest Illustrated Guide to Machine Learning](https://statquest.org/statquest-store/)_, 2022 55 | * A. Narayanan & S. Kapoor, [AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference](https://press.princeton.edu/books/hardcover/9780691249131/ai-snake-oil), Princeton University Press, 2024 56 | * L. Vanneschi & S. Silva, [Lectures on Intelligent Systems](https://link.springer.com/content/pdf/10.1007/978-3-031-17922-8.pdf), Springer, 2023 57 | * Simon J.D. Prince, [Understanding Deep Learning](https://udlbook.github.io/udlbook/), MIT Press, 2023 🡇 58 | * G. James, D. Witten, T. Hastie, R. Tibshirani, [An Introduction to Statistical Learning](https://www.statlearning.com/), 2nd edition, 2021 🡇 59 | * A.E. Eiben and J.E. Smith, [_Introduction to Evolutionary Computing_](http://www.cs.vu.nl/~gusz/ecbook/ecbook.html), Springer, 1st edition, 2003, Corr. 2nd printing, 2007 60 | * R. Poli, B. Langdon, & N. McPhee, [_A Field Guide to Genetic Programming_](http://www.gp-field-guide.org.uk/), 2008 🡇 61 | * J. Koza, [_Genetic Programming: On the Programming of Computers by Means of Natural Selection_](http://www.genetic-programming.org/gpbook1toc.html), MIT Press, Cambridge, MA, 1992. 62 | * S. Luke, [_Essentials of Metaheuristics_](http://cs.gmu.edu/~sean/book/metaheuristics/), 2013 🡇 63 | * A. Geron, [Hands On Machine Learning with Scikit Learn and TensorFlow](https://github.com/yanshengjia/ml-road/blob/master/resources/Hands%20On%20Machine%20Learning%20with%20Scikit%20Learn%20and%20TensorFlow.pdf), 2017 🡇 64 | * J. VanderPlas, [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/) 65 | * K. Reitz, [The Hitchhiker’s Guide to Python](https://docs.python-guide.org/) 66 | * M. Nielsen, [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/index.html) 67 | * Z. Michalewicz & D.B. Fogel, [_How to Solve It: Modern Heuristics_](https://www.springer.com/computer/theoretical+computer+science/foundations+of+computations/book/978-3-540-22494-5), 2nd ed. Revised and Extended, 2004 68 | * Z. Michalewicz. [_Genetic Algorithms + Data Structures = Evolution Programs_](http://www.springeronline.com/sgw/cda/frontpage/0,10735,5-40109-22-1430991-0,00.html). Springer-Verlag, Berlin, 3rd edition, 1996 69 | * D. Floreano & C. Mattiussi, [_Bio-Inspired Artificial Intelligence: Theories, Methods, and Technologies_](http://baibook.epfl.ch/), MIT Press, 2008 70 | * A. Tettamanzi & M. Tomassini, [_Soft Computing: Integrating Evolutionary, Neural, and Fuzzy Systems_](https://www.springer.com/computer/theoretical+computer+science/book/978-3-540-42204-4), Springer-Verlag, Heidelberg, 2001 71 | * M. Mohri, A. Rostamizadeh, and A. Talwalka, [Foundations of Machine Learning](https://www.dropbox.com/s/4fij1xrclwjdu5y/Foundations%20of%20Machine%20Learning%2C%20Mohri%202012.pdf?dl=0), MIT Press, 2012 🡇 72 | 73 | 74 | Software 75 | 76 | * [EC-KitY: Evolutionary Computation Tool Kit in Python with Seamless Machine Learning Integration](https://www.eckity.org/) 77 | * [gplearn: Genetic Programming in Python, with a scikit-learn inspired and compatible API](https://gplearn.readthedocs.io/en/stable/#) 78 | * [LEAP: Library for Evolutionary Algorithms in Python](https://github.com/AureumChaos/LEAP) 79 | * [DEAP: Distributed Evolutionary Algorithms in Python](https://deap.readthedocs.io/en/master/) 80 | * [Swarm Intelligence in Python (Genetic Algorithm, Particle Swarm Optimization, Simulated Annealing, Ant Colony Algorithm, Immune Algorithm, Artificial Fish Swarm Algorithm in Python)](https://github.com/guofei9987/scikit-opt) 81 | * [Scikit-learn: Machine Learning in Python](https://scikit-learn.org/stable/index.html) 82 | * [Mlxtend (machine learning extensions)](https://rasbt.github.io/mlxtend/) 83 | * [PyTorch (deep networks)](https://pytorch.org/) 84 | * [Best-of Machine Learning with Python](https://github.com/ml-tooling/best-of-ml-python) 85 | * [Fundamental concepts of PyTorch through self-contained examples](https://github.com/jcjohnson/pytorch-examples) 86 | * [Faster Python calculations with Numba](https://pythonspeed.com/articles/numba-faster-python) 87 | 88 | 89 | Datasets 90 | 91 | * [Tabular & cleaned (PMLB)](https://github.com/EpistasisLab/pmlb) 92 | * [By domain](https://www.datasetlist.com/) 93 | * [By application](https://github.com/awesomedata/awesome-public-datasets) 94 | * [Search engine](https://datasetsearch.research.google.com/) 95 | * [Kaggle competitions](https://www.kaggle.com/datasets) 96 | * [OpenML](https://www.openml.org/) 97 | * [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets.php) 98 | * [Image Databases](https://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm) 99 | * [AWS Open Data Registry](https://registry.opendata.aws/) 100 | * [Wikipedia ML Datasets](https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research) 101 | * [The Big Bad NLP Database](https://datasets.quantumstat.com/) 102 | * [Datasets for Machine Learning and Deep Learning](https://sebastianraschka.com/blog/2021/ml-dl-datasets.html) 103 | * [Browse State-of-the-Art](https://paperswithcode.com/sota) 104 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Applied Machine Learning 2 | 3 | 4 | 5 | **This course covers the applied side of algorithmics in machine learning, with some deep learning and evolutionary algorithms thrown in as well.** 6 | 7 | Prerequisites: Design of Algorithms, Algebra 2, Calculus 2, Probability and Statistics 8 | 9 | **[Moshe Sipper’s Cat-a-log of Writings](https://medium.com/@sipper/moshe-sippers-writings-174ed2e861df)** 10 | 11 | **[Some Pros and Cons of Basic ML Algorithms, in 2 Minutes](https://medium.com/ai-mind-labs/some-pros-and-cons-of-basic-ml-algorithms-in-2-minutes-1cf7f327147f)** 12 | 13 | [Additional Resources](resources.md) (Cheat Sheets, Vids, Reads, Books, Software, Datasets) 14 | 15 | 16 | *** 17 | 18 | **Syllabus** 19 | 20 | ❖ Math ❖ Python ❖ Artificial Intelligence ❖ Date Science ❖ Machine Learning Intro ❖ Scikit-learn ❖ ML Models ❖ Decision Trees ❖ Random Forest ❖ Linear Regression ❖ Logistic Regression ❖ Linear Models ❖ Regularization: Ridge & Lasso ❖ AdaBoost ❖ Gradient Boosting ❖ AddGBoost ❖ Ensembles ❖ XGBoost ❖ Comparing ML algorithms ❖ Gradient Descent ❖ SVM ❖ Bayesian ❖ Metrics ❖ Data Leakage ❖ Dimensionality Reduction ❖ Clustering ❖ Hyperparameters ❖ Some Topics in Probability ❖ Feature Importances ❖ Semi-Supervised Learning ❖ Neural Networks ❖ Deep Learning ❖ DL and AI ❖ Evolutionary Algorithms: Basics ❖ Evolutionary Algorithms: Advanced ❖ Large Language Models 21 | 22 | 23 | 24 | *** 25 | 26 | **Topics** (according to order of instruction) 27 | 28 | (![](colab.png): my colab notebooks, ![](medium.png): my medium articles) 29 | 30 | - Math 31 | - Mathematics for ML: [ppt](https://www.webpages.uidaho.edu/vakanski/Courses/Adversarial_Machine_Learning/Spring_2024/Lecture_3_Mathematics_for_Machine_Learning.pptx), [pdf](https://www.webpages.uidaho.edu/vakanski/Courses/Adversarial_Machine_Learning/Spring_2024/Lecture_3_Mathematics_for_Machine_Learning.pdf), [alternate](https://www.cs.ucf.edu/~lboloni/Teaching/CAP4611_Fall2023/slides/01-Prelim-02-MathReview.pdf) 32 | 33 | - Python 34 | - [Python Tutorial With Google Colab](https://colab.research.google.com/github/cs231n/cs231n.github.io/blob/master/python-colab.ipynb) 35 | - [Learn Python Programming](https://pythonbasics.org/) 36 | - [Python](https://www.python.org/downloads/windows/) 37 | - [PyCharm](https://www.jetbrains.com/pycharm/download/#section=windows) 38 | - [Pandas](https://pythonbasics.org/what-is-pandas/) 39 | - [NumPy](https://numpy.org/devdocs/user/absolute_beginners.html) 40 | - [NumPy](https://www.w3schools.com/python/numpy/default.asp) 41 | - [Numba](https://numba.pydata.org/) 42 | - [np.dot](https://numpy.org/doc/stable/reference/generated/numpy.dot.html) vs [loop example](https://colab.research.google.com/drive/1wAfDDyqYkj1izQvn7bDF9tJA4xYlDWzp?usp=sharing) ![](colab.png) 43 | - [The history of Python](https://youtu.be/GfH4QL4VqJ0?si=r12zYzrowJvdP8_V) (try `import this`) 44 | 45 | 46 | - Artificial Intelligence 47 | - [Computing Machinery and Intelligence](https://www.cs.mcgill.ca/~dprecup/courses/AI/Materials/turing1950.pdf) 48 | - [Four Takeaways on the Race to Amass Data for A.I.](https://www.nytimes.com/2024/04/06/technology/ai-data-tech-takeaways.html) 49 | 50 | 51 | - Date Science 52 | - [Data Science Landscape](https://github.com/dataprofessor/infographic/blob/master/04-Data-Science-Landscape.JPG) 53 | - [Building the Machine Learning Model](https://github.com/dataprofessor/infographic/blob/master/01-Building-the-Machine-Learning-Model.JPG) 54 | - [21 Most Important (and Must-know) Mathematical Equations in Data Science](https://www.blog.dailydoseofds.com/p/21-most-important-and-must-know-mathematical) 55 | - [spurious correlations](https://tylervigen.com/spurious-correlations) 56 | - [Data Science Has Become a Pseudo-Science](https://www.reddit.com/r/datascience/comments/1lluwlv/data_science_has_become_a_pseudoscience/) 57 | - [Data Visualisation Gallery](https://nrennie.rbind.io/viz-gallery/) 58 | 59 | 60 | - Machine Learning Intro 61 | - [Machine Learning: history, applications, recent successes](https://data-psl.github.io/lectures2020/slides/01_machine_learning_successes) 62 | - [How to avoid machine learning pitfalls](https://arxiv.org/abs/2108.02497) 63 | - [Top 10 machine learning algorithms with their use-cases](https://medium.com/@avikumart_/top-10-machine-learning-algorithms-with-their-use-cases-f1ea2c1dfd6b) 64 | - [Introduction to machine learning](https://data-psl.github.io/lectures2020/slides/02_intro_to_machine_learning) 65 | - [Train/Val/Test](https://glassboxmedicine.com/2019/09/15/best-use-of-train-val-test-splits-with-tips-for-medical-data/) 66 | - [Simple weather example](https://colab.research.google.com/drive/1XShD6G7sPGLXKtto4GBZPLWJoPcJEJBk?usp=sharing) ![](colab.png) 67 | - [Cross-validation](https://scikit-learn.org/stable/modules/cross_validation.html) 68 | - [kfold](https://colab.research.google.com/drive/1Hj17jfBbl0tYBVn6ze0YQ7xxTS5Dr1-D?usp=sharing) ![](colab.png) 69 | - [Overfitting](https://miro.medium.com/v2/resize:fit:640/format:webp/1*aorb7r6PyHhQEZwwT1-_sw.png) 70 | 71 | 72 | - Scikit-learn 73 | - [Machine learning with scikit-learn](https://data-psl.github.io/lectures2020/slides/04_scikit_learn/#1) 74 | - [A Minimal Example of Machine Learning (with scikit-learn)](https://medium.com/@sipper/a-minimal-example-of-machine-learning-with-scikit-learn-4e98d5dcc6e7) ![](medium.png) 75 | - [19 Most Elegant Sklearn Tricks I Found After 3 Years of Use](https://pub.towardsai.net/19-most-elegant-sklearn-tricks-i-found-after-3-years-of-use-5bda439fa506) 76 | - [Toy datasets (sklearn)](https://scikit-learn.org/stable/datasets/toy_dataset.html) 77 | 78 | 79 | - ML Models 80 | - [Machine learning models](https://data-psl.github.io/lectures2020/slides/03_machine_learning_models/) 81 | 82 | 83 | - Decision Trees 84 | - [Decision trees](https://youtu.be/_L39rN6gz7Y) 85 | - [Decision trees](https://colab.research.google.com/drive/1wyD94nW0HFvdhCkYLLmkxdVulhZTDD-x?usp=sharing) ![](colab.png) 86 | 87 | 88 | - Random Forest 89 | - [Random Forests](https://youtu.be/J4Wdy0Wc_xQ) 90 | - [Random Forests](https://colab.research.google.com/drive/1yd-QebBMSJOlVP9Pot7DHyJtclPDvHqF?usp=sharing) ![](colab.png) 91 | 92 | 93 | 94 | - Linear Regression 95 | - [Linear Regression](https://youtu.be/PaFPbb66DxQ) 96 | - [LinearReg](https://colab.research.google.com/drive/1fCQjAiEce6hU0osLzCWo73hC9iYva3QK?usp=sharing) ![](colab.png) 97 | - [A Quick Summary of Linear Regression](https://medium.com/analytics-vidhya/a-quick-summary-of-linear-regression-42d1dab85e3e) 98 | 99 | 100 | - Logistic Regression 101 | - [Logistic Regression](https://youtu.be/yIYKR4sgzI8) 102 | - [Logistic Regression](https://www.analyticsvidhya.com/blog/2021/07/an-introduction-to-logistic-regression/) 103 | - [Cross-Entropy Loss](https://towardsdatascience.com/cross-entropy-loss-function-f38c4ec8643e) 104 | - [Shannon Entropy calculator](https://planetcalc.com/2476/) 105 | - [Linear Regression vs Logistic Regression in a Nutshell](https://pub.towardsai.net/linear-regression-vs-logistic-regression-in-a-nutshell-cf708cfe8f92) ![](medium.png) 106 | - [LinVsLog](https://colab.research.google.com/drive/1kfMFdrVpL9NczZdKDfA_zGT0NMr3PMYS?usp=sharing) ![](colab.png) 107 | - [PolynomialFeatures](https://colab.research.google.com/drive/1zjuhudzOZRCbovLwWSxYLxsJ67V7A5Dt?usp=sharing) ![](colab.png) 108 | 109 | - Linear Models 110 | - [Optimization of linear models](https://data-psl.github.io/lectures2020/slides/05_optimization_linear_models/) 111 | 112 | 113 | - Regularization: Ridge & Lasso 114 | - [Ridge vs. Lasso](https://www.statology.org/when-to-use-ridge-lasso-regression/) 115 | - [notes on regularization in ML](https://www.linkedin.com/feed/update/urn:li:activity:7053809365169971201/) 116 | 117 | 118 | - AdaBoost 119 | - [Adaptive Boosting](https://youtu.be/LsK-xG1cLYA) 120 | 121 | 122 | 123 | - Gradient Boosting 124 | - [Gradient Boosting](https://youtu.be/3CC4N4z3GJc) 125 | - [Why tree gradients give you a boost](https://youtu.be/o6seqpMJSTI?si=i-jfTRuypFI2cUEB) 126 | 127 | 128 | - AddGBoost 129 | - [AddGBoost](https://www.sciencedirect.com/science/article/pii/S2666827021001225) 130 | - [Strong(er) Gradient Boosting](https://medium.com/@sipper/strong-er-gradient-boosting-6eb617566328) ![](medium.png) 131 | 132 | 133 | - Ensembles 134 | - [Two’s Company, Three’s an Ensemble](https://pub.towardsai.net/twos-company-three-s-an-ensemble-99bd06608560) ![](medium.png) 135 | 136 | 137 | 138 | - XGBoost 139 | - [XGBoost](https://youtu.be/OtD8wVaFm6E) 140 | 141 | 142 | - Comparing ML algorithms 143 | - [Comparing supervised learning algorithms](https://www.dataschool.io/comparing-supervised-learning-algorithms/) 144 | - [How to find the best performing Machine Learning algorithm](https://medium.com/analytics-vidhya/how-to-find-the-best-performing-machine-learning-algorithm-dc4eb4ff34b6) 145 | - [`load_boston` removal](https://scikit-learn.org/1.0/modules/generated/sklearn.datasets.load_boston.html) 146 | - [Choosing the right estimator](https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html) 147 | - [Questionable practices in machine learning](https://arxiv.org/abs/2407.12220) 148 | 149 | 150 | - Gradient Descent 151 | - [Gradient Descent](https://youtu.be/sDv4f4s2SB8) 152 | - [Least Squares](https://www.mathsisfun.com/data/least-squares-regression.html) 153 | - [Least Squares](https://textbooks.math.gatech.edu/ila/least-squares.html) 154 | - [Stochastic Gradient Descent](https://youtu.be/vMh0zPT0tLI) 155 | - [Stochastic Gradient Descent Algorithm With Python and NumPy](https://realpython.com/gradient-descent-algorithm-python/) 156 | - [Gradient Descent With Momentum](https://towardsdatascience.com/gradient-descent-with-momentum-59420f626c8f) 157 | 158 | 159 | - SVM 160 | - [Support Vector Machine](https://youtu.be/efR1C6CvhmE) 161 | - [Plot different SVM classifiers in the iris dataset](https://scikit-learn.org/stable/auto_examples/svm/plot_iris_svc.html#sphx-glr-auto-examples-svm-plot-iris-svc-py) 162 | - [SVM’s Kernel Trick in a minute](https://medium.com/@asusrishabh/svms-kernel-trick-in-a-minute-bd0554b31ec0) ![](medium.png) 163 | 164 | 165 | - Bayesian 166 | - [Multinomial Naive Bayes](https://youtu.be/O2L2Uv9pdDA) 167 | 168 | 169 | - Metrics 170 | - [sklearn.metrics](https://scikit-learn.org/stable/api/sklearn.metrics.html) 171 | - [Confusion Matrix](https://youtu.be/Kdsp6soqA7o) 172 | - [Confusion Matrix & Metrics in a minute](https://medium.com/@asusrishabh/confusion-matrix-metrics-in-a-minute-e1596872e90b) 173 | - [Sensitivity and Specificity](https://youtu.be/vP06aMoz4v8) 174 | - [Sensitivity and specificity](https://en.wikipedia.org/wiki/Sensitivity_and_specificity) 175 | - [ROC and AUC](https://youtu.be/4jRBRDbJemM) 176 | - [balanced accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html#balanced-accuracy-score) 177 | - [balanced accuracy](https://www.statology.org/balanced-accuracy/) 178 | - [various metrics from CM](https://en.wikipedia.org/wiki/Confusion_matrix) 179 | - [Summary of metrics](metrics.md) 180 | 181 | 182 | - Data Leakage 183 | - [Data Leakage Basics, with Examples in Scikit-Learn](https://pub.aimind.so/data-leakage-basics-with-examples-in-scikit-learn-9c946a6f75b2) ![](medium.png) 184 | - [Could machine learning fuel a reproducibility crisis in science?](https://www.nature.com/articles/d41586-022-02035-w) 185 | 186 | 187 | - Dimensionality Reduction 188 | - [11 Different Uses of Dimensionality Reduction](https://towardsdatascience.com/11-different-uses-of-dimensionality-reduction-4325d62b4fa6) 189 | - [PCA](https://youtu.be/FgakZw6K1QQ) 190 | - [PCA](https://www.sartorius.com/en/knowledge/science-snippets/what-is-principal-component-analysis-pca-and-how-it-is-used-507186) 191 | - [PCA vs LR](https://shephexd.github.io/machine%20learning/2018/07/15/Machine_learning(9)-PCA.html) 192 | - [pca](https://colab.research.google.com/drive/1h6xLxKyEltPwsck-mJ5nQPFkMGYI8VOs?usp=sharing) ![](colab.png) 193 | - [t-SNE](https://youtu.be/NEaUSP4YerM) 194 | - [tsne](https://colab.research.google.com/drive/1vnA5iwWrjDY4AhHL_E86VLq59FwJG2s9?usp=sharing) ![](colab.png) 195 | 196 | 197 | - Clustering 198 | - [Clustering](https://www.geeksforgeeks.org/clustering-in-machine-learning/) 199 | - [Hierarchical clustering](https://youtu.be/7xHsRkOdVwo) 200 | - [K-means clustering](https://youtu.be/4b5d3muPQmA) 201 | - [kmeans](https://colab.research.google.com/drive/1aoiM8cnS_DdNOP2njEsjWcyf6-zHMrJ1?usp=sharing) ![](colab.png) 202 | - [From Data to Clusters: When is Your Clustering Good Enough?](https://medium.com/data-science/from-data-to-clusters-when-is-your-clustering-good-enough-5895440a978a) 203 | 204 | 205 | - Hyperparameters 206 | - [Model Parameter vs. Hyperparameter](https://www.youtube.com/watch?v=Qcgav8NmPxY&t=1224s) 207 | - [Hyperparameter tuning](https://medium.com/data-science/hyperparameter-tuning-explained-d0ebb2ba1d35) 208 | - [Hyperparameter tuning](https://medium.com/data-science/hyperparameter-tuning-a-practical-guide-and-template-b3bf0504f095) 209 | - [Optuna](https://optuna.org/) 210 | - [optuna](https://colab.research.google.com/drive/1FbG9yaUNn8EqL1NgLBBoRIx9E5EPBuIQ?usp=sharing) ![](colab.png) 211 | - [Evaluating Hyperparameters in Machine Learning](https://medium.com/@sipper/evaluating-hyperparameters-in-machine-learning-25b7fa09362d) ![](medium.png) 212 | 213 | 214 | - Some Topics in Probability 215 | - [p-values](https://youtu.be/vemZtEM63GY) 216 | - [how to calculate p-values](https://youtu.be/JQc3yx0-Q9E) 217 | - [p-hacking](https://youtu.be/HDCOUXE3HMM) 218 | - [Probability is not Likelihood](https://www.youtube.com/watch?v=pYxNSUDSFH4) 219 | - [t-test](https://youtu.be/0Pd3dc1GcHc) 220 | - [t-test](https://youtu.be/pTmLQvMM-1M) 221 | - [t-test vs p-value](https://askanydifference.com/difference-between-t-test-and-p-value/) 222 | - [scpipy ttest](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html) 223 | - [chi-square](https://youtu.be/ZjdBM7NO7bY) 224 | - [17 Statistical Hypothesis Tests in Python](https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/) 225 | - [permutation test](https://youtu.be/GmvpsJHGCxQ) ([AddGBoost ](https://www.sciencedirect.com/science/article/pii/S2666827021001225)+ [code](https://github.com/moshesipper/AddGBoost)) 226 | - [The Permutation Test: A Data Scientist’s BFF](https://www.cantorsparadise.com/the-permutation-test-a-data-scientists-bff-a8ca64a5adb4) ![](medium.png) 227 | 228 | 229 | - Feature Importances 230 | - [Decision Tree](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier.feature_importances_) 231 | - [Feature importances with a forest of trees](https://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html) 232 | - [Explainable AI explained! | #4 SHAP](https://youtu.be/9haIOplEIGM?si=kxF4tpGBwziQFy28) 233 | - [SHAP Values Explained](https://bix-tech.com/how-i-wish-someone-would-explain-shap-values-to-me/) 234 | 235 | 236 | - Semi-Supervised Learning 237 | - [Boosting vs. semi-supervised learning](https://youtu.be/2eU_8ExBzDw?si=pS9qK5YvO3tIFyTz) 238 | - [LabelPropagation](https://scikit-learn.org/stable/modules/generated/sklearn.semi_supervised.LabelPropagation.html) 239 | - [Semi-Supervised Learning: How To Overcome the Lack of Labels](https://dzone.com/articles/semi-supervised-learning-overcome-lack-of-labels) 240 | 241 | 242 | - Neural Networks 243 | - [Neural networks](https://youtube.com/playlist?list=PLblh5JKOoLUIxGDQs4LFFD--41Vzf-ME1) 244 | - [Neural networks](https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi&pp=iAQB) 245 | - [Tinker With a Neural Network in Your Browser](https://playground.tensorflow.org/) 246 | 247 | 248 | - Deep Learning 249 | - [Neural Networks with À La Carte Selection of Activation Functions](https://arxiv.org/abs/2206.12166) 250 | - [PyTorch](https://pytorch.org/tutorials/beginner/pytorch_with_examples.html) 251 | - [Double Descent](https://mlu-explain.github.io/double-descent/) 252 | - [Overparameterization, Backpropagation, Alimentation: Them and Us](https://pub.aimind.so/overparameterization-backpropagation-alimentation-them-and-us-bb326edb60c2) ![](medium.png) 253 | - [No, Kernels & Filters Are Not The Same](https://medium.com/data-science/no-kernels-filters-are-not-the-same-b230ec192ac9) 254 | - [conv demo](https://deeplizard.com/resource/pavq7noze2) 255 | - [convolution](https://www.linkedin.com/posts/pascalbornet_artificialintelligence-ugcPost-6925288775740776448-0S-K/) 256 | - [A simple image convolution](https://youtube.com/shorts/4xWpQe3G9qI?si=RUwQHsAK4oP2bvc1) 257 | - [Implementing Image Processing Kernels from scratch using Convolution in Python](https://medium.com/@sabribarac/implementing-image-processing-kernels-from-scratch-using-convolution-in-python-4e966e9aafaf) 258 | - [Introduction to image generation (diffusion)](https://www.youtube.com/watch?v=kzxz8CO_oG4) 259 | - [Loss is Boss](https://levelup.gitconnected.com/loss-is-boss-01ec08dea9e0) ![](medium.png) and other articles in the [DL section](https://medium.com/@sipper/moshe-sippers-writings-174ed2e861df) ![](medium.png) 260 | - [Neural Networks from Scratch](https://github.com/DorsaRoh/Machine-Learning) 261 | 262 | 263 | - DL and AI 264 | - [Growth of AI computing](https://twitter.com/pmddomingos/status/1535112033137401857) 265 | - [AI move from Academia](https://twitter.com/GaryMarcus/status/1536150812795121664) 266 | - [Artificial General Intelligence Is Not as Imminent as You Might Think](https://www.scientificamerican.com/article/artificial-general-intelligence-is-not-as-imminent-as-you-might-think1/) 267 | 268 | 269 | 270 | - Evolutionary Algorithms: Basics 271 | - [Evolutionary Computation](http://www.evolutionarycomputation.org/slides/) 272 | - [How to Build a Genetic Algorithm from Scratch in Python with Just 33 Lines of Code](https://levelup.gitconnected.com/tiny-genetic-algorithm-33-line-version-and-3-line-version-38a851141512) ![](medium.png) 273 | - [Evolutionary Algorithms, Genetic Programming, and Learning](https://medium.com/@sipper/evolutionary-algorithms-genetic-programming-and-learning-dfde441ad0b9) ![](medium.png) 274 | - [Tiny GA](https://github.com/moshesipper/tiny_ga) 275 | - [EC-KitY](https://www.eckity.org/) 276 | - [Genetic Programming slides](http://www.genetic-programming.com/c2003lecture1modified.ppt) 277 | - [Tiny GP](https://github.com/moshesipper/tiny_gp) 278 | - [GP Tutorial](http://www.genetic-programming.com/gecco2003tutorial.pdf) 279 | - [GP vids](https://www.youtube.com/channel/UC9MEHhji3ODbE_e66EgFkew) 280 | - [GP slides](https://coinse.github.io/assets/files/teaching/cs454/cs454-slide09.pdf) 281 | 282 | 283 | - Evolutionary Algorithms: Advanced 284 | - [Multi-Objective Optimization](https://engineering.purdue.edu/~sudhoff/ee630/Lecture09.pdf) 285 | - [Schema theorem](https://engineering.purdue.edu/~sudhoff/ee630/Lecture03.pdf) 286 | - [Linear GP](http://www.am.chalmers.se/~wolff/AI2/Lect05LGP.pdf) 287 | - [Cartesian GP](http://cs.ijs.si/ppsn2014/files/slides/ppsn2014-tutorial3-miller.pdf) 288 | - [Grammatical Evolution](https://web.archive.org/web/20110721124315/http:/www.grammaticalevolution.org/tutorial.pdf) 289 | - [Coevolutionary Computation](https://medium.com/the-generator/coevolutionary-computation-fb719304d12e) ![](medium.png) 290 | - [New Pathways in Coevolutionary Computation](https://arxiv.org/abs/2401.10515) 291 | - [Novelty search](https://www.cs.ucf.edu/~gitars/cap6671-2010/Presentations/lehman_alife08.pdf) 292 | - [Humies](https://www.human-competitive.org/) 293 | - [Evolutionary Art](https://medium.com/the-generator/evolutionary-art-00460707d529) ![](medium.png) 294 | - [Building Activation Functions for Deep Networks](https://medium.com/@sipper/building-activation-functions-for-deep-networks-82c2a9c9cc1f) ![](medium.png) 295 | - [Evolutionary Adversarial Attacks on Deep Networks](https://medium.com/@sipper/evolutionary-adversarial-attacks-on-deep-networks-ff622b8e15e5) ![](medium.png) 296 | 297 | 298 | - Large Language Models 299 | - [Introduction to large language models](https://www.youtube.com/watch?v=zizonToFXDs) 300 | - [A Tiny Large Language Model (LLM), Coded, and Hallucinating](https://medium.com/@sipper/a-tiny-large-language-model-llm-coded-and-hallucinating-9a427b04eb1a) ![](medium.png) 301 | - [Large Language Models from scratch](https://youtu.be/lnA9DMvHtfI?si=mjdh3ts-qrblJdjI) 302 | - [Large Language Models: Part 2](https://youtu.be/YDiSFS-yHwk?si=7iD0M7QwGg0nifuq) 303 | - [Scikit-LLM](https://github.com/iryna-kondr/scikit-llm) 304 | - [Coding a ChatGPT Like Transformer From Scratch in PyTorch](https://youtu.be/C9QSpl5nmrY?si=4ysM62zpwGpSYiMD) 305 | - [Word Embeddings: Encoding Lexical Semantics](https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html?highlight=embeddings) 306 | - [The Magic Behind Embedding Models](https://medium.com/data-science/the-magic-behind-embedding-models-c3af62f71fb) 307 | - [What are the query, key, and value vectors?](https://rahulrajpvr7d.medium.com/what-are-the-query-key-and-value-vectors-5656b8ca5fa0) 308 | - [Unpacking the Query, Key, and Value of Transformers: An Analogy to Database Operations](https://www.linkedin.com/pulse/unpacking-query-key-value-transformers-analogy-database-mohamed-nabil/) 309 | --------------------------------------------------------------------------------