└── README.md /README.md: -------------------------------------------------------------------------------- 1 | With an aim of developing a training course in machine / deep learning for life-science and chemistry researchers, this is a compilation of relevant materials. 2 | 3 | - [Articles, reviews and tutorials](#articles-reviews-and-tutorials) 4 | - [Books](#books) 5 | - [Courses](#courses) 6 | - [Resources](#resources) 7 | - [References](#references) 8 | 9 | 10 | ## Articles, reviews and tutorials 11 | 12 | Year | First author | Title / Link | Journal 13 | -----|--------------|--------------|-------- 14 | 2021 | Jimenez-Luna | [Artificial intelligence in drug discovery: Recent advances and future perspectives](https://www.tandfonline.com/doi/abs/10.1080/17460441.2021.1909567) | Expert Opinion on Drug Discovery 15 | 2021 | Bender | [Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: Ways to make an impact, and why we are not there yet](https://www.sciencedirect.com/science/article/pii/S1359644620305274) | Drug Discovery Today 16 | 2020 | von Lilienfeld | [Retrospective on a decade of machine learning for chemical discovery](https://www.nature.com/articles/s41467-020-18556-9) | Nature Communications 17 | 2020 | Lopez | [Enhancing scientific discoveries in molecular biology with deep generative models](https://www.embopress.org/doi/full/10.15252/msb.20199198) | Molecular Systems Biology 18 | 2020 | Cao | [Ensemble deep learning in bioinformatics](https://www.nature.com/articles/s42256-020-0217-y) | Nature Machine Intelligence 19 | 2020 | Kopp | [Deep learning for genomics using Janggu](https://www.nature.com/articles/s41467-020-17155-y) | Nature Communications 20 | 2020 | Adam | [Machine learning approaches to drug response prediction: challenges and recent progress](https://www.nature.com/articles/s41698-020-0122-1) | npj Precision Oncology 21 | 2020 | Schreiber | [Avocado: a multi-scale deep tensor factorization method learns a latent representation of the human epigenome](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-01977-6) | Genome biology 22 | 2020 | Schreiber | [Completing the ENCODE3 compendium yields accurate imputations across a variety of assays and human biosamples](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-01978-5) | Genome biology 23 | 2020 | Brown | [Artificial intelligence in chemistry and drug design](https://link.springer.com/article/10.1007%2Fs10822-020-00317-x) | Journal of Computer-Aided Molecular Design 24 | 2020 | van der Schaar | [How artificial intelligence and machine learning can help healthcare systems respond to COVID-19](http://www.vanderschaar-lab.com/NewWebsite/covid-19/post1/paper.pdf) | Group website 25 | 2020 | Neves | [Deep Learning-driven research for drug discovery: Tackling Malaria](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007025) | PLOS Computational Biology 26 | 2020 | Stokes | [A Deep Learning Approach to Antibiotic Discovery](https://www.sciencedirect.com/science/article/pii/S0092867420301021) | Cell 27 | 2020 | Xu | [A comprehensive review of computational prediction of genome-wide features](https://academic.oup.com/bib/article-abstract/21/1/120/5177808) | Briefings in Bioinformatics 28 | 2020 | Walters | [Assessing the impact of generative AI on medicinal chemistry](https://www.nature.com/articles/s41587-020-0418-2) | Nature Biotechnology 29 | --- | --- | --- | --- 30 | 2019 | Schneider | [Rethinking drug design in the artificial intelligence era](https://www.nature.com/articles/s41573-019-0050-3) | Nature Reviews Drug Discovery 31 | 2019 | Dias | [Artificial intelligence in clinical and genomic diagnostics](https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-019-0689-8) | Genome Medicine 32 | 2019 | Filipp | [Opportunities for artificial intelligence in advancing precision medicine](https://arxiv.org/abs/1911.07125) | arXiv 33 | 2019 | Yang | [Machine-learning-guided directed evolution for protein engineering](https://www.nature.com/articles/s41592-019-0496-6) | Nature methods 34 | 2019 | Kopp | [Janggu - Deep learning for genomics](https://www.biorxiv.org/content/10.1101/700450v2) | bioRxiv 35 | 2019 | Zhavoronkov | [Deep Aging Clocks: The Emergence of AI-Based Biomarkers of Aging and Longevity](https://www.cell.com/trends/pharmacological-sciences/fulltext/S0165-6147(19)30114-2) | Trends in Pharmacological Sciences 36 | 2019 | Mater | [Deep Learning in Chemistry](https://pubs.acs.org/doi/full/10.1021/acs.jcim.9b00266) | J. Chem. Inf. Model. 37 | 2019 | Eraslan | [Deep learning: new computational modelling techniques for genomics](https://www.nature.com/articles/s41576-019-0122-6) | Nature Reviews Genetics 38 | 2019 | Avsec | [The Kipoi repository accelerates community exchange and reuse of predictive models for genomics](https://www.nature.com/articles/s41587-019-0140-0) 39 | 2019 | Xu | [Machine learning and complex biological data](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1689-0) 40 | 2019 | Vamathevan | [Applications of machine learning in drug discovery and development](https://www.nature.com/articles/s41573-019-0024-5) 41 | 2019 | Preuer | [Interpretable Deep Learning in Drug Discovery](https://arxiv.org/abs/1903.02788) 42 | 2019 | Polykovskiy | [Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models](https://arxiv.org/abs/1811.12823) 43 | 2019 | Schneider | [Mind and machine in drug design](https://www.nature.com/articles/s42256-019-0030-7) 44 | 2019 | Elton | [Deep learning for molecular generation and optimization-a review of the state of the art](https://arxiv.org/abs/1903.04388) 45 | 2019 | Topol | [High-performance medicine: the convergence of human and artificial intelligence](https://www.nature.com/articles/s41591-018-0300-7) 46 | 2019 | Gromski | [How to explore chemical space using algorithms and automation](https://www.nature.com/articles/s41570-018-0066-y) 47 | 2019 | Li | [Deep learning in bioinformatics: introduction, application, and perspective in big data era](https://www.biorxiv.org/content/10.1101/563601v1) 48 | 2019 | Zachary | [Machine-Learning-Assisted Directed Protein Evolution with Combinatorial Libraries](https://arxiv.org/abs/1902.07231) 49 | 2019 | Haghighatlari | [Advances of Machine Learning in Molecular Modeling and Simulation](https://arxiv.org/abs/1902.00140) 50 | 2019 | PLOS | [Collection in Machine Learning in Health and Biomedicine](https://collections.plos.org/mlforhealth) 51 | 2019 | Jaganathan | [Predicting Splicing from Primary Sequence with Deep Learning](https://www.sciencedirect.com/science/article/pii/S0092867418316295) 52 | 2019 | He | [The practical implementation of artificial intelligence technologies in medicine](https://www.nature.com/articles/s41591-018-0307-0) 53 | 2019 | Kriegescorte | [Neural network models and deep learning - a primer for biologists](https://arxiv.org/abs/1902.04704) 54 | 2019 | Esteva | [A guide to deep learning in healthcare](https://www.nature.com/articles/s41591-018-0316-z) 55 | --- | --- | --- 56 | 2018 | Yu | [Visible Machine Learning for Biomedicine](https://www.sciencedirect.com/science/article/pii/S0092867418307190) | Cell 57 | 2018 | Brown | [GuacaMol: Benchmarking Models for de Novo Molecular Design](https://pubs.acs.org/doi/10.1021/acs.jcim.8b00839) | J. Chem. Inf. Model. 58 | 2018 | Sellwood | [Artificial intelligence in drug discovery](https://www.future-science.com/doi/10.4155/fmc-2018-0212) 59 | 2018 | Yu | [Artificial intelligence in healthcare](https://www.nature.com/articles/s41551-018-0305-z) 60 | 2018 | Pérez | [Simulations meet machine learning in structural biology](https://www.sciencedirect.com/science/article/pii/S0959440X17301069) 61 | 2018 | Zou | [A primer on deep learning in genomics](https://www.nature.com/articles/s41588-018-0295-5) 62 | 2018 | Ching | [Opportunities and obstacles for deep learning in biology and medicine](https://royalsocietypublishing.org/doi/full/10.1098/rsif.2017.0387) 63 | 2018 | Greene | [Opportunities and obstacles for deep learning in biology and medicine](https://github.com/greenelab/deep-review) 64 | 2018 | Wainberg | [Deep learning in biomedicine](https://www.nature.com/articles/nbt.4233) 65 | 2018 | Zitnik | [Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities](https://arxiv.org/abs/1807.00123) 66 | 2018 | Telenti | [Deep learning of genomic variation and regulatory network data](https://academic.oup.com/hmg/article/27/R1/R63/4966854) 67 | 2018 | Yue | [Deep Learning for Genomics: A Concise Overview](https://arxiv.org/abs/1802.00810) 68 | 2018 | Camacho | [Next-Generation Machine Learning for Biological Networks](https://www.sciencedirect.com/science/article/pii/S0092867418305920) 69 | 2018 | Jung | [Machine Learning: Basic Principles](https://arxiv.org/abs/1805.05052) 70 | 2018 | Chen | [The rise of deep learning in drug discovery](https://www.sciencedirect.com/science/article/pii/S1359644617303598). A summary of the latest applications of deep learning to bioactivity and reaction predictions, and image analysis 71 | 2018 | Lo | [Machine learning in chemoinformatics and drug discovery](https://www.sciencedirect.com/science/article/pii/S1359644617304695) 72 | 2018 | Segler | [Planning chemical syntheses with deep neural networks and symbolic AI](https://www.nature.com/articles/nature25978) 73 | 2018 | Sundaram | [Predicting the clinical impact of human mutation with deep neural networks](https://www.nature.com/articles/s41588-018-0167-z) 74 | 2018 | Zhou | [Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk](https://www.nature.com/articles/s41588-018-0160-6) 75 | 2018 | Teschendorff | [Avoiding common pitfalls in machine learning omic data science](https://www.nature.com/articles/s41563-018-0241-z) 76 | 2018 | Colwell | [Statistical and machine learning approaches to predicting protein–ligand interactions](https://www.sciencedirect.com/science/article/pii/S0959440X17301525?via%3Dihub) 77 | 2018 | Yang | [Machine learning in protein engineering](https://arxiv.org/abs/1811.10775) 78 | 2018 | Coley | [Machine Learning in Computer-Aided Synthesis Planning](https://pubs.acs.org/doi/10.1021/acs.accounts.8b00087) 79 | 2018 | Goh | [Deep Learning for Computational Chemistry](https://arxiv.org/abs/1701.04503) 80 | 2018 | Wu | [MoleculeNet: a benchmark for molecular machine learning](http://pubs.rsc.org/en/content/articlehtml/2017/sc/c7sc02664a) 81 | 2018 | Salim | [Synthetic Patient Generation: A Deep Learning Approach Using Variational Autoencoders](https://arxiv.org/abs/1808.06444) 82 | 2018 | Butler | [Machine learning for molecular and materials science](https://www.nature.com/articles/s41586-018-0337-2) 83 | --- | --- | --- 84 | 2017 | Altae-Tran | [Low Data Drug Discovery with One-Shot Learning](https://pubs.acs.org/doi/10.1021/acscentsci.6b00367). Learning with little data in drug discovery 85 | 2017 | Ransundar | [Is Multitask Deep Learning Practical for Pharma?](https://pubs.acs.org/doi/abs/10.1021/acs.jcim.7b00146) 86 | --- | --- | --- 87 | 2016 | Angermueller | [Deep learning for computational biology](https://onlinelibrary.wiley.com/doi/abs/10.15252/msb.20156651) 88 | 2016 | Mamoshina | [Applications of Deep Learning in Biomedicine](https://pubs.acs.org/doi/10.1021/acs.molpharmaceut.5b00982) 89 | --- | --- | --- 90 | 2015 | Park | [Deep learning for regulatory genomics](https://www.nature.com/articles/nbt.3313) 91 | 2015 | LeCun | [Deep learning](https://www.nature.com/articles/nature14539) 92 | 93 | **Other:** 94 | 95 | - de la Vega de León (r), [Effect of missing data on multitask prediction methods](https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0281-z) 96 | - Liu, [Deep EHR: Chronic Disease Prediction Using Medical Notes](https://arxiv.org/abs/1808.04928) 97 | - Radovic, [Machine learning at the energy and intensity frontiers of particle physics](https://www.nature.com/articles/s41586-018-0361-2) 98 | - Ryan, [Crystal Structure Prediction via Deep Learning](https://pubs.acs.org/doi/10.1021/jacs.8b03913) 99 | - Sanchez-Lengeling and Aspuru-Guzik, [Inverse molecular design using machine learning: Generative models for matter engineering](http://science.sciencemag.org/content/361/6400/360) 100 | - Way, [Discovering pathway and cell-type signatures in transcriptomic compendia with machine learning](https://peerj.com/preprints/27229/) 101 | - Chen, [XGBoost: A Scalable Tree Boosting System](https://dl.acm.org/citation.cfm?id=2939785) 102 | - Vinyals, [Matching Networks for One Shot Learning](http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning): learning with little data 103 | - Simm, [Macau: Scalable Bayesian Multi-relational Factorization with Side Information using MCMC](https://arxiv.org/abs/1509.04610) 104 | 105 | 106 | ## Books 107 | 108 | Authors | Title / Link 109 | --------|------------- 110 | Nielsen | [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/index.html) 111 | James, Witten, Hastie and Tibshirani | [An Introduction to Statistical Learning with Applications in R](http://www-bcf.usc.edu/~gareth/ISL/) 112 | Hastie, Tibshirani and Friedman | [The Elements of Statistical Learning: Data Mining, Inference, and Prediction](https://web.stanford.edu/~hastie/ElemStatLearn/) 113 | Goodfellow, Bengio and Courville | [Deep Learning](http://www.deeplearningbook.org/) 114 | 115 | 116 | 117 | ## Courses 118 | 119 | Provider | Title / Link 120 | ---------|------------- 121 | Parr and Howard | [How to explain gradient boosting](https://explained.ai/gradient-boosting/) 122 | Machine learning for kids | [projects](https://machinelearningforkids.co.uk/#!/welcome) 123 | Data flair | Machine learning [tutorials](https://data-flair.training/blogs/machine-learning-tutorials-home/) and Artificial Intelligence [tutorials](https://data-flair.training/blogs/ai-tutorials-home/) 124 | MIT | [Introduction to Deep Learning](http://introtodeeplearning.com/) 125 | fast.ai | [Deep learning and machine learning courses](https://www.fast.ai/) 126 | Coursera | [Deep Learning Specialization](https://www.coursera.org/specializations/deep-learning/) 127 | University of Cambridge | An Introduction to Machine Learning: [github](https://github.com/bioinformatics-training/intro-machine-learning-2018) and [book](https://bioinformatics-training.github.io/intro-machine-learning-2017/) 128 | Data School | In-depth introduction to machine learning in 15 hours of [expert videos](https://www.dataschool.io/15-hours-of-expert-machine-learning-videos/) 129 | Shirin Glander | [Introduction to Machine Learning with R](https://shirinsplayground.netlify.com/2018/06/intro_to_ml_workshop_heidelberg/) 130 | 131 | 132 | ## Resources 133 | 134 | Implementations: 135 | 136 | Language | Library | Description 137 | ---------|---------|------------ 138 | Python | [scikit-learn](http://scikit-learn.org) | machine learning 139 | Python | [PyTorch](https://pytorch.org/) | deep learning, Facebook 140 | Python | [Keras](https://keras.io/) | deep learning 141 | Python | [TensorFlow](https://www.tensorflow.org/) | deep learning, Google 142 | Python | [Theano](http://deeplearning.net/software/theano/) | deep learning 143 | Python | [DragoNN](https://kundajelab.github.io/dragonn/) | deep learning for genomics 144 | Python | [Macau](https://github.com/jaak-s/macau) | Bayesian factorization 145 | R | [h2o](https://cran.r-project.org/web/packages/h2o/index.html) | machine and deep learning 146 | R | [mlr](https://cran.r-project.org/web/packages/mlr/index.html) | machine learning 147 | R | [caret](https://cran.r-project.org/web/packages/caret/index.html) | machine learning 148 | R | [randomForest](https://cran.r-project.org/web/packages/randomForest/index.html) | Breiman and Cutler's Random Forests for Classification and Regression 149 | Java | [WEKA](https://www.cs.waikato.ac.nz/~ml/weka/) | machine learning 150 | 151 | Repositories: 152 | 153 | Title / Link | Description 154 | -------------|------------ 155 | [Kipoi](http://kipoi.org/about/) | machine and deep learning models in genomics 156 | 157 | Blogs: 158 | 159 | - [Primer for Learning Google Colab](https://medium.com/dair-ai/primer-for-learning-google-colab-bb4cabca5dd6) 160 | - [Scikit-Learn: A silver bullet for basic machine learning](https://medium.com/analytics-vidhya/scikit-learn-a-silver-bullet-for-basic-machine-learning-13c7d8b248ee) 161 | - [How to build your own Neural Network from scratch in Python](https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6) 162 | - [Every single Machine Learning course on the internet, ranked by your reviews](https://medium.freecodecamp.org/every-single-machine-learning-course-on-the-internet-ranked-by-your-reviews-3c4a7b8026c0) 163 | 164 | Cheat sheets: 165 | 166 | - [101 Machine Learning Algorithms for Data Science with Cheat Sheets](https://www.r-bloggers.com/101-machine-learning-algorithms-for-data-science-with-cheat-sheets/) 167 | - [Essential cheat sheets for deep learning and machine learning researchers](https://github.com/kailashahirwar/cheatsheets-ai) 168 | - RStudio [Machine Learning Modelling](https://github.com/rstudio/cheatsheets/raw/master/Machine%20Learning%20Modelling%20in%20R.pdf) 169 | - RStudio [Deep Learning with Keras](https://github.com/rstudio/cheatsheets/raw/master/keras.pdf) 170 | - RStudio [caret package](https://github.com/rstudio/cheatsheets/raw/master/caret.pdf) 171 | - RStudio [mlr package](https://github.com/rstudio/cheatsheets/raw/master/mlr.pdf) 172 | - RStudio [h2o environment](https://github.com/rstudio/cheatsheets/raw/master/h2o.pdf) 173 | 174 | CRAN Task Views: 175 | - [Machine Learning & Statistical Learning](https://cran.r-project.org/web/views/MachineLearning.html) 176 | 177 | 178 | ## References 179 | 180 | - Pimentel's [collection](https://github.com/pimentel/deep_learning_papers) of deep learning papers 181 | 182 | --------------------------------------------------------------------------------