├── LICENSE.md ├── README.md ├── images ├── GNN │ ├── calculate_text.png │ ├── final_output_text.png │ ├── final_structure.png │ ├── initialize_calculate_text.png │ ├── initialize_text.png │ ├── node_a.jpg │ ├── node_a.png │ ├── node_a_text.png │ ├── node_b.png │ ├── node_b_text.png │ ├── node_c.png │ ├── node_c_text.png │ ├── starting_structure.png │ ├── structure_text.png │ └── test └── cell_motherboard.png └── notebooks ├── WS01_NeuralNetworksWithNumpy.ipynb ├── WS02_NeuralNetworksWithPyTorch.ipynb ├── WS03_ConvolutionalNeuralNetworks.ipynb ├── WS04_LMsForShakespeareAndProteins.ipynb ├── WS05_LanguageModelEmbeddingsTransferLearningForDownstreamTask.ipynb ├── WS06_IntroductionToAF.ipynb ├── WS07_GNNsForProteins.ipynb ├── WS08_DenoisingDiffusionProbabilisticModels.ipynb ├── WS09_PuttingItAllTogether_DesigningProteins.ipynb └── WS10_RFDiffusion_AllAtom.ipynb /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Gray Lab 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DL4Proteins Workshops (Beta release) 2 | 3 | ![Cell Motherboard Wallpaper](images/cell_motherboard.png) 4 | 5 | **Welcome to DL4Proteins!** 6 | 7 | The goal of the DL4Proteins notebook series is to democratize deep learning for protein design and prediction, arriving at a transformative moment in science. With the 2024 Nobel Prize in Chemistry awarded to David Baker, Demis Hassabis, and John Jumper for breakthroughs in computational protein design and structural prediction, this resource provides an accessible, hands-on introduction to the very tools and methodologies that shaped this revolution. By blending foundational machine learning principles with state-of-the-art approaches such as AlphaFold, RFDiffusion, and ProteinMPNN, DL4Proteins equips researchers, educators, and students with the knowledge to contribute to the future of protein engineering. These open-source notebooks bridge the gap between cutting-edge research and classroom learning, fostering a new generation of innovators in synthetic biology and therapeutics. 8 | 9 | The Jupyter notebooks below provide an introduction to the fundamental machine learning concepts and models currently utilized in the protein design space. Notebooks can be run in Google Colaboratory. 10 | 11 | **For figures and questions to render correctly, please set colab notebooks to light mode. 12 | 13 | ### Table of contents 14 | ### [Chapter 1: Neural Networks with NumPy](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS01_NeuralNetworksWithNumpy.ipynb) 15 | ### [Chapter 2: Neural Networks with PyTorch](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS02_NeuralNetworksWithPyTorch.ipynb) 16 | ### [Chapter 3: Convolutional Neural Networks](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS03_ConvolutionalNeuralNetworks.ipynb) 17 | ### [Chapter 4: Language Models for Shakespeare and Proteins](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS04_LMsForShakespeareAndProteins.ipynb) 18 | ### [Chapter 5: Language model embeddings transfer learning for downstream task](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS05_LanguageModelEmbeddingsTransferLearningForDownstreamTask.ipynb) 19 | ### [Chapter 6: Introduction to AlphaFold](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS06_IntroductionToAF.ipynb) 20 | ### [Chapter 7: Graph Neural Networks for Proteins](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS07_GNNsForProteins.ipynb) 21 | ### [Chapter 8: Denoising Diffusion Probabilistic Models](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS08_DenoisingDiffusionProbabilisticModels.ipynb) 22 | ### [Chapter 9: Putting it All Together - From RFDiffusion to ProteinMPNN to Alphafold](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS09_PuttingItAllTogether_DesigningProteins.ipynb) 23 | ### [Chapter 10: Introduction to RFDiffusion - All Atom](https://colab.research.google.com/github/Graylab/DL4Proteins-notebooks/blob/main/notebooks/WS10_RFDiffusion_AllAtom.ipynb) 24 | 25 | If you have any issues, please put into [Issues tab](https://github.com/Graylab/DL4Proteins-notebooks/issues). This is a living repository - we are actively incorporating feedback! 26 | 27 | Authors: Michael F. Chungyoun, Sreevarsha Puvada, Gabriel Au, Courtney Thomas, Britnie J. Carpentier, Jeffrey J. Gray 28 | 29 | Acknowledgments: Sergey Lyskov, Sergey Ovchinnikov, Johns Hopkins students of 2023 540.614/414 Protein Structure Prediction course, and the Johns Hopkins Center for Teaching Excellence and Innovation - Instructional Enhancement Grant. 30 | 31 | Citations and Additional Resources: Each notebook in this repository draws inspiration and methodologies from various cutting-edge resources, including prominent online tools, education resources, publications, and open-source repositories. Key resources include YouTube series by Harrison Kinsley, Andrej Karpathy, and Petar Veličković. These are cited within their respective notebooks, and we encourage users to explore these foundational works for deeper insights. 32 | -------------------------------------------------------------------------------- /images/GNN/calculate_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/calculate_text.png -------------------------------------------------------------------------------- /images/GNN/final_output_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/final_output_text.png -------------------------------------------------------------------------------- /images/GNN/final_structure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/final_structure.png -------------------------------------------------------------------------------- /images/GNN/initialize_calculate_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/initialize_calculate_text.png -------------------------------------------------------------------------------- /images/GNN/initialize_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/initialize_text.png -------------------------------------------------------------------------------- /images/GNN/node_a.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_a.jpg -------------------------------------------------------------------------------- /images/GNN/node_a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_a.png -------------------------------------------------------------------------------- /images/GNN/node_a_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_a_text.png -------------------------------------------------------------------------------- /images/GNN/node_b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_b.png -------------------------------------------------------------------------------- /images/GNN/node_b_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_b_text.png -------------------------------------------------------------------------------- /images/GNN/node_c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_c.png -------------------------------------------------------------------------------- /images/GNN/node_c_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/node_c_text.png -------------------------------------------------------------------------------- /images/GNN/starting_structure.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/starting_structure.png -------------------------------------------------------------------------------- /images/GNN/structure_text.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/GNN/structure_text.png -------------------------------------------------------------------------------- /images/GNN/test: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /images/cell_motherboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Graylab/DL4Proteins-notebooks/1a6c6780aa24db3a02e650ff43e9acb0bcd834db/images/cell_motherboard.png --------------------------------------------------------------------------------