├── .travis.yml ├── CONTRIBUTING.md └── README.md /.travis.yml: -------------------------------------------------------------------------------- 1 | language: ruby 2 | rvm: 2.5 3 | before_script: gem install awesome_bot 4 | script: awesome_bot --allow-redirect --allow-dupe README.md 5 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | Your contributions are always welcome! 4 | 5 | ## Guidelines 6 | 7 | * Add section if needed. 8 | * Add section description. 9 | * Add section title to Table of contents. 10 | * If the program has special functionality mention it in bullet points beneath it. 11 | * Search previous suggestions before making a new one, as yours may be a duplicate. 12 | * Add your links: `* [project-name](http://example.com/) - A short description ends with a dot.` 13 | * Check your spelling and grammar. 14 | * Make sure your text editor is set to remove trailing whitespace. 15 | * Send a Pull Request. 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome Cheminformatics [![Awesome](https://awesome.re/badge.svg)](https://awesome.re) 2 | 3 | > Cheminformatics (also known as chemoinformatics, chemioinformatics and chemical informatics) is the use of computer and informational techniques applied to a range of problems in the field of chemistry.— [Wikipedia](https://en.wikipedia.org/wiki/Cheminformatics) 4 | 5 | A curated list of awesome Cheminformatics software, resources, and libraries. Mostly command line based, and free or open-source. Please feel free to [contribute](CONTRIBUTING.md) ! 6 | 7 | ## Contents 8 | 9 | * [Applications](#applications) 10 | * [Visualization](#app-visualization) 11 | * [Command Line Tools](#app-cmd) 12 | * [Docking](#app-docking) 13 | * [Virtual Machine](#app-virtual) 14 | * [Libraries](#libraries) 15 | * [General Purpose](#lib-general) 16 | * [Visualization](#lib-visualization) 17 | * [Command Line Tools](#lib-format) 18 | * [Docking](#lib-dock) 19 | * [Molecular Descriptors](#lib-des) 20 | * [Machine Learning](#lib-ml) 21 | * [Web APIs](#lib-web) 22 | * [Databases](#lib-db) 23 | * [Others](#lib-others) 24 | * [Journals](#journals) 25 | * [Resources](#resources) 26 | * [Courses](#courses) 27 | * [Blogs](#blogs) 28 | * [Books](#books) 29 | * [See Also](#see-also) 30 | 31 | ## Applications 32 | 33 | 34 | ### Visualization 35 | 36 | * [PyMOL](https://sourceforge.net/projects/pymol/) - Python-enhanced molecular graphics tool. 37 | * [Jmol](http://jmol.sourceforge.net/) - Browser-based HTML5 viewer and stand-alone Java viewer for chemical structures in 3D. 38 | * [VMD](http://www.ks.uiuc.edu/Research/vmd/) - Molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting. 39 | * [Chimera](https://www.cgl.ucsf.edu/chimera/) - Highly extensible program for interactive molecular visualization and analysis. [Source](https://www.cgl.ucsf.edu/chimera/docs/sourcecode.html) is available. 40 | * [ChimeraX](https://www.cgl.ucsf.edu/chimerax/) - The next-generation molecular visualization program, following UCSF Chimera. Source is available [here](https://www.cgl.ucsf.edu/chimerax/docs/devel/conventions.html). 41 | * [DataWarrior](http://www.openmolecules.org/datawarrior/index.html) - A program for data Visualization and analysis which combines dynamic graphical views and interactive row filtering with chemical intelligence. 42 | 43 | 44 | ### Command Line Tools 45 | 46 | * [Open Babel](http://openbabel.org/wiki/Main_Page) - Chemical toolbox designed to speak the many languages of chemical data. 47 | * [MayaChemTools](http://www.mayachemtools.org/index.html) - Collection of Perl and Python scripts, modules, and classes that support day-to-day computational discovery needs. 48 | * [Packmol](http://m3g.iqm.unicamp.br/packmol/home.shtml) - Initial configurations for molecular dynamics simulations by packing optimization. 49 | * [BCL::Commons](http://meilerlab.org/index.php/bclcommons/show/b_apps_id/1) 50 | 51 | 52 | ### Docking 53 | 54 | * [AutoDock Vina](http://vina.scripps.edu/) - Molecular docking and virtual screening. 55 | * [smina](https://sourceforge.net/projects/smina/) - Customized [AutoDock Vina](http://vina.scripps.edu/) to better support scoring function development and high-performance energy minimization. 56 | 57 | 58 | ### Virtual Machine 59 | 60 | * [myChEMBL](http://chembl.blogspot.com/2015/07/mychembl-20-has-landed.html) - A version of ChEMBL built using Open Source software (Ubuntu, PostgreSQL, RDKit) 61 | * [3D e-Chem Virtual Machine](https://github.com/3D-e-Chem/3D-e-Chem-VM) - Virtual machine with all software and sample data to run 3D-e-Chem Knime workflows 62 | 63 | ## Libraries 64 | 65 | 66 | ### General Purpose 67 | 68 | * [RDKit](http://www.rdkit.org/) - Collection of cheminformatics and machine-learning software written in C++ and Python. 69 | * [Indigo](https://github.com/epam/Indigo) - Universal molecular toolkit that can be used for molecular fingerprinting, substructure search, and molecular visualization written in C++ package, with Java, C#, and Python wrappers. 70 | * [CDK (Chemistry Development Kit)](https://sourceforge.net/projects/cdk/) - Algorithms for structural chemo- and bioinformatics, implemented in Java. 71 | * [ChemmineR](https://www.bioconductor.org/packages/release/bioc/vignettes/ChemmineR/inst/doc/ChemmineR.html) - Cheminformatics package for analyzing drug-like small molecule data in R. 72 | * [ChemPy](https://github.com/bjodah/chempy) - A Python package useful for chemistry (mainly physical/inorganic/analytical chemistry) 73 | * [MolecularGraph.jl](https://github.com/mojaie/MolecularGraph.jl) - A graph-based molecule modeling and chemoinformatics analysis toolkit fully implemented in Julia 74 | * [datamol](https://github.com/datamol-org/datamol): - Molecular Manipulation Made Easy. A light wrapper build on top of RDKit. 75 | * [CGRtools](https://github.com/cimm-kzn/CGRtools) - Toolkit for processing molecules, reactions and condensed graphs of reactions. Can be used for chemical standardization, MCS search, tautomers generation with backward compatibility to RDKit and NetworkX. 76 | 77 | 78 | ### Format Checking 79 | 80 | * [ChEMBL_Structure_Pipeline (formerly standardiser)](https://github.com/chembl/ChEMBL_Structure_Pipeline) - Tool designed to provide a simple way of standardising molecules as a prelude to e.g. molecular modelling exercises. 81 | * [MolVS](https://github.com/mcs07/MolVS) - Molecule validation and standardization based on [RDKit](http://www.rdkit.org/). 82 | * [rd_filters](https://github.com/PatWalters/rd_filters) - A script to run structural alerts using the RDKit and ChEMBL 83 | * [pdb-tools](https://github.com/haddocking/pdb-tools) - A swiss army knife for manipulating and editing PDB files. 84 | 85 | 86 | ### Visualization 87 | 88 | * [Kekule.js](http://partridgejiang.github.io/Kekule.js/) - Front-end JavaScript library for providing the ability to represent, draw, edit, compare and search molecule structures on web browsers. 89 | * [3Dmol.js](https://github.com/3dmol/3Dmol.js) - An object-oriented, webGL based JavaScript library for online molecular visualization. 90 | * [JChemPaint](https://github.com/JChemPaint/jchempaint) - Chemical 2D structure editor application/applet based on the [Chemistry Development Kit](https://sourceforge.net/projects/cdk/). 91 | * [rdeditor](https://github.com/EBjerrum/rdeditor) - Simple RDKit molecule editor GUI using PySide. 92 | * [nglviewer](http://nglviewer.org/nglview/latest/) - Interactive molecular graphics for Jupyter notebooks. 93 | * [RDKit.js](https://www.npmjs.com/package/@rdkit/rdkit) - Official JavaScript distribution of cheminformatics functionality from the RDKit - a C++ library for cheminformatics. 94 | 95 | 96 | ### Molecular Descriptors 97 | 98 | * [mordred](https://github.com/mordred-descriptor/mordred) - Molecular descriptor calculator based on [RDKit](http://www.rdkit.org/). 99 | * [DescriptaStorus](https://github.com/bp-kelley/descriptastorus) - Descriptor computation(chemistry) and (optional) storage for machine learning. 100 | * [mol2vec](https://github.com/samoturk/mol2vec) - Vector representations of molecular substructures. 101 | * [Align-it](http://silicos-it.be.s3-website-eu-west-1.amazonaws.com/software/align-it/1.0.4/align-it.html#alignit-generating-pharmacophore-points) - Align molecules according their pharmacophores. 102 | * [Rcpi](https://nanx.me/Rcpi/index.html) - R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions. 103 | 104 | 105 | ### Machine Learning 106 | 107 | * [DeepChem](https://github.com/deepchem/deepchem) - Deep learning library for Chemistry based on Tensorflow 108 | * [Chemprop](https://github.com/chemprop/chemprop) - Directed message passing neural networks for property prediction of molecules and reactions with uncertainty and interpretation. 109 | * [ChemML](https://github.com/hachmannlab/chemml) - ChemML is a machine learning and informatics program suite for the analysis, mining, and modeling of chemical and materials data. (based on Tensorflow) 110 | * [olorenchemengine](https://github.com/Oloren-AI/olorenchemengine) - Molecular property prediction with unified API for diverse models and respresentations, 111 | with integrated uncertainty quantification, interpretability, and hyperparameter/architecture tuning. 112 | * [OpenChem](https://github.com/Mariewelt/OpenChem) - OpenChem is a deep learning toolkit for Computational Chemistry with PyTorch backend. 113 | * [DGL-LifeSci](https://github.com/awslabs/dgl-lifesci) - DGL-LifeSci is a [DGL](https://www.dgl.ai/)-based package for various applications in life science with graph neural network. 114 | * [chainer-chemistry](https://github.com/pfnet-research/chainer-chemistry) - A Library for Deep Learning in Biology and Chemistry. 115 | * [pytorch-geometric](https://pytorch-geometric.readthedocs.io/en/latest/) - A PyTorch library provides implementation of many graph convolution algorithms. 116 | * [chemmodlab](https://github.com/jrash/ChemModLab) - A Cheminformatics Modeling Laboratory for Fitting and Assessing Machine Learning Models in R. 117 | * [Summit](https://github.com/sustainable-processes/summit) - A python package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks). 118 | 119 | 120 | ### Web APIs 121 | 122 | * [webchem](https://github.com/ropensci/webchem) - Chemical Information from the Web. 123 | * [PubChemPy](http://pubchempy.readthedocs.io) - Python wrapper for the PubChem PUG REST API. 124 | * [ChemSpiPy](http://chemspipy.readthedocs.org) - Python wrapper for the ChemSpider API. 125 | * [CIRpy](http://cirpy.readthedocs.org/) - Python wrapper for the [NCI Chemical Identifier Resolver (CIR)](https://cactus.nci.nih.gov/chemical/structure). 126 | * [Beaker](https://github.com/chembl/chembl_beaker) - [RDKit](http://www.rdkit.org/) and [OSRA](https://cactus.nci.nih.gov/osra/) in the [Bottle](http://bottlepy.org/docs/dev/) on [Tornado](http://www.tornadoweb.org/en/stable/). 127 | * [chemminetools](https://github.com/girke-lab/chemminetools) - Open source web framework for small molecule analysis based on Django. 128 | * [ambit](http://ambit.sourceforge.net/) - offers chemoinformatics functionality via REST web services. 129 | 130 | 131 | ### Databases 132 | 133 | * [razi](https://github.com/rvianello/razi) - Cheminformatic extension for the SQLAlchemy database. 134 | * [Chemical Translation Service](https://bitbucket.org/fiehnlab/fiehnlab-cts/src/master/) - Source code of the [Chemical Translation Service](https://cts.fiehnlab.ucdavis.edu/) web service. 135 | 136 | 137 | ### Docking 138 | * [Rosetta](https://www.rosettacommons.org/docs/latest/Home) - A comprehensive software suite for modeling macromolecular structures. Used larely for protein-protein docking. 139 | * [DOCKSTRING](https://github.com/dockstring/dockstring) - Automates and standardizes ligand preparation for AutoDock Vina. 140 | 141 | 142 | ### Molecular Dynamics 143 | 144 | * [Gromacs](http://www.gromacs.org/) - Molecular dynamics package mainly designed for simulations of proteins, lipids and nucleic acids. 145 | * [OpenMM](http://openmm.org/) - High performance toolkit for molecular simulation including extensive language bindings for Python, C, C++, and even Fortran. 146 | * [NAMD](https://www.ks.uiuc.edu/Research/namd/) - a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. 147 | * [MDTraj](https://github.com/mdtraj/mdtraj) - Analysis of molecular dynamics trajectories. 148 | * [cclib](https://github.com/cclib/cclib) - Parsers and algorithms for computational chemistry logfiles. 149 | * [ProDy](https://github.com/prody/ProDy) - A Python package for protein dynamics analysis 150 | 151 | 152 | ### Others 153 | 154 | * [eiR](https://github.com/girke-lab/eiR) - Accelerated similarity searching of small molecules 155 | * [OPSIN](https://github.com/dan2097/opsin) - Open Parser for Systematic IUPAC nomenclature 156 | * [Cookiecutter for Computational Molecular Sciences](https://github.com/MolSSI/cookiecutter-cms) - Python-centric Cookiecutter for Molecular Computational Chemistry Packages by [MolSSL](https://molssi.org/) 157 | * [Auto-QChem](https://github.com/PrincetonUniversity/auto-qchem) - an automated workflow for the generation and storage of DFT calculations for organic molecules. 158 | * [Gypsum-DL](https://git.durrantlab.pitt.edu/jdurrant/gypsum_dl) - a program for converting 2D SMILES strings to 3D models. 159 | * [RDchiral](https://github.com/connorcoley/rdchiral) - Wrapper for RDKit's RunReactants to improve stereochemistry handling 160 | * [confgen](https://github.com/Et9797/confgen-webapp) - Webapp for generating conformers 161 | 162 | 163 | ## Journals 164 | 165 | * [Journal of Cheminformatics](https://jcheminf.biomedcentral.com/) 166 | * [Journal of Chemical Information and Modeling (ACS Publications)](https://pubs.acs.org/journal/jcisd8) 167 | 168 | ## Resources 169 | 170 | ### Courses 171 | 172 | * [Learncheminformatics.com](http://learncheminformatics.com/) - "Cheminformatics: Navigating the world of chemical data" courese at Indiana University. 173 | * [Python for chemoinformatics](https://github.com/Mishima-syk/py4chemoinformatics) 174 | * [TeachOpenCADD](https://github.com/volkamerlab/TeachOpenCADD) - A teaching platform for computer-aided drug design (CADD) using open source packages and data. 175 | * [Cheminformatics OLCC](https://chem.libretexts.org/Courses/Intercollegiate_Courses/Cheminformatics_OLCC_(2019)) - Cheminformatics course of the Collaborative Intercollegiate Online Chemistry Course (OLCC) course of University of Arkansas at Little Rock by Robert Belford 176 | * [BigChem](http://bigchem.eu/alllectures) - All lectures of [BigChem](http://bigchem.eu/) (A Horizon 2020 MSC ITN EID project, which provides innovative education in large chemical data analysis.) 177 | * [Molecular modeling course](https://dasher.wustl.edu/chem478/) - by Dr. [Jay Ponder](https://dasher.wustl.edu/), a professor from WashU St.Louis. 178 | * [Simulation in Chemistry and Biochemistry](https://dasher.wustl.edu/chem430/) - by Dr. [Jay Ponder](https://dasher.wustl.edu/), a professor from WashU St.Louis. 179 | 180 | ### Blogs 181 | 182 | * [Open Source Molecular Modeling](https://opensourcemolecularmodeling.github.io/README.html) - Updateable catalog of open source molecular modeling software. 183 | * [PubChem Blog](https://pubchemblog.ncbi.nlm.nih.gov/) - News, updates and tutorials about [PubChem](https://pubchem.ncbi.nlm.nih.gov/). 184 | * [The ChEMBL-og blog](http://chembl.blogspot.tw/) - Stories and news from Computational Chemical Biology Group at [EMBL-EBI](https://www.ebi.ac.uk/). 185 | * [ChEMBL blog](http://chembl.github.io/) - ChEMBL on GitHub. 186 | * [SteinBlog](http://www.steinbeck-molecular.de/steinblog/) - Blog of [Christoph Steinbeck](http://www.steinbeck-molecular.de/steinblog/index.php/about/), who is the head of cheminformatics and metabolism at the EMBL-EBI. 187 | * [Practical Cheminformatics](http://practicalcheminformatics.blogspot.com/) - Blog with in-depth examples of practical application of cheminformatics. 188 | * [So much to do, so little time - Trying to squeeze sense out of chemical data](http://blog.rguha.net/) - Bolg of [Rajarshi Guha](http://blog.rguha.net/?page_id=8), who is a research scientist at NIH Center for Advancing Translational Science. 189 |  * Some old blogs [1](https://rguha.wordpress.com/) [2](http://www.rguha.net/index.html). 190 | * [Noel O'Blog](http://baoilleach.blogspot.tw/) - Blog of [Noel O'Boyle](https://www.redbrick.dcu.ie/~noel/), who is a Senior Software Engineer at NextMove Software. 191 | * [chem-bla-ics](http://chem-bla-ics.blogspot.tw/) - Blog of [Egon Willighagen](http://egonw.github.io/), who is an assistant professor at Maastricht University. 192 | 195 | * [steeveslab-blog](http://asteeves.github.io/) - Some examples using [RDKit](http://www.rdkit.org/). 196 | * [Macs in Chemistry](http://www.macinchem.org/) - Provide a resource for chemists using Apple Macintosh computers. 197 | * [DrugDiscovery.NET](http://www.drugdiscovery.net/) - Blog of [Andreas Bender](http://www.andreasbender.de/), who is a Reader for Molecular Informatics at University of Cambridge. 198 | * [Is life worth living?](https://iwatobipen.wordpress.com/) - Some examples for cheminformatics libraries. 199 | * [Cheminformatics 2.0](https://cheminf20.org/) - Blog of [Alex M. Clark](https://twitter.com/aclarkxyz), a research scientist at Collaborative Drug Discovery. 200 | * [Depth-First](https://depth-first.com/) - Blog of [Richard L. Apodaca](https://depth-first.com/about/), a chemist living in La Jolla, California. 201 | * [Cheminformania](https://www.cheminformania.com) - Blog of [Ph.D, Esben Jannik Bjerrum](https://www.cheminformania.com/about/esben-jannik-bjerrum/), who is a Principle Scientist and a Machine Learning and AI specialists at AstraZeneca. 202 | 203 | ### Books 204 | 205 | * [Computational Approaches in Cheminformatics and Bioinformatics](https://books.google.com/books/about/Computational_Approaches_in_Cheminformat.html?id=bLqV4rYQoYsC) - Include insights from public (NIH), academic, and industrial sources at the same time. 206 | * [Chemoinformatics for Drug Discovery](https://onlinelibrary.wiley.com/doi/book/10.1002/9781118742785) - Materials about how to use Chemoinformatics strategies to improve drug discovery results. 207 | * [Molecular Descriptors for Chemoinformatics](https://onlinelibrary.wiley.com/doi/book/10.1002/9783527628766) - More than 3300 descriptors and related terms for chemoinformatic analysis of chemical compound properties. 208 | 209 | 210 | ## See Also 211 | 212 | * [deeplearning-biology](https://github.com/hussius/deeplearning-biology#chemoinformatics-and-drug-discovery-) - Chemoinformatics and drug discovery section in deeplearning-biology repo. 213 | * [awesome-python-chemistry](https://github.com/lmmentel/awesome-python-chemistry) - Another list focuses on Python stuff related to Chemistry. 214 | * [awesome-small-molecule-ml](https://github.com/benb111/awesome-small-molecule-ml) - A list of papers, data sets, and other resources for machine learning for small-molecule drug discovery. 215 | * [awesome-molecular-docking](https://github.com/yangnianzu0515/awesome-molecular-docking) - A curated list of molecular docking software, datasets, and other closely related resources. 216 | * [MolSSI Molecular Software Database](https://molssi.org/software-search/) 217 | *[Pages created by Tobias Kind, PhD](https://fiehnlab.ucdavis.edu/staff/kind/metabolomics) 218 | 219 | ## License 220 | 221 | [![CC0](http://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg)](https://creativecommons.org/publicdomain/zero/1.0/) 222 | --------------------------------------------------------------------------------