├── .github └── workflows │ └── check-links.yml ├── README.md └── nott300.png /.github/workflows/check-links.yml: -------------------------------------------------------------------------------- 1 | name: CI 2 | 3 | on: [push, pull_request] 4 | 5 | jobs: 6 | check-links: 7 | runs-on: ubuntu-latest 8 | timeout-minutes: 10 9 | steps: 10 | - uses: actions/checkout@v1 11 | - name: Link Checker 12 | id: lc 13 | uses: peter-evans/link-checker@v1 14 | - name: Fail if there were link errors 15 | run: exit ${{ steps.lc.outputs.exit_code }} -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # porous materials AI gym 2 | 3 | open data sets for machine learning pertaining to porous materials. 4 | 5 | - MOF = metal-organic framework 6 | - COF = covalent organic framework 7 | 8 |

9 | 10 |

11 | 12 | ## crystal structures 13 | 14 | ### experimental 15 | 16 | - MOFs: CoRE MOFs ([Paper](https://doi.org/10.1021/acs.jced.9b00835), [Database](https://zenodo.org/record/3677685)), CSD MOF Subset ([Paper](https://pubs.acs.org/doi/abs/10.1021/acs.chemmater.7b00441), [Database](https://sites.google.com/view/csdmofsubset/home)), CSD MOF Collection ([Paper](https://doi.org/10.1016/j.matt.2021.03.006), [Database](https://www.ccdc.cam.ac.uk/Community/csd-community/csd-mof-collection/)) 17 | - COFs: CURATED COFs ([Paper](https://pubs.acs.org/doi/10.1021/acscentsci.9b00619), [Database](https://github.com/danieleongari/CURATED-COFs)), CoRE COFs ([Paper](https://doi.org/10.1016/j.ces.2017.05.004), [Database](https://github.com/core-cof/CoRE-COF-Database)) 18 | - zeolites: IZA database ([Database](http://www.iza-structure.org/databases/)) 19 | 20 | ### hypothetical 21 | 22 | - MOFs: B&W ([Paper](https://www.nature.com/articles/s41586-019-1798-7), [Database](https://doi.org/10.24435/materialscloud:2018.0016/v3)), ToBaCCo ([Paper](https://pubs.acs.org/doi/abs/10.1021/acs.cgd.7b00848), [Database](https://mof.tech.northwestern.edu/databases), [Code](https://github.com/tobacco-mofs/tobacco_3.0)), hMOFs ([Paper](https://www.nature.com/articles/nchem.1192), [Database](https://mof.tech.northwestern.edu/databases)), Anderson et al. ([Paper](https://chemrxiv.org/articles/preprint/Deep_Learning_Combined_with_IAST_to_Screen_Thermodynamically_Feasible_MOFs_for_Adsorption-Based_Separation_of_Multiple_Binary_Mixtures/14122901/1), [Database](https://osf.io/7dgvy/)), MOF-5 analogues ([Paper](https://doi.org/10.1021/jp401920y), [Database](http://www.nanoporousmaterials.org/databases/)), PORMAKE ([Paper](https://doi.org/10.1021/acsami.1c02471), [Code](https://github.com/Sangwon91/PORMAKE)) 23 | - COFs: Mercado et al. ([Paper](https://doi.org/10.1021/acs.chemmater.8b01425), [Database](https://archive.materialscloud.org/record/2018.0003/v1)), Haranczyk's 3D COF database ([Paper](https://pubs.acs.org/doi/10.1021/jp507152j), [Database](http://www.nanoporousmaterials.org/databases/)) 24 | 25 | ## labeled porous materials for supervised learning 26 | 27 | | material class | target y | features x provided? | Reference | size of data set | 28 | | ------------------------------------ | ---------------------------------------------------- | ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------- | ---------------- | 29 | | MOFs (hypothetical) | CO2, N2 adsorption (sim) | yes | [Paper](https://www.nature.com/articles/s41586-019-1798-7), [Database](https://doi.org/10.24435/materialscloud:2018.0016/v3) | ca. 325,000 | 30 | | MOFs (experimental and hypothetical) | Band gaps, density of states, charge densities (sim) | yes | [Paper](https://doi.org/10.1016/j.matt.2021.02.015), [Database](https://github.com/arosen93/QMOF) | ca. 18,000 | 31 | | MOFs (experimental) | Color (exp) | yes | [Paper](https://doi.org/10.1039/D0SC05337F), [Database](https://doi.org/10.24435/materialscloud:cc-j6) | ? | 32 | | COFs (hypothetical) | CH4 deliverable capacity (sim) | yes, hand-crafted features provided. | [Paper](https://doi.org/10.1021/acs.chemmater.8b01425), [Database](https://archive.materialscloud.org/2018.0003/v3) | ca. 70,000 | 33 | | COFs (experimental) | CH4, H2, O2, Xe, Kr, H2S adsorption (sim) | ? | [Paper](https://doi.org/10.1021/acscentsci.0c00988) | ca. 500 | 34 | | MOFs (hypothetical) | H2 adsorption (sim) | yes | [Paper](https://www.sciencedirect.com/science/article/pii/S2666389921001240#bib70) / [Database](https://datahub.hymarc.org/dataset/computational-prediction-of-hydrogen-storage-capacities-in-mofs) | ca. 100K 35 | | MOFs (experimental) | thermal stability, solvent removal stability | yes (RAC & geometric features) | [Paper](https://www.nature.com/articles/s41597-022-01181-0) / [Database](https://zenodo.org/record/5737968#.YjNo6lRlAuU) | ca. 2-3K (extracted from experimental lit) 36 | | MOFs (experimental) | CO2, H2O DFT-calculated adsorption energy | no | [Paper](https://arxiv.org/pdf/2311.00341.pdf), [DataBase](https://open-dac.github.io/) | ca. 8400 MOFs, but 38M DFT calcs | 37 | 38 | ## labeled nodes for supervised learning 39 | 40 | | material class | target y | Reference | size of data set (# materials) | 41 | | ------------------------------------ | -------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------- | 42 | | MOFs (experimental) | DDEC6 charges on atoms (sim) | [Paper](https://doi.org/10.1021/acs.chemmater.5b03836), [Database](https://zenodo.org/record/3986573#.XzfKiJMzY8N) | ca. 3,000 | 43 | | MOFs (experimental and hypothetical) | DDEC6/CM5/Bader charges on atoms (sim) | [Paper](https://doi.org/10.1016/j.matt.2021.02.015), [Database](https://github.com/arosen93/QMOF) | ca. 18,000 (DDEC6/CM5), ca. 5,000 (Bader) | 44 | | MOFs (experimental and hypothetical) | Effective bond orders on atoms (sim) | [Paper](https://doi.org/10.1016/j.matt.2021.02.015), [Database](https://github.com/arosen93/QMOF) | ca. 18,000 | 45 | | MOFs (experimental) | Formal oxidation states on atoms (exp) | [Paper](https://chemrxiv.org/articles/preprint/Using_Collective_Knowledge_to_Assign_Oxidation_States/11604129/1), [Database](https://doi.org/10.24435/materialscloud:dq-ey) | ca. 49,000 | 46 | 47 | # other data sets, pertaining to materials, for machine learning 48 | 49 | see `matminer` [here](https://hackingmaterials.lbl.gov/matminer/dataset_summary.html). pointed out by [Jack Evans](https://twitter.com/jackevansADL/status/1439730395570851841). 50 | 51 | # construct your own crystal structures! 52 | here is a list of open-source codes for building your own crystal structure models. 53 | 54 | | name of code | link to code | link to associated paper | 55 | | -- | -- | -- | 56 | | `tobacco` | [link](https://github.com/tobacco-mofs/tobacco_3.0) | [link](https://pubs.acs.org/doi/abs/10.1021/acs.cgd.7b00848) | 57 | | `pormake` | [link](https://github.com/Sangwon91/PORMAKE) | [link](https://pubs.acs.org/doi/full/10.1021/acsami.1c02471) | 58 | | `ToBasCCo` | [link](https://github.com/peteboyd/tobascco) | [link](https://www.nature.com/articles/s41586-019-1798-7?proof=tNature) | 59 | | `Zeo++` | [link](http://www.zeoplusplus.org/) | [link](https://pubs.acs.org/doi/abs/10.1021/cg500158c) | 60 | | `stk` | [link](https://github.com/lukasturcani/stk) | [link](https://aip.scitation.org/doi/10.1063/5.0049708) | 61 | | `PoreMatMod.jl` (only modifies) | [link](https://github.com/SimonEnsemble/PoreMatMod.jl) | [link](https://pubs.acs.org/doi/10.1021/acs.jcim.1c01219) | 62 | | `pyCOFBuilder` | [link](https://github.com/lipelopesoliveira/pyCOFBuilder) | [link](https://arxiv.org/abs/2310.14822) | 63 | 64 | -------------------------------------------------------------------------------- /nott300.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SimonEnsemble/porous-material-AI-gym/b193bfc3682411e252b3ce71effaaf2a677ad290/nott300.png --------------------------------------------------------------------------------