├── .gitattributes
├── .gitignore
├── LICENSE
├── README.md
├── benchmarks.md
├── data_sources.md
├── database_tools
├── README.md
├── ase_formatter.py
├── check_dist.py
├── cifs_to_xyz.py
├── deduplicate.py
├── false_terminal_oxo_checker.py
├── lone_atom_check.py
├── make_primitive.py
└── xyz_to_cifs.py
├── logo.png
├── logo_white.png
├── machine_learning
├── README.md
├── cgcnn
│ ├── LICENSE
│ ├── README.md
│ ├── cgcnn
│ │ ├── __init__.py
│ │ ├── data.py
│ │ └── model.py
│ ├── data
│ │ └── example
│ │ │ ├── 1.cif
│ │ │ ├── 2.cif
│ │ │ ├── 3.cif
│ │ │ ├── atom_init.json
│ │ │ └── id_prop.csv
│ ├── main.py
│ ├── predict.py
│ ├── run_predict.sh
│ └── run_train.sh
├── he_stoichiometric_45
│ ├── README.md
│ ├── stoich45_feature_generator.py
│ ├── stoich45_krr.py
│ ├── stoich45_learning_curves.py
│ └── tabulated_data
│ │ ├── atomic_number.csv
│ │ ├── boiling_point.csv
│ │ ├── density.csv
│ │ ├── electron_affinity.csv
│ │ ├── electronegativity.csv
│ │ ├── group.csv
│ │ ├── ionization_energy.csv
│ │ ├── melting_point.csv
│ │ └── period.csv
├── meredig_stoichiometric_120
│ ├── README.md
│ ├── stoich120_feature_generator.py
│ ├── stoich120_krr.py
│ └── stoich120_learning_curves.py
├── orbital_field_matrix
│ ├── README.md
│ ├── ofm_feature_generator.py
│ ├── ofm_krr.py
│ └── ofm_learning_curves.py
├── sine_matrix
│ ├── README.md
│ ├── sine_matrix_feature_generator.py
│ ├── sine_matrix_krr.py
│ └── sine_matrix_learning_curves.py
├── soap_kernel
│ ├── README.md
│ ├── soap_avg_kernel_generator.py
│ ├── soap_krr.py
│ ├── soap_learning_curves.py
│ └── soap_matrix_generator.py
└── umap
│ ├── README.md
│ ├── umap_reduction.py
│ └── umap_reduction_dataset_overlap.py
├── mofid_search.md
├── other
├── dft_workflow
│ ├── README.md
│ ├── mof_screen
│ │ ├── LICENSE
│ │ ├── README.md
│ │ ├── pymofscreen
│ │ │ ├── __init__.py
│ │ │ ├── calc_swaps.py
│ │ │ ├── cif_handler.py
│ │ │ ├── compute_environ.py
│ │ │ ├── default_calculators.py
│ │ │ ├── error_handler.py
│ │ │ ├── janitor.py
│ │ │ ├── kpts_handler.py
│ │ │ ├── magmom_handler.py
│ │ │ ├── metal_types.py
│ │ │ ├── runner.py
│ │ │ ├── screen.py
│ │ │ ├── screen_phases.py
│ │ │ ├── vtst_handler.py
│ │ │ └── writers.py
│ │ ├── requirements.txt
│ │ ├── setup.py
│ │ └── toc.png
│ ├── mofpath
│ │ └── example.cif
│ └── runner
│ │ ├── opt.py
│ │ └── sub_slurm.job
└── example_dos
│ ├── GUTYAW
│ ├── CONTCAR
│ ├── DOSCAR.gz
│ ├── EIGENVAL
│ ├── dos.py
│ └── vasprun.xml.gz
│ ├── LOJLAZ-HS
│ ├── CONTCAR
│ ├── DOSCAR.gz
│ ├── EIGENVAL
│ ├── dos.py
│ └── vasprun.xml.gz
│ ├── LOJLAZ-LS
│ ├── CONTCAR
│ ├── DOSCAR.gz
│ ├── EIGENVAL
│ ├── dos.py
│ └── vasprun.xml.gz
│ ├── RAXNEK
│ ├── CONTCAR
│ ├── DOSCAR.gz
│ ├── EIGENVAL
│ ├── dos.py
│ └── vasprun.xml.gz
│ ├── README.md
│ └── WAQMEJ
│ ├── CONTCAR
│ ├── DOSCAR.gz
│ ├── EIGENVAL
│ ├── dos.py
│ └── vasprun.xml.gz
└── updates.md
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | POTCAR
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 Andrew S. Rosen
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # QMOF Database
2 |
3 |
4 |
5 | ## Overview
6 |
7 | The Quantum MOF (QMOF) Database is a publicly available dataset of quantum-chemical properties for 20,000+ metal–organic frameworks (MOFs) and coordination polymers derived from high-throughput periodic density functional theory (DFT) calculations.
8 |
9 | ## Explore and Download the QMOF Database
10 |
11 | Much of the data underlying the QMOF Database can be downloaded from [Figshare](https://doi.org/10.6084/m9.figshare.13147324). For additional documentation and supplemental data, refer to the following:
12 |
13 | Downloading the QMOF Database 14 |
15 | 16 | ## Updates 17 | 18 | For a list of version-specific updates, see [updates.md](https://github.com/arosen93/QMOF/blob/main/updates.md). 19 | 20 | ## FAQs 21 | 22 | - Q: Are trajectories avaialable with the QMOF Database? A: No. 23 | 24 | ## Citation 25 | 26 | If you use the QMOF Database, please refer to the following publications. Both should be cited if you are using the dataset with 20k+ structures. 27 | 28 | - A.S. Rosen, S.M. Iyer, D. Ray, Z. Yao, A. Aspuru-Guzik, L. Gagliardi, J.M. Notestein, R.Q. Snurr. "Machine Learning the Quantum-Chemical Properties of Metal–Organic Frameworks for Accelerated Materials Discovery", *Matter*, **4**, 1578-1597 (2021). DOI: [10.1016/j.matt.2021.02.015](https://doi.org/10.1016/j.matt.2021.02.015). 29 | - A.S. Rosen, V. Fung, P. Huck, C.T. O'Donnell, M.K. Horton, D.G. Truhlar, K.A. Persson, J.M. Notestein, R.Q. Snurr. "High-Throughput Predictions of Metal–Organic Framework Electronic Properties: Theoretical Challenges, Graph Neural Networks, and Data Exploration," *npj Comput. Mat.,* **8**, 112 (2022). DOI: [10.1038/s41524-022-00796-6](https://doi.org/10.1038/s41524-022-00796-6). 30 | 31 | ## Licensing 32 | 33 | The data underlying the QMOF Database is made publicly available under a [CC BY 4.0 license](https://creativecommons.org/licenses/by/4.0/). This means you can copy it, share it, adapt it, and do whatever you like with it provided that you give [appropriate credit](https://wiki.creativecommons.org/wiki/License_Versions#Detailed_attribution_comparison_chart) and [indicate any changes](https://wiki.creativecommons.org/wiki/License_Versions#Modifications_and_adaptations_must_be_marked_as_such). 34 | -------------------------------------------------------------------------------- /benchmarks.md: -------------------------------------------------------------------------------- 1 | # ML Model Results 2 | 3 | ## Overview 4 | When trained on 80% of [QMOF Database v13](https://figshare.com/articles/dataset/QMOF_Database/13147324/13), the testing set ML results for band gap prediction are as follows: 5 | 6 | | Method | MAE (eV) | 7 | | ----------- | ----------- | 8 | | Mean of Dataset | 0.940 | 9 | | Sine Matrix Eigenspectrum | 0.539 ± 0.004 | 10 | | Stoichiometric-45 | 0.459 ± 0.005 | 11 | | Stoichiometric-120 | 0.454 ± 0.006 | 12 | | Orbital Field Matrix | 0.410 ± 0.004 | 13 | | Average SOAP kernel | 0.344 ± 0.004 | 14 | | CGCNN | 0.270 ± 0.011 | 15 | 16 | You can download the corresponding features for QMOF Database v13 [here](https://nuwildcat.sharepoint.com/:f:/s/TGS-QMOF/Es-51y1ZLmlDmoYOemYqArsBMtyAmG5qAs6UBFHh3C968g?e=0PIJsg). 17 | -------------------------------------------------------------------------------- /data_sources.md: -------------------------------------------------------------------------------- 1 | # Data Sources 2 | Here, we describe the data sources for the crystal structures in the QMOF Database. 3 | 4 | ## Synthesized MOFs 5 | ### CSD 6 | Most of the properties are obtained for DFT-optimized structures based on initial geometries from the [MOF Subset on the CSD](https://pubs.acs.org/doi/full/10.1021/acs.chemmater.7b00441) with their free solvents removed. Note that given the loose definition of a MOF in the CSD MOF Subset, some of these materials may be better classified as coordination polymers. You may wish to filter the materials by void fraction or pore size if you are specifically interested in porous materials. 7 | 8 | All materials taken from the CSD have `_FSR` at the end of their name. The `source` flag is set to `'CSD'`. 9 | 10 | Note: Only DFT-optimized structures are made available (no trajectories or initial structures). If you wish to obtain the initial structures, you must have an active CSD license and use the CCDC's ConQuest program. 11 | 12 | ### CoRE 2019 13 | Since v8 of the QMOF Database, some properties were obtained from DFT-optimized structures based on the 2019 CoRE MOF Database uploaded on [Zenodo](https://doi.org/10.5281/zenodo.3677685 14 | ). This was done in one of two ways. In the first approach uploaded with v8, a curated subset of CoRE MOFs discussed in [Kancharlapalli et al.](https://pubs.acs.org/doi/abs/10.1021/acs.jctc.0c01229) was adopted to reduce the likelihood of obtaining erroneous crystal structures. In the second approach uploaded with v9, a curated subset of CoRE MOFs presented in [Chen and Manz](https://pubs.rsc.org/en/content/articlehtml/2020/ra/d0ra02498h) was adopted, after removing entries with CSD-identified charge-balancing ions, to reduce the likelihood of obtaining erroneous crystal structures. In both cases, only the FSR (free solvent removed) version of the CoRE MOF Database was used. 15 | 16 | All materials taken directly from the CoRE MOF Database have `core` at the start of their name. The `source` flag is set to `'CoRE'`. 17 | 18 | ### Pyrene MOFs 19 | A few pyrene-containing MOFs were taken from [prior work](https://pubs.rsc.org/en/content/articlehtml/2021/cs/d0cs00424c) using the data uploaded to the [Materials Cloud](https://doi.org/10.24435/materialscloud:z5-ct). 20 | 21 | All materials taken from this dataset have `pyrene` in their name. The `source` flag is set to `'Pyrene'`. 22 | 23 | ## Hypothetical MOFs 24 | Since v6 of the QMOF Database, we have now included hypothetical MOFs from various sources. 25 | 26 | ### ToBaCCo 27 | Several of the MOFs in the database were constructed using the [ToBaCCo code](https://github.com/tobacco-mofs/tobacco_3.0) described in a series of prior studies [here](https://pubs.acs.org/doi/abs/10.1021/acs.cgd.7b00848) and [here](https://pubs.rsc.org/en/content/articlehtml/2019/ce/c8ce01637b). These are described below. 28 | 29 | #### Anderson and Gómez-Gualdrón 30 | A database of ToBaCCo-constructed Zr-MOFs was adopted from [prior work](https://aip.scitation.org/doi/full/10.1063/5.0048736) by Andereson and Gómez-Gualdrón. See [here](https://osf.io/7dgvy/) for the dataset. We also exchanged the Zr species for Hf to include hypothetical Hf-MOFs as well. 31 | 32 | All materials taken from this dataset have `tobacco` and `_SR_` in their names. The `source` flag is set to `'Anderson'`. 33 | 34 | #### Colón, Gómez-Gualdrón, and Snurr 35 | A database of Cu triangle MOFs was adopted from [prior work](https://pubs.acs.org/doi/abs/10.1021/acs.cgd.7b00848) by Colón, Gómez-Gualdrón, and Snurr. These structures had H atoms in the center of the Cu triangles, which were removed before relaxing their structures with DFT. See [here](https://github.com/snurr-group/tobacco_mofs_mc_0_node) for the dataset. 36 | 37 | All materials taken from this dataset have `tobacco` in their names. The `source` flag is set to `'ToBaCCo'`. 38 | 39 | ### Boyd and Woo 40 | Several of the MOFs in the database were constructed using the [TOBASCCO code](https://github.com/peteboyd/tobascco) as described in prior work by [Boyd and Woo](https://pubs.rsc.org/en/content/articlehtml/2016/ce/c6ce00407e). These were obtained from [prior work](https://www.nature.com/articles/s41586-019-1798-7) by Boyd et al. using the Materials Cloud dataset [here](https://doi.org/10.24435/materialscloud:2018.0016/v3). We made modifications to several of these MOFs prior to structure relaxation to diversify our dataset. For instance, we occasionally exchanged the metals in the inorganic node, and we constructed Al rod MOFs by exchanging the metals in the pre-existing V rod MOFs and protonating the bridging oxo ligands. 41 | 42 | All materials taken from this dataset have `boydwoo` in their names. The `source` flag is set to `'BoydWoo'`. 43 | 44 | ### GMOFs 45 | Hypothetical MOFs were also taken from the genomic MOF (GMOF) database [here](https://figshare.com/s/ec378d7315581e48f1e4). All materials taken from this dataset have `gmof` in their names. The `source` flag is set to `'GMOF'`. 46 | 47 | ### Haranczyk Datasets 48 | Several of the MOFs in the database were obtained from prior work by Maciej Haranczyk and coworkers. 49 | 50 | #### MOF-5 51 | Hypothetical MOF-5 analogues were obtained from [prior work](https://pubs.acs.org/doi/abs/10.1021/jp401920y) by Haranczyk and colleagues. See [here](http://nanoporousmaterials.org/databases/) for the dataset. 52 | 53 | All materials taken from this dataset have `MOF-5` in their names. The `source` flag is set to `'Haranczyk_MOF5'`. 54 | 55 | #### MOF-74 56 | Hypothetical MOF-74 analogues were obtained from [prior work](https://pubs.rsc.org/en/content/articlehtml/2016/sc/c6sc01477a) by Haranczyk and colleagues. 57 | 58 | All materials taken from this dataset have `mof74` in their names. The `source` flag is set to `'Haranczyk_MOF74'`. 59 | -------------------------------------------------------------------------------- /database_tools/README.md: -------------------------------------------------------------------------------- 1 | These tools can be used to filter a set of CIFs obtained from the [MOF subset of the Cambridge Structural Database](https://sites.google.com/view/csdmofsubset/home). A brief description of each file is shown below: 2 | 3 | `ase_formatter.py`: This script reads in a list of CIFs with ASE and writes them back out again. This seems a bit silly, but it's because the default formatting of the CIFs obtained from the CSD with ConQuest are not immediately suitable for use with a variety of Python packages like Pymatgen. I recommend running this script first. 4 | 5 | `make_primitive.py`: This script converts a list of CIFs to their Niggli-reduced primitive cells. I recommend running this second. 6 | 7 | `check_dist.py`: This script will check for small interatomic distances. 8 | 9 | `lone_atom_check.py`: This script will check for lone atoms in the framework, as determined using Pymatgen's `CrystalNN` tool. 10 | 11 | `deduplicate.py`: This script will de-duplicate a list of CIFs by using Pymatgen's `StructureMatcher` utility. 12 | 13 | `false_terminal_oxo_checker.py`: This script will check for missing H atoms on "terminal" metal-oxo species that should be terminal OH groups or terminal H2O groups. 14 | 15 | `cifs_to_xyz.py`: This script converts a folder of CIFs to an ASE-formatted appended XYZ file and refcodes `.csv` file. 16 | 17 | `xyz_to_cifs.py`: This script converts an ASE-formatted appended XYZ file to a folder of CIFs. 18 | -------------------------------------------------------------------------------- /database_tools/ase_formatter.py: -------------------------------------------------------------------------------- 1 | from ase.io import read, write 2 | import os 3 | 4 | cif_path = r'/path/to/CIFs' 5 | cifs = os.listdir(cif_path) 6 | cifs.sort() 7 | 8 | for cif in cifs: 9 | mof = read(os.path.join(cif_path, cif)) 10 | write(os.path.join(cif_path, cif), mof) 11 | -------------------------------------------------------------------------------- /database_tools/check_dist.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | import os 3 | import numpy as np 4 | 5 | cutoff = 0.75 # interatomic distance threshold 6 | folder = r'/path/to/CIFs' 7 | bad_list = [] 8 | for cif in os.listdir(folder): 9 | mof = read(os.path.join(folder, cif)) 10 | d = mof.get_all_distances() 11 | upper_diag = d[np.triu_indices_from(d, k=1)] 12 | for entry in upper_diag: 13 | if entry < cutoff: 14 | print('Interatomic distance issue:' + cif.split('.')[0]) 15 | bad_list.append(cif) 16 | break 17 | 18 | with open('bad_cifs_distance_check.txt','w') as w: 19 | for bad_cif in bad_list: 20 | w.write(bad_cif+'\n') 21 | -------------------------------------------------------------------------------- /database_tools/cifs_to_xyz.py: -------------------------------------------------------------------------------- 1 | from ase.io import read, write 2 | import os 3 | 4 | cif_path = r'/path/to/CIFs' 5 | cifs = os.listdir(cif_path) 6 | cifs.sort() 7 | 8 | refcodes = [] 9 | mofs = [] 10 | for cif in cifs: 11 | refcodes.append(cif.split('.cif')[0]) 12 | mofs.append(read(os.path.join(cif_path, cif))) 13 | write('mofs.xyz', mofs) 14 | 15 | with open('refcodes.csv','w') as w: 16 | for refcode in refcodes: 17 | if refcode == refcodes[-1]: 18 | w.write(refcode) 19 | else: 20 | w.write(refcode+',') 21 | -------------------------------------------------------------------------------- /database_tools/deduplicate.py: -------------------------------------------------------------------------------- 1 | from pymatgen.core import Structure 2 | from pymatgen.analysis import structure_matcher 3 | import os 4 | 5 | folder = r'/folder/of/cifs' #folder of CIFs to de-duplicate 6 | new_folder = r'/new/folder/to/save/cifs' #folder to save only unique CIFs 7 | 8 | mofs = [] #initialize list to store Pymatgen structures 9 | entries = os.listdir(folder) #get all CIFs 10 | entries.sort() #alphabetical sort 11 | 12 | #for every CIF, store Pymatgen Structure in list 13 | for entry in entries: 14 | 15 | if '.cif' not in entry: 16 | continue 17 | 18 | #read CIF 19 | mof_temp = Structure.from_file(os.path.join(folder,entry),primitive=False) 20 | 21 | #tag Pymatgen structure with its name 22 | mof_temp.name = entry 23 | mofs.append(mof_temp) 24 | 25 | #Initialize StructureMatcher 26 | sm = structure_matcher.StructureMatcher(primitive_cell=True) 27 | 28 | #Group structures 29 | groups = sm.group_structures(mofs) 30 | print(str(len(groups))+' unique out of '+str(len(entries))+' total') 31 | 32 | #Write out set of only unique CIFs 33 | if not os.path.exists(new_folder): 34 | os.mkdir(new_folder) 35 | for group in groups: 36 | mof_temp = group[0] 37 | mof_temp.to(filename=os.path.join(new_folder,mof_temp.name)) 38 | -------------------------------------------------------------------------------- /database_tools/false_terminal_oxo_checker.py: -------------------------------------------------------------------------------- 1 | from ase import neighborlist 2 | from ase.io import read 3 | import os 4 | import numpy as np 5 | import warnings 6 | 7 | # Metals that should not have terminal oxo ligands 8 | metals = ['Li','Na','K','Rb','Cs','Fr', 9 | 'Be','Mg','Ca','Sr','Ba','Ra', 10 | 'Sc','Y','La','Ac', 11 | 'Ti','Zr','Hf', 12 | 'Mn', 13 | 'Fe', 14 | 'Co', 15 | 'Ni', 16 | 'Cu','Ag', 17 | 'Zn','Cd', 18 | 'Al','Ga','In','Tl'] 19 | 20 | # Path to CIFs 21 | p = r'/path/to/CIFs' 22 | 23 | # Get CIFs from folder 24 | cifs = os.listdir(p) 25 | cifs = [cif for cif in cifs if '.cif' in cif] 26 | cifs.sort() 27 | 28 | # Check every CIF 29 | bad_list = [] 30 | for cif in cifs: 31 | 32 | bad = False 33 | 34 | # Read in CIF, ignoring ASE warnings 35 | with warnings.catch_warnings(): 36 | warnings.simplefilter('ignore') 37 | structure = read(os.path.join(p,cif)) 38 | 39 | # Get list of atomic symbols 40 | syms = np.array(structure.get_chemical_symbols()) 41 | 42 | # Is one of the specified metals in this MOF 43 | if not any(item in syms for item in metals): 44 | continue 45 | 46 | # Initialize neighbor list 47 | cutoff = neighborlist.natural_cutoffs(structure) 48 | nl = neighborlist.NeighborList(cutoff,self_interaction=False,bothways=True) 49 | nl.update(structure) 50 | 51 | # For every site, check if it is a terminal metal-oxo 52 | for i, sym in enumerate(syms): 53 | 54 | # Confirm site is in pre-specified metal list 55 | if sym not in metals: 56 | continue 57 | 58 | # Get neighbors to metal 59 | bonded_atom_indices = nl.get_neighbors(i)[0] 60 | if bonded_atom_indices is None: 61 | continue 62 | bonded_atom_symbols = syms[bonded_atom_indices] 63 | 64 | # For every neighbor, check if it's a terminal oxo 65 | for j, bonded_atom_symbol in enumerate(bonded_atom_symbols): 66 | 67 | # Confirm neighbor is an O atom 68 | if bonded_atom_symbol != 'O': 69 | continue 70 | 71 | # Check if the O atom is only bound to the metal 72 | cn = len(nl.get_neighbors(bonded_atom_indices[j])[0]) 73 | if cn == 1: 74 | bad = True 75 | print('Missing H on terminal oxo: ' + cif) 76 | bad_list.append(cif) 77 | 78 | if bad: 79 | break 80 | if bad: 81 | break 82 | 83 | with open('bad_cifs_oxo_check.txt','w') as w: 84 | for bad_cif in bad_list: 85 | w.write(bad_cif + '\n') 86 | -------------------------------------------------------------------------------- /database_tools/lone_atom_check.py: -------------------------------------------------------------------------------- 1 | from pymatgen.analysis.graphs import StructureGraph 2 | from pymatgen.analysis import local_env 3 | from pymatgen.core import Structure 4 | import os 5 | 6 | folder = r'/path/to/CIFs' 7 | cifs = os.listdir(folder) 8 | cifs.sort() 9 | bad_list = [] 10 | for cif in cifs: 11 | mof = Structure.from_file(os.path.join(folder, cif)) 12 | nn = local_env.CrystalNN() 13 | graph = StructureGraph.with_local_env_strategy(mof, nn) 14 | for j in range(len(mof)): 15 | nbr = graph.get_connected_sites(j) 16 | if not nbr: 17 | print('Lone atom issue:' + cif+'\n') 18 | bad_list.append(cif) 19 | break 20 | 21 | with open('bad_cifs_lone_atom_check.txt','w') as w: 22 | for bad_cif in bad_list: 23 | w.write(bad_cif+'\n') 24 | -------------------------------------------------------------------------------- /database_tools/make_primitive.py: -------------------------------------------------------------------------------- 1 | from pymatgen.core import Structure 2 | import os 3 | 4 | folder = r'/path/to/CIFs' 5 | for entry in os.listdir(folder): 6 | structure = Structure.from_file(os.path.join(folder,entry),primitive=True) 7 | structure.to(filename=os.path.join(folder,entry)) 8 | -------------------------------------------------------------------------------- /database_tools/xyz_to_cifs.py: -------------------------------------------------------------------------------- 1 | from ase.io import read, write 2 | import numpy as np 3 | import os 4 | 5 | # Converts an appended .xyz to a folder of CIFs 6 | 7 | # Relevant filenames 8 | refcode_path = r'/path/to/refcodes.csv' # path to refcodes 9 | xyz_path = r'/path/to/geometries.xyz' # path to XYZ of all structures 10 | new_folder = r'cifs' # path to new folder store CIFs 11 | 12 | # ---------------------- 13 | refs = np.genfromtxt(refcode_path,delimiter=',',dtype=str) 14 | mofs = read(xyz_path,index=':') 15 | 16 | if not os.path.exists(new_folder): 17 | os.mkdir(new_folder) 18 | for i, mof in enumerate(mofs): 19 | write(os.path.join(new_folder,refs[i]+'.cif'),mof) 20 | -------------------------------------------------------------------------------- /logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/logo.png -------------------------------------------------------------------------------- /logo_white.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/logo_white.png -------------------------------------------------------------------------------- /machine_learning/README.md: -------------------------------------------------------------------------------- 1 | In the following folders, you can train various machine learning models and carry out a UMAP dimensionality reduction. These scripts take in the following formatted data: 2 | - A list of appended XYZs (constructed using ASE) for the structures under investigation. 3 | - A .csv of refcodes that correspond to the above structures. 4 | - A .csv of property (in this case, band gap) data with refcodes in the first column and band gap data in a column named 'BG_PBE'. 5 | 6 | The scripts in this folder make use of [DScribe](https://github.com/SINGROUP/dscribe), [matminer](https://github.com/hackingmaterials/matminer), [UMAP](https://github.com/lmcinnes/umap), [ASE](https://gitlab.com/ase/ase), [Pymatgen](https://pymatgen.org/), [scikit-learn](https://github.com/scikit-learn/scikit-learn), [PyTorch](https://github.com/pytorch/pytorch), and their respective dependencies. -------------------------------------------------------------------------------- /machine_learning/cgcnn/LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Tian Xie 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /machine_learning/cgcnn/README.md: -------------------------------------------------------------------------------- 1 | Please see https://github.com/arosen93/cgcnn for more details on how to run CGCNN. The `run_train.sh` and `run_predict.sh` files are sample scripts to train a model and make predictions using a trained model, respectively. 2 | 3 | In general, we recommend using [MatDeepLearn](https://github.com/vxfung/MatDeepLearn) if you are interested in training graph neural networks given its superior features. -------------------------------------------------------------------------------- /machine_learning/cgcnn/cgcnn/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/machine_learning/cgcnn/cgcnn/__init__.py -------------------------------------------------------------------------------- /machine_learning/cgcnn/cgcnn/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | class ConvLayer(nn.Module): 6 | """ 7 | Convolutional operation on graphs 8 | """ 9 | 10 | def __init__(self, atom_fea_len, nbr_fea_len): 11 | """ 12 | Initialize ConvLayer. 13 | 14 | Parameters 15 | ---------- 16 | 17 | atom_fea_len: int 18 | Number of atom hidden features. 19 | nbr_fea_len: int 20 | Number of bond features. 21 | """ 22 | super(ConvLayer, self).__init__() 23 | self.atom_fea_len = atom_fea_len 24 | self.nbr_fea_len = nbr_fea_len 25 | self.fc_full = nn.Linear(2*self.atom_fea_len+self.nbr_fea_len, 26 | 2*self.atom_fea_len) 27 | self.sigmoid = nn.Sigmoid() 28 | self.softplus1 = nn.Softplus() 29 | self.bn1 = nn.BatchNorm1d(2*self.atom_fea_len) 30 | self.bn2 = nn.BatchNorm1d(self.atom_fea_len) 31 | self.softplus2 = nn.Softplus() 32 | 33 | def forward(self, atom_in_fea, nbr_fea, nbr_fea_idx): 34 | """ 35 | Forward pass 36 | 37 | N: Total number of atoms in the batch 38 | M: Max number of neighbors 39 | 40 | Parameters 41 | ---------- 42 | 43 | atom_in_fea: Variable(torch.Tensor) shape (N, atom_fea_len) 44 | Atom hidden features before convolution 45 | nbr_fea: Variable(torch.Tensor) shape (N, M, nbr_fea_len) 46 | Bond features of each atom's M neighbors 47 | nbr_fea_idx: torch.LongTensor shape (N, M) 48 | Indices of M neighbors of each atom 49 | 50 | Returns 51 | ------- 52 | 53 | atom_out_fea: nn.Variable shape (N, atom_fea_len) 54 | Atom hidden features after convolution 55 | 56 | """ 57 | 58 | N, M = nbr_fea_idx.shape 59 | # convolution 60 | atom_nbr_fea = atom_in_fea[nbr_fea_idx, :] 61 | total_nbr_fea = torch.cat( 62 | [atom_in_fea.unsqueeze(1).expand(N, M, self.atom_fea_len), 63 | atom_nbr_fea, nbr_fea], dim=2) 64 | total_gated_fea = self.fc_full(total_nbr_fea) 65 | total_gated_fea = self.bn1(total_gated_fea.view( 66 | -1, self.atom_fea_len*2)).view(N, M, self.atom_fea_len*2) 67 | nbr_filter, nbr_core = total_gated_fea.chunk(2, dim=2) 68 | nbr_filter = self.sigmoid(nbr_filter) 69 | nbr_core = self.softplus1(nbr_core) 70 | nbr_sumed = torch.sum(nbr_filter * nbr_core, dim=1) 71 | nbr_sumed = self.bn2(nbr_sumed) 72 | out = self.softplus2(atom_in_fea + nbr_sumed) 73 | return out 74 | 75 | 76 | class CrystalGraphConvNet(nn.Module): 77 | """ 78 | Create a crystal graph convolutional neural network for predicting total 79 | material properties. 80 | """ 81 | 82 | def __init__(self, orig_atom_fea_len, nbr_fea_len, 83 | atom_fea_len=64, n_conv=3, h_fea_len=128, n_h=1, 84 | classification=False): 85 | """ 86 | Initialize CrystalGraphConvNet. 87 | 88 | Parameters 89 | ---------- 90 | 91 | orig_atom_fea_len: int 92 | Number of atom features in the input. 93 | nbr_fea_len: int 94 | Number of bond features. 95 | atom_fea_len: int 96 | Number of hidden atom features in the convolutional layers 97 | n_conv: int 98 | Number of convolutional layers 99 | h_fea_len: int 100 | Number of hidden features after pooling 101 | n_h: int 102 | Number of hidden layers after pooling 103 | """ 104 | super(CrystalGraphConvNet, self).__init__() 105 | self.classification = classification 106 | self.embedding = nn.Linear(orig_atom_fea_len, atom_fea_len) 107 | self.convs = nn.ModuleList([ConvLayer(atom_fea_len=atom_fea_len, 108 | nbr_fea_len=nbr_fea_len) 109 | for _ in range(n_conv)]) 110 | self.conv_to_fc = nn.Linear(atom_fea_len, h_fea_len) 111 | self.conv_to_fc_softplus = nn.Softplus() 112 | if n_h > 1: 113 | self.fcs = nn.ModuleList([nn.Linear(h_fea_len, h_fea_len) 114 | for _ in range(n_h-1)]) 115 | self.softpluses = nn.ModuleList([nn.Softplus() 116 | for _ in range(n_h-1)]) 117 | if self.classification: 118 | self.fc_out = nn.Linear(h_fea_len, 2) 119 | else: 120 | self.fc_out = nn.Linear(h_fea_len, 1) 121 | if self.classification: 122 | self.logsoftmax = nn.LogSoftmax(dim=1) 123 | self.dropout = nn.Dropout() 124 | 125 | def forward(self, atom_fea, nbr_fea, nbr_fea_idx, crystal_atom_idx): 126 | """ 127 | Forward pass 128 | 129 | N: Total number of atoms in the batch 130 | M: Max number of neighbors 131 | N0: Total number of crystals in the batch 132 | 133 | Parameters 134 | ---------- 135 | 136 | atom_fea: Variable(torch.Tensor) shape (N, orig_atom_fea_len) 137 | Atom features from atom type 138 | nbr_fea: Variable(torch.Tensor) shape (N, M, nbr_fea_len) 139 | Bond features of each atom's M neighbors 140 | nbr_fea_idx: torch.LongTensor shape (N, M) 141 | Indices of M neighbors of each atom 142 | crystal_atom_idx: list of torch.LongTensor of length N0 143 | Mapping from the crystal idx to atom idx 144 | 145 | Returns 146 | ------- 147 | 148 | prediction: nn.Variable shape (N, ) 149 | Atom hidden features after convolution 150 | 151 | """ 152 | atom_fea = self.embedding(atom_fea) 153 | for conv_func in self.convs: 154 | atom_fea = conv_func(atom_fea, nbr_fea, nbr_fea_idx) 155 | crys_fea = self.pooling(atom_fea, crystal_atom_idx) 156 | crys_fea = self.conv_to_fc(self.conv_to_fc_softplus(crys_fea)) 157 | crys_fea = self.conv_to_fc_softplus(crys_fea) 158 | if self.classification: 159 | crys_fea = self.dropout(crys_fea) 160 | if hasattr(self, 'fcs') and hasattr(self, 'softpluses'): 161 | for fc, softplus in zip(self.fcs, self.softpluses): 162 | crys_fea = softplus(fc(crys_fea)) 163 | out = self.fc_out(crys_fea) 164 | if self.classification: 165 | out = self.logsoftmax(out) 166 | return out 167 | 168 | def pooling(self, atom_fea, crystal_atom_idx): 169 | """ 170 | Pooling the atom features to crystal features 171 | 172 | N: Total number of atoms in the batch 173 | N0: Total number of crystals in the batch 174 | 175 | Parameters 176 | ---------- 177 | 178 | atom_fea: Variable(torch.Tensor) shape (N, atom_fea_len) 179 | Atom feature vectors of the batch 180 | crystal_atom_idx: list of torch.LongTensor of length N0 181 | Mapping from the crystal idx to atom idx 182 | """ 183 | assert sum([len(idx_map) for idx_map in crystal_atom_idx]) ==\ 184 | atom_fea.data.shape[0] 185 | summed_fea = [torch.mean(atom_fea[idx_map], dim=0, keepdim=True) 186 | for idx_map in crystal_atom_idx] 187 | return torch.cat(summed_fea, dim=0) 188 | -------------------------------------------------------------------------------- /machine_learning/cgcnn/data/example/3.cif: -------------------------------------------------------------------------------- 1 | data_image0 2 | _cell_length_a 26.19 3 | _cell_length_b 26.19 4 | _cell_length_c 6.935 5 | _cell_angle_alpha 90 6 | _cell_angle_beta 90 7 | _cell_angle_gamma 120 8 | 9 | _symmetry_space_group_name_H-M "P 1" 10 | _symmetry_int_tables_number 1 11 | 12 | loop_ 13 | _symmetry_equiv_pos_as_xyz 14 | 'x, y, z' 15 | 16 | loop_ 17 | _atom_site_label 18 | _atom_site_occupancy 19 | _atom_site_fract_x 20 | _atom_site_fract_y 21 | _atom_site_fract_z 22 | _atom_site_thermal_displace_type 23 | _atom_site_B_iso_or_equiv 24 | _atom_site_type_symbol 25 | Zn1 1.0000 0.64159 0.68514 0.47189 Biso 1.000 Zn 26 | Zn2 1.0000 0.31486 0.95645 0.47189 Biso 1.000 Zn 27 | Zn3 1.0000 0.04355 0.35841 0.47189 Biso 1.000 Zn 28 | Zn4 1.0000 0.35841 0.31486 0.52811 Biso 1.000 Zn 29 | Zn5 1.0000 0.68514 0.04355 0.52811 Biso 1.000 Zn 30 | Zn6 1.0000 0.95645 0.64159 0.52811 Biso 1.000 Zn 31 | Zn7 1.0000 0.30826 0.01847 0.80522 Biso 1.000 Zn 32 | Zn8 1.0000 0.98153 0.28978 0.80522 Biso 1.000 Zn 33 | Zn9 1.0000 0.71022 0.69174 0.80522 Biso 1.000 Zn 34 | Zn10 1.0000 0.02508 0.64819 0.86144 Biso 1.000 Zn 35 | Zn11 1.0000 0.35181 0.37688 0.86144 Biso 1.000 Zn 36 | Zn12 1.0000 0.62312 0.97492 0.86144 Biso 1.000 Zn 37 | Zn13 1.0000 0.97492 0.35181 0.13856 Biso 1.000 Zn 38 | Zn14 1.0000 0.64819 0.62312 0.13856 Biso 1.000 Zn 39 | Zn15 1.0000 0.37688 0.02508 0.13856 Biso 1.000 Zn 40 | Zn16 1.0000 0.69174 0.98153 0.19478 Biso 1.000 Zn 41 | Zn17 1.0000 0.01847 0.71022 0.19478 Biso 1.000 Zn 42 | Zn18 1.0000 0.28978 0.30826 0.19478 Biso 1.000 Zn 43 | O1 1.0000 0.96943 0.29776 0.36038 Biso 1.000 O 44 | O2 1.0000 0.70224 0.67167 0.36038 Biso 1.000 O 45 | O3 1.0000 0.32833 0.03057 0.36038 Biso 1.000 O 46 | O4 1.0000 0.03057 0.70224 0.63962 Biso 1.000 O 47 | O5 1.0000 0.29776 0.32833 0.63962 Biso 1.000 O 48 | O6 1.0000 0.67167 0.96943 0.63962 Biso 1.000 O 49 | O7 1.0000 0.63610 0.63109 0.69371 Biso 1.000 O 50 | O8 1.0000 0.36891 0.00500 0.69371 Biso 1.000 O 51 | O9 1.0000 0.99500 0.36390 0.69371 Biso 1.000 O 52 | O10 1.0000 0.69724 0.03557 0.97295 Biso 1.000 O 53 | O11 1.0000 0.96443 0.66166 0.97295 Biso 1.000 O 54 | O12 1.0000 0.33834 0.30276 0.97295 Biso 1.000 O 55 | O13 1.0000 0.30276 0.96443 0.02705 Biso 1.000 O 56 | O14 1.0000 0.03557 0.33834 0.02705 Biso 1.000 O 57 | O15 1.0000 0.66166 0.69724 0.02705 Biso 1.000 O 58 | O16 1.0000 0.36390 0.36891 0.30629 Biso 1.000 O 59 | O17 1.0000 0.63109 0.99500 0.30629 Biso 1.000 O 60 | O18 1.0000 0.00500 0.63610 0.30629 Biso 1.000 O 61 | O19 1.0000 0.92763 0.23048 0.61020 Biso 1.000 O 62 | O20 1.0000 0.76952 0.69715 0.61020 Biso 1.000 O 63 | O21 1.0000 0.30285 0.07237 0.61020 Biso 1.000 O 64 | O22 1.0000 0.07237 0.76952 0.38980 Biso 1.000 O 65 | O23 1.0000 0.23048 0.30285 0.38980 Biso 1.000 O 66 | O24 1.0000 0.69715 0.92763 0.38980 Biso 1.000 O 67 | O25 1.0000 0.59430 0.56381 0.94353 Biso 1.000 O 68 | O26 1.0000 0.43619 0.03048 0.94353 Biso 1.000 O 69 | O27 1.0000 0.96952 0.40570 0.94353 Biso 1.000 O 70 | O28 1.0000 0.73904 0.10285 0.72313 Biso 1.000 O 71 | O29 1.0000 0.89715 0.63618 0.72313 Biso 1.000 O 72 | O30 1.0000 0.36382 0.26096 0.72313 Biso 1.000 O 73 | O31 1.0000 0.26096 0.89715 0.27687 Biso 1.000 O 74 | O32 1.0000 0.10285 0.36382 0.27687 Biso 1.000 O 75 | O33 1.0000 0.63618 0.73904 0.27687 Biso 1.000 O 76 | O34 1.0000 0.40570 0.43619 0.05647 Biso 1.000 O 77 | O35 1.0000 0.56381 0.96952 0.05647 Biso 1.000 O 78 | O36 1.0000 0.03048 0.59430 0.05647 Biso 1.000 O 79 | O37 1.0000 0.25494 0.94205 0.67383 Biso 1.000 O 80 | O38 1.0000 0.05795 0.31289 0.67383 Biso 1.000 O 81 | O39 1.0000 0.68711 0.74506 0.67383 Biso 1.000 O 82 | O40 1.0000 0.74506 0.05795 0.32617 Biso 1.000 O 83 | O41 1.0000 0.94205 0.68711 0.32617 Biso 1.000 O 84 | O42 1.0000 0.31289 0.25494 0.32617 Biso 1.000 O 85 | O43 1.0000 0.92161 0.27538 0.00716 Biso 1.000 O 86 | O44 1.0000 0.72462 0.64622 0.00716 Biso 1.000 O 87 | O45 1.0000 0.35378 0.07839 0.00716 Biso 1.000 O 88 | O46 1.0000 0.41173 0.39128 0.65950 Biso 1.000 O 89 | O47 1.0000 0.60872 0.02044 0.65950 Biso 1.000 O 90 | O48 1.0000 0.97956 0.58827 0.65950 Biso 1.000 O 91 | O49 1.0000 0.58827 0.60872 0.34050 Biso 1.000 O 92 | O50 1.0000 0.39128 0.97956 0.34050 Biso 1.000 O 93 | O51 1.0000 0.02044 0.41173 0.34050 Biso 1.000 O 94 | O52 1.0000 0.07839 0.72462 0.99284 Biso 1.000 O 95 | O53 1.0000 0.27538 0.35378 0.99284 Biso 1.000 O 96 | O54 1.0000 0.64622 0.92161 0.99284 Biso 1.000 O 97 | C1 1.0000 0.92667 0.24580 0.42703 Biso 1.000 C 98 | C2 1.0000 0.75420 0.68087 0.42703 Biso 1.000 C 99 | C3 1.0000 0.31913 0.07333 0.42703 Biso 1.000 C 100 | C4 1.0000 0.07333 0.75420 0.57297 Biso 1.000 C 101 | C5 1.0000 0.24580 0.31913 0.57297 Biso 1.000 C 102 | C6 1.0000 0.68087 0.92667 0.57297 Biso 1.000 C 103 | C7 1.0000 0.59334 0.57913 0.76036 Biso 1.000 C 104 | C8 1.0000 0.42087 0.01420 0.76036 Biso 1.000 C 105 | C9 1.0000 0.98580 0.40666 0.76036 Biso 1.000 C 106 | C10 1.0000 0.74000 0.08753 0.90630 Biso 1.000 C 107 | C11 1.0000 0.91247 0.65246 0.90630 Biso 1.000 C 108 | C12 1.0000 0.34754 0.26000 0.90630 Biso 1.000 C 109 | C13 1.0000 0.26000 0.91247 0.09370 Biso 1.000 C 110 | C14 1.0000 0.08753 0.34754 0.09370 Biso 1.000 C 111 | C15 1.0000 0.65246 0.74000 0.09370 Biso 1.000 C 112 | C16 1.0000 0.40666 0.42087 0.23964 Biso 1.000 C 113 | C17 1.0000 0.57913 0.98580 0.23964 Biso 1.000 C 114 | C18 1.0000 0.01420 0.59334 0.23964 Biso 1.000 C 115 | C19 1.0000 0.87918 0.20711 0.29083 Biso 1.000 C 116 | C20 1.0000 0.79289 0.67207 0.29083 Biso 1.000 C 117 | C21 1.0000 0.32793 0.12082 0.29083 Biso 1.000 C 118 | C22 1.0000 0.12082 0.79289 0.70917 Biso 1.000 C 119 | C23 1.0000 0.20711 0.32793 0.70917 Biso 1.000 C 120 | C24 1.0000 0.67207 0.87918 0.70917 Biso 1.000 C 121 | C25 1.0000 0.54585 0.54044 0.62416 Biso 1.000 C 122 | C26 1.0000 0.45956 0.00540 0.62416 Biso 1.000 C 123 | C27 1.0000 0.99460 0.45415 0.62416 Biso 1.000 C 124 | C28 1.0000 0.78749 0.12622 0.04250 Biso 1.000 C 125 | C29 1.0000 0.87378 0.66126 0.04250 Biso 1.000 C 126 | C30 1.0000 0.33874 0.21251 0.04250 Biso 1.000 C 127 | C31 1.0000 0.21251 0.87378 0.95750 Biso 1.000 C 128 | C32 1.0000 0.12622 0.33874 0.95750 Biso 1.000 C 129 | C33 1.0000 0.66126 0.78749 0.95750 Biso 1.000 C 130 | C34 1.0000 0.45415 0.45956 0.37584 Biso 1.000 C 131 | C35 1.0000 0.54044 0.99460 0.37584 Biso 1.000 C 132 | C36 1.0000 0.00540 0.54585 0.37584 Biso 1.000 C 133 | C37 1.0000 0.21254 0.88926 0.75490 Biso 1.000 C 134 | C38 1.0000 0.11074 0.32328 0.75490 Biso 1.000 C 135 | C39 1.0000 0.67672 0.78746 0.75490 Biso 1.000 C 136 | C40 1.0000 0.78746 0.11074 0.24510 Biso 1.000 C 137 | C41 1.0000 0.88926 0.67672 0.24510 Biso 1.000 C 138 | C42 1.0000 0.32328 0.21254 0.24510 Biso 1.000 C 139 | C43 1.0000 0.87921 0.22259 0.08823 Biso 1.000 C 140 | C44 1.0000 0.77741 0.65661 0.08823 Biso 1.000 C 141 | C45 1.0000 0.34339 0.12079 0.08823 Biso 1.000 C 142 | C46 1.0000 0.45413 0.44407 0.57843 Biso 1.000 C 143 | C47 1.0000 0.55593 0.01005 0.57843 Biso 1.000 C 144 | C48 1.0000 0.98995 0.54587 0.57843 Biso 1.000 C 145 | C49 1.0000 0.54587 0.55593 0.42157 Biso 1.000 C 146 | C50 1.0000 0.44407 0.98995 0.42157 Biso 1.000 C 147 | C51 1.0000 0.01005 0.45413 0.42157 Biso 1.000 C 148 | C52 1.0000 0.12079 0.77741 0.91177 Biso 1.000 C 149 | C53 1.0000 0.22259 0.34339 0.91177 Biso 1.000 C 150 | C54 1.0000 0.65661 0.87921 0.91177 Biso 1.000 C 151 | C55 1.0000 0.16630 0.84759 0.63558 Biso 1.000 C 152 | C56 1.0000 0.15241 0.31871 0.63558 Biso 1.000 C 153 | C57 1.0000 0.68129 0.83370 0.63558 Biso 1.000 C 154 | C58 1.0000 0.83370 0.15241 0.36442 Biso 1.000 C 155 | C59 1.0000 0.84759 0.68129 0.36442 Biso 1.000 C 156 | C60 1.0000 0.31871 0.16630 0.36442 Biso 1.000 C 157 | C61 1.0000 0.83297 0.18092 0.96891 Biso 1.000 C 158 | C62 1.0000 0.81908 0.65204 0.96891 Biso 1.000 C 159 | C63 1.0000 0.34796 0.16703 0.96891 Biso 1.000 C 160 | C64 1.0000 0.50037 0.48574 0.69775 Biso 1.000 C 161 | C65 1.0000 0.51426 0.01462 0.69775 Biso 1.000 C 162 | C66 1.0000 0.98538 0.49963 0.69775 Biso 1.000 C 163 | C67 1.0000 0.49963 0.51426 0.30225 Biso 1.000 C 164 | C68 1.0000 0.48574 0.98538 0.30225 Biso 1.000 C 165 | C69 1.0000 0.01462 0.50037 0.30225 Biso 1.000 C 166 | C70 1.0000 0.16703 0.81908 0.03109 Biso 1.000 C 167 | C71 1.0000 0.18092 0.34796 0.03109 Biso 1.000 C 168 | C72 1.0000 0.65204 0.83297 0.03109 Biso 1.000 C 169 | H1 1.0000 0.16529 0.85803 0.47918 Biso 1.000 H 170 | H2 1.0000 0.14197 0.30726 0.47918 Biso 1.000 H 171 | H3 1.0000 0.69274 0.83471 0.47918 Biso 1.000 H 172 | H4 1.0000 0.83471 0.14197 0.52082 Biso 1.000 H 173 | H5 1.0000 0.85803 0.69274 0.52082 Biso 1.000 H 174 | H6 1.0000 0.30726 0.16529 0.52082 Biso 1.000 H 175 | H7 1.0000 0.83196 0.19136 0.81251 Biso 1.000 H 176 | H8 1.0000 0.80864 0.64059 0.81251 Biso 1.000 H 177 | H9 1.0000 0.35941 0.16804 0.81251 Biso 1.000 H 178 | H10 1.0000 0.50138 0.47530 0.85415 Biso 1.000 H 179 | H11 1.0000 0.52470 0.02607 0.85415 Biso 1.000 H 180 | H12 1.0000 0.97393 0.49862 0.85415 Biso 1.000 H 181 | H13 1.0000 0.49862 0.52470 0.14585 Biso 1.000 H 182 | H14 1.0000 0.47530 0.97393 0.14585 Biso 1.000 H 183 | H15 1.0000 0.02607 0.50138 0.14585 Biso 1.000 H 184 | H16 1.0000 0.16804 0.80864 0.18749 Biso 1.000 H 185 | H17 1.0000 0.19136 0.35941 0.18749 Biso 1.000 H 186 | H18 1.0000 0.64059 0.83196 0.18749 Biso 1.000 H 187 | -------------------------------------------------------------------------------- /machine_learning/cgcnn/data/example/id_prop.csv: -------------------------------------------------------------------------------- 1 | 1,1.0 2 | 2,2.0 3 | 3,3.0 4 | -------------------------------------------------------------------------------- /machine_learning/cgcnn/run_predict.sh: -------------------------------------------------------------------------------- 1 | python predict.py pre-trained/model_best.pth.tar data/QMOF --print-freq 1 > cgcnn_predict.out 2 | -------------------------------------------------------------------------------- /machine_learning/cgcnn/run_train.sh: -------------------------------------------------------------------------------- 1 | python main.py --batch-size 16 --n-conv 5 --n-h 1 --train-ratio 0.8 --val-ratio 0.1 --test-ratio 0.1 --workers 4 --epochs 300 --print-freq 1 data/QMOF > cgcnn_train.out 2 | -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/README.md: -------------------------------------------------------------------------------- 1 | `stoich45_feature_generator.py`: Generates Stoichoimetric-45 encodings. Inputs: `xyz_path` (.xyz file of concatenated XYZs for each structure), `refcodes_path` (.csv file of corresponding refcodes). 2 | 3 | `stoich45_krr.py`: Trains a KRR model given Stoichiometric-45 fingerprints and property data. Inputs: `fingerprint_path` (path to X fingerprints obtained from `stoich45_feature_generator.py`), `y_path` (path to .csv of property data to train/test on). 4 | 5 | `stoich45_learning_curves.py`: Same as `stoich45_krr.py` but loops over increasing training set sizes. -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/stoich45_feature_generator.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pandas as pd 3 | from ase.io import read 4 | import os 5 | from scipy.stats.mstats import gmean 6 | 7 | tabulated_data_path = 'tabulated_data' 8 | xyz_path = os.path.join('..','qmof-geometries.xyz') # list of appended XYZs (length N) 9 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # list of refcodes (length N) 10 | 11 | def get_stats(atomic_nums, tabulated_data): 12 | vals = np.empty(len(atomic_nums))*np.nan 13 | for i in range(len(tabulated_data)): 14 | vals[(atomic_nums == (i+1))] = tabulated_data[i] 15 | stdmean = np.mean(vals) 16 | geomean = gmean(np.absolute(vals)) 17 | stdev = np.std(vals) 18 | maxval = np.max(vals) 19 | minval = np.min(vals) 20 | 21 | return stdmean, geomean, stdev, maxval, minval 22 | 23 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str) 24 | group_data = np.genfromtxt(os.path.join( 25 | tabulated_data_path, 'group.csv'), delimiter=',') 26 | period_data = np.genfromtxt(os.path.join( 27 | tabulated_data_path, 'period.csv'), delimiter=',') 28 | electroneg_data = np.genfromtxt( 29 | os.path.join(tabulated_data_path, 'electronegativity.csv'), delimiter=',') 30 | electron_affin_data = np.genfromtxt( 31 | os.path.join(tabulated_data_path, 'electron_affinity.csv'), delimiter=',') 32 | melting_data = np.genfromtxt(os.path.join( 33 | tabulated_data_path, 'melting_point.csv'), delimiter=',')+273.15 34 | boiling_data = np.genfromtxt(os.path.join( 35 | tabulated_data_path, 'boiling_point.csv'), delimiter=',')+273.15 36 | density_data = np.genfromtxt(os.path.join( 37 | tabulated_data_path, 'density.csv'), delimiter=',') 38 | ionization_data = np.genfromtxt( 39 | os.path.join(tabulated_data_path, 'ionization_energy.csv'), delimiter=',') 40 | 41 | mofs = read(xyz_path, index=':') 42 | data = np.empty((45, len(mofs)))*np.nan 43 | for i, mof in enumerate(mofs): 44 | print('Generating fingerprint: '+str(i)) 45 | 46 | atomic_nums = mof.get_atomic_numbers() 47 | data_vector = np.empty(45)*np.nan 48 | 49 | data_vector[0] = np.mean(atomic_nums) 50 | data_vector[1] = gmean(atomic_nums) 51 | data_vector[2] = np.std(atomic_nums) 52 | data_vector[3] = np.amax(atomic_nums) 53 | data_vector[4] = np.amin(atomic_nums) 54 | 55 | data_vector[5:10] = get_stats(atomic_nums, group_data) 56 | data_vector[10:15] = get_stats(atomic_nums, period_data) 57 | data_vector[15:20] = get_stats(atomic_nums, electroneg_data) 58 | data_vector[20:25] = get_stats(atomic_nums, electron_affin_data) 59 | data_vector[25:30] = get_stats(atomic_nums, melting_data) 60 | data_vector[30:35] = get_stats(atomic_nums, boiling_data) 61 | data_vector[35:40] = get_stats(atomic_nums, density_data) 62 | data_vector[40:45] = get_stats(atomic_nums, ionization_data) 63 | 64 | data[:, i] = data_vector 65 | 66 | df = pd.DataFrame(np.transpose(data), index=refcodes) 67 | 68 | colnames = ['atomic_num_mean', 'atomic_num_geometric_mean', 'atomic_num_standard_deviation', 'atomic_num_max', 'atomic_num_min', 69 | 'group_num_mean', 'group_num_geometric_mean', 'group_num_standard_deviation', 'group_num_max', 'group_num_min', 70 | 'period_num_mean', 'period_num_geometric_mean', 'period_num_standard_deviation', 'period_num_max', 'period_num_min', 71 | 'electronegativity_mean', 'electronegativity_geometric_mean', 'electronegativity_standard_deviation', 'electronegativity_max', 'electronegativity_min', 72 | 'electron_affinity_mean', 'electron_affinity_geometric_mean', 'electron_affinity_standard_deviation', 'electron_affinity_max', 'electron_affinity_min', 73 | 'melting_mean', 'melting_geometric_mean', 'melting_standard_deviation', 'melting_max', 'melting_min', 74 | 'boiling_mean', 'boiling_geometric_mean', 'boiling_standard_deviation', 'boiling_max', 'boiling_min', 75 | 'density_mean', 'density_geometric_mean', 'density_standard_deviation', 'density_max', 'density_min', 76 | 'ionization_energy_mean', 'ionization_energy_geometric_mean', 'ionization_energy_standard_deviation', 'ionization_energy_max', 'ionization_energy_geometric_min'] 77 | 78 | df.columns = colnames 79 | df.index.name = 'MOF' 80 | df.to_csv('stoich45_fingerprints.csv', index=True) 81 | -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/stoich45_krr.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.1 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seed = 42 # random seed 16 | fingerprint_path = 'stoich45_fingerprints.csv' # fingerprints (length N) 17 | y_path = os.path.join('..','qmof-bandgaps.csv') # band gaps (length N) 18 | 19 | #--------------------------------------- 20 | #Read in data 21 | df_features = pd.read_csv(fingerprint_path, index_col=0) 22 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 23 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 24 | df = df.dropna() 25 | refcodes = df.index 26 | 27 | # Make a training and testing set 28 | train_set, test_set = train_test_split( 29 | df, test_size=test_size, shuffle=True, random_state=seed) 30 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 31 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 32 | 33 | refcodes_train = X_train.index 34 | refcodes_test = X_test.index 35 | 36 | scaler = MinMaxScaler() 37 | scaler.fit(X_train) 38 | X_train = scaler.transform(X_train) 39 | X_test = scaler.transform(X_test) 40 | 41 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 42 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 43 | 44 | # Train and evaluate KRR model 45 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 46 | krr.fit(X_train, y_train) 47 | y_train_pred = krr.predict(X_train) 48 | y_test_pred = krr.predict(X_test) 49 | 50 | # Save results 51 | df_train = pd.DataFrame(np.concatenate((y_train, y_train_pred), axis=1), columns=[ 52 | 'DFT', 'ML'], index=refcodes_train) 53 | df_train.to_csv('train_results.csv', header=True, index=True) 54 | 55 | df_test = pd.DataFrame(np.concatenate((y_test, y_test_pred), axis=1), columns=[ 56 | 'DFT', 'ML'], index=refcodes_test) 57 | df_test.to_csv('test_results.csv', header=True, index=True) 58 | 59 | print('Train size: ', len(y_train)) 60 | print('Test size: ', len(y_test)) 61 | print('Train/test MAE: ', mean_absolute_error(y_train, y_train_pred), 62 | mean_absolute_error(y_test, y_test_pred)) 63 | print('Train/test r^2: ', r2_score(y_train, y_train_pred), 64 | r2_score(y_test, y_test_pred)) 65 | print('Train/test rho: ', spearmanr(y_train, y_train_pred) 66 | [0], spearmanr(y_test, y_test_pred)[0]) 67 | -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/stoich45_learning_curves.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.1 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seeds = [42, 125, 267, 541, 582] # random seeds 16 | train_sizes = [2**7, 2**8, 2**9, 2**10, 2**11, 2**12, 2**13, -1] # train sizes 17 | fingerprint_path = 'stoich45_fingerprints.csv' # fingerprints (length N) 18 | y_path = os.path.join('..','qmof-bandgaps.csv') # band gaps (length N) 19 | 20 | #--------------------------------------- 21 | #Read in data 22 | df_features = pd.read_csv(fingerprint_path, index_col=0) 23 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 24 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 25 | df = df.dropna() 26 | refcodes = df.index 27 | 28 | # Make a training and testing set 29 | mae = [] 30 | r2 = [] 31 | rho = [] 32 | mae_std = [] 33 | r2_std = [] 34 | rho_std = [] 35 | for train_size in train_sizes: 36 | mae_test_seeds = [] 37 | r2_test_seeds = [] 38 | rho_test_seeds = [] 39 | for seed in seeds: 40 | train_set, test_set = train_test_split( 41 | df, test_size=test_size, shuffle=True, random_state=seed) 42 | if train_size != -1: 43 | train_set = train_set[0:train_size] 44 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 45 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 46 | refcodes_train = X_train.index 47 | refcodes_test = X_test.index 48 | 49 | scaler = MinMaxScaler() 50 | scaler.fit(X_train) 51 | X_train = scaler.transform(X_train) 52 | X_test = scaler.transform(X_test) 53 | 54 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 55 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 56 | 57 | # Train and evaluate KRR model 58 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 59 | krr.fit(X_train, y_train) 60 | y_train_pred = krr.predict(X_train) 61 | y_test_pred = krr.predict(X_test) 62 | 63 | mae_test_seeds.append(mean_absolute_error(y_test, y_test_pred)) 64 | r2_test_seeds.append(r2_score(y_test, y_test_pred)) 65 | rho_test_seeds.append(spearmanr(y_test, y_test_pred)[0]) 66 | 67 | mae.append(np.average(mae_test_seeds)) 68 | r2.append(np.average(r2_test_seeds)) 69 | rho.append(np.average(rho_test_seeds)) 70 | mae_std.append(np.std(mae_test_seeds)) 71 | r2_std.append(np.std(r2_test_seeds)) 72 | rho_std.append(np.std(rho_test_seeds)) 73 | 74 | print('Training size: ', train_size) 75 | print('Avg. testing MAE: ', np.round(np.average(mae_test_seeds), 3)) 76 | print('Avg. testing r^2: ', np.round(np.average(r2_test_seeds), 3)) 77 | print('Avg. testing rho: ', np.round(np.average(rho_test_seeds), 3)) 78 | 79 | np.savetxt('learning_curve_avg.csv',np.vstack([mae,r2,rho]),delimiter=',') 80 | np.savetxt('learning_curve_std.csv',np.vstack([mae_std,r2_std,rho_std]),delimiter=',') -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/atomic_number.csv: -------------------------------------------------------------------------------- 1 | 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118 -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/boiling_point.csv: -------------------------------------------------------------------------------- 1 | -252.87,-268.93,1342.,2470.,4000.,4027.,-195.79,-182.9,-188.12,-246.08,883.,1090.,2519.,2900.,280.5,444.72,-34.04,-185.8,759.,1484.,2830.,3287.,3407.,2671.,2061.,2861.,2927.,2913.,2562.,907.,2204.,2820.,614.,685.,59.,-153.22,688.,1382.,3345.,4409.,4744.,4639.,4265.,4150.,3695.,2963.,2162.,767.,2072.,2602.,1587.,988.,184.3,-108.,671.,1870.,3464.,3360.,3290.,3100.,3000.,1803.,1527.,3250.,3230.,2567.,2700.,2868.,1950.,1196.,3402.,4603.,5458.,5555.,5596.,5012.,4428.,3825.,2856.,356.73,1473.,1749.,1564.,962.,NaN,-61.7,NaN,1737.,3200.,4820.,4000.,3927.,4000.,3230.,2011.,3110.,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/density.csv: -------------------------------------------------------------------------------- 1 | 0.0899,0.1785,535.,1848.,2460.,2260.,1.251,1.429,1.696,0.9,968.,1738.,2700.,2330.,1823.,1960.,3.214,1.784,856.,1550.,2985.,4507.,6110.,7190.,7470.,7874.,8900.,8908.,8960.,7140.,5904.,5323.,5727.,4819.,3120.,3.75,1532.,2630.,4472.,6511.,8570.,10280.,11500.,12370.,12450.,12023.,10490.,8650.,7310.,7310.,6697.,6240.,4940.,5.9,1879.,3510.,6146.,6689.,6640.,7010.,7264.,7353.,5244.,7901.,8219.,8551.,8795.,9066.,9320.,6570.,9841.,13310.,16650.,19250.,21020.,22590.,22560.,21450.,19300.,13534.,11850.,11340.,9780.,9196.,NaN,9.73,NaN,5000.,10070.,11724.,15370.,19050.,20450.,19816.,13670.,13510.,14780.,15100.,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/electron_affinity.csv: -------------------------------------------------------------------------------- 1 | 72.8,0.,59.6,0.,26.7,153.9,7.,141.,328.,0.,52.8,0.,42.5,133.6,71.,200.,349.,0.,48.4,2.37,18.1,7.6,50.6,64.3,0.,15.7,63.7,112.,118.4,0.,28.9,119.,78.,195.,324.6,0.,46.9,5.03,29.6,41.1,86.1,71.9,53.,101.3,109.7,53.7,125.6,0.,28.9,107.3,103.2,190.2,295.2,0.,45.5,13.95,48.,50.,50.,50.,50.,50.,50.,50.,50.,50.,50.,50.,50.,50.,50.,0.,31.,78.6,14.5,106.1,151.,205.3,222.8,0.,19.2,35.1,91.2,183.3,270.1,0.,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/electronegativity.csv: -------------------------------------------------------------------------------- 1 | 2.2,NaN,0.98,1.57,2.04,2.55,3.04,3.44,3.98,NaN,0.93,1.31,1.61,1.9,2.19,2.58,3.16,NaN,0.82,1.,1.36,1.54,1.63,1.66,1.55,1.83,1.88,1.91,1.9,1.65,1.81,2.01,2.18,2.55,2.96,3.,0.82,0.95,1.22,1.33,1.6,2.16,1.9,2.2,2.28,2.2,1.93,1.69,1.78,1.96,2.05,2.1,2.66,2.6,0.79,0.89,1.1,1.12,1.13,1.14,NaN,1.17,NaN,1.2,NaN,1.22,1.23,1.24,1.25,NaN,1.27,1.3,1.5,2.36,1.9,2.2,2.2,2.28,2.54,2.,1.62,2.33,2.02,2.,2.2,NaN,0.7,0.9,1.1,1.3,1.5,1.38,1.36,1.28,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/group.csv: -------------------------------------------------------------------------------- 1 | 1,18,1,2,13,14,15,16,17,18,1,2,13,14,15,16,17,18,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,1,2,19,19,19,19,19,19,19,19,19,19,19,19,19,19,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,1,2,19,19,19,19,19,19,19,19,19,19,19,19,19,19,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18 -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/ionization_energy.csv: -------------------------------------------------------------------------------- 1 | 1312.,2372.3,520.2,899.5,800.6,1086.5,1402.3,1313.9,1681.,2080.7,495.8,737.7,577.5,786.5,1011.8,999.6,1251.2,1520.6,418.8,589.8,633.1,658.8,650.9,652.9,717.3,762.5,760.4,737.1,745.5,906.4,578.8,762.,947.,941.,1139.9,1350.8,403.,549.5,600.,640.1,652.1,684.3,702.,710.2,719.7,804.4,731.,867.8,558.3,708.6,834.,869.3,1008.4,1170.4,375.7,502.9,538.1,534.4,527.,533.1,540.,544.5,547.1,593.4,565.8,573.,581.,589.3,596.7,603.4,523.5,658.5,761.,770.,760.,840.,880.,870.,890.1,1007.1,589.4,715.6,703.,812.1,920.,1037.,380.,509.3,499.,587.,568.,597.6,604.5,584.7,578.,581.,601.,608.,619.,627.,635.,642.,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/melting_point.csv: -------------------------------------------------------------------------------- 1 | -259.14,NaN,180.54,1287.,2075.,3550.,-210.1,-218.3,-219.6,-248.59,97.72,650.,660.32,1414.,44.2,115.21,-101.5,-189.3,63.38,842.,1541.,1668.,1910.,1907.,1246.,1538.,1495.,1455.,1084.62,419.53,29.76,938.3,817.,221.,-7.3,-157.36,39.31,777.,1526.,1855.,2477.,2623.,2157.,2334.,1964.,1554.9,961.78,321.07,156.6,231.93,630.63,449.51,113.7,-111.8,28.44,727.,919.,798.,931.,1021.,1100.,1072.,822.,1313.,1356.,1412.,1474.,1497.,1545.,819.,1663.,2233.,3017.,3422.,3186.,3033.,2466.,1768.3,1064.18,-38.83,304.,327.46,271.3,254.,302.,-71.,NaN,700.,1050.,1750.,1572.,1135.,644.,640.,1176.,1345.,1050.,900.,860.,1527.,828.,828.,1627.,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN,NaN -------------------------------------------------------------------------------- /machine_learning/he_stoichiometric_45/tabulated_data/period.csv: -------------------------------------------------------------------------------- 1 | 1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7,7 -------------------------------------------------------------------------------- /machine_learning/meredig_stoichiometric_120/README.md: -------------------------------------------------------------------------------- 1 | `stoich120_feature_generator.py`: Generates Stoichoimetric-120 encodings. Inputs: `xyz_path` (.xyz file of concatenated XYZs for each structure), `refcodes_path` (.csv file of corresponding refcodes). 2 | 3 | `stoich120_krr.py`: Trains a KRR model given Stoichiometric-120 fingerprints and property data. Inputs: `fingerprint_path` (path to X fingerprints obtained from `stoich120_feature_generator.py`), `y_path` (path to .csv of property data to train/test on). 4 | 5 | `stoich120_learning_curves.py`: Same as `stoich120_krr.py` but loops over increasing training set sizes. -------------------------------------------------------------------------------- /machine_learning/meredig_stoichiometric_120/stoich120_feature_generator.py: -------------------------------------------------------------------------------- 1 | from matminer.featurizers.composition import Meredig 2 | from ase.io import read 3 | from pymatgen.io import ase as pm_ase 4 | import numpy as np 5 | import pandas as pd 6 | import os 7 | 8 | # Settings 9 | xyz_path = os.path.join('..','qmof-geometries.xyz') # list of appended XYZs (length N) 10 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # list of refcodes (length N) 11 | 12 | #--------------------------------------- 13 | # Read in structures 14 | ase_mofs = read(xyz_path, index=':') 15 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str) 16 | adaptor = pm_ase.AseAtomsAdaptor() 17 | pm_mofs = [adaptor.get_structure(ase_mof) for ase_mof in ase_mofs] 18 | 19 | # Initialize feature object 20 | featurizer = Meredig() 21 | features = featurizer.feature_labels() 22 | df = pd.DataFrame(columns=features) 23 | 24 | # Get features 25 | for i, pm_mof in enumerate(pm_mofs): 26 | print('Generating fingerprint: '+str(i)) 27 | fingerprint = featurizer.featurize(pm_mof.composition) 28 | refcode = refcodes[i] 29 | df.loc[refcode, :] = fingerprint 30 | 31 | # Export features 32 | df.index.name = 'MOF' 33 | df.to_csv('stoich120_fingerprints.csv', index=True) 34 | -------------------------------------------------------------------------------- /machine_learning/meredig_stoichiometric_120/stoich120_krr.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.1 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seed = 42 # random seed 16 | fingerprint_path = 'stoich120_fingerprints.csv' # fingerprints (length N) 17 | y_path = os.path.join('..','qmof-bandgaps.csv') # band gaps (lenght N) 18 | 19 | #--------------------------------------- 20 | #Read in data 21 | df_features = pd.read_csv(fingerprint_path, index_col=0) 22 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 23 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 24 | df = df.dropna() 25 | refcodes = df.index 26 | 27 | # Make a training and testing set 28 | train_set, test_set = train_test_split( 29 | df, test_size=test_size, shuffle=True, random_state=seed) 30 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 31 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 32 | 33 | refcodes_train = X_train.index 34 | refcodes_test = X_test.index 35 | 36 | scaler = MinMaxScaler() 37 | scaler.fit(X_train) 38 | X_train = scaler.transform(X_train) 39 | X_test = scaler.transform(X_test) 40 | 41 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 42 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 43 | 44 | # Train and evaluate KRR model 45 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 46 | krr.fit(X_train, y_train) 47 | y_train_pred = krr.predict(X_train) 48 | y_test_pred = krr.predict(X_test) 49 | 50 | # Save results 51 | df_train = pd.DataFrame(np.concatenate((y_train, y_train_pred), axis=1), columns=[ 52 | 'DFT', 'ML'], index=refcodes_train) 53 | df_train.to_csv('train_results.csv', header=True, index=True) 54 | 55 | df_test = pd.DataFrame(np.concatenate((y_test, y_test_pred), axis=1), columns=[ 56 | 'DFT', 'ML'], index=refcodes_test) 57 | df_test.to_csv('test_results.csv', header=True, index=True) 58 | 59 | print('Train size: ', len(y_train)) 60 | print('Test size: ', len(y_test)) 61 | print('Train/test MAE: ', mean_absolute_error(y_train, y_train_pred), 62 | mean_absolute_error(y_test, y_test_pred)) 63 | print('Train/test r^2: ', r2_score(y_train, y_train_pred), 64 | r2_score(y_test, y_test_pred)) 65 | print('Train/test rho: ', spearmanr(y_train, y_train_pred) 66 | [0], spearmanr(y_test, y_test_pred)[0]) 67 | -------------------------------------------------------------------------------- /machine_learning/meredig_stoichiometric_120/stoich120_learning_curves.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.1 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seeds = [42, 125, 267, 541, 582] # random seeds 16 | train_sizes = [2**7, 2**8, 2**9, 2**10, 2**11, 2**12, 2**13, -1] # train sizes 17 | fingerprint_path = 'stoich120_fingerprints.csv' # fingerprints (length N) 18 | y_path = os.path.join('..','qmof-bandgaps.csv') # band gaps (length N) 19 | 20 | #--------------------------------------- 21 | #Read in data 22 | df_features = pd.read_csv(fingerprint_path, index_col=0) 23 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 24 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 25 | df = df.dropna() 26 | refcodes = df.index 27 | 28 | # Make a training and testing set 29 | mae = [] 30 | r2 = [] 31 | rho = [] 32 | mae_std = [] 33 | r2_std = [] 34 | rho_std = [] 35 | for train_size in train_sizes: 36 | mae_test_seeds = [] 37 | r2_test_seeds = [] 38 | rho_test_seeds = [] 39 | for seed in seeds: 40 | train_set, test_set = train_test_split( 41 | df, test_size=test_size, shuffle=True, random_state=seed) 42 | if train_size != -1: 43 | train_set = train_set[0:train_size] 44 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 45 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 46 | refcodes_train = X_train.index 47 | refcodes_test = X_test.index 48 | 49 | scaler = MinMaxScaler() 50 | scaler.fit(X_train) 51 | X_train = scaler.transform(X_train) 52 | X_test = scaler.transform(X_test) 53 | 54 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 55 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 56 | 57 | # Train and evaluate KRR model 58 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 59 | krr.fit(X_train, y_train) 60 | y_train_pred = krr.predict(X_train) 61 | y_test_pred = krr.predict(X_test) 62 | 63 | mae_test_seeds.append(mean_absolute_error(y_test, y_test_pred)) 64 | r2_test_seeds.append(r2_score(y_test, y_test_pred)) 65 | rho_test_seeds.append(spearmanr(y_test, y_test_pred)[0]) 66 | 67 | mae.append(np.average(mae_test_seeds)) 68 | r2.append(np.average(r2_test_seeds)) 69 | rho.append(np.average(rho_test_seeds)) 70 | mae_std.append(np.std(mae_test_seeds)) 71 | r2_std.append(np.std(r2_test_seeds)) 72 | rho_std.append(np.std(rho_test_seeds)) 73 | 74 | print('Training size: ', train_size) 75 | print('Avg. testing MAE: ', np.round(np.average(mae_test_seeds), 3)) 76 | print('Avg. testing r^2: ', np.round(np.average(r2_test_seeds), 3)) 77 | print('Avg. testing rho: ', np.round(np.average(rho_test_seeds), 3)) 78 | 79 | np.savetxt('learning_curve_avg.csv',np.vstack([mae,r2,rho]),delimiter=',') 80 | np.savetxt('learning_curve_std.csv',np.vstack([mae_std,r2_std,rho_std]),delimiter=',') -------------------------------------------------------------------------------- /machine_learning/orbital_field_matrix/README.md: -------------------------------------------------------------------------------- 1 | `ofm_feature_generator.py`: Generates (flattened) orbital field matrices. Inputs: `xyz_path` (.xyz file of concatenated XYZs for each structure), `refcodes_path` (.csv file of corresponding refcodes). 2 | 3 | `ofm_krr.py`: Trains a KRR model given orbital field matrix fingerprints and property data. Inputs: `fingerprint_path` (path to X fingerprints obtained from `ofm_feature_generator.py`), `y_path` (path to .csv of property data to train/test on). 4 | 5 | `ofm_learning_curves.py`: Same as `ofm_krr.py` but loops over increasing training set sizes. -------------------------------------------------------------------------------- /machine_learning/orbital_field_matrix/ofm_feature_generator.py: -------------------------------------------------------------------------------- 1 | from matminer.featurizers.structure import OrbitalFieldMatrix 2 | from ase.io import read 3 | from pymatgen.io import ase as pm_ase 4 | import numpy as np 5 | import pandas as pd 6 | import os 7 | 8 | # Settings 9 | xyz_path = os.path.join('..','qmof-geometries.xyz') # appended list of XYZs (length N) 10 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # refcode for each structure (length N) 11 | 12 | #--------------------------------------- 13 | # Read in structures 14 | ase_mofs = read(xyz_path, index=':') 15 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str) 16 | adaptor = pm_ase.AseAtomsAdaptor() 17 | pm_mofs = [adaptor.get_structure(ase_mof) for ase_mof in ase_mofs] 18 | 19 | # Initialize feature object 20 | featurizer = OrbitalFieldMatrix(period_tag=True) 21 | features = featurizer.feature_labels() 22 | df = pd.DataFrame(columns=features) 23 | 24 | # Get features 25 | for i, pm_mof in enumerate(pm_mofs): 26 | print('Generating fingerprint: '+str(i)) 27 | fingerprint = featurizer.featurize(pm_mof) 28 | refcode = refcodes[i] 29 | df.loc[refcode, :] = fingerprint 30 | 31 | # Export features 32 | df.index.name = 'MOF' 33 | df.to_csv('ofm_fingerprints.csv', index=True) 34 | -------------------------------------------------------------------------------- /machine_learning/orbital_field_matrix/ofm_krr.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.1 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seed = 42 # random seed 16 | fingerprint_path = 'ofm_fingerprints.csv' # path to fingerprints (length N) 17 | y_path = os.path.join('..','qmof-bandgaps.csv') # path to band gap data (length N) 18 | 19 | #--------------------------------------- 20 | #Read in data 21 | df_features = pd.read_csv(fingerprint_path, index_col=0) 22 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 23 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 24 | df = df.dropna() 25 | refcodes = df.index 26 | 27 | # Make a training and testing set 28 | train_set, test_set = train_test_split( 29 | df, test_size=test_size, shuffle=True, random_state=seed) 30 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 31 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 32 | 33 | refcodes_train = X_train.index 34 | refcodes_test = X_test.index 35 | 36 | scaler = MinMaxScaler() 37 | scaler.fit(X_train) 38 | X_train = scaler.transform(X_train) 39 | X_test = scaler.transform(X_test) 40 | 41 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 42 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 43 | 44 | # Train and evaluate KRR model 45 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 46 | krr.fit(X_train, y_train) 47 | y_train_pred = krr.predict(X_train) 48 | y_test_pred = krr.predict(X_test) 49 | 50 | # Save results 51 | df_train = pd.DataFrame(np.concatenate((y_train, y_train_pred), axis=1), columns=[ 52 | 'DFT', 'ML'], index=refcodes_train) 53 | df_train.to_csv('train_results.csv', header=True, index=True) 54 | 55 | df_test = pd.DataFrame(np.concatenate((y_test, y_test_pred), axis=1), columns=[ 56 | 'DFT', 'ML'], index=refcodes_test) 57 | df_test.to_csv('test_results.csv', header=True, index=True) 58 | 59 | print('Train size: ', len(y_train)) 60 | print('Test size: ', len(y_test)) 61 | print('Train/test MAE: ', mean_absolute_error(y_train, y_train_pred), 62 | mean_absolute_error(y_test, y_test_pred)) 63 | print('Train/test r^2: ', r2_score(y_train, y_train_pred), 64 | r2_score(y_test, y_test_pred)) 65 | print('Train/test rho: ', spearmanr(y_train, y_train_pred) 66 | [0], spearmanr(y_test, y_test_pred)[0]) 67 | -------------------------------------------------------------------------------- /machine_learning/orbital_field_matrix/ofm_learning_curves.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.1 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seeds = [42, 125, 267, 541, 582] # random seeds 16 | train_sizes = [2**7, 2**8, 2**9, 2**10, 2**11, 2**12, 2**13, -1] # train sizes 17 | fingerprint_path = 'ofm_fingerprints.csv' # path to fingerprints (length N) 18 | y_path = os.path.join('..','qmof-bandgaps.csv') # path to band gap data (length N) 19 | 20 | #--------------------------------------- 21 | #Read in data 22 | df_features = pd.read_csv(fingerprint_path, index_col=0) 23 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 24 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 25 | df = df.dropna() 26 | refcodes = df.index 27 | 28 | mae = [] 29 | r2 = [] 30 | rho = [] 31 | mae_std = [] 32 | r2_std = [] 33 | rho_std = [] 34 | for train_size in train_sizes: 35 | mae_test_seeds = [] 36 | r2_test_seeds = [] 37 | rho_test_seeds = [] 38 | for seed in seeds: 39 | 40 | # Make a training and testing set 41 | train_set, test_set = train_test_split( 42 | df, test_size=test_size, shuffle=True, random_state=seed) 43 | if train_size != -1: 44 | train_set = train_set[0:train_size] 45 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 46 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 47 | refcodes_train = X_train.index 48 | refcodes_test = X_test.index 49 | 50 | scaler = MinMaxScaler() 51 | scaler.fit(X_train) 52 | X_train = scaler.transform(X_train) 53 | X_test = scaler.transform(X_test) 54 | 55 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 56 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 57 | 58 | # Train and evaluate KRR model 59 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 60 | krr.fit(X_train, y_train) 61 | y_train_pred = krr.predict(X_train) 62 | y_test_pred = krr.predict(X_test) 63 | 64 | mae_test_seeds.append(mean_absolute_error(y_test, y_test_pred)) 65 | r2_test_seeds.append(r2_score(y_test, y_test_pred)) 66 | rho_test_seeds.append(spearmanr(y_test, y_test_pred)[0]) 67 | 68 | mae.append(np.average(mae_test_seeds)) 69 | r2.append(np.average(r2_test_seeds)) 70 | rho.append(np.average(rho_test_seeds)) 71 | mae_std.append(np.std(mae_test_seeds)) 72 | r2_std.append(np.std(r2_test_seeds)) 73 | rho_std.append(np.std(rho_test_seeds)) 74 | 75 | print('Training size: ', train_size) 76 | print('Avg. testing MAE: ', np.round(np.average(mae_test_seeds), 3)) 77 | print('Avg. testing r^2: ', np.round(np.average(r2_test_seeds), 3)) 78 | print('Avg. testing rho: ', np.round(np.average(rho_test_seeds), 3)) 79 | 80 | np.savetxt('learning_curve_avg.csv',np.vstack([mae,r2,rho]),delimiter=',') 81 | np.savetxt('learning_curve_std.csv',np.vstack([mae_std,r2_std,rho_std]),delimiter=',') 82 | -------------------------------------------------------------------------------- /machine_learning/sine_matrix/README.md: -------------------------------------------------------------------------------- 1 | `sine_matrix_feature_generator.py`: Generates sine Coulomb matrix eigenspectrum. Inputs: `xyz_path` (.xyz file of concatenated XYZs for each structure), `refcodes_path` (.csv file of corresponding refcodes). 2 | 3 | `sine_matrix_krr.py`: Trains a KRR model given sine Coulomb matrix-based fingerprints and property data. Inputs: `fingerprint_path` (path to X fingerprints obtained from `sine_matrix_feature_generator.py`), `y_path` (path to .csv of property data to train/test on). 4 | 5 | `sine_matrix_learning_curves.py`: Same as `sine_matrix_krr.py` but loops over increasing training set sizes. -------------------------------------------------------------------------------- /machine_learning/sine_matrix/sine_matrix_feature_generator.py: -------------------------------------------------------------------------------- 1 | from matminer.featurizers.structure import SineCoulombMatrix 2 | from ase.io import read 3 | from pymatgen.io import ase as pm_ase 4 | import numpy as np 5 | import pandas as pd 6 | import os 7 | 8 | # Settings 9 | xyz_path = os.path.join('..','qmof-geometries.xyz') # appended list of XYZs (length N) 10 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # refcode for each structure (length N) 11 | max_atoms = np.inf # specify if you want an upper max on the # of atoms to consider 12 | 13 | #--------------------------------------- 14 | # Read in structures 15 | ase_mofs = read(xyz_path, index=':') 16 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str) 17 | adaptor = pm_ase.AseAtomsAdaptor() 18 | pm_mofs = [adaptor.get_structure(ase_mof) for ase_mof in ase_mofs if len(ase_mof) <= max_atoms] 19 | 20 | # Initialize feature object 21 | featurizer = SineCoulombMatrix() 22 | featurizer.fit(pm_mofs) 23 | features = featurizer.feature_labels() 24 | df = pd.DataFrame(columns=features) 25 | 26 | # Get features 27 | for i, pm_mof in enumerate(pm_mofs): 28 | print('Generating fingerprint: '+str(i)) 29 | fingerprint = featurizer.featurize(pm_mof) 30 | refcode = refcodes[i] 31 | df.loc[refcode, :] = fingerprint 32 | 33 | # Export features 34 | df.index.name = 'MOF' 35 | df.to_csv('sine_matrix_fingerprints.csv', index=True) 36 | -------------------------------------------------------------------------------- /machine_learning/sine_matrix/sine_matrix_krr.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.01 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seed = 42 # random seed 16 | fingerprint_path = 'sine_matrix_fingerprints.csv' # path to fingerprints (length N) 17 | y_path = os.path.join('..','qmof-bandgaps.csv') # path to band gap data (length N) 18 | 19 | #--------------------------------------- 20 | #Read in data 21 | df_features = pd.read_csv(fingerprint_path, index_col=0) 22 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 23 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 24 | df = df.dropna() 25 | refcodes = df.index 26 | 27 | # Make a training and testing set 28 | train_set, test_set = train_test_split( 29 | df, test_size=test_size, shuffle=True, random_state=seed) 30 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 31 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 32 | 33 | refcodes_train = X_train.index 34 | refcodes_test = X_test.index 35 | 36 | scaler = MinMaxScaler() 37 | scaler.fit(X_train) 38 | X_train = scaler.transform(X_train) 39 | X_test = scaler.transform(X_test) 40 | 41 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 42 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 43 | 44 | # Train and evaluate KRR model 45 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 46 | krr.fit(X_train, y_train) 47 | y_train_pred = krr.predict(X_train) 48 | y_test_pred = krr.predict(X_test) 49 | 50 | # Save results 51 | df_train = pd.DataFrame(np.concatenate((y_train, y_train_pred), axis=1), columns=[ 52 | 'DFT', 'ML'], index=refcodes_train) 53 | df_train.to_csv('train_results.csv', header=True, index=True) 54 | 55 | df_test = pd.DataFrame(np.concatenate((y_test, y_test_pred), axis=1), columns=[ 56 | 'DFT', 'ML'], index=refcodes_test) 57 | df_test.to_csv('test_results.csv', header=True, index=True) 58 | 59 | print('Train size: ', len(y_train)) 60 | print('Test size: ', len(y_test)) 61 | print('Train/test MAE: ', mean_absolute_error(y_train, y_train_pred), 62 | mean_absolute_error(y_test, y_test_pred)) 63 | print('Train/test r^2: ', r2_score(y_train, y_train_pred), 64 | r2_score(y_test, y_test_pred)) 65 | print('Train/test rho: ', spearmanr(y_train, y_train_pred) 66 | [0], spearmanr(y_test, y_test_pred)[0]) 67 | -------------------------------------------------------------------------------- /machine_learning/sine_matrix/sine_matrix_learning_curves.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import train_test_split 4 | from sklearn.preprocessing import MinMaxScaler 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import numpy as np 8 | import os 9 | 10 | # Settings 11 | alpha = 0.01 12 | gamma = 0.1 13 | kernel = 'laplacian' # kernel function 14 | test_size = 0.2 # fraction held-out for testing 15 | seeds = [42, 125, 267, 541, 582] # random seeds 16 | train_sizes = [2**7, 2**8, 2**9, 2**10, 2**11, 2**12, 2**13, -1] # train sizes 17 | fingerprint_path = 'sine_matrix_fingerprints.csv' # path to fingerprints (length N) 18 | y_path = os.path.join('..','qmof-bandgaps.csv') # path to band gap data (length N) 19 | 20 | #--------------------------------------- 21 | #Read in data 22 | df_features = pd.read_csv(fingerprint_path, index_col=0) 23 | df_BG = pd.read_csv(y_path, index_col=0)['BG_PBE'] 24 | df = pd.concat([df_features, df_BG], axis=1, sort=True) 25 | df = df.dropna() 26 | refcodes = df.index 27 | 28 | mae = [] 29 | r2 = [] 30 | rho = [] 31 | mae_std = [] 32 | r2_std = [] 33 | rho_std = [] 34 | for train_size in train_sizes: 35 | mae_test_seeds = [] 36 | r2_test_seeds = [] 37 | rho_test_seeds = [] 38 | for seed in seeds: 39 | 40 | # Make a training and testing set 41 | train_set, test_set = train_test_split( 42 | df, test_size=test_size, shuffle=True, random_state=seed) 43 | if train_size != -1: 44 | train_set = train_set[0:int(round(train_size*0.8))] 45 | X_train = train_set.loc[:, (df.columns != 'BG_PBE')] 46 | X_test = test_set.loc[:, (df.columns != 'BG_PBE')] 47 | refcodes_train = X_train.index 48 | refcodes_test = X_test.index 49 | 50 | scaler = MinMaxScaler() 51 | scaler.fit(X_train) 52 | X_train = scaler.transform(X_train) 53 | X_test = scaler.transform(X_test) 54 | 55 | y_train = train_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 56 | y_test = test_set.loc[:, df.columns == 'BG_PBE'].to_numpy() 57 | 58 | # Train and evaluate KRR model 59 | krr = KernelRidge(alpha=alpha, gamma=gamma, kernel=kernel) 60 | krr.fit(X_train, y_train) 61 | y_train_pred = krr.predict(X_train) 62 | y_test_pred = krr.predict(X_test) 63 | 64 | mae_test_seeds.append(mean_absolute_error(y_test, y_test_pred)) 65 | r2_test_seeds.append(r2_score(y_test, y_test_pred)) 66 | rho_test_seeds.append(spearmanr(y_test, y_test_pred)[0]) 67 | 68 | mae.append(np.average(mae_test_seeds)) 69 | r2.append(np.average(r2_test_seeds)) 70 | rho.append(np.average(rho_test_seeds)) 71 | mae_std.append(np.std(mae_test_seeds)) 72 | r2_std.append(np.std(r2_test_seeds)) 73 | rho_std.append(np.std(rho_test_seeds)) 74 | 75 | print('Training size: ', train_size) 76 | print('Avg. testing MAE: ', np.round(np.average(mae_test_seeds), 3)) 77 | print('Avg. testing r^2: ', np.round(np.average(r2_test_seeds), 3)) 78 | print('Avg. testing rho: ', np.round(np.average(rho_test_seeds), 3)) 79 | 80 | np.savetxt('learning_curve_avg.csv',np.vstack([mae,r2,rho]),delimiter=',') 81 | np.savetxt('learning_curve_std.csv',np.vstack([mae_std,r2_std,rho_std]),delimiter=',') -------------------------------------------------------------------------------- /machine_learning/soap_kernel/README.md: -------------------------------------------------------------------------------- 1 | `soap_matrix_generator.py`: Generates SOAP fingerprints for a set of structures. Inputs: `xyz_path` (.xyz file of concatenated XYZs for each structure), `refcodes_path` (.csv file of corresponding refcodes). 2 | 3 | `soap_avg_kernel_generator.py`: Generates an averaged SOAP similarity kernel from a folder of SOAP fingerprints. Inputs: `refcodes_path` (.csv of IDs corresponding to N structures), `comparison_refcodes_path` (.csv of IDs corresponding to M structures to compare to), `soaps_path` (path to folder containing SOAP fingerprints genreated from `soap_matrix_generator.py`). 4 | 5 | `soap_krr.py`: Trains a KRR model given a square SOAP similarity kernel and property data. Inputs: `kernel_path` (path to SOAP similarity kernel obtained from `soap_avg_kernel_generator.py`), `y_path` (path to .csv of property data to train/test on). 6 | 7 | `soap_learning_curves.py`: Same as `soap_krr.py` but loops over increasing training set sizes. -------------------------------------------------------------------------------- /machine_learning/soap_kernel/soap_avg_kernel_generator.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from sparse import load_npz 4 | 5 | # This will make an average kernel matrix of (M x N) dimensions. 6 | # If refcodes_path == comparison_refcodes_path, then M = N. 7 | # This code must be run after `soap_matrix_generator.py` 8 | # Note: This can be memory-intensive. 9 | 10 | # Settings 11 | basepath = os.getcwd() # Base path where results will be stored 12 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # IDs corresponding to N structures 13 | comparison_refcodes_path = refcodes_path # IDs corresponding to M structures 14 | soaps_path = os.path.join(basepath, 'soap_matrices') # Path where SOAP matrices are stored 15 | 16 | #--------------------------------------- 17 | # Read in refcodes 18 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str).tolist() 19 | 20 | if refcodes_path == comparison_refcodes_path: 21 | comparison_refcodes = refcodes 22 | else: 23 | comparison_refcodes = np.genfromtxt( 24 | comparison_refcodes_path, delimiter=',', dtype=str).tolist() 25 | M = len(refcodes) 26 | N = len(comparison_refcodes) 27 | 28 | # Get number of features 29 | kernel_name = 'avg_soap_kernel.csv' 30 | example_soap = load_npz(os.path.join(soaps_path, os.listdir(soaps_path)[0])) 31 | N_features = np.shape(example_soap)[1] 32 | 33 | # Prepare M average SOAPs 34 | print('Initializing M matrix') 35 | avg_soaps_M = np.zeros((M, N_features), dtype=np.float32) 36 | for i in range(M): 37 | print(refcodes[i]) 38 | p = os.path.join(soaps_path, 'soap_'+str(refcodes[i])+'.npz') 39 | soap_temp = load_npz(p).todense() 40 | avg_soaps_M[i, :] = soap_temp.mean(axis=0) 41 | 42 | # Prepare N average SOAPs 43 | if M != N or refcodes_path != comparison_refcodes_path: 44 | print('Initializing N matrix') 45 | avg_soaps_N = np.zeros((N, N_features), dtype=np.float32) 46 | for i in range(N): 47 | print(comparison_refcodes[i]) 48 | p = os.path.join(soaps_path, 'soap_' + 49 | str(comparison_refcodes[i])+'.npz') 50 | soap_temp = load_npz(p).todense() 51 | avg_soaps_N[i, :] = soap_temp.mean(axis=0) 52 | 53 | # Compute average kernel matrix 54 | print('Computing K') 55 | if M == N and refcodes_path == comparison_refcodes_path: 56 | K = avg_soaps_M.dot(avg_soaps_M.T) 57 | norm = np.sqrt(np.einsum('ii,jj->ij', K, K)) 58 | else: 59 | K = avg_soaps_M.dot(avg_soaps_N.T) 60 | norm = np.sqrt(np.einsum( 61 | 'ii,jj->ij', avg_soaps_M.dot(avg_soaps_M.T), avg_soaps_N.dot(avg_soaps_N.T))) 62 | K = K/norm 63 | 64 | np.savetxt(os.path.join(basepath, kernel_name), K.T, delimiter=',') 65 | -------------------------------------------------------------------------------- /machine_learning/soap_kernel/soap_krr.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pandas as pd 3 | from sklearn.kernel_ridge import KernelRidge 4 | from sklearn.model_selection import ShuffleSplit 5 | from sklearn.metrics import mean_absolute_error, r2_score 6 | from scipy.stats import spearmanr 7 | import os 8 | 9 | # Settings 10 | xi = 2 11 | alpha = 1E-3 12 | test_size = 0.2 # fraction held-out for testing 13 | seed = 42 # random seed 14 | kernel_path = 'avg_soap_kernel.csv' # path to NxN kernel 15 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # path to refcodes (length N) 16 | y_path = os.path.join('..','qmof-bandgaps.csv') # path to band gaps (length N) 17 | 18 | #--------------------------------------- 19 | # Read in data 20 | K = pd.read_csv(kernel_path, header=None, delimiter=',').to_numpy() 21 | K = K**xi 22 | y = pd.read_csv(y_path, index_col=0)['BG_PBE'].values 23 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str) 24 | 25 | # Make a training and testing set 26 | splitter = ShuffleSplit(n_splits=1, test_size=test_size, random_state=seed) 27 | train_indices, test_indices = next(splitter.split(y)) 28 | y_train = y[train_indices] 29 | y_test = y[test_indices] 30 | K_train = K[train_indices, :][:, train_indices] 31 | K_test = K[test_indices, :][:, train_indices] 32 | refcodes_train = refcodes[train_indices] 33 | refcodes_test = refcodes[test_indices] 34 | del K, y, refcodes 35 | 36 | # Train and evaluate KRR model 37 | krr = KernelRidge(alpha=alpha, kernel='precomputed') 38 | krr.fit(K_train, y_train) 39 | y_train_pred = krr.predict(K_train) 40 | y_test_pred = krr.predict(K_test) 41 | 42 | # Save results 43 | df_train = pd.DataFrame(np.vstack((y_train,y_train_pred)).T, columns=[ 44 | 'DFT', 'ML'], index=refcodes_train) 45 | df_train.index.name = 'MOF' 46 | df_train.to_csv('train_results.csv', header=True, index=True) 47 | 48 | df_test = pd.DataFrame(np.vstack((y_test,y_test_pred)).T, columns=[ 49 | 'DFT', 'ML'], index=refcodes_test) 50 | df_test.index.name = 'MOF' 51 | df_test.to_csv('test_results.csv', header=True, index=True) 52 | 53 | print('Train size: ', len(train_indices)) 54 | print('Test size: ', len(test_indices)) 55 | print('Train/test MAE: ', mean_absolute_error(y_train, y_train_pred), mean_absolute_error(y_test, y_test_pred)) 56 | print('Train/test r^2: ', r2_score(y_train, y_train_pred), r2_score(y_test, y_test_pred)) 57 | print('Train/test rho: ', spearmanr(y_train, y_train_pred)[0], spearmanr(y_test, y_test_pred)[0]) -------------------------------------------------------------------------------- /machine_learning/soap_kernel/soap_learning_curves.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | from sklearn.kernel_ridge import KernelRidge 3 | from sklearn.model_selection import ShuffleSplit 4 | from sklearn.metrics import mean_absolute_error, r2_score 5 | from scipy.stats import spearmanr 6 | import numpy as np 7 | import os 8 | 9 | # Settings 10 | xi = 2 11 | alpha = 1E-3 12 | gamma = 0.1 13 | test_size = 0.2 # fraction held-out for testing 14 | seeds = [42, 125, 267, 541, 582] # random seeds 15 | train_sizes = [2**7, 2**8, 2**9, 2**10, 2**11, 2**12, 2**13, -1] # train sizes 16 | kernel_path = 'avg_soap_kernel.csv' # path to NxN kernel 17 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # path to refcodes (length N) 18 | y_path = os.path.join('..','qmof-bandgaps.csv') # path to band gaps (length N) 19 | 20 | #--------------------------------------- 21 | # Read in data 22 | K = pd.read_csv(kernel_path, header=None, delimiter=',').to_numpy() 23 | K = K**xi 24 | y = pd.read_csv(y_path, index_col=0)['BG_PBE'].values 25 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str) 26 | 27 | mae = [] 28 | r2 = [] 29 | rho = [] 30 | mae_std = [] 31 | r2_std = [] 32 | rho_std = [] 33 | for train_size in train_sizes: 34 | mae_test_seeds = [] 35 | r2_test_seeds = [] 36 | rho_test_seeds = [] 37 | for seed in seeds: 38 | 39 | # Make a training and testing set 40 | splitter = ShuffleSplit(n_splits=1, test_size=test_size, random_state=seed) 41 | train_indices, test_indices = next(splitter.split(y)) 42 | if train_size != -1: 43 | train_indices = train_indices[0:train_size] 44 | y_train = y[train_indices] 45 | y_test = y[test_indices] 46 | K_train = K[train_indices, :][:, train_indices] 47 | K_test = K[test_indices, :][:, train_indices] 48 | refcodes_train = refcodes[train_indices] 49 | refcodes_test = refcodes[test_indices] 50 | 51 | # Train and evaluate KRR model 52 | krr = KernelRidge(alpha=alpha, kernel='precomputed') 53 | krr.fit(K_train, y_train) 54 | y_train_pred = krr.predict(K_train) 55 | y_test_pred = krr.predict(K_test) 56 | 57 | mae_test_seeds.append(mean_absolute_error(y_test, y_test_pred)) 58 | r2_test_seeds.append(r2_score(y_test, y_test_pred)) 59 | rho_test_seeds.append(spearmanr(y_test, y_test_pred)[0]) 60 | 61 | mae.append(np.average(mae_test_seeds)) 62 | r2.append(np.average(r2_test_seeds)) 63 | rho.append(np.average(rho_test_seeds)) 64 | mae_std.append(np.std(mae_test_seeds)) 65 | r2_std.append(np.std(r2_test_seeds)) 66 | rho_std.append(np.std(rho_test_seeds)) 67 | 68 | print('Training size: ', train_size) 69 | print('Avg. testing MAE: ', np.round(np.average(mae_test_seeds), 3)) 70 | print('Avg. testing r^2: ', np.round(np.average(r2_test_seeds), 3)) 71 | print('Avg. testing rho: ', np.round(np.average(rho_test_seeds), 3)) 72 | 73 | np.savetxt('learning_curve_avg.csv',np.vstack([mae,r2,rho]),delimiter=',') 74 | np.savetxt('learning_curve_std.csv',np.vstack([mae_std,r2_std,rho_std]),delimiter=',') -------------------------------------------------------------------------------- /machine_learning/soap_kernel/soap_matrix_generator.py: -------------------------------------------------------------------------------- 1 | from dscribe.descriptors import SOAP 2 | from ase.io import read 3 | import numpy as np 4 | from sparse import save_npz 5 | import os 6 | 7 | # Settings 8 | basepath = os.getcwd() # base path where avg SOAP matrices will be stored 9 | soap_params = {'rcut': 4.0, 'sigma': 0.1, 'nmax': 9, 'lmax': 9, 10 | 'rbf': 'gto', 'average': 'off', 'crossover': True} 11 | xyz_path = os.path.join('..','qmof-geometries.xyz') # appended XYZ of structures (length N) 12 | refcodes_path = os.path.join('..','qmof-refcodes.csv') # refcode for each structure (length N) 13 | 14 | #--------------------------------------- 15 | # Make folder if not present 16 | if not os.path.exists(os.path.join(basepath, 'soap_matrices')): 17 | os.mkdir(os.path.join(basepath, 'soap_matrices')) 18 | 19 | # Read in structures 20 | structures = read(xyz_path, index=':') 21 | 22 | # Read in refcodes 23 | refcodes = np.genfromtxt(refcodes_path, delimiter=',', dtype=str).tolist() 24 | if len(refcodes) != len(structures): 25 | raise ValueError('Mismatch in refcodes and num. structures') 26 | 27 | # Get unique species 28 | species = [] 29 | for structure in structures: 30 | syms = np.unique(structure.get_chemical_symbols()) 31 | species.extend([sym for sym in syms if sym not in species]) 32 | species.sort() 33 | 34 | # Initialize SOAP 35 | soap = SOAP( 36 | species=species, 37 | periodic=True, 38 | sigma=soap_params['sigma'], 39 | rcut=soap_params['rcut'], 40 | nmax=soap_params['nmax'], 41 | lmax=soap_params['lmax'], 42 | rbf=soap_params['rbf'], 43 | average=soap_params['average'], 44 | crossover=soap_params['crossover'], 45 | sparse=True 46 | ) 47 | 48 | # Make SOAP fingerprints 49 | for i, structure in enumerate(structures): 50 | refcode = refcodes[i] 51 | soap_filename = os.path.join( 52 | basepath, 'soap_matrices', 'soap_'+refcode+'.npz') 53 | if os.path.exists(soap_filename): 54 | continue 55 | soap_matrix = soap.create(structure) 56 | save_npz(soap_filename, soap_matrix) 57 | -------------------------------------------------------------------------------- /machine_learning/umap/README.md: -------------------------------------------------------------------------------- 1 | The following folders contain scripts to carry out the UMAP analyses. 2 | 3 | Pre-requisite installation instructions: `pip install umap-learn[plot]` 4 | 5 | - `umap_reduction.py`: UMAP with DFT-optimizd band gap data overlaid 6 | 7 | - `umap_reduction_dataset_overlap.py`: UMAP comparing the overlap between two datasets. -------------------------------------------------------------------------------- /machine_learning/umap/umap_reduction.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import umap 3 | import umap.plot 4 | import numpy as np 5 | import matplotlib.pyplot as plt 6 | from bokeh.plotting import output_file, save 7 | 8 | seed = 42 # random seed 9 | x = r'/path/to/encodings.csv' # .csv of X encodings 10 | bandgaps_path = r'/path/to/qmof-bandgaps.csv' # .csv of y properties 11 | 12 | # --------------------------------------- 13 | # Band gaps and refcodes 14 | df = pd.read_csv(bandgaps_path, delimiter=',', header=0, index_col=0) 15 | bg = df['BG_PBE'].values 16 | 17 | # Encoding 18 | X = pd.read_csv(x, delimiter=',', header=0, index_col=0).dropna() 19 | refcodes = X.index.values 20 | 21 | # Discretize band gaps for colorbar 22 | bg_class = np.empty(len(refcodes), dtype=object) 23 | bg = np.empty(len(refcodes)) 24 | for i, ref in enumerate(refcodes): 25 | b = df.loc[ref]['BG_PBE'] 26 | bg[i] = b 27 | if b < 0.5: 28 | bg_class[i] = '[0 eV, 0.5 eV)' 29 | elif b < 1: 30 | bg_class[i] = '[0.5 eV, 1 eV)' 31 | elif b < 2: 32 | bg_class[i] = '[1 eV, 2 eV)' 33 | elif b < 3: 34 | bg_class[i] = '[2 eV, 3 eV)' 35 | elif b < 4: 36 | bg_class[i] = '[3 eV, 4 eV)' 37 | else: 38 | bg_class[i] = '[4 eV, 6.5 eV)' 39 | 40 | bg_class = np.array(bg_class) 41 | 42 | # Perform dimensionality reduction 43 | fit = umap.UMAP(n_neighbors=50, min_dist=0.4, random_state=seed) 44 | u = fit.fit(X) 45 | 46 | # Make static plot 47 | plt.rcParams["figure.dpi"] = 1000 48 | p = umap.plot.points(u, labels=bg_class, color_key_cmap='Spectral', 49 | width=8500, height=8500) 50 | p.texts[0].set_visible(False) 51 | plt.savefig('umap.png', transparent=False) 52 | 53 | # Make interactive plot 54 | hover_data = pd.DataFrame({'Refcode': refcodes, 'E_g': bg}) 55 | p_int = umap.plot.interactive( 56 | u, labels=bg_class, color_key_cmap='Spectral', hover_data=hover_data, point_size=2) 57 | output_file('umap.html') 58 | save(p_int) 59 | -------------------------------------------------------------------------------- /machine_learning/umap/umap_reduction_dataset_overlap.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import umap 3 | import umap.plot 4 | import numpy as np 5 | import matplotlib.pyplot as plt 6 | from matplotlib.patches import Patch 7 | 8 | seed = 42 # random seed 9 | x1 = r'/path/to/encodings1.csv' # path to X of dataset 1 10 | x2 = r'/path/to/encodings2.csv' # path to X of dataset 2 11 | 12 | #--------------------------------------- 13 | # Encoding 14 | X1 = pd.read_csv(x1, delimiter=',', header=0, index_col=0).dropna() 15 | refcodes1 = X1.index.values 16 | X2 = pd.read_csv(x2, delimiter=',', header=0, index_col=0).dropna() 17 | refcodes2 = X2.index.values 18 | X = pd.concat([X1,X2],axis=0) 19 | refcodes = X.index.values 20 | 21 | # Perform dimensionality reduction 22 | fit = umap.UMAP(n_neighbors=50, min_dist=0.4, random_state=seed) 23 | u = fit.fit(X) 24 | 25 | classifier = np.array(['Dataset-1']*len(refcodes1)+['Dataset-2']*len(refcodes2)) 26 | mask = np.zeros(len(refcodes),dtype=bool) 27 | mask[np.arange(0,len(refcodes1))]=0 28 | mask[np.arange(0,len(refcodes2))]=1 29 | 30 | # Perform dimensionality reduction 31 | fit = umap.UMAP(n_neighbors=50, min_dist=0.4, random_state=seed) 32 | u = fit.fit(X) 33 | 34 | # Make static plot 35 | points = u.embedding_ 36 | labels = classifier 37 | width = 8500 38 | height = 8500 39 | point_size = 100.0 / np.sqrt(points.shape[0]) 40 | dpi = 1000 41 | plt.rcParams["figure.dpi"] = dpi 42 | color_key_cmap = 'Paired_r' 43 | fig = plt.figure(figsize=(width / dpi, height / dpi)) 44 | ax = fig.add_subplot(111) 45 | color_key = plt.get_cmap(color_key_cmap)(np.linspace(0, 1, np.unique(labels).shape[0])) 46 | unique_labels = np.unique(labels) 47 | num_labels = unique_labels.shape[0] 48 | legend_elements = [ 49 | Patch(facecolor=color_key[i], label=unique_labels[i]) 50 | for i, k in enumerate(unique_labels) 51 | ] 52 | new_color_key = {k: color_key[i] for i, k in enumerate(unique_labels)} 53 | colors = pd.Series(labels).map(new_color_key) 54 | ax.scatter(points[:, 0][~mask], points[:, 1][~mask], s=point_size, c=colors[~mask],alpha=0.5) 55 | ax.scatter(points[:, 0][mask], points[:, 1][mask], s=point_size, c=colors[mask],alpha=0.5) 56 | ax.legend(handles=legend_elements) 57 | ax.axes.xaxis.set_visible(False) 58 | ax.axes.yaxis.set_visible(False) 59 | plt.savefig('umap_overlap.png', transparent=False) 60 | -------------------------------------------------------------------------------- /mofid_search.md: -------------------------------------------------------------------------------- 1 | # Searching Tips 2 | 3 | ## MOFid 4 | Consider this scenario: you want to find MOF-5 (also known as IRMOF-1) in the QMOF Database (or any other MOF database for that matter!) but don't know which entry it corresponds to. One way to find the MOF that you're looking for is to take advantage of a tool called [MOFid](https://github.com/snurr-group/mofid). The MOFid code can generate a unique identifier for a given MOF structure. MOFid information is made publicly available with each structure in the QMOF Database, most structures on [MOFDB](https://mof.tech.northwestern.edu/), and the [2019 CoRE MOF Database](https://zenodo.org/record/3677685#.XzqXbZMzY8M) (see the supporting information [here](https://pubs.acs.org/doi/abs/10.1021/acs.cgd.9b01050)). I outline a typical procedure below: 5 | 6 | 1. Download the CIF of the desired MOF from the internet (e.g. from the original source publication). An example CIF for MOF-5 is [here](https://github.com/iRASPA/RASPA2/blob/master/structures/mofs/cif/IRMOF-1.cif). 7 | 2. Calculate its unique MOFid/MOFkey using the [ID Tool](https://snurr-group.github.io/web-mofid/) by simply uploading the structure and hitting submit. Please read the tips on the MOFid webpage carefully. 8 | 3. Copy down the MOFid and/or MOFkey information. The MOFkey for MOF-5 is Zn.KKEYFWRCBNTPAC.MOFkey-v1.pcu. 9 | 4. Look for the MOFid/MOFkey in the database of your choosing. For the QMOF Database, we provide the MOFids/MOFkeys for every structure, so you can search the `qmof.json` file provided with the QMOF Database for any entries with the obtained MOFkey (or MOFid). The hits returned for MOF-5 are: LISBIZ_FSR, MIBQAR_FSR, SAHYIK_FSR, SAHYOQ_FSR , XOKHAH_FSR, and so on (this is a very popular MOF). If you are using a MOF Database that doesn't have MOFids/MOFkeys pre-computed, you can calculate them in a high-throughput mode using the MOFid [Python interface](https://github.com/snurr-group/mofid). 10 | 5. If there are multiple options, take the one you like. In the case of the QMOF Database, I would generally recommend the structure with the lowest energy (per atom), if there are any appreciable differences. 11 | -------------------------------------------------------------------------------- /other/dft_workflow/README.md: -------------------------------------------------------------------------------- 1 | # Important Note 2 | If you are just looking to use PyMOFScreen in your own work, please refer to the parent GitHub page for the [PyMOFScreen package](https://github.com/arosen93/mof_screen). The information below is solely meant to reproduce the QMOF database workflow exactly. 3 | 4 | # Running the QMOF database DFT Workflow 5 | This directory contains an example input file for running the automated DFT screening procedure used in constructing the QMOF database. 6 | 7 | All that needs to be run on the compute cluster is the `opt.py` file in the `runner` folder, which will call PyMOFScreen to screen MOF CIFs found in the `mofpath` directory. First, configure PyMOFScreen to work with your VASP executables and compute cluster. Refer to the [PyMOFScreen GitHub page](https://github.com/arosen93/mof_screen) for full instructions. 8 | 9 | Briefly, you will need to... 10 | 1. Modify the `mof_screen/pymofscreen/compute_environ.py` file for your compute cluster's scheduling system and how to call the VASP executables. 11 | 2. Install PyMOFScreen by going into the `mof_screen` folder and running `pip install .`. 12 | 3. Submit a compute job to run the `opt.py` script in the `runner` folder with Python. The only other component of the submission script that is needed is a line reading `export VASP_SCRIPT=run_vasp.py` (the `run_vasp.py` file will be generated automatically). An example submission script for use with a Slurm scheduler can be found in `sub_slurm.job`. -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Andrew S. Rosen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/dft_workflow/mof_screen/pymofscreen/__init__.py -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/calc_swaps.py: -------------------------------------------------------------------------------- 1 | from pymofscreen.default_calculators import defaults 2 | 3 | def update_calc(calc,calc_swaps): 4 | """ 5 | Update a calculator based on pre-defined "swaps" 6 | Args: 7 | calc (dictionary): ASE Vasp calculators dictionary 8 | 9 | calc_swaps (list): list of pre-existing calc swaps 10 | 11 | Returns: 12 | calc (dictionary): updated ASE Vasp calculator 13 | 14 | calc_swaps (list): updated list of swaps 15 | """ 16 | for swap in calc_swaps: 17 | 18 | swap.replace(' ','') 19 | 20 | if swap == 'large_supercell': 21 | calc.special_params['lreal'] = 'Auto' 22 | 23 | elif swap == 'zbrent': 24 | calc.int_params['ibrion'] = 3 25 | calc.exp_params['ediff'] = 1e-6 26 | calc.int_params['nelmin'] = 8 27 | calc.int_params['iopt'] = 7 28 | calc.float_params['potim'] = 0 29 | 30 | elif swap == 'dentet' or swap == 'grad_not_orth': 31 | calc.int_params['ismear'] = 0 32 | calc.string_params['algo'] = 'Normal' 33 | 34 | elif swap == 'edddav': 35 | calc.string_params['algo'] = 'All' 36 | 37 | elif swap == 'inv_rot_mat': 38 | calc.exp_params['symprec'] = 1e-8 39 | 40 | elif (swap == 'subspacematrix' or swap == 'real_optlay' 41 | or swap == 'rspher' or swap == 'nicht_konv'): 42 | calc.special_params['lreal'] = False 43 | calc.string_params['prec'] = 'Accurate' 44 | 45 | elif swap == 'tetirr' or swap == 'incorrect_shift': 46 | calc.input_params['gamma'] = True 47 | 48 | elif swap == 'pricel' or swap == 'sgrcon' or swap == 'ibzkpt': 49 | calc.exp_params['symprec'] = 1e-8 50 | calc.int_params['isym'] = 0 51 | 52 | elif swap == 'amin': 53 | calc.float_params['amin'] = 0.01 54 | 55 | elif swap == 'pssyevx' or swap == 'eddrmm': 56 | calc.string_params['algo'] = 'Normal' 57 | 58 | elif swap == 'zheev': 59 | calc.string_params['algo'] = 'Exact' 60 | 61 | elif swap == 'elf_kpar': 62 | calc.int_params['kpar'] = 1 63 | 64 | elif swap == 'rhosyg': 65 | calc.exp_params['symprec'] = 1e-4 66 | calc.int_params['isym'] = 0 67 | 68 | elif swap == 'posmap': 69 | calc.exp_params['symprec'] = 1e-6 70 | 71 | elif 'pwave' in swap: 72 | calc.int_params['istart'] = 0 73 | 74 | elif 'sigma=' in swap: 75 | calc.float_params['sigma'] = float(swap.split('=')[-1]) 76 | 77 | elif 'nbands=' in swap: 78 | calc.int_params['nbands'] = int(swap.split('=')[-1]) 79 | 80 | elif 'potim=' in swap: 81 | calc.float_params['potim'] = float(swap.split('=')[-1]) 82 | 83 | elif 'nsw=' in swap: 84 | calc.int_params['nsw'] = int(swap.split('=')[-1]) 85 | 86 | elif 'nelm=' in swap: 87 | calc.int_params['nelm'] = int(swap.split('=')[-1]) 88 | 89 | elif 'ibrion=' in swap: 90 | calc.int_params['ibrion'] = int(swap.split('=')[-1]) 91 | 92 | elif 'istart=' in swap: 93 | calc.int_params['istart'] = int(swap.split('=')[-1]) 94 | 95 | elif 'algo=' in swap: 96 | calc.string_params['algo'] = swap.split('=')[-1] 97 | 98 | elif 'iopt=' in swap: 99 | calc.int_params['iopt'] = int(swap.split('=')[-1]) 100 | 101 | elif 'isif=' in swap: 102 | calc.int_params['isif'] = int(swap.split('=')[-1]) 103 | 104 | elif 'lreal=' in swap: 105 | swap_val = swap.split('=')[1].lower() 106 | if swap_val == 'false': 107 | calc.special_params['lreal'] = False 108 | elif swap_val == 'auto': 109 | calc.special_params['lreal'] = 'Auto' 110 | elif swap_val == 'true': 111 | calc.special_params['lreal'] = True 112 | 113 | elif 'fire' in swap: 114 | calc.int_params['ibrion'] = 3 115 | calc.int_params['iopt'] = 7 116 | calc.float_params['potim'] = 0 117 | 118 | elif 'lwave=' in swap: 119 | swap_val = swap.split('=')[-1].lower() 120 | if swap_val == 'true': 121 | calc.bool_params['lwave'] = True 122 | elif swap_val == 'false': 123 | calc.bool_params['lwave'] = False 124 | 125 | elif 'nelect=' in swap: 126 | swap_val = swap.split('=')[-1].lower() 127 | calc.float_params['nelect'] = float(swap.split('=')[-1]) 128 | 129 | elif swap == 'brions' or swap == 'too_few_bands': 130 | pass 131 | 132 | else: 133 | raise ValueError('Unknown calc swap') 134 | 135 | return calc, calc_swaps 136 | 137 | def check_nprocs(n_atoms,nprocs,ppn): 138 | """ 139 | Reduce processors if the structure is too small 140 | Args: 141 | n_atoms (int): number of atoms in structure 142 | 143 | nprocs (int): total number of processors 144 | 145 | ppn (int): processors per node 146 | 147 | Returns: 148 | nprocs (int): updated total number of processors 149 | """ 150 | 151 | lower = False 152 | while n_atoms < nprocs/2: 153 | if nprocs == ppn: 154 | lower = True 155 | break 156 | nprocs -= ppn 157 | if lower == True: 158 | while n_atoms < nprocs/2: 159 | if nprocs <= 2: 160 | break 161 | nprocs -= 2 162 | defaults['ncore'] = nprocs 163 | 164 | return nprocs -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/cif_handler.py: -------------------------------------------------------------------------------- 1 | import os 2 | from ase.io import read, write 3 | import numpy as np 4 | from pymofscreen.writers import pprint 5 | try: 6 | import pymatgen as pm 7 | from pymatgen.io.cif import CifParser 8 | has_pm = True 9 | except: 10 | no_pm = False 11 | import warnings 12 | 13 | def get_cif_files(mofpath,skip_mofs=None): 14 | """ 15 | Get the list of CIF files 16 | Args: 17 | mofpath (string): directory to CIF files 18 | 19 | skip_mofs (list): list of MOFs to ignore 20 | 21 | Returns: 22 | sorted_cifs (list): alphabetized list of CIF files 23 | """ 24 | cif_files = [] 25 | if skip_mofs is None: 26 | skip_mofs = [] 27 | for filename in os.listdir(mofpath): 28 | filename = filename.strip() 29 | if '.cif' in filename or 'POSCAR_' in filename: 30 | if '.cif' in filename: 31 | refcode = filename.split('.cif')[0] 32 | elif 'POSCAR_' in filename: 33 | refcode = filename.split('POSCAR_')[1] 34 | if refcode not in skip_mofs: 35 | cif_files.append(filename) 36 | else: 37 | pprint('Skipping '+refcode) 38 | 39 | sorted_cifs = sorted(cif_files) 40 | 41 | return sorted_cifs 42 | 43 | def cif_to_mof(filepath,niggli): 44 | """ 45 | Convert file to ASE Atoms object 46 | Args: 47 | filepath (string): full path to structure file 48 | 49 | niggli (bool): if Niggli-reduction should be performed 50 | 51 | Returns: 52 | sorted_cifs (list): alphabetized list of CIF files 53 | """ 54 | 55 | tol = 0.8 56 | if niggli and has_pm: 57 | if '.cif' in os.path.basename(filepath): 58 | parser = CifParser(filepath) 59 | pm_mof = parser.get_structures(primitive=True)[0] 60 | else: 61 | pm.Structure.from_file(filepath,primitive=True) 62 | pm_mof.to(filename='POSCAR') 63 | mof = read('POSCAR') 64 | write('POSCAR',mof) 65 | elif niggli and not has_pm: 66 | warnings.warn('Pymatgen not installed. Niggli set to False',Warning) 67 | else: 68 | mof = read(filepath) 69 | write('POSCAR',mof) 70 | 71 | mof = read('POSCAR') 72 | d = mof.get_all_distances() 73 | min_val = np.min(d[d>0]) 74 | if min_val < tol: 75 | pprint('WARNING: Atoms overlap by '+str(min_val)) 76 | 77 | return mof 78 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/compute_environ.py: -------------------------------------------------------------------------------- 1 | def get_nprocs(submit_script): 2 | """ 3 | Get the number of processors from submit script 4 | Args: 5 | submit_script (string): name of submission script 6 | 7 | Returns: 8 | nprocs (int): number of total processors 9 | 10 | ppn (int): number of processors per node 11 | """ 12 | 13 | #Setup for MOAB 14 | # with open(submit_script,'r') as rf: 15 | # for line in rf: 16 | # if 'nodes' in line or 'ppn' in line: 17 | # line = line.strip().replace(' ','') 18 | # nodes = int(line.split('nodes=')[1].split(':ppn=')[0]) 19 | # ppn = int(line.split('nodes=')[1].split(':ppn=')[1]) 20 | # nprocs = nodes*ppn 21 | 22 | #Setup for SLURM 23 | with open(submit_script,'r') as rf: 24 | for line in rf: 25 | if '-N' in line: 26 | line = line.strip().replace(' ','') 27 | nodes = int(line.split('-N')[1]) 28 | if '--ntasks-per-node' in line: 29 | line = line.strip().replace(' ','') 30 | ppn = int(line.split('=')[1]) 31 | nprocs = nodes*ppn 32 | 33 | #Setup for MOAB at Thunder 34 | # with open(submit_script,'r') as rf: 35 | # for line in rf: 36 | # if 'select' in line: 37 | # line = line.strip().replace(' ','') 38 | # nodes = int(line.split('=')[1].split(':')[0]) 39 | # ppn = int(line.split('=')[2]) 40 | # nprocs = nodes*ppn 41 | 42 | return nprocs, ppn 43 | 44 | def choose_vasp_version(gpt_version,nprocs): 45 | """ 46 | Choose the appropriate VASP version (std or gam) 47 | Args: 48 | gpt_version (bool): True if gamma-point only or False 49 | if standard version 50 | 51 | nprocs (int): total number of processors 52 | """ 53 | 54 | runvasp_file = open('run_vasp.py','w') 55 | 56 | #Setup for A.S. Rosen on Quest 57 | #parallel_cmd = 'mpirun -n'+' '+str(nprocs)+' ' 58 | #vasp_path = '/home/asr731/software/vasp_builds/bin/' 59 | #vasp_ex = [vasp_path+'vasp_std',vasp_path+'vasp_gam'] 60 | #module_cmd = 'module load mpi/openmpi-1.8.3-intel2013.2' 61 | 62 | #Setup for Cori/KNL 63 | # parallel_cmd = 'srun -n'+' '+str(nprocs)+' ' 64 | # vasp_path = '' 65 | # vasp_ex = [vasp_path+'vasp_std',vasp_path+'vasp_gam'] 66 | # module_cmd = 'module load vasp-tpc/5.4.1-knl' 67 | 68 | #Setup for Cori/SKX 69 | # parallel_cmd = 'srun -n'+' '+str(nprocs)+' ' 70 | # vasp_path = '' 71 | # vasp_ex = [vasp_path+'vasp_std',vasp_path+'vasp_gam'] 72 | # module_cmd = 'module load vasp-tpc/5.4.1-hsw' 73 | 74 | #Setup for Stampede2 75 | parallel_cmd = 'ibrun -n'+' '+str(nprocs)+' ' 76 | vasp_path = '' 77 | vasp_ex = [vasp_path+'vasp_std_vtst',vasp_path+'vasp_gam_vtst'] 78 | module_cmd = 'module purge all && module load intel/18.0.2 && module load impi/18.0.2 && module load vasp/5.4.4' 79 | 80 | #Setup for Mustang 81 | # parallel_cmd = 'export VASP_NPROCS='+str(nprocs)+' && ' 82 | # vasp_path = '' 83 | # vasp_ex = [vasp_path+'vasp-vtst_3.2',vasp_path+'vasp_real-vtst_3.2'] 84 | # module_cmd = 'module load VASP/5.4.4' 85 | 86 | #Setting up run_vasp.py 87 | vasp_cmd = parallel_cmd+vasp_ex[0] 88 | gamvasp_cmd = parallel_cmd+vasp_ex[1] 89 | if gpt_version: 90 | runvasp_file.write("import os\nexitcode = os.system(" 91 | +"'"+module_cmd+' && '+gamvasp_cmd+"'"+')') 92 | else: 93 | runvasp_file.write("import os\nexitcode = os.system(" 94 | +"'"+module_cmd+' && '+vasp_cmd+"'"+')') 95 | 96 | runvasp_file.close() 97 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/default_calculators.py: -------------------------------------------------------------------------------- 1 | from ase.calculators.vasp import Vasp 2 | 3 | #default parameters for calculators 4 | defaults = { 5 | 'xc': 'PBE', 6 | 'ivdw': 12, 7 | 'encut': 520, 8 | 'prec': 'Accurate', 9 | 'algo': 'All', 10 | 'ediff': 1e-4, 11 | 'ediffxl': 1e-6, 12 | 'nelm': 150, 13 | 'nelmxl': 150, 14 | 'nelmin': 3, 15 | 'lreal': False, 16 | 'ismear': 0, 17 | 'sigma': 0.01, 18 | 'nsw': 500, 19 | 'ediffg': -0.03, 20 | 'lorbit': 11, 21 | 'isym': 0, 22 | 'symprec': 1e-8, 23 | 'setups': {'base':'recommended','Li':'','W':'_sv','Eu':'_3','Yb':'_3'}, 24 | 'ldau_luj': None, 25 | 'lasph': False, 26 | 'nupdown': -1 27 | } 28 | 29 | def calcs(calc_name): 30 | """ 31 | Define the default calculators for relaxations 32 | Note: it should not include the kpts, gamma, or images keywords! 33 | Args: 34 | calc_name (string): name of calculator 35 | 36 | Returns: 37 | calc (dict): ASE Vasp calculator dictionary 38 | """ 39 | if calc_name == 'scf_test': 40 | calc = Vasp( 41 | xc=defaults['xc'], 42 | setups=defaults['setups'], 43 | encut=defaults['encut'], 44 | ivdw=defaults['ivdw'], 45 | prec=defaults['prec'], 46 | algo=defaults['algo'], 47 | ediff=defaults['ediffxl'], 48 | nelm=defaults['nelmxl'], 49 | lreal=False, 50 | ismear=defaults['ismear'], 51 | sigma=defaults['sigma'], 52 | lcharg=True, 53 | laechg=True, 54 | lwave=True, 55 | nsw=0, 56 | lorbit=defaults['lorbit'], 57 | isym=defaults['isym'], 58 | symprec=defaults['symprec'], 59 | addgrid=False, 60 | ldau_luj=defaults['ldau_luj'], 61 | lasph=defaults['lasph'], 62 | nupdown=defaults['nupdown'] 63 | ) 64 | elif calc_name == 'ase_bfgs': 65 | calc = Vasp( 66 | xc=defaults['xc'], 67 | setups=defaults['setups'], 68 | ivdw=defaults['ivdw'], 69 | prec=defaults['prec'], 70 | algo=defaults['algo'], 71 | ediff=defaults['ediff'], 72 | nelm=defaults['nelm']*1.5, 73 | nelmin=defaults['nelmin'], 74 | lreal=defaults['lreal'], 75 | ismear=defaults['ismear'], 76 | sigma=defaults['sigma'], 77 | lcharg=False, 78 | lwave=True, 79 | lorbit=defaults['lorbit'], 80 | isym=defaults['isym'], 81 | symprec=defaults['symprec'], 82 | ldau_luj=defaults['ldau_luj'], 83 | lasph=defaults['lasph'], 84 | nupdown=defaults['nupdown'] 85 | ) 86 | elif calc_name == 'isif2_lowacc': 87 | calc = Vasp( 88 | xc=defaults['xc'], 89 | setups=defaults['setups'], 90 | ivdw=defaults['ivdw'], 91 | prec=defaults['prec'], 92 | algo=defaults['algo'], 93 | ediff=defaults['ediff'], 94 | nelm=defaults['nelm'], 95 | nelmin=defaults['nelmin'], 96 | lreal=defaults['lreal'], 97 | ismear=defaults['ismear'], 98 | sigma=defaults['sigma'], 99 | lcharg=False, 100 | lwave=True, 101 | ibrion=2, 102 | isif=2, 103 | nsw=250, 104 | ediffg=-0.05, 105 | lorbit=defaults['lorbit'], 106 | isym=defaults['isym'], 107 | symprec=defaults['symprec'], 108 | ldau_luj=defaults['ldau_luj'], 109 | lasph=defaults['lasph'], 110 | nupdown=defaults['nupdown'] 111 | ) 112 | elif calc_name == 'isif2_medacc': 113 | calc = Vasp( 114 | xc=defaults['xc'], 115 | setups=defaults['setups'], 116 | ivdw=defaults['ivdw'], 117 | prec=defaults['prec'], 118 | algo=defaults['algo'], 119 | ediff=defaults['ediff'], 120 | nelm=defaults['nelm'], 121 | nelmin=8, 122 | lreal=defaults['lreal'], 123 | ismear=defaults['ismear'], 124 | sigma=defaults['sigma'], 125 | lcharg=False, 126 | lwave=True, 127 | ibrion=3, 128 | iopt=7, 129 | potim=0, 130 | isif=2, 131 | nsw=defaults['nsw'], 132 | ediffg=-0.05, 133 | lorbit=defaults['lorbit'], 134 | isym=defaults['isym'], 135 | symprec=defaults['symprec'], 136 | ldau_luj=defaults['ldau_luj'], 137 | lasph=defaults['lasph'], 138 | nupdown=defaults['nupdown'] 139 | ) 140 | elif calc_name == 'isif2_highacc': 141 | calc = Vasp( 142 | xc=defaults['xc'], 143 | setups=defaults['setups'], 144 | encut=defaults['encut'], 145 | ivdw=defaults['ivdw'], 146 | prec=defaults['prec'], 147 | algo=defaults['algo'], 148 | ediff=1e-6, 149 | nelm=defaults['nelm'], 150 | nelmin=8, 151 | lreal=defaults['lreal'], 152 | ismear=defaults['ismear'], 153 | sigma=defaults['sigma'], 154 | lcharg=False, 155 | lwave=True, 156 | ibrion=3, 157 | iopt=7, 158 | potim=0, 159 | isif=2, 160 | nsw=defaults['nsw'], 161 | ediffg=defaults['ediffg'], 162 | lorbit=defaults['lorbit'], 163 | isym=defaults['isym'], 164 | symprec=defaults['symprec'], 165 | ldau_luj=defaults['ldau_luj'], 166 | lasph=defaults['lasph'], 167 | nupdown=defaults['nupdown'] 168 | ) 169 | elif calc_name == 'isif3_lowacc': 170 | calc = Vasp( 171 | xc=defaults['xc'], 172 | setups=defaults['setups'], 173 | encut=defaults['encut'], 174 | ivdw=defaults['ivdw'], 175 | prec=defaults['prec'], 176 | algo=defaults['algo'], 177 | ediff=defaults['ediffxl'], 178 | nelm=defaults['nelm'], 179 | nelmin=defaults['nelmin'], 180 | lreal=defaults['lreal'], 181 | ismear=defaults['ismear'], 182 | sigma=defaults['sigma'], 183 | lcharg=False, 184 | lwave=True, 185 | ibrion=2, 186 | isif=3, 187 | nsw=30, 188 | ediffg=defaults['ediffg'], 189 | lorbit=defaults['lorbit'], 190 | isym=defaults['isym'], 191 | symprec=defaults['symprec'], 192 | ldau_luj=defaults['ldau_luj'], 193 | lasph=defaults['lasph'], 194 | nupdown=defaults['nupdown'] 195 | ) 196 | elif calc_name == 'isif3_highacc': 197 | calc = Vasp( 198 | xc=defaults['xc'], 199 | setups=defaults['setups'], 200 | encut=defaults['encut'], 201 | ivdw=defaults['ivdw'], 202 | prec=defaults['prec'], 203 | algo=defaults['algo'], 204 | ediff=defaults['ediffxl'], 205 | nelm=defaults['nelm'], 206 | nelmin=defaults['nelmin'], 207 | lreal=defaults['lreal'], 208 | ismear=defaults['ismear'], 209 | sigma=defaults['sigma'], 210 | lcharg=False, 211 | lwave=True, 212 | ibrion=2, 213 | isif=3, 214 | nsw=30, 215 | ediffg=defaults['ediffg'], 216 | lorbit=defaults['lorbit'], 217 | isym=defaults['isym'], 218 | symprec=defaults['symprec'], 219 | ldau_luj=defaults['ldau_luj'], 220 | lasph=defaults['lasph'], 221 | nupdown=defaults['nupdown'] 222 | ) 223 | elif calc_name == 'final_spe': 224 | calc = Vasp( 225 | xc=defaults['xc'], 226 | setups=defaults['setups'], 227 | encut=defaults['encut'], 228 | ivdw=defaults['ivdw'], 229 | prec=defaults['prec'], 230 | algo=defaults['algo'], 231 | ediff=defaults['ediffxl'], 232 | nelm=defaults['nelmxl'], 233 | lreal=False, 234 | ismear=defaults['ismear'], 235 | sigma=defaults['sigma'], 236 | lcharg=True, 237 | laechg=True, 238 | lwave=True, 239 | nsw=0, 240 | lorbit=defaults['lorbit'], 241 | isym=defaults['isym'], 242 | symprec=defaults['symprec'], 243 | addgrid=False, 244 | ldau_luj=defaults['ldau_luj'], 245 | lasph=defaults['lasph'], 246 | nupdown=defaults['nupdown'] 247 | ) 248 | elif calc_name == 'cineb_lowacc': 249 | calc = Vasp( 250 | xc=defaults['xc'], 251 | setups=defaults['setups'], 252 | ivdw=defaults['ivdw'], 253 | prec=defaults['prec'], 254 | algo=defaults['algo'], 255 | ediff=1e-6, 256 | nelm=100, 257 | nelmin=defaults['nelmin'], 258 | lreal=defaults['lreal'], 259 | ismear=defaults['ismear'], 260 | sigma=defaults['sigma'], 261 | lcharg=False, 262 | lwave=True, 263 | ibrion=3, 264 | potim=0, 265 | iopt=1, 266 | nsw=defaults['nsw'], 267 | ediffg=-0.1, 268 | lclimb=True, 269 | lorbit=defaults['lorbit'], 270 | isym=defaults['isym'], 271 | symprec=defaults['symprec'], 272 | ichain=0, 273 | ldau_luj=defaults['ldau_luj'], 274 | lasph=defaults['lasph'], 275 | nupdown=defaults['nupdown'] 276 | ) 277 | elif calc_name == 'dimer_lowacc': 278 | calc = Vasp( 279 | xc=defaults['xc'], 280 | setups=defaults['setups'], 281 | ivdw=defaults['ivdw'], 282 | prec=defaults['prec'], 283 | algo=defaults['algo'], 284 | ediff=1e-8, 285 | nelm=defaults['nelm'], 286 | nelmin=defaults['nelmin'], 287 | lreal=defaults['lreal'], 288 | ismear=defaults['ismear'], 289 | sigma=defaults['sigma'], 290 | lcharg=False, 291 | lwave=True, 292 | ibrion=3, 293 | potim=0, 294 | iopt=7, 295 | nsw=defaults['nsw']*4, 296 | ediffg=-0.075, 297 | lorbit=defaults['lorbit'], 298 | isym=defaults['isym'], 299 | symprec=defaults['symprec'], 300 | ichain=2, 301 | ldau_luj=defaults['ldau_luj'], 302 | lasph=defaults['lasph'], 303 | nupdown=defaults['nupdown'] 304 | ) 305 | elif calc_name == 'dimer_medacc': 306 | calc = Vasp( 307 | xc=defaults['xc'], 308 | setups=defaults['setups'], 309 | ivdw=defaults['ivdw'], 310 | prec=defaults['prec'], 311 | algo=defaults['algo'], 312 | ediff=1e-8, 313 | nelm=defaults['nelm'], 314 | nelmin=defaults['nelmin'], 315 | lreal=defaults['lreal'], 316 | ismear=defaults['ismear'], 317 | sigma=defaults['sigma'], 318 | lcharg=False, 319 | lwave=True, 320 | ibrion=3, 321 | potim=0, 322 | iopt=7, 323 | nsw=defaults['nsw']*2, 324 | ediffg=defaults['ediffg'], 325 | lorbit=defaults['lorbit'], 326 | isym=defaults['isym'], 327 | symprec=defaults['symprec'], 328 | ichain=2, 329 | ldau_luj=defaults['ldau_luj'], 330 | lasph=defaults['lasph'], 331 | nupdown=defaults['nupdown'] 332 | ) 333 | elif calc_name == 'dimer_highacc': 334 | calc = Vasp( 335 | xc=defaults['xc'], 336 | encut=defaults['encut'], 337 | setups=defaults['setups'], 338 | ivdw=defaults['ivdw'], 339 | prec=defaults['prec'], 340 | algo=defaults['algo'], 341 | ediff=1e-8, 342 | nelm=defaults['nelm'], 343 | nelmin=defaults['nelmin'], 344 | lreal=defaults['lreal'], 345 | ismear=defaults['ismear'], 346 | sigma=defaults['sigma'], 347 | lcharg=False, 348 | lwave=True, 349 | ibrion=3, 350 | potim=0, 351 | iopt=7, 352 | nsw=defaults['nsw']*2, 353 | ediffg=defaults['ediffg_dimerhigh'], 354 | lorbit=defaults['lorbit'], 355 | isym=defaults['isym'], 356 | symprec=defaults['symprec'], 357 | ichain=2, 358 | ldau_luj=defaults['ldau_luj'], 359 | lasph=defaults['lasph'], 360 | nupdown=defaults['nupdown'] 361 | ) 362 | else: 363 | raise ValueError('Out of range for calculators') 364 | 365 | return calc -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/error_handler.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | from pymofscreen.magmom_handler import get_incar_magmoms, continue_failed_magmoms 3 | from pymofscreen.calc_swaps import update_calc 4 | 5 | def get_error_msgs(outcarfile,refcode,stdout_file): 6 | """ 7 | Parse error messages from VASP 8 | Args: 9 | outcarfile (string): parth to OUTCAR file 10 | 11 | refcode (string): name of MOF 12 | 13 | stdout_file (string): path to stdout file 14 | 15 | Returns: 16 | errormsg (list of strings): error messages in OUTCAR/stdout 17 | """ 18 | 19 | errormsg = [] 20 | start = False 21 | with open(outcarfile,'r') as rf: 22 | for line in rf: 23 | errormsg = check_line_for_error(line,errormsg) 24 | with open(stdout_file,'r') as rf: 25 | for line in rf: 26 | if 'STARTING '+refcode in line: 27 | start = True 28 | if start: 29 | errormsg = check_line_for_error(line,errormsg) 30 | errormsg = list(set(errormsg)) 31 | 32 | return errormsg 33 | 34 | def get_warning_msgs(outcarfile): 35 | """ 36 | Parse warning messages from VASP 37 | Args: 38 | outcarfile (string): parth to OUTCAR file 39 | 40 | Returns: 41 | warningmsg (list of strings): warning messages in OUTCAR 42 | """ 43 | 44 | warningmsg = [] 45 | with open(outcarfile,'r') as rf: 46 | for line in rf: 47 | if 'You have a (more or less)' in line: 48 | warningmsg.append('large_supercell') 49 | warningmsg = list(set(warningmsg)) 50 | 51 | return warningmsg 52 | 53 | def check_line_for_error(line,errormsg): 54 | """ 55 | Parse given line for VASP error code 56 | Args: 57 | line (string): error statement 58 | 59 | Returns: 60 | errormsg (string): VASP error code 61 | """ 62 | 63 | if 'inverse of rotation matrix was not found (increase SYMPREC)' in line: 64 | errormsg.append('inv_rot_mat') 65 | elif 'WARNING: Sub-Space-Matrix is not hermitian in DAV' in line: 66 | errormsg.append('subspacematrix') 67 | elif 'Routine TETIRR needs special values' in line: 68 | errormsg.append('tetirr') 69 | elif 'Could not get correct shifts' in line: 70 | errormsg.append('incorrect_shift') 71 | elif ('REAL_OPTLAY: internal error' in line 72 | or 'REAL_OPT: internal ERROR' in line): 73 | errormsg.append('real_optlay') 74 | elif 'ERROR RSPHER' in line: 75 | errormsg.append('rspher') 76 | elif 'DENTET' in line: 77 | errormsg.append('dentet') 78 | elif 'TOO FEW BANDS' in line: 79 | errormsg.append('too_few_bands') 80 | elif 'BRIONS problems: POTIM should be increased' in line: 81 | errormsg.append('brions') 82 | elif 'internal error in subroutine PRICEL' in line: 83 | errormsg.append('pricel') 84 | elif 'One of the lattice vectors is very long (>50 A), but AMIN' in line: 85 | errormsg.append('amin') 86 | elif ('ZBRENT: fatal internal in' in line 87 | or 'ZBRENT: fatal error in bracketing' in line): 88 | errormsg.append('zbrent') 89 | elif 'ERROR in subspace rotation PSSYEVX' in line: 90 | errormsg.append('pssyevx') 91 | elif 'WARNING in EDDRMM: call to ZHEGV failed' in line: 92 | errormsg.append('eddrmm') 93 | elif 'Error EDDDAV: Call to ZHEGV failed' in line: 94 | errormsg.append('edddav') 95 | elif 'EDWAV: internal error, the gradient is not orthogonal' in line: 96 | errormsg.append('grad_not_orth') 97 | elif 'ERROR: SBESSELITER : nicht konvergent' in line: 98 | errormsg.append('nicht_konv') 99 | elif 'ERROR EDDIAG: Call to routine ZHEEV failed!' in line: 100 | errormsg.append('zheev') 101 | elif 'ELF: KPAR>1 not implemented' in line: 102 | errormsg.append('elf_kpar') 103 | elif 'RHOSYG internal error' in line: 104 | errormsg.append('rhosyg') 105 | elif 'POSMAP internal error: symmetry equivalent atom not found' in line: 106 | errormsg.append('posmap') 107 | elif 'internal error in subroutine IBZKPT' in line: 108 | errormsg.append('ibzkpt') 109 | elif 'internal error in subroutine SGRCON' in line: 110 | errormsg.append('sgrcon') 111 | elif 'plane wave coefficients changed' in line: 112 | errormsg.append('pwave') 113 | return errormsg 114 | 115 | def update_calc_after_errors(calc,calc_swaps,errormsg): 116 | """ 117 | Update an ASE Vasp calculators object based on error messages 118 | Args: 119 | calc (dict): ASE Vasp calculator dictionary 120 | 121 | calc_swaps (list of strings): list of calc swaps 122 | 123 | errormsg (list of strings): list of error messages 124 | 125 | Returns: 126 | calc (dict): updated ASE Vasp calculator 127 | 128 | errormsg (list of strings): list of error messages 129 | """ 130 | 131 | for msg in errormsg: 132 | if msg not in calc_swaps: 133 | calc_swaps.append(msg) 134 | 135 | calc, calc_swaps = update_calc(calc,calc_swaps) 136 | 137 | for swap in calc_swaps: 138 | 139 | if swap == 'too_few_bands': 140 | with open('OUTCAR','r') as outcarfile: 141 | for line in outcarfile: 142 | if 'NBANDS' in line: 143 | try: 144 | d = line.split("=") 145 | nbands = int(d[-1].strip()) 146 | break 147 | except (IndexError, ValueError): 148 | pass 149 | nbands = int(1.1*nbands) 150 | calc_swaps.append('nbands='+nbands) 151 | calc.int_params['nbands'] = nbands 152 | calc_swaps.remove('too_few_bands') 153 | 154 | elif swap == 'brions': 155 | with open('OUTCAR','r') as outcarfile: 156 | for line in outcarfile: 157 | if 'POTIM' in line: 158 | try: 159 | potim = float(d.split('=')[-1].split('time-step')[0]) 160 | break 161 | except (IndexError, ValueError): 162 | pass 163 | calc_swaps.append('potim='+potim) 164 | calc.float_params['potim'] = potim 165 | calc_swaps.remove('brions') 166 | 167 | return calc, calc_swaps 168 | 169 | def reset_mof(): 170 | """ 171 | Reset the ASE atoms object to the POSCAR/INCAR settings 172 | Returns: 173 | mof (ASE Atoms object): reset ASE Atoms object 174 | """ 175 | 176 | mof = read('POSCAR') 177 | mof.set_initial_magnetic_moments(get_incar_magmoms('INCAR','POSCAR')) 178 | 179 | return mof 180 | 181 | def continue_mof(): 182 | """ 183 | Update ASE Atoms object after failed job 184 | Returns: 185 | mof (ASE Atoms object): reset ASE Atoms object 186 | """ 187 | 188 | try: 189 | mof = read('CONTCAR') 190 | mof = continue_failed_magmoms(mof) 191 | except: 192 | mof = reset_mof() 193 | 194 | return mof 195 | 196 | def get_niter(outcarfile): 197 | """ 198 | Get the number of ionic steps that were run 199 | Args: 200 | outcarfile (string): full path to OUTCAR file 201 | 202 | Returns: 203 | niter (int): number of ionic iterations 204 | """ 205 | 206 | with open(outcarfile,'r') as rf: 207 | for line in rf: 208 | if '- Iteration' in line: 209 | niter = line.split('(')[0].split('n')[-1].strip() 210 | niter = int(niter) 211 | return niter -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/janitor.py: -------------------------------------------------------------------------------- 1 | import os 2 | from shutil import copyfile, rmtree 3 | 4 | def clean_files(remove_files): 5 | """ 6 | Remove specified files 7 | Args: 8 | remove_files (list of strings): filenames to remove 9 | """ 10 | 11 | for file in remove_files: 12 | if os.path.isfile(file): 13 | os.remove(file) 14 | 15 | def vtst_cleanup(): 16 | """ 17 | Remove VTST-generated files/folders 18 | """ 19 | 20 | if os.path.exists('neb'): 21 | rmtree('neb') 22 | if os.path.exists('dim'): 23 | rmtree('dim') 24 | if os.path.exists('neb.tar.gz'): 25 | os.remove('neb.tar.gz') 26 | clean_files(['MODECAR']) 27 | 28 | def prep_paths(basepath): 29 | """ 30 | Prepare the necessary file paths 31 | Args: 32 | basepath (string): full path to base 33 | """ 34 | 35 | error_path = os.path.join(basepath,'errors') 36 | results_path = os.path.join(basepath,'results') 37 | working_path = os.path.join(basepath,'working') 38 | screen_results_path = os.path.join(basepath,'results','screen_results.dat') 39 | log_file = 'screening.log' 40 | if not os.path.exists(error_path): 41 | os.makedirs(error_path) 42 | if not os.path.exists(results_path): 43 | os.makedirs(results_path) 44 | if not os.path.exists(working_path): 45 | os.makedirs(working_path) 46 | if os.path.isfile(screen_results_path): 47 | open(screen_results_path,'w').close() 48 | if os.path.isfile(log_file): 49 | open(log_file,'w').close() 50 | clean_files(['run_vasp.py','neb.tar.gz','AECCAR0','AECCAR0.gz','AECCAR1','AECCAR2','AECCAR2.gz','CENTCAR','CHG','ase-sort.dat','DIMCAR','DOSCAR','EIGENVAL','IBZKPT','OSZICAR','PCDAT','PROCAR','REPORT','vasprun.xml','XDATCAR','WAVECAR','WAVECAR.gz','CHGCAR','CHGCAR.gz','STOPCAR']) 51 | vtst_cleanup() 52 | 53 | def manage_restart_files(file_path,dimer=False,neb=False,wavechg=True): 54 | """ 55 | Copy restart files to current directory 56 | Args: 57 | file_path (string): path restart files 58 | """ 59 | 60 | gzip_list = ['AECCAR0','AECCAR2','CHGCAR','DOSCAR','WAVECAR'] 61 | if wavechg: 62 | files = ['WAVECAR','CHGCAR'] 63 | else: 64 | files = [] 65 | if dimer and neb: 66 | raise ValueError('Cannot be both NEB and dimer') 67 | if dimer: 68 | files += ['NEWMODECAR','CENTCAR'] 69 | if neb: 70 | files = ['neb.tar.gz'] 71 | for file in files: 72 | full_path = os.path.join(file_path,file) 73 | if file == 'NEWMODECAR': 74 | if os.path.isfile('MODECAR'): 75 | os.remove('MODECAR') 76 | copyfile(full_path,'MODECAR') 77 | else: 78 | if not os.path.isfile(file) or os.stat(file).st_size == 0: 79 | if file in gzip_list and not os.path.isfile(full_path): 80 | file += '.gz' 81 | full_path += '.gz' 82 | if os.path.isfile(full_path) and os.stat(full_path).st_size > 0: 83 | copyfile(full_path,file) 84 | if '.tar.gz' in file: 85 | os.system('tar -zxf '+file) 86 | elif '.gz' in file: 87 | os.system('gunzip '+file) -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/kpts_handler.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | try: 4 | import pymatgen as pm 5 | from pymatgen.io.cif import CifParser 6 | from pymatgen.io.vasp.inputs import Kpoints 7 | has_pm = True 8 | except: 9 | has_pm = False 10 | 11 | def get_kpts(screener,cif_file,level): 12 | """ 13 | Obtain the number of kpoints 14 | Args: 15 | screener (class): pymofscreen.screener class 16 | 17 | cif_file (string): name of CIF file 18 | 19 | level (string): accuracy level 20 | 21 | Returns: 22 | kpts (list of ints): kpoint grid 23 | 24 | gamma (bool): True for gamma-centered 25 | """ 26 | niggli = screener.niggli 27 | mofpath = screener.mofpath 28 | kpts_path = screener.kpts_path 29 | kppas = screener.kppas 30 | kpts = None 31 | if not mofpath: 32 | mofpath = '' 33 | 34 | if kpts_path == 'Auto' and has_pm: 35 | 36 | if level == 'low': 37 | kppa = kppas[0] 38 | elif level == 'high': 39 | kppa = kppas[1] 40 | else: 41 | raise ValueError('kpoints accuracy level not defined') 42 | filepath = os.path.join(mofpath,cif_file) 43 | if '.cif' in cif_file: 44 | parser = CifParser(filepath) 45 | pm_mof = parser.get_structures(primitive=niggli)[0] 46 | else: 47 | pm_mof = pm.Structure.from_file(filepath,primitive=niggli) 48 | pm_kpts = Kpoints.automatic_density(pm_mof,kppa) 49 | kpts = pm_kpts.kpts[0] 50 | 51 | if pm_kpts.style.name == 'Gamma': 52 | gamma = True 53 | else: 54 | gamma = None 55 | elif kpts_path == 'Auto' and not has_pm: 56 | raise ValueError('Pymatgen not installed. Please provide a kpts file.') 57 | else: 58 | old_cif_name = cif_file.split('.cif')[0].split('_')[0] 59 | infile = open(kpts_path,'r') 60 | lines = infile.read().splitlines() 61 | infile.close() 62 | 63 | for i in range(len(lines)): 64 | if old_cif_name in lines[i]: 65 | if level == 'low': 66 | kpts = lines[i+1] 67 | gamma = lines[i+2] 68 | elif level == 'high': 69 | kpts = lines[i+3] 70 | gamma = lines[i+4] 71 | else: 72 | raise ValueError('Incompatible KPPA with prior runs') 73 | break 74 | kpts = np.squeeze(np.asarray(np.matrix(kpts))).tolist() 75 | if not kpts or len(kpts) != 3: 76 | raise ValueError('Error parsing k-points for '+cif_file) 77 | 78 | if gamma == 'True': 79 | gamma = True 80 | elif gamma == 'False': 81 | gamma = False 82 | else: 83 | raise ValueError('Error parsing gamma for '+cif_file) 84 | 85 | return kpts, gamma -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/magmom_handler.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | from ase.io import read 4 | from copy import copy, deepcopy 5 | from pymofscreen.metal_types import mag_list, dblock_metals, fblock_metals 6 | from pymofscreen.writers import pprint 7 | 8 | def get_incar_magmoms(incarpath,poscarpath): 9 | """ 10 | Read in the magnetic moments in the INCAR 11 | Args: 12 | incarpath (string): path to INCAR 13 | 14 | poscarpath (string): path to POSCAR 15 | 16 | Returns: 17 | mof_mag_list (list of floats): magnetic moments 18 | """ 19 | mof_mag_list = [] 20 | init_mof = read(poscarpath) 21 | with open(incarpath,'r') as incarfile: 22 | for line in incarfile: 23 | line = line.strip() 24 | if 'MAGMOM' in line: 25 | mag_line = line.split('= ')[1:][0].split(' ') 26 | for val in mag_line: 27 | mag = float(val.split('*')[1]) 28 | num = int(val.split('*')[0]) 29 | mof_mag_list.extend([mag]*num) 30 | if not bool(mof_mag_list): 31 | mof_mag_list = np.zeros(len(init_mof)) 32 | if len(mof_mag_list) != len(mof_mag_list): 33 | raise ValueError('Error reading INCAR magnetic moments') 34 | 35 | return mof_mag_list 36 | 37 | def get_mag_indices(mof): 38 | """ 39 | Get the indices of metals that could be magnetic 40 | Args: 41 | mof (ASE Atoms object): MOF structure 42 | 43 | Returns: 44 | mag_indices (list of ints): indices of aforementioned metlas 45 | """ 46 | mag_indices = [atom.index for atom in mof if atom.number in mag_list] 47 | 48 | return mag_indices 49 | 50 | def set_initial_magmoms(mof,spin_level): 51 | """ 52 | Set the initial magnetic moments for each atom 53 | Args: 54 | mof (ASE Atoms object): MOF structure 55 | 56 | spin_level (string): determines default spins 57 | 58 | Returns: 59 | mof (ASE Atoms object): MOF structure with initial magmoms 60 | """ 61 | 62 | if isinstance(spin_level,list): 63 | if len(spin_level) != len(mof): 64 | raise ValueError('Magmom list is incompatible with number of atoms') 65 | mof.set_initial_magnetic_moments(spin_level) 66 | 67 | elif isinstance(spin_level,str): 68 | spin_level = spin_level.lower() 69 | mag_indices = get_mag_indices(mof) 70 | mof.set_initial_magnetic_moments(np.zeros(len(mof))) 71 | 72 | if 'afm' not in spin_level: 73 | for mag_idx in mag_indices: 74 | mag_number = mof[mag_idx].number 75 | if spin_level == 'high': 76 | if mag_number in dblock_metals: 77 | mof[mag_idx].magmom = 5.0 78 | elif mag_number in fblock_metals: 79 | mof[mag_idx].magmom = 7.0 80 | else: 81 | raise ValueError('Metal not properly classified') 82 | elif spin_level == 'low': 83 | mof[mag_idx].magmom = 0.1 84 | else: 85 | raise ValueError('Undefined spin level') 86 | elif spin_level == 'afm_high': 87 | AFM_cutoff = 5 88 | Mi = mag_indices[0] 89 | M_distances = mof.get_distances(Mi,mag_indices,mic=True,vector=False).tolist() 90 | mag_indices = [mag_indices[i] for i in np.argsort(M_distances).tolist()] 91 | mags = copy(mag_indices) 92 | sign = 1 93 | del mags[0] 94 | for i, mag_idx in enumerate(mag_indices): 95 | if i != 0: 96 | M_distances = mof.get_distances(Mi,mags,mic=True,vector=False).tolist() 97 | min_idx = np.argmin(M_distances) 98 | Mj = mags[min_idx] 99 | d = M_distances[min_idx] 100 | if d <= AFM_cutoff: 101 | sign = -sign 102 | else: 103 | sign = 1 104 | Mi = Mj 105 | mags.remove(Mj) 106 | del M_distances[min_idx] 107 | mag_number = mof[mag_idx].number 108 | if mag_number in dblock_metals: 109 | mof[mag_idx].magmom = sign*5.0 110 | elif mag_number in fblock_metals: 111 | mof[mag_idx].magmom = sign*7.0 112 | else: 113 | raise ValueError('Metal not properly classified') 114 | else: 115 | raise ValueError('Undefined AFM spin level') 116 | 117 | else: 118 | raise TypeError('spin_level has wrong type') 119 | 120 | return mof 121 | 122 | def continue_magmoms(mof,incarpath): 123 | """ 124 | Continue magmoms from prior run 125 | Args: 126 | mof (ASE Atoms object): MOF structure 127 | 128 | incarpath (string): path to INCAR 129 | 130 | Returns: 131 | mof (ASE Atoms object): MOF structure with initial magmoms 132 | """ 133 | with open(incarpath,'r') as incarfile: 134 | for line in incarfile: 135 | line = line.strip() 136 | if 'ISPIN = 2' in line: 137 | mof_magmoms = mof.get_magnetic_moments() 138 | mof.set_initial_magnetic_moments(mof_magmoms) 139 | 140 | return mof 141 | 142 | def get_abs_magmoms(mof,incarpath): 143 | """ 144 | Get absolute magmoms, indices, and ispin value from INCAR 145 | Args: 146 | mof (ASE Atoms object): MOF structure 147 | 148 | incarpath (string): path to INCAR 149 | 150 | Returns: 151 | abs_magmoms (list of floats): absolute values of magmoms 152 | 153 | mag_indices (list of ints): ASE indices 154 | 155 | ispin (bool): True if ispin = 2 in INCAR 156 | """ 157 | mag_indices = get_mag_indices(mof) 158 | ispin = False 159 | with open(incarpath,'r') as incarfile: 160 | for line in incarfile: 161 | line = line.strip() 162 | if 'ISPIN = 2' in line: 163 | ispin = True 164 | mof_magmoms = mof.get_magnetic_moments() 165 | abs_magmoms = np.abs(mof_magmoms[mag_indices]) 166 | if not ispin: 167 | abs_magmoms = np.zeros(len(mag_indices)) 168 | 169 | return abs_magmoms, mag_indices, ispin 170 | 171 | def continue_failed_magmoms(mof): 172 | """ 173 | If job failed, try to read magmoms from OUTCAR 174 | Args: 175 | mof (ASE Atoms object): MOF structure 176 | 177 | Returns: 178 | mof (ASE Atoms object): MOF structure with old magmoms 179 | """ 180 | self_resort = [] 181 | file = open('ase-sort.dat', 'r') 182 | lines = file.readlines() 183 | file.close() 184 | for line in lines: 185 | data = line.split() 186 | self_resort.append(int(data[1])) 187 | magnetic_moments = np.zeros(len(mof)) 188 | n = 0 189 | lines = open('OUTCAR', 'r').readlines() 190 | for line in lines: 191 | if line.rfind('magnetization (x)') > -1: 192 | for m in range(len(mof)): 193 | magnetic_moments[m] = float(lines[n + m + 4].split()[4]) 194 | n += 1 195 | sorted_magmoms = np.array(magnetic_moments)[self_resort] 196 | ispin = False 197 | with open('INCAR','r') as incarfile: 198 | for line in incarfile: 199 | line = line.strip() 200 | if 'ISPIN = 2' in line: 201 | ispin = True 202 | if ispin and all(sorted_magmoms == 0.0): 203 | raise ValueError('Error reading magmoms from failed OUTCAR') 204 | mof.set_initial_magnetic_moments(sorted_magmoms) 205 | 206 | return mof 207 | 208 | def check_if_new_spin(screener,mof,refcode,acc_level,current_spin): 209 | """ 210 | Check if new spin converged to old spin 211 | Args: 212 | screener (class): pymofscreen.screen.screener class 213 | 214 | mof (ASE Atoms object): MOF structure 215 | 216 | refcode (string): name of MOF 217 | 218 | acc_level (string): current accuracy level 219 | 220 | current_spin (string): current spin level 221 | 222 | Returns: 223 | True or False depending on if new spin converged to old spin 224 | """ 225 | basepath = screener.basepath 226 | spin_labels = screener.spin_labels 227 | results_partial_path = os.path.join(basepath,'results',refcode,acc_level) 228 | success_path = os.path.join(results_partial_path,current_spin) 229 | incarpath = os.path.join(success_path,'INCAR') 230 | mof = deepcopy(mof) 231 | mof = continue_magmoms(mof,incarpath) 232 | 233 | for prior_spin in spin_labels: 234 | if prior_spin == current_spin: 235 | break 236 | old_mof_path = os.path.join(results_partial_path,prior_spin,'OUTCAR') 237 | old_incar_path = os.path.join(results_partial_path,prior_spin,'INCAR') 238 | mag_indices = get_mag_indices(mof) 239 | old_mof = read(old_mof_path) 240 | old_abs_magmoms, old_mag_indices, old_ispin = get_abs_magmoms(old_mof,old_incar_path) 241 | mof_mag = mof.get_initial_magnetic_moments()[mag_indices] 242 | if old_ispin: 243 | old_mof_mag = old_mof.get_magnetic_moments()[mag_indices] 244 | else: 245 | old_mof_mag = [0]*len(mag_indices) 246 | mag_tol = 0.1 247 | if np.sum(np.abs(mof_mag - old_mof_mag) >= mag_tol) == 0: 248 | pprint('Skipping rest because '+current_spin+' converged to '+prior_spin) 249 | return False 250 | return True 251 | 252 | def check_if_skip_low_spin(screener,mof,refcode,spin_label): 253 | """ 254 | Check if low spin job should be skipped 255 | Args: 256 | screener (class): pymofscreen.screen.screener class 257 | 258 | mof (ASE Atoms object): MOF structure 259 | 260 | refcode (string): name of MOF 261 | 262 | spin_label (string): current spin label 263 | 264 | Returns: 265 | skip_low_spin (bool): True if low spin should be skipped 266 | """ 267 | acc_levels = screener.acc_levels 268 | acc_level = acc_levels[-1] 269 | basepath = screener.basepath 270 | success_path = os.path.join(basepath,'results',refcode,acc_level,spin_label) 271 | incarpath = os.path.join(success_path,'INCAR') 272 | skip_low_spin = False 273 | 274 | abs_magmoms, mag_indices, ispin = get_abs_magmoms(mof,incarpath) 275 | if not mag_indices: 276 | skip_low_spin = True 277 | else: 278 | if np.sum(abs_magmoms < 0.1) == len(abs_magmoms): 279 | skip_low_spin = True 280 | 281 | return skip_low_spin -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/metal_types.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | #Refer to magmom_handler.py for usage of variables defined here 4 | dblock_metals = np.concatenate((np.arange(21,30,1),np.arange(39,48,1), 5 | np.arange(71,80,1),np.arange(103,112,1)),axis=0).tolist() 6 | fblock_metals = np.concatenate((np.arange(57,71,1),np.arange(89,103,1)), 7 | axis=0).tolist() 8 | mag_list = dblock_metals+fblock_metals 9 | nomag_list = [val for val in np.arange(1,119,1) if val not in dblock_metals+fblock_metals] 10 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/runner.py: -------------------------------------------------------------------------------- 1 | import os 2 | from copy import deepcopy 3 | import numpy as np 4 | from ase.io import read 5 | from ase.optimize import BFGSLineSearch 6 | from pymofscreen.calc_swaps import update_calc, check_nprocs 7 | from pymofscreen.error_handler import get_niter, get_error_msgs, update_calc_after_errors, continue_mof 8 | from pymofscreen.compute_environ import choose_vasp_version 9 | from pymofscreen.janitor import clean_files 10 | from pymofscreen.magmom_handler import continue_magmoms 11 | 12 | def mof_run(workflow,mof,calc,kpts,images=None,force_nupdown=False): 13 | """ 14 | Run an atoms.get_potential_energy() calculation 15 | Args: 16 | workflow (class): pymofscreen.screen_phases.worfklow class 17 | 18 | mof (ASE Atoms object): ASE Atoms object for MOF 19 | 20 | calc (dict): ASE Vasp calculator 21 | 22 | kpts (list of ints): k-point grid 23 | 24 | images (int): number of NEB images 25 | 26 | force_nupdown (bool): force NUPDOWN to nearest int 27 | 28 | Returns: 29 | mof (ASE Atoms object): updated ASE Atoms object 30 | 31 | calc_swaps (list of strings): calc swaps 32 | """ 33 | 34 | nprocs = workflow.nprocs 35 | ppn = workflow.ppn 36 | calc_swaps = workflow.calc_swaps 37 | refcode = workflow.refcode 38 | stdout_file = workflow.stdout_file 39 | calc_swaps = workflow.calc_swaps 40 | gamma = workflow.kpts_dict['gamma'] 41 | 42 | if force_nupdown: 43 | init_mags = mof.get_initial_magnetic_moments() 44 | summed_mags = np.sum(np.abs(init_mags)) 45 | nupdown = int(np.round(summed_mags,0)) 46 | calc.int_params['nupdown'] = nupdown 47 | elif workflow.nupdown is not None: 48 | calc.int_params['nupdown'] = workflow.nupdown 49 | 50 | if sum(kpts) == 3: 51 | gpt_version = True 52 | else: 53 | gpt_version = False 54 | if images is not None: 55 | neb = True 56 | calc.int_params['images'] = images 57 | else: 58 | neb = False 59 | if not neb: 60 | try: 61 | nprocs = check_nprocs(len(mof),nprocs,ppn) 62 | except: 63 | pass 64 | choose_vasp_version(gpt_version,nprocs) 65 | calc.input_params['kpts'] = kpts 66 | calc.input_params['gamma'] = gamma 67 | if calc.int_params['ncore'] is None and calc.int_params['npar'] is None: 68 | calc.int_params['ncore'] = int(ppn/2.0) 69 | calc, calc_swaps = update_calc(calc,calc_swaps) 70 | mof.set_calculator(calc) 71 | success = False 72 | 73 | try: 74 | mof.get_potential_energy() 75 | niter = get_niter('OUTCAR') 76 | if niter < mof.calc.int_params['nsw'] and mof.calc.converged != True: 77 | raise SystemError('VASP stopped but did not die') 78 | success = True 79 | except: 80 | 81 | if not os.path.isfile('STOPCAR') and not neb: 82 | 83 | old_error_len = 0 84 | restart_files = ['WAVECAR','CHGCAR'] 85 | 86 | while True: 87 | 88 | errormsg = get_error_msgs('OUTCAR',refcode,stdout_file) 89 | print(errormsg) 90 | calc, calc_swaps = update_calc_after_errors(calc,calc_swaps, 91 | errormsg) 92 | error_len = len(errormsg) 93 | if error_len == old_error_len: 94 | break 95 | 96 | clean_files(restart_files) 97 | mof = continue_mof() 98 | choose_vasp_version(gpt_version,nprocs) 99 | mof.set_calculator(calc) 100 | 101 | try: 102 | mof.get_potential_energy() 103 | niter = get_niter('OUTCAR') 104 | if (niter < mof.calc.int_params['nsw'] and 105 | mof.calc.converged != True): 106 | raise SystemError('VASP stopped but did not die') 107 | success = True 108 | except: 109 | pass 110 | 111 | old_error_len = error_len 112 | 113 | if not success: 114 | mof = None 115 | 116 | return mof, calc_swaps 117 | 118 | def mof_bfgs_run(workflow,mof,calc,kpts,steps=100,fmax=0.05,force_nupdown=False): 119 | """ 120 | Run ASE BFGSLineSearch calculation 121 | Args: 122 | workflow (class): pymofscreen.screen_phases.worfklow class 123 | 124 | mof (ASE Atoms object): ASE Atoms object for MOF 125 | 126 | calc (dict): ASE Vasp calculator 127 | 128 | kpts (list of ints): k-point grid 129 | 130 | steps (int): maximum number of steps 131 | 132 | fmax (int): force tolerance 133 | 134 | force_nupdown (bool): force NUPDOWN to nearest int 135 | 136 | Returns: 137 | mof (ASE Atoms object): updated ASE Atoms object 138 | 139 | dyn (class): ASE dynamics class 140 | 141 | calc_swaps (list of strings): calc swaps 142 | """ 143 | 144 | nprocs = workflow.nprocs 145 | ppn = workflow.ppn 146 | calc_swaps = workflow.calc_swaps 147 | refcode = workflow.refcode 148 | stdout_file = workflow.stdout_file 149 | calc_swaps = workflow.calc_swaps 150 | gamma = workflow.kpts_dict['gamma'] 151 | 152 | if force_nupdown: 153 | init_mags = mof.get_initial_magnetic_moments() 154 | summed_mags = np.sum(np.abs(init_mags)) 155 | nupdown = int(np.round(summed_mags,0)) 156 | calc.int_params['nupdown'] = nupdown 157 | elif workflow.nupdown is not None: 158 | calc.int_params['nupdown'] = workflow.nupdown 159 | 160 | if sum(kpts) == 3: 161 | gpt_version = True 162 | else: 163 | gpt_version = False 164 | 165 | nprocs = check_nprocs(len(mof),nprocs,ppn) 166 | choose_vasp_version(gpt_version,nprocs) 167 | calc.input_params['kpts'] = kpts 168 | calc.input_params['gamma'] = gamma 169 | if calc.int_params['ncore'] is None and calc.int_params['npar'] is None: 170 | calc.int_params['ncore'] = int(ppn/2.0) 171 | calc, calc_swaps = update_calc(calc,calc_swaps) 172 | mof.set_calculator(calc) 173 | dyn = BFGSLineSearch(mof,trajectory='opt.traj') 174 | success = False 175 | 176 | try: 177 | dyn.run(fmax=fmax,steps=steps) 178 | success = True 179 | except: 180 | 181 | if not os.path.isfile('STOPCAR'): 182 | 183 | old_error_len = 0 184 | restart_files = ['WAVECAR','CHGCAR'] 185 | 186 | while True: 187 | 188 | errormsg = get_error_msgs('OUTCAR',refcode,stdout_file) 189 | print(errormsg) 190 | calc, calc_swaps = update_calc_after_errors(calc,calc_swaps, 191 | errormsg) 192 | error_len = len(errormsg) 193 | if error_len == old_error_len: 194 | break 195 | 196 | clean_files(restart_files) 197 | mof = continue_mof() 198 | mof.set_calculator(calc) 199 | dyn = BFGSLineSearch(mof,trajectory='opt.traj') 200 | 201 | try: 202 | dyn.run(fmax=fmax,steps=steps) 203 | success = True 204 | except: 205 | pass 206 | 207 | old_error_len = error_len 208 | 209 | if not success: 210 | mof = None 211 | 212 | return mof, dyn, calc_swaps 213 | 214 | def prep_next_run(workflow): 215 | """ 216 | Prepare for the next run 217 | Args: 218 | workflow (class): pymofscreen.screen_phases.worfklow class 219 | 220 | Returns: 221 | mof (ASE Atoms object): updated ASE Atoms object 222 | """ 223 | 224 | acc_levels = workflow.acc_levels 225 | acc_level = acc_levels[workflow.run_i] 226 | refcode = workflow.refcode 227 | spin_label = workflow.spin_label 228 | basepath = workflow.basepath 229 | success_path = os.path.join(basepath,'results',refcode,acc_level,spin_label) 230 | outcarpath = os.path.join(success_path,'OUTCAR') 231 | errorpath = os.path.join(basepath,'errors',refcode,acc_level,spin_label) 232 | 233 | if os.path.exists(errorpath): 234 | mof = None 235 | else: 236 | mof = read(outcarpath) 237 | workflow.run_i += 1 238 | 239 | return mof 240 | 241 | def prep_new_run(workflow): 242 | acc_levels = workflow.acc_levels 243 | acc_level = acc_levels[workflow.run_i-1] 244 | refcode = workflow.refcode 245 | spin_label = workflow.spin_label 246 | basepath = workflow.basepath 247 | success_path = os.path.join(basepath,'results',refcode,acc_level,spin_label) 248 | outcarpath = os.path.join(success_path,'OUTCAR') 249 | incarpath = os.path.join(success_path,'INCAR') 250 | 251 | mof = read(outcarpath) 252 | mof_initialized = deepcopy(mof) 253 | mof_initialized = continue_magmoms(mof_initialized,incarpath) 254 | 255 | return mof_initialized -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/vtst_handler.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import time 4 | from shutil import copyfile, rmtree, move 5 | from ase.io import read, write 6 | from ase.neb import NEB 7 | from pymofscreen.janitor import clean_files 8 | 9 | def nebmake(initial_atoms,final_atoms,n_images): 10 | """ 11 | Make interpolated images for NEB 12 | Args: 13 | initial_atoms (ASE Atoms object): initial MOF structure 14 | 15 | final_atoms (ASE Atoms object): final MOF structure 16 | 17 | n_images (int): number of NEB images 18 | """ 19 | pwd = os.getcwd() 20 | neb_path = os.path.join(pwd,'neb') 21 | if os.path.exists(neb_path): 22 | rmtree(neb_path) 23 | os.makedirs(neb_path) 24 | os.chdir(neb_path) 25 | 26 | images = [initial_atoms] 27 | for i in range(n_images): 28 | images.append(initial_atoms.copy()) 29 | images.append(final_atoms) 30 | 31 | neb = NEB(images) 32 | neb.interpolate('idpp',mic=True) 33 | for i, neb_image in enumerate(neb.images): 34 | if i < 10: 35 | ii = '0'+str(i) 36 | else: 37 | ii = str(i) 38 | os.mkdir(os.path.join(neb_path,ii)) 39 | write(os.path.join(neb_path,ii,'POSCAR'),neb_image,format='vasp') 40 | write_dummy_outcar(os.path.join(neb_path,'00','OUTCAR'),initial_atoms.get_potential_energy()) 41 | write_dummy_outcar(os.path.join(neb_path,ii,'OUTCAR'),final_atoms.get_potential_energy()) 42 | 43 | def write_dummy_outcar(name,E): 44 | """ 45 | Construct a dummy OUTCAR for images 0 and n 46 | Args: 47 | name (string): name of file to write 48 | 49 | E (float): energy to write out in dummy OUTCAR 50 | """ 51 | with open(name,'w') as wf: 52 | wf.write(' energy without entropy= energy(sigma->0) = '+str(E)+'\n') 53 | 54 | def neb2dim(): 55 | """ 56 | Construct initial dimer job from NEB 57 | """ 58 | pwd = os.getcwd() 59 | neb_path = os.path.join(pwd,'neb') 60 | os.chdir(neb_path) 61 | os.system('vfin.pl neb_fin') 62 | time.sleep(5) 63 | neb_fin_path = os.path.join(neb_path,'neb_fin') 64 | os.chdir(neb_fin_path) 65 | os.system('nebresults.pl') 66 | copyfile(os.path.join(neb_fin_path,'exts.dat'),os.path.join(neb_path,'exts.dat')) 67 | os.chdir(neb_path) 68 | if os.stat(os.path.join(neb_path,'exts.dat')).st_size == 0: 69 | raise ValueError('Error with exts.dat file') 70 | os.system('neb2dim.pl') 71 | old_dim_path = os.path.join(neb_path,'dim') 72 | new_dim_path = os.path.join(pwd,'dim') 73 | move(old_dim_path,new_dim_path) 74 | os.chdir(new_dim_path) 75 | mof = read('POSCAR') 76 | 77 | max_F = 0 78 | high_i = 0 79 | if os.stat(os.path.join(neb_fin_path,'nebef.dat')).st_size == 0: 80 | raise ValueError('nebef.dat not written') 81 | with open(os.path.join(neb_fin_path,'nebef.dat'),'r') as rf: 82 | for i, line in enumerate(rf): 83 | line = line.strip() 84 | max_F_temp = np.fromstring(line,dtype=float,sep=' ')[1] 85 | if max_F_temp > max_F: 86 | max_F = max_F_temp 87 | high_i = i 88 | try: 89 | if high_i < 10: 90 | str_high_i = '0'+str(high_i) 91 | else: 92 | str_high_i = str(high_i) 93 | move(os.path.join(neb_fin_path,str_high_i,'WAVECAR.gz'),os.path.join(new_dim_path,'WAVECAR.gz')) 94 | os.system('gunzip WAVECAR.gz') 95 | except: 96 | pass 97 | 98 | return mof 99 | 100 | def dimmins(dis): 101 | """ 102 | Run dimmins.pl 103 | Args: 104 | dis (float): displacement vector 105 | """ 106 | os.system('vfin.pl dim_fin') 107 | rmtree('dim_fin') 108 | os.system('dimmins.pl POSCAR MODECAR '+str(dis)) 109 | 110 | def nebef(ediffg): 111 | """ 112 | Run nebef.pl 113 | Args: 114 | ediffg (float): specified EDIFFG vlaue in VASP 115 | 116 | Returns: 117 | neb_conv (bool): True if NEB converged within EDIFFG 118 | """ 119 | ediffg = abs(ediffg) 120 | clean_files(['POSCAR']) 121 | open('nebef.dat','w').close() 122 | os.system('nebef.pl > nebef.dat') 123 | max_F = 0 124 | if os.stat('nebef.dat').st_size == 0: 125 | raise ValueError('nebef.dat not written') 126 | with open('nebef.dat','r') as rf: 127 | for line in rf: 128 | line = line.strip() 129 | max_F_temp = np.fromstring(line,dtype=float,sep=' ')[1] 130 | if max_F_temp > max_F: 131 | max_F = max_F_temp 132 | if max_F == 0.0: 133 | neb_conv = False 134 | elif max_F <= ediffg: 135 | neb_conv = True 136 | else: 137 | neb_conv = False 138 | 139 | return neb_conv -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/pymofscreen/writers.py: -------------------------------------------------------------------------------- 1 | import os 2 | from shutil import copyfile, move 3 | 4 | def pprint(printstr): 5 | """ 6 | Redirects pprint to stdout 7 | Args: 8 | printstr (string): string to print to stdout 9 | """ 10 | print(printstr) 11 | with open('screening.log','a') as txtfile: 12 | txtfile.write(printstr+'\n') 13 | 14 | def write_success(workflow,neb=False): 15 | """ 16 | Write out the successful job files 17 | Args: 18 | workflow (class): pymofscreen.screen_phases.worfklow class 19 | """ 20 | spin_label = workflow.spin_label 21 | acc_level = workflow.acc_levels[workflow.run_i] 22 | pprint('SUCCESS: '+spin_label+', '+acc_level) 23 | refcode = workflow.refcode 24 | basepath = workflow.basepath 25 | vasp_files = workflow.vasp_files 26 | gzip_list = ['AECCAR0','AECCAR2','CHGCAR','DOSCAR','WAVECAR','PROCAR'] 27 | if not neb: 28 | success_path = os.path.join(basepath,'results',refcode,acc_level,spin_label) 29 | elif neb: 30 | success_path = os.path.join(basepath,'results',refcode,acc_level) 31 | if not os.path.exists(success_path): 32 | os.makedirs(success_path) 33 | if not neb: 34 | if acc_level in ['scf_test','final_spe']: 35 | files_to_copy = vasp_files+['DOSCAR','EIGENVAL','IBZKPT','PROCAR','AECCAR0','AECCAR2','OSZICAR'] 36 | elif 'dimer' in acc_level: 37 | files_to_copy = vasp_files+['DIMCAR','MODECAR','NEWMODECAR','CENTCAR'] 38 | else: 39 | files_to_copy = vasp_files 40 | for file in files_to_copy: 41 | if os.path.isfile(file) and os.stat(file).st_size > 0: 42 | write_to_path = os.path.join(success_path,file) 43 | if file in gzip_list: 44 | os.system('gzip < '+file+' > '+file+'.gz') 45 | move(file+'.gz',write_to_path+'.gz') 46 | else: 47 | copyfile(file,write_to_path) 48 | elif neb: 49 | tar_file = 'neb.tar.gz' 50 | os.system('tar -zcvf '+tar_file+' neb') 51 | if os.path.isfile(tar_file) and os.stat(tar_file).st_size > 0: 52 | write_to_path = os.path.join(success_path,tar_file) 53 | copyfile(tar_file,write_to_path) 54 | os.remove('neb.tar.gz') 55 | 56 | def write_errors(workflow,mof,neb=False): 57 | """ 58 | Write out the unsuccesful job files 59 | Args: 60 | workflow (class): pymofscreen.screen_phases.worfklow class 61 | 62 | mof (ASE Atoms object): ASE Atoms object 63 | """ 64 | spin_label = workflow.spin_label 65 | acc_level = workflow.acc_levels[workflow.run_i] 66 | pprint('ERROR: '+spin_label+', '+acc_level+' failed') 67 | gzip_list = ['AECCAR0','AECCAR2','CHGCAR','DOSCAR','WAVECAR'] 68 | if acc_level != 'scf_test' and 'neb' not in acc_level: 69 | if mof is None: 70 | pprint('^ VASP crashed') 71 | else: 72 | pprint('Calculation not converged') 73 | refcode = workflow.refcode 74 | basepath = workflow.basepath 75 | vasp_files = workflow.vasp_files 76 | if not neb: 77 | error_path = os.path.join(basepath,'errors',refcode,acc_level,spin_label) 78 | elif neb: 79 | error_path = os.path.join(basepath,'errors',refcode,acc_level,spin_label) 80 | if not os.path.exists(error_path): 81 | os.makedirs(error_path) 82 | if not neb: 83 | if 'dimer' in acc_level: 84 | files_to_copy = vasp_files+['DIMCAR','MODECAR','NEWMODECAR','CENTCAR'] 85 | else: 86 | files_to_copy = vasp_files 87 | for file in files_to_copy: 88 | if os.path.isfile(file) and os.stat(file).st_size > 0: 89 | write_to_path = os.path.join(error_path,file) 90 | if file in gzip_list: 91 | os.system('gzip < '+file+' > '+file+'.gz') 92 | move(file+'.gz',write_to_path+'.gz') 93 | else: 94 | copyfile(file,write_to_path) 95 | elif neb: 96 | tar_file = 'neb.tar.gz' 97 | os.system('tar -zcvf '+tar_file+' neb') 98 | if os.path.isfile(tar_file) and os.stat(tar_file).st_size > 0: 99 | write_to_path = os.path.join(error_path,tar_file) 100 | copyfile(tar_file,write_to_path) 101 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/requirements.txt: -------------------------------------------------------------------------------- 1 | -e git+https://github.com/arosen93/rASE#egg=rASE 2 | Pymatgen>=2018.11.30,<2022.0.0 3 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages 2 | 3 | setup(name='PyMOFScreen', 4 | description='Python code to do high-throughput DFT of MOFs with VASP', 5 | author='Andrew S. Rosen', 6 | author_email='rosen@u.northwestern.edu', 7 | url='https://github.com/arosen93/mof_screen', 8 | requires_python='>=3.6.0', 9 | version='1.1', 10 | packages=find_packages(), 11 | license='MIT' 12 | ) 13 | -------------------------------------------------------------------------------- /other/dft_workflow/mof_screen/toc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/dft_workflow/mof_screen/toc.png -------------------------------------------------------------------------------- /other/dft_workflow/mofpath/example.cif: -------------------------------------------------------------------------------- 1 | data_image0 2 | _cell_length_a 9.98589 3 | _cell_length_b 9.98589 4 | _cell_length_c 15.0927 5 | _cell_angle_alpha 89.9998 6 | _cell_angle_beta 90.0002 7 | _cell_angle_gamma 104.944 8 | 9 | _symmetry_space_group_name_H-M "P 1" 10 | _symmetry_int_tables_number 1 11 | 12 | loop_ 13 | _symmetry_equiv_pos_as_xyz 14 | 'x, y, z' 15 | 16 | loop_ 17 | _atom_site_label 18 | _atom_site_occupancy 19 | _atom_site_fract_x 20 | _atom_site_fract_y 21 | _atom_site_fract_z 22 | _atom_site_thermal_displace_type 23 | _atom_site_B_iso_or_equiv 24 | _atom_site_type_symbol 25 | Fe1 1.0000 0.25000 0.75000 0.61548 Biso 1.000 Fe 26 | Fe2 1.0000 0.75000 0.25000 0.38452 Biso 1.000 Fe 27 | H1 1.0000 0.07966 0.57966 0.75389 Biso 1.000 H 28 | H2 1.0000 0.42034 0.92035 0.75389 Biso 1.000 H 29 | H3 1.0000 0.92034 0.42034 0.24611 Biso 1.000 H 30 | H4 1.0000 0.57965 0.07965 0.24611 Biso 1.000 H 31 | H5 1.0000 0.07176 0.57176 0.91819 Biso 1.000 H 32 | H6 1.0000 0.42824 0.92824 0.91819 Biso 1.000 H 33 | H7 1.0000 0.92824 0.42824 0.08181 Biso 1.000 H 34 | H8 1.0000 0.57176 0.07176 0.08181 Biso 1.000 H 35 | H9 1.0000 0.07227 0.57227 0.31323 Biso 1.000 H 36 | H10 1.0000 0.42773 0.92773 0.31323 Biso 1.000 H 37 | H11 1.0000 0.92773 0.42773 0.68677 Biso 1.000 H 38 | H12 1.0000 0.57227 0.07227 0.68677 Biso 1.000 H 39 | H13 1.0000 0.07990 0.57990 0.47730 Biso 1.000 H 40 | H14 1.0000 0.42010 0.92010 0.47729 Biso 1.000 H 41 | H15 1.0000 0.92010 0.42010 0.52271 Biso 1.000 H 42 | H16 1.0000 0.57990 0.07990 0.52271 Biso 1.000 H 43 | Au1 1.0000 0.25000 0.25000 0.60636 Biso 1.000 Au 44 | Au2 1.0000 0.25000 0.25000 0.39364 Biso 1.000 Au 45 | Au3 1.0000 0.75000 0.75000 0.39363 Biso 1.000 Au 46 | Au4 1.0000 0.75000 0.75000 0.60636 Biso 1.000 Au 47 | C1 1.0000 0.23317 0.44142 0.61071 Biso 1.000 C 48 | C2 1.0000 0.26683 0.05858 0.61072 Biso 1.000 C 49 | C3 1.0000 0.05857 0.26683 0.38928 Biso 1.000 C 50 | C4 1.0000 0.44142 0.23317 0.38928 Biso 1.000 C 51 | C5 1.0000 0.76684 0.55858 0.38928 Biso 1.000 C 52 | C6 1.0000 0.73317 0.94142 0.38928 Biso 1.000 C 53 | C7 1.0000 0.94142 0.73317 0.61071 Biso 1.000 C 54 | C8 1.0000 0.55857 0.76683 0.61072 Biso 1.000 C 55 | C9 1.0000 0.15478 0.65478 0.79272 Biso 1.000 C 56 | C10 1.0000 0.34522 0.84522 0.79272 Biso 1.000 C 57 | C11 1.0000 0.84522 0.34522 0.20728 Biso 1.000 C 58 | C12 1.0000 0.65478 0.15478 0.20728 Biso 1.000 C 59 | C13 1.0000 0.15104 0.65104 0.88463 Biso 1.000 C 60 | C14 1.0000 0.34896 0.84896 0.88463 Biso 1.000 C 61 | C15 1.0000 0.84896 0.34895 0.11537 Biso 1.000 C 62 | C16 1.0000 0.65104 0.15104 0.11537 Biso 1.000 C 63 | C17 1.0000 0.25000 0.75000 0.93240 Biso 1.000 C 64 | C18 1.0000 0.75000 0.25000 0.06760 Biso 1.000 C 65 | C19 1.0000 0.25000 0.75000 0.02997 Biso 1.000 C 66 | C20 1.0000 0.75000 0.25000 0.97004 Biso 1.000 C 67 | C21 1.0000 0.25000 0.75000 0.20132 Biso 1.000 C 68 | C22 1.0000 0.75000 0.25000 0.79868 Biso 1.000 C 69 | C23 1.0000 0.25000 0.75000 0.29879 Biso 1.000 C 70 | C24 1.0000 0.75000 0.25000 0.70121 Biso 1.000 C 71 | C25 1.0000 0.15132 0.65132 0.34659 Biso 1.000 C 72 | C26 1.0000 0.34868 0.84868 0.34659 Biso 1.000 C 73 | C27 1.0000 0.84868 0.34868 0.65341 Biso 1.000 C 74 | C28 1.0000 0.65132 0.15132 0.65341 Biso 1.000 C 75 | C29 1.0000 0.15484 0.65484 0.43836 Biso 1.000 C 76 | C30 1.0000 0.34516 0.84517 0.43836 Biso 1.000 C 77 | C31 1.0000 0.84516 0.34517 0.56164 Biso 1.000 C 78 | C32 1.0000 0.65483 0.15483 0.56164 Biso 1.000 C 79 | N1 1.0000 0.22927 0.55753 0.61423 Biso 1.000 N 80 | N2 1.0000 0.27073 0.94247 0.61423 Biso 1.000 N 81 | N3 1.0000 0.94247 0.27073 0.38577 Biso 1.000 N 82 | N4 1.0000 0.55753 0.22927 0.38577 Biso 1.000 N 83 | N5 1.0000 0.77073 0.44247 0.38577 Biso 1.000 N 84 | N6 1.0000 0.72927 0.05753 0.38577 Biso 1.000 N 85 | N7 1.0000 0.05753 0.72927 0.61423 Biso 1.000 N 86 | N8 1.0000 0.44247 0.77073 0.61423 Biso 1.000 N 87 | N9 1.0000 0.25000 0.75000 0.74661 Biso 1.000 N 88 | N10 1.0000 0.75000 0.25000 0.25340 Biso 1.000 N 89 | N11 1.0000 0.15169 0.65168 0.07204 Biso 1.000 N 90 | N12 1.0000 0.34832 0.84832 0.07204 Biso 1.000 N 91 | N13 1.0000 0.84832 0.34832 0.92796 Biso 1.000 N 92 | N14 1.0000 0.65168 0.15168 0.92796 Biso 1.000 N 93 | N15 1.0000 0.15169 0.65169 0.15933 Biso 1.000 N 94 | N16 1.0000 0.34831 0.84831 0.15933 Biso 1.000 N 95 | N17 1.0000 0.84831 0.34831 0.84068 Biso 1.000 N 96 | N18 1.0000 0.65169 0.15169 0.84068 Biso 1.000 N 97 | N19 1.0000 0.25000 0.75000 0.48424 Biso 1.000 N 98 | N20 1.0000 0.75000 0.25000 0.51576 Biso 1.000 N 99 | -------------------------------------------------------------------------------- /other/dft_workflow/runner/opt.py: -------------------------------------------------------------------------------- 1 | from pymofscreen.screen import screener 2 | from pymofscreen.default_calculators import defaults 3 | import os 4 | 5 | # Specifies paths 6 | basepath = os.path.join(os.getcwd(), '..') # path to store results 7 | mofpath = os.path.join(basepath, 'mofpath') # path where MOF CIFs are 8 | submit_script = 'sub_slurm.job' # path to job submission script 9 | 10 | # Modifies the defaults in `default_calculators.py 11 | defaults['algo'] = 'Fast' 12 | 13 | # Get CIF files 14 | cif_files = os.listdir(mofpath) 15 | cif_files.sort() 16 | s = screener(basepath, mofpath, submit_script=submit_script, kppas=[100, 1000]) 17 | 18 | # Run DFT calculations 19 | for cif_file in cif_files: 20 | mof = s.run_screen(cif_file, 'volume', spin_levels=['high'], acc_levels=[ 21 | 'scf_test', 'isif2_lowacc', 'isif3_lowacc', 'isif3_highacc', 'final_spe']) 22 | -------------------------------------------------------------------------------- /other/dft_workflow/runner/sub_slurm.job: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | #SBATCH --job-name="opt.py" 3 | #SBATCH -N 1 4 | #SBATCH --ntasks-per-node=48 5 | #SBATCH -t 04:00:00 6 | 7 | export VASP_SCRIPT=run_vasp.py 8 | python opt.py > opt.out 9 | -------------------------------------------------------------------------------- /other/example_dos/GUTYAW/CONTCAR: -------------------------------------------------------------------------------- 1 | Sr H C S O 2 | 1.00000000000000 3 | 4.8822619228908923 -0.0192408672100046 0.0139268912532824 4 | 1.9300331126912076 4.4846646841373614 0.0139140778187617 5 | 0.7618217242902056 0.4993259944361932 14.8673519817358333 6 | Sr H C S O 7 | 2 8 4 4 12 8 | Direct 9 | 0.0280331751028839 0.9719599267621035 0.7499994060465554 10 | 0.9719619113463622 0.0280414598470671 0.2500001025077054 11 | 0.6829985373166281 0.6813056397846182 0.4662684184971582 12 | 0.3186609119704613 0.3169928166589173 0.0337354538079992 13 | 0.3170011916748336 0.3186468756337248 0.5337367647432885 14 | 0.6812850676210971 0.6829970435376325 0.9662626242891861 15 | 0.8977394533591152 0.8300635335349682 0.5239741295254419 16 | 0.1699122848724883 0.1022696425544751 0.9760186584933663 17 | 0.1022552523449605 0.1699258642775092 0.4760123099875955 18 | 0.8300665683800901 0.8977234146273076 0.0239684683506098 19 | 0.8592846229543980 0.6286738422132956 0.5141970979377675 20 | 0.3713032026280274 0.1407106660474966 0.9858047336228069 21 | 0.1407046294277592 0.3713172815955659 0.4858050430392566 22 | 0.6286581975789005 0.8592873720344372 0.0141929539838657 23 | 0.7130097925859218 0.5469163999733269 0.6178476473159833 24 | 0.4530853288265746 0.2870593243657495 0.8821561492084058 25 | 0.2870766663825677 0.4531022866849526 0.3821581601761608 26 | 0.5469018756042061 0.7129907549624974 0.1178421130427552 27 | 0.9112350549663546 0.5464544355601575 0.6881630246986887 28 | 0.4535044337563505 0.0888097093179923 0.8118310973920870 29 | 0.0888186812531870 0.4535171373869460 0.3118295098614396 30 | 0.5464658711113302 0.9112203436874324 0.1881612173665204 31 | 0.7132641120365619 0.2503239729981956 0.6140937555277759 32 | 0.7496538290478298 0.2867673922237870 0.8859092924918244 33 | 0.2867698831415737 0.7496730390753257 0.3859132769834517 34 | 0.2503202139616150 0.7132595227511871 0.1140920181233014 35 | 0.4184549537527857 0.7764182132627795 0.6283336676699918 36 | 0.2235340640016616 0.5816101503410280 0.8716769209395494 37 | 0.5816243886829326 0.2235511685553604 0.3716797027759782 38 | 0.7764258443105589 0.4184307697441483 0.1283262815934734 39 | 40 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 41 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 42 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 43 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 44 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 45 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 46 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 47 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 48 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 49 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 50 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 51 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 52 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 53 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 54 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 55 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 56 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 57 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 58 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 59 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 60 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 61 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 62 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 63 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 64 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 65 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 66 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 67 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 68 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 69 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 70 | -------------------------------------------------------------------------------- /other/example_dos/GUTYAW/DOSCAR.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/GUTYAW/DOSCAR.gz -------------------------------------------------------------------------------- /other/example_dos/GUTYAW/dos.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | from pychemia.code.vasp import VaspXML 3 | import matplotlib.pyplot as plt 4 | from matplotlib import rc 5 | 6 | plt.rcParams['xtick.labelsize']=14 7 | plt.rcParams['ytick.labelsize']=14 8 | rc('text', usetex=True) 9 | 10 | mof = read('CONTCAR') 11 | Sr = [atom.index for atom in mof if atom.symbol == 'Sr'] 12 | O = [atom.index for atom in mof if atom.symbol == 'O'] 13 | C = [atom.index for atom in mof if atom.symbol == 'C'] 14 | H = [atom.index for atom in mof if atom.symbol == 'H'] 15 | S = [atom.index for atom in mof if atom.symbol == 'S'] 16 | VaspXML = VaspXML('vasprun.xml') 17 | 18 | colors = {'total':"#4d3c40", 19 | 'M':"#7248b7", 20 | 'C':'#9c9ec4', 21 | 'H':"#a4cb4f", 22 | 'O':"#c4529c", 23 | 'S':"#77b593", 24 | 'N':"#c25243"} 25 | 26 | s = [0] 27 | p = [1, 2, 3] 28 | d = [4, 5, 6, 7, 8] 29 | fig, ax = plt.subplots(3, figsize=(4, 6)) 30 | 31 | dos = VaspXML.dos_total 32 | x = dos.energies 33 | spins = [1] 34 | spinlabels = [r'$\uparrow$', r'$\downarrow$'] 35 | for i, spin in enumerate(spins): 36 | y = dos.dos[:, i+1]*spin 37 | if i == 0: 38 | label = 'Total' 39 | else: 40 | label = None 41 | ax[0].plot(x, y, '-', alpha=0.8, label=label, color=colors['total']) 42 | ax[0].axvline(x=0, color='k', linestyle='--', alpha=0.5) 43 | ax[0].axhline(y=0, color='k', linestyle='-') 44 | ax[0].legend(loc='upper left') 45 | ax[0].set_xlim([-12, 12]) 46 | ax[0].set_ylim([0,80]) 47 | ax[0].xaxis.get_ticklocs(minor=True) 48 | ax[0].yaxis.get_ticklocs(minor=True) 49 | ax[0].minorticks_on() 50 | pdos = VaspXML.dos_parametric(atoms=Sr, spin=[0]) 51 | for i, spin in enumerate(spins): 52 | y = pdos.dos[:, i+1]*spin 53 | if i == 0: 54 | label = 'Sr' 55 | else: 56 | label = None 57 | ax[1].plot(x, y, '-', alpha=0.8, label=label, color=colors['M']) 58 | ax[1].axvline(x=0, color='k', linestyle='--', alpha=0.5) 59 | ax[1].axhline(y=0, color='k', linestyle='-') 60 | ax[1].legend(loc='upper left') 61 | ax[1].set_xlim([-12, 12]) 62 | ax[1].set_ylim([0,10]) 63 | ax[1].xaxis.get_ticklocs(minor=True) 64 | ax[1].yaxis.get_ticklocs(minor=True) 65 | ax[1].minorticks_on() 66 | atoms = [C,H,O,S] 67 | atom_names = ['C','H','O','S'] 68 | for j, atom in enumerate(atoms): 69 | pdos = VaspXML.dos_parametric(atoms=atom, spin=[0]) 70 | for i, spin in enumerate(spins): 71 | y = pdos.dos[:, i+1]*spin 72 | if i == 0: 73 | label=atom_names[j] 74 | else: 75 | label = None 76 | ax[2].plot(x, y, '-', alpha=0.8, label=label, color=colors[atom_names[j]]) 77 | ax[2].axvline(x=0, color='k', linestyle='--', alpha=0.5) 78 | ax[2].axhline(y=0, color='k', linestyle='-') 79 | ax[2].legend(loc='upper left',ncol=2,columnspacing=1) 80 | 81 | ax[2].set_xlim([-12, 12]) 82 | ax[2].set_ylim([0,50]) 83 | ax[2].xaxis.get_ticklocs(minor=True) 84 | ax[2].yaxis.get_ticklocs(minor=True) 85 | ax[2].minorticks_on() 86 | 87 | ax[2].set_xlabel(r'$E-E_{\mathrm{f}}$ (eV)') 88 | ax[1].set_ylabel(r'Density of states (states/eV)') 89 | plt.tight_layout() 90 | plt.savefig('dos.png',dpi=500) 91 | -------------------------------------------------------------------------------- /other/example_dos/GUTYAW/vasprun.xml.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/GUTYAW/vasprun.xml.gz -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-HS/CONTCAR: -------------------------------------------------------------------------------- 1 | Fe H Au C N 2 | 1.00000000000000 3 | 3.0327972540000001 -9.9445705110000002 -0.0005783880000000 4 | 10.3892096320000000 -0.3660969750000000 0.0054680890000000 5 | 0.0082982530000000 0.0033554310000000 15.4604443380000003 6 | Fe H Au C N 7 | 2 16 4 32 20 8 | Direct 9 | 0.2500005134758041 0.2500121547414693 0.6158456575003868 10 | 0.7500988428836095 0.7499872483357777 0.3841223913953158 11 | 0.4199879323256539 0.0800049298121905 0.7656749821095303 12 | 0.0799072543095605 0.4200949596734773 0.7655704352349915 13 | 0.5799610724931483 0.9194384067714623 0.2346638810741908 14 | 0.9203115082600419 0.5798570513673553 0.2344254478029697 15 | 0.4283133396000451 0.0716841963957080 0.9255365138865841 16 | 0.0717738388079923 0.4282342086725990 0.9254397595284161 17 | 0.5716930864008916 0.9284635751842245 0.0744806367646405 18 | 0.9284999731165513 0.5715045610880978 0.0743699191498948 19 | 0.4279381550116526 0.0719287655174554 0.3081751143265308 20 | 0.0720055153104369 0.4281100806601330 0.3080765726374537 21 | 0.5718258318237233 0.9281692623662749 0.6920910674453324 22 | 0.9279826158398734 0.5718793748449968 0.6919723271080471 23 | 0.4200808414992707 0.0798290203144916 0.4675141476861953 24 | 0.0799962283804305 0.4201078931798534 0.4674101806859099 25 | 0.5796899659466632 0.9201354066857093 0.5324293224349219 26 | 0.9200338133593675 0.5805964042120877 0.5321869930660910 27 | 0.7499897708349934 0.2499818137820711 0.6061052746876996 28 | 0.7500180201436351 0.2500293924277912 0.3940149768839234 29 | 0.2499140409207712 0.7501113184726051 0.3937754141302108 30 | 0.2500019484048650 0.7499960590521297 0.6059705237513455 31 | 0.5698062604227871 0.2261933524725848 0.6101262368337004 32 | 0.9301740494018205 0.2737684438561416 0.6101215983881545 33 | 0.7261179744253710 0.0702186291026266 0.3898824564436367 34 | 0.7738845075712319 0.4298835309602183 0.3898495303457352 35 | 0.4300028802276259 0.7739852126665028 0.3896987923995070 36 | 0.0699184128660448 0.7259834751243019 0.3900962071893872 37 | 0.2737922182808035 0.9302218906563056 0.6101869788346619 38 | 0.2261878690295305 0.5697569504690634 0.6102072577207736 39 | 0.3446728111719679 0.1553217369210529 0.8034413351889285 40 | 0.1552554556762047 0.3447360522122906 0.8033698436434591 41 | 0.6552280147819260 0.8444987859461861 0.1965901571165460 42 | 0.8449653898880172 0.6549113775174860 0.1963900608393061 43 | 0.3491691918636732 0.1508288186127373 0.8927684682348627 44 | 0.1508429375289637 0.3491619044907424 0.8926857272435385 45 | 0.6509858616189490 0.8494334344466807 0.1072331239999684 46 | 0.8493653682231468 0.6506624281034874 0.1070922249369843 47 | 0.2500232486206713 0.2499776538521488 0.9387432465129848 48 | 0.7500340556567693 0.7500715536152427 0.0612377194333789 49 | 0.2500038198772430 0.2499958067654902 0.0337731302201334 50 | 0.7499577616803492 0.7500043150735465 0.9662401901029156 51 | 0.2499970886329734 0.2500057806152611 0.1996868572444015 52 | 0.7500488053479728 0.7499986024394758 0.8003296987854256 53 | 0.2499873292080821 0.2500204474296268 0.2945528185383353 54 | 0.7499709646815802 0.7499324990030871 0.7054708673378727 55 | 0.3490192570621176 0.1510152442675192 0.3405928050082991 56 | 0.1509950221558185 0.3489839076722205 0.3405218468205575 57 | 0.6508178306494870 0.8491452282976866 0.6596190252097500 58 | 0.8488280866518565 0.6507621185078278 0.6594689197733956 59 | 0.3448040537786667 0.1552151715448957 0.4297600091495468 60 | 0.1552480203389024 0.3447363752203856 0.4297001172263961 61 | 0.6549679927536260 0.8451279082269991 0.5704653371010266 62 | 0.8448182548082954 0.6554786750981663 0.5702587987938941 63 | 0.4629750844113687 0.2144341007464305 0.6145402439135026 64 | 0.0370275418112627 0.2855225496341731 0.6145088683008666 65 | 0.7159242617042807 0.9627134890677311 0.3861790611209912 66 | 0.7840558229740751 0.5372685312267365 0.3846898269097920 67 | 0.5375075229522466 0.7843383137623192 0.3860446650634586 68 | 0.9625931136940338 0.7155112193908479 0.3849238833744622 69 | 0.2855426983209313 0.0370737698671704 0.6145536456830030 70 | 0.2144485172739010 0.4628929027471216 0.6145536756344256 71 | 0.2499338656290888 0.2500499309332369 0.7592089209636725 72 | 0.7499494653714791 0.7492869754293636 0.2405155060623727 73 | 0.3477757598476856 0.1522556740957910 0.0747765443524813 74 | 0.1522117722682950 0.3477755944045455 0.0745939990046551 75 | 0.6522362978723635 0.8477662980413427 0.9252257743278705 76 | 0.8477477238091424 0.6521991544303418 0.9254109216591360 77 | 0.3478568843564460 0.1522158393822437 0.1588685818332962 78 | 0.1521747685951453 0.3477760176375142 0.1586873114515512 79 | 0.6522172108999911 0.8478419161862547 0.8411447223149366 80 | 0.8478019052824948 0.6521965939659751 0.8413305468495835 81 | 0.2500525757190104 0.2499449181441378 0.4737242828364359 82 | 0.7500462410484587 0.7507272354898333 0.5265197167974733 83 | 84 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 85 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 86 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 87 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 88 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 89 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 90 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 91 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 92 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 93 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 94 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 95 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 96 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 97 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 98 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 99 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 100 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 101 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 102 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 103 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 104 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 105 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 106 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 107 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 108 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 109 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 110 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 111 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 112 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 113 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 114 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 115 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 116 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 117 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 118 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 119 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 120 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 121 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 122 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 123 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 124 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 125 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 126 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 127 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 128 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 129 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 130 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 131 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 132 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 133 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 134 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 135 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 136 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 137 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 138 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 139 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 140 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 141 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 142 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 143 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 144 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 145 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 146 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 147 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 148 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 149 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 150 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 151 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 152 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 153 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 154 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 155 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 156 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 157 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 158 | -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-HS/DOSCAR.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/LOJLAZ-HS/DOSCAR.gz -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-HS/dos.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | from pychemia.code.vasp import VaspXML 3 | import matplotlib.pyplot as plt 4 | from matplotlib import rc 5 | 6 | plt.rcParams['xtick.labelsize']=14 7 | plt.rcParams['ytick.labelsize']=14 8 | rc('text', usetex=True) 9 | 10 | mof = read('CONTCAR') 11 | Fe = [atom.index for atom in mof if atom.symbol == 'Fe'] 12 | Au = [atom.index for atom in mof if atom.symbol == 'Au'] 13 | C = [atom.index for atom in mof if atom.symbol == 'C'] 14 | N = [atom.index for atom in mof if atom.symbol == 'N'] 15 | H = [atom.index for atom in mof if atom.symbol == 'H'] 16 | VaspXML = VaspXML('vasprun.xml') 17 | 18 | colors = {'total':"#4d3c40", 19 | 'M':"#7248b7", 20 | 'C':'#9c9ec4', 21 | 'H':"#a4cb4f", 22 | 'O':"#c4529c", 23 | 'S':"#77b593", 24 | 'N':"#c25243", 25 | 'Au':'#e2b541'} 26 | s = [0] 27 | p = [1, 2, 3] 28 | d = [4, 5, 6, 7, 8] 29 | elim = [-5, 5] 30 | fig, ax = plt.subplots(3, figsize=(5, 6)) 31 | 32 | dos = VaspXML.dos_total 33 | x = dos.energies 34 | spins = [1, -1] 35 | spinlabels = [r'$\uparrow$', r'$\downarrow$'] 36 | for i, spin in enumerate(spins): 37 | y = dos.dos[:, i+1]*spin 38 | if i == 0: 39 | label = 'Total' 40 | else: 41 | label = None 42 | ax[0].plot(x, y, '-', alpha=0.8, label=label, color=colors['total']) 43 | ax[0].axvline(x=0, color='k', linestyle='--',alpha=0.5) 44 | ax[0].axhline(y=0, color='k', linestyle='-') 45 | ax[0].legend(loc='upper right') 46 | ax[0].set_xlim(elim) 47 | ax[0].set_ylim([-100,100]) 48 | pdos = VaspXML.dos_parametric(atoms=Fe, spin=[0,1]) 49 | for i, spin in enumerate(spins): 50 | y = pdos.dos[:, i+1]*spin 51 | if i == 0: 52 | label = 'Fe' 53 | else: 54 | label = None 55 | ax[1].plot(x, y, '-', alpha=0.8, label=label, color=colors['M']) 56 | ax[1].axvline(x=0, color='k', linestyle='--',alpha=0.5) 57 | ax[1].axhline(y=0, color='k', linestyle='-') 58 | ax[1].legend(loc='upper right') 59 | ax[1].set_xlim(elim) 60 | ax[1].set_ylim([-30,30]) 61 | atoms = [Au,C,H,N] 62 | atom_names = ['Au','C','H','N'] 63 | for j, atom in enumerate(atoms): 64 | pdos = VaspXML.dos_parametric(atoms=atom, spin=[0,1]) 65 | for i, spin in enumerate(spins): 66 | y = pdos.dos[:, i+1]*spin 67 | if i == 0: 68 | label=atom_names[j] 69 | else: 70 | label = None 71 | ax[2].plot(x, y, '-', alpha=0.8, label=label, color=colors[atom_names[j]]) 72 | ax[2].axvline(x=0, color='k', linestyle='--',alpha=0.5) 73 | ax[2].axhline(y=0, color='k', linestyle='-') 74 | ax[2].legend(loc='upper right',ncol=2,columnspacing=1) 75 | ax[2].set_xlim(elim) 76 | ax[2].set_ylim([-75,75]) 77 | ax[0].xaxis.get_ticklocs(minor=True) 78 | ax[0].yaxis.get_ticklocs(minor=True) 79 | ax[0].minorticks_on() 80 | ax[1].xaxis.get_ticklocs(minor=True) 81 | ax[1].yaxis.get_ticklocs(minor=True) 82 | ax[1].minorticks_on() 83 | ax[2].xaxis.get_ticklocs(minor=True) 84 | ax[2].yaxis.get_ticklocs(minor=True) 85 | ax[2].minorticks_on() 86 | 87 | ax[2].set_xlabel(r'$E-E_{\mathrm{f}}$ (eV)') 88 | ax[1].set_ylabel(r'Density of states (states/eV)') 89 | plt.tight_layout() 90 | plt.savefig('dos.png',dpi=500) -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-HS/vasprun.xml.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/LOJLAZ-HS/vasprun.xml.gz -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-LS/CONTCAR: -------------------------------------------------------------------------------- 1 | Fe H Au C N 2 | 1.00000000000000 3 | 2.6436193588683037 -9.7227671993750153 -0.0000254084015313 4 | 10.0756457482676058 -0.0469517512440262 -0.0000354645240717 5 | -0.0000008222571310 -0.0000152465015463 15.0585980206052081 6 | Fe H Au C N 7 | 2 16 4 32 20 8 | Direct 9 | 0.2499983465380282 0.2499992359236103 0.6154222403591660 10 | 0.7500009641805931 0.7499986006135870 0.3845789139022600 11 | 0.4183168822232233 0.0816836020318306 0.7554269391140878 12 | 0.0816811106234923 0.4183161828895408 0.7554273098949693 13 | 0.5816857283997905 0.9183170667837715 0.2445758102183788 14 | 0.9183172582522090 0.5816804187461884 0.2445745825613272 15 | 0.4264412279233554 0.0735608303152020 0.9191431915412949 16 | 0.0735586626790763 0.4264412643327589 0.9191435620707864 17 | 0.5735618565728089 0.9264422094674458 0.0808599050329448 18 | 0.9264410596440058 0.5735578180460763 0.0808586391043562 19 | 0.4260483241058139 0.0739526566041633 0.3126375894841971 20 | 0.0739506929670597 0.4260468556107639 0.3126377184417919 21 | 0.5739514402798989 0.9260479233259176 0.6873621416662488 22 | 0.9260488742718138 0.5739522584937049 0.6873625272874335 23 | 0.4182414267853005 0.0817579387143255 0.4760221613369708 24 | 0.0817551979447941 0.4182409194839138 0.4760220642998618 25 | 0.5817573079853631 0.9182407180982466 0.5239778415272554 26 | 0.9182432303352073 0.5817573346119573 0.5239780845172106 27 | 0.7499987988102816 0.2500005074636178 0.6064061528778453 28 | 0.7499971556036655 0.2499990277977560 0.3935914554821593 29 | 0.2500002628863598 0.7500000723891418 0.3935888756726129 30 | 0.2499983836854867 0.7499997909582490 0.6064046800810061 31 | 0.5598876957648784 0.2320891526944280 0.6107078785976157 32 | 0.9401029020709757 0.2679103892163823 0.6107077755356798 33 | 0.7320876488699000 0.0599036628387353 0.3892923414451417 34 | 0.7679094709803209 0.4400997742725323 0.3892905639191042 35 | 0.4400966002197180 0.7679107189245897 0.3892894984690471 36 | 0.0599067021144890 0.7320884634822420 0.3892900292212431 37 | 0.2679101277152540 0.9401017243963068 0.6107060786998986 38 | 0.2320872306958890 0.5598936108761379 0.6107071128047608 39 | 0.3438781815898082 0.1561225565882225 0.7941014469826442 40 | 0.1561209823859926 0.3438782887542260 0.7941016678328339 41 | 0.6561237920300087 0.8438776310328393 0.2059001036915475 42 | 0.8438788043293073 0.6561198858511332 0.2058994603831792 43 | 0.3479855324724568 0.1520163047809717 0.8857563777694750 44 | 0.1520147712073410 0.3479858968346718 0.8857566431129911 45 | 0.6520169939002756 0.8479851198868644 0.1142456229992845 46 | 0.8479855614088052 0.6520137932305587 0.1142449192795567 47 | 0.2500007142361156 0.2500015995277565 0.9331151811939762 48 | 0.7500010288099830 0.7499994577645452 0.0668867881338073 49 | 0.2500011391632526 0.2500020843112694 0.0306865676810020 50 | 0.7500005377501680 0.7499997300474774 0.9693151529668711 51 | 0.2500002926617952 0.2500007359214678 0.2009930793244763 52 | 0.7500002229307867 0.7500002779420214 0.7990078243358809 53 | 0.2499993606085056 0.2499996249872964 0.2984281237511723 54 | 0.7500002840259157 0.7500002706997009 0.7015725244013069 55 | 0.3477719069232563 0.1522274750794210 0.3458326747521028 56 | 0.1522261212600000 0.3477715374604600 0.3458327949483007 57 | 0.6522275004077471 0.8477721736877086 0.6541679077054710 58 | 0.8477730743208411 0.6522277636964162 0.6541680775908674 59 | 0.3438867070242395 0.1561120533642821 0.4373353924804135 60 | 0.1561104167088132 0.3438867330476967 0.4373354337491193 61 | 0.6561120949041310 0.8438868425986215 0.5626658317153499 62 | 0.8438885393639950 0.6561117418194513 0.5626659381618637 63 | 0.4466932014634466 0.2275014643542264 0.6142950531395073 64 | 0.0533058217180979 0.2724977107830853 0.6142950384131538 65 | 0.7275010878741597 0.9466900402642509 0.3857060684485987 66 | 0.7724989687620791 0.5533052681346931 0.3857049565163280 67 | 0.5533091635727772 0.7724977269432856 0.3857047531927122 68 | 0.9466919725412737 0.7274988903703701 0.3857054050078332 69 | 0.2724981024374316 0.0533070956713786 0.6142944387254161 70 | 0.2275000412205586 0.4466928616427808 0.6142949465413707 71 | 0.2499994491462942 0.2500001576438748 0.7484258306647433 72 | 0.7500012016077022 0.7499982350808594 0.2515752168185728 73 | 0.3467945240510559 0.1532112302868711 0.0726603065572533 74 | 0.1532082608250960 0.3467931412118404 0.0726607191854569 75 | 0.6532083570923959 0.8467915389964134 0.9273408515366270 76 | 0.8467923653446334 0.6532079861696758 0.9273402622017670 77 | 0.3468150044997387 0.1531897540000529 0.1590361659093418 78 | 0.1531869828591184 0.3468132282764955 0.1590365047554485 79 | 0.6531873981892033 0.8468126585337075 0.8409653626125859 80 | 0.8468130126529445 0.6531873431759223 0.8409647821533497 81 | 0.2499985058887475 0.2499993211141032 0.4828735668087916 82 | 0.7500004112520458 0.7499988773310804 0.5171278475736472 83 | 84 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 85 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 86 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 87 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 88 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 89 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 90 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 91 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 92 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 93 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 94 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 95 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 96 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 97 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 98 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 99 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 100 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 101 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 102 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 103 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 104 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 105 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 106 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 107 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 108 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 109 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 110 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 111 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 112 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 113 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 114 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 115 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 116 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 117 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 118 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 119 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 120 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 121 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 122 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 123 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 124 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 125 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 126 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 127 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 128 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 129 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 130 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 131 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 132 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 133 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 134 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 135 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 136 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 137 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 138 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 139 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 140 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 141 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 142 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 143 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 144 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 145 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 146 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 147 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 148 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 149 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 150 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 151 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 152 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 153 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 154 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 155 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 156 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 157 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 158 | -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-LS/DOSCAR.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/LOJLAZ-LS/DOSCAR.gz -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-LS/dos.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | from pychemia.code.vasp import VaspXML 3 | import matplotlib.pyplot as plt 4 | from matplotlib import rc 5 | 6 | plt.rcParams['xtick.labelsize']=14 7 | plt.rcParams['ytick.labelsize']=14 8 | rc('text', usetex=True) 9 | 10 | mof = read('CONTCAR') 11 | Fe = [atom.index for atom in mof if atom.symbol == 'Fe'] 12 | Au = [atom.index for atom in mof if atom.symbol == 'Au'] 13 | C = [atom.index for atom in mof if atom.symbol == 'C'] 14 | N = [atom.index for atom in mof if atom.symbol == 'N'] 15 | H = [atom.index for atom in mof if atom.symbol == 'H'] 16 | VaspXML = VaspXML('vasprun.xml') 17 | 18 | colors = {'total':"#4d3c40", 19 | 'M':"#7248b7", 20 | 'C':'#9c9ec4', 21 | 'H':"#a4cb4f", 22 | 'O':"#c4529c", 23 | 'S':"#77b593", 24 | 'N':"#c25243", 25 | 'Au':'#e2b541'} 26 | s = [0] 27 | p = [1, 2, 3] 28 | d = [4, 5, 6, 7, 8] 29 | elim = [-5, 5] 30 | fig, ax = plt.subplots(3, figsize=(5, 6)) 31 | 32 | dos = VaspXML.dos_total 33 | x = dos.energies 34 | spins = [1, -1] 35 | spinlabels = [r'$\uparrow$', r'$\downarrow$'] 36 | for i, spin in enumerate(spins): 37 | y = dos.dos[:, i+1]*spin 38 | if i == 0: 39 | label = 'Total' 40 | else: 41 | label = None 42 | ax[0].plot(x, y, '-', alpha=0.8, label=label, color=colors['total']) 43 | ax[0].axvline(x=0, color='k', linestyle='--',alpha=0.5) 44 | ax[0].axhline(y=0, color='k', linestyle='-') 45 | ax[0].legend(loc='upper right') 46 | ax[0].set_xlim(elim) 47 | ax[0].set_ylim([-160,160]) 48 | pdos = VaspXML.dos_parametric(atoms=Fe, spin=[0,1]) 49 | for i, spin in enumerate(spins): 50 | y = pdos.dos[:, i+1]*spin 51 | if i == 0: 52 | label = 'Fe' 53 | else: 54 | label = None 55 | ax[1].plot(x, y, '-', alpha=0.8, label=label, color=colors['M']) 56 | ax[1].axvline(x=0, color='k', linestyle='--',alpha=0.5) 57 | ax[1].axhline(y=0, color='k', linestyle='-') 58 | ax[1].legend(loc='upper right') 59 | ax[1].set_xlim(elim) 60 | ax[1].set_ylim([-50,50]) 61 | atoms = [Au,C,H,N] 62 | atom_names = ['Au','C','H','N'] 63 | for j, atom in enumerate(atoms): 64 | pdos = VaspXML.dos_parametric(atoms=atom, spin=[0,1]) 65 | for i, spin in enumerate(spins): 66 | y = pdos.dos[:, i+1]*spin 67 | if i == 0: 68 | label=atom_names[j] 69 | else: 70 | label = None 71 | ax[2].plot(x, y, '-', alpha=0.8, label=label, color=colors[atom_names[j]]) 72 | ax[2].axvline(x=0, color='k', linestyle='--',alpha=0.5) 73 | ax[2].axhline(y=0, color='k', linestyle='-') 74 | ax[2].legend(loc='upper right',ncol=2,columnspacing=1) 75 | ax[2].set_xlim(elim) 76 | ax[2].set_ylim([-140,140]) 77 | ax[0].xaxis.get_ticklocs(minor=True) 78 | ax[0].yaxis.get_ticklocs(minor=True) 79 | ax[0].minorticks_on() 80 | ax[1].xaxis.get_ticklocs(minor=True) 81 | ax[1].yaxis.get_ticklocs(minor=True) 82 | ax[1].minorticks_on() 83 | ax[2].xaxis.get_ticklocs(minor=True) 84 | ax[2].yaxis.get_ticklocs(minor=True) 85 | ax[2].minorticks_on() 86 | 87 | ax[2].set_xlabel(r'$E-E_{\mathrm{f}}$ (eV)') 88 | ax[1].set_ylabel(r'Density of states (states/eV)') 89 | plt.tight_layout() 90 | plt.savefig('dos.png',dpi=500) -------------------------------------------------------------------------------- /other/example_dos/LOJLAZ-LS/vasprun.xml.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/LOJLAZ-LS/vasprun.xml.gz -------------------------------------------------------------------------------- /other/example_dos/RAXNEK/CONTCAR: -------------------------------------------------------------------------------- 1 | Fe H C N O 2 | 1.00000000000000 3 | -7.8755780604055303 -0.0221342010076184 1.3620089270563844 4 | -0.9731614594919148 10.1289927623027776 -5.4824503166599499 5 | 0.8971180228303540 -10.2886243202697631 -6.9893096995931483 6 | Fe H C N O 7 | 2 28 32 4 12 8 | Direct 9 | 0.0000014308340042 0.4999998223777737 0.5000011400986040 10 | 0.4999998404187664 0.9999988857804283 0.4999991478248660 11 | 0.1643135152417301 0.2880769924854505 0.3888904028532068 12 | 0.3356616280853544 0.8989538210509380 0.6110299128949350 13 | 0.8356881714031914 0.7119214092727546 0.6111103901050257 14 | 0.6643383640953857 0.1010433899589955 0.3889691713506735 15 | 0.9828371867169139 0.2318792581206424 0.3924597673728343 16 | 0.5171853766141226 0.8392817395040879 0.6075278685017196 17 | 0.0171647318285650 0.7681200394310679 0.6075426279922169 18 | 0.4828148966897743 0.1607153568805799 0.3924698285466164 19 | 0.2328832996939525 0.6836844306485759 0.7440522911764020 20 | 0.2680478268233060 0.9394827408736717 0.2554077326958719 21 | 0.7671190951591313 0.3163148518271726 0.2559487975297117 22 | 0.7319522612644533 0.0605141512772178 0.7445898482644395 23 | 0.4203622231562605 0.6745315660164195 0.9170270900752158 24 | 0.0806318184574621 0.7573221708510331 0.0823890304356638 25 | 0.5796407182911736 0.3254688222877320 0.0829746588807652 26 | 0.9193671364579359 0.2426753902453456 0.9176091535922808 27 | 0.2174190913752199 0.2741251802945044 0.7662728118734279 28 | 0.2821881138693314 0.5080433282329722 0.2339730989739195 29 | 0.7825842414185047 0.7258754368727836 0.2337307377555220 30 | 0.7178088618821121 0.4919540924231498 0.7660254775721214 31 | 0.0331231894690376 0.3036054058298490 0.6026548685312250 32 | 0.4665120681648602 0.7011291787392508 0.3975502147312895 33 | 0.9668798515437587 0.6963939792435738 0.3973476987088418 34 | 0.5334856418568208 0.2988677222314848 0.6024480618279853 35 | 0.4284504201654187 0.3572820303955453 0.9457995304283031 36 | 0.0717504100466400 0.4115115479504041 0.0541707589332603 37 | 0.5715528022809053 0.6427192360702989 0.0542034083000758 38 | 0.9282467544840856 0.5884858721775714 0.9458281727369382 39 | 0.2334573270424585 0.5949355084022869 0.7531083144096087 40 | 0.2671477494222074 0.8417533535023267 0.2465277621047832 41 | 0.7665451303355724 0.4050643444546296 0.2468934498507593 42 | 0.7328512118015666 0.1582438422991714 0.7534701702867679 43 | 0.3381009481354482 0.5886661320007249 0.8493106941273538 44 | 0.1625180200461287 0.7392670776029107 0.1503198303405426 45 | 0.6619016462394285 0.4113344744572132 0.1506914099045105 46 | 0.8374801727280854 0.2607301912751865 0.8496785196074725 47 | 0.3368699696424571 0.4716800002273303 0.8569226129406928 48 | 0.1633370957417313 0.6147686429642505 0.1429450632640084 49 | 0.6631327638112765 0.5283209626081060 0.1430799525408020 50 | 0.8366603597189837 0.3852288225629792 0.8570533961362656 51 | 0.2250337522965964 0.3673460667180279 0.7656527212569486 52 | 0.2748624139653515 0.6018007357413282 0.2344048538567591 53 | 0.7749692030893769 0.6326545002912596 0.2343502505322377 54 | 0.7251346230342151 0.3981968749900560 0.7655937704628002 55 | 0.1227731386898725 0.3825536508573535 0.6732492916898138 56 | 0.3771342128859061 0.7094074096038270 0.3267924732311229 57 | 0.8772295841220199 0.6174461879773006 0.3267532823184069 58 | 0.6228637414720950 0.2905897935851769 0.6732059306594138 59 | 0.3806642049569788 0.5349403778346300 0.4970528314952674 60 | 0.1193666118510777 0.0377850773956681 0.5028180118122947 61 | 0.6193376987207699 0.4650580956841139 0.5029460191747575 62 | 0.8806331704302224 0.9622137313430486 0.4971820991361540 63 | 0.4439017018556157 0.4048911112681211 0.4568501665105345 64 | 0.0560613501568383 0.9479800419284246 0.5430994438896235 65 | 0.5561001788221063 0.5951069216430369 0.5431475034817410 66 | 0.9439378965810903 0.0520192106364448 0.4569013596520151 67 | 0.4437897942451272 0.4543154662385263 0.9534161855616574 68 | 0.0563212731739284 0.5009129229782019 0.0465418625354701 69 | 0.5562133182367930 0.5456858257515975 0.0465865952131708 70 | 0.9436761511365006 0.4990843431280396 0.9534568306068536 71 | 0.1277207490011989 0.4935615380502796 0.6653256255945621 72 | 0.3725134212781640 0.8282612992016780 0.3345297010457244 73 | 0.8722818995836477 0.5064381378510987 0.3346765408119410 74 | 0.6274854576427487 0.1717359892649526 0.6654685245752887 75 | 0.0424036974124178 0.3039177547527032 0.3879641658732780 76 | 0.4575702757890596 0.9157397316708824 0.6119396624923183 77 | 0.9575982801775638 0.6960813350426989 0.6120383050455729 78 | 0.5424290358364630 0.0842570490201524 0.3880579633534254 79 | 0.2380303744679608 0.5781484762355547 0.4936282040047502 80 | 0.2620305346081011 0.0842990764176648 0.5060939899186536 81 | 0.7619724849830547 0.4218510783807261 0.5063728427104195 82 | 0.7379692548644101 0.9156988445495671 0.4939049977062382 83 | 0.3749690173179943 0.2902155911127693 0.4046261680790693 84 | 0.1249614372681336 0.8854553130420868 0.5952573913816863 85 | 0.6250333741943663 0.7097822404388197 0.5953702699592043 86 | 0.8750382512715404 0.1145443970796052 0.4047440379061698 87 | 88 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 89 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 90 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 91 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 92 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 93 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 94 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 95 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 96 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 97 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 98 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 99 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 100 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 101 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 102 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 103 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 104 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 105 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 106 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 107 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 108 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 109 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 110 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 111 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 112 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 113 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 114 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 115 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 116 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 117 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 118 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 119 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 120 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 121 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 122 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 123 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 124 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 125 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 126 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 127 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 128 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 129 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 130 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 131 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 132 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 133 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 134 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 135 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 136 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 137 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 138 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 139 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 140 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 141 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 142 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 143 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 144 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 145 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 146 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 147 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 148 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 149 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 150 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 151 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 152 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 153 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 154 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 155 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 156 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 157 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 158 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 159 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 160 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 161 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 162 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 163 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 164 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 165 | 0.00000000E+00 0.00000000E+00 0.00000000E+00 166 | -------------------------------------------------------------------------------- /other/example_dos/RAXNEK/DOSCAR.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/RAXNEK/DOSCAR.gz -------------------------------------------------------------------------------- /other/example_dos/RAXNEK/dos.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | from pychemia.code.vasp import VaspXML 3 | import matplotlib.pyplot as plt 4 | from matplotlib import rc 5 | 6 | plt.rcParams['xtick.labelsize']=14 7 | plt.rcParams['ytick.labelsize']=14 8 | rc('text', usetex=True) 9 | 10 | mof = read('CONTCAR') 11 | Fe = [atom.index for atom in mof if atom.symbol == 'Fe'] 12 | O = [atom.index for atom in mof if atom.symbol == 'O'] 13 | C = [atom.index for atom in mof if atom.symbol == 'C'] 14 | N = [atom.index for atom in mof if atom.symbol == 'N'] 15 | H = [atom.index for atom in mof if atom.symbol == 'H'] 16 | VaspXML = VaspXML('vasprun.xml') 17 | 18 | colors = {'total':"#4d3c40", 19 | 'M':"#7248b7", 20 | 'C':'#9c9ec4', 21 | 'H':"#a4cb4f", 22 | 'O':"#c4529c", 23 | 'S':"#77b593", 24 | 'N':"#c25243"} 25 | s = [0] 26 | p = [1, 2, 3] 27 | d = [4, 5, 6, 7, 8] 28 | elim = [-5, 5] 29 | fig, ax = plt.subplots(3, figsize=(5, 6)) 30 | 31 | dos = VaspXML.dos_total 32 | x = dos.energies 33 | spins = [1, -1] 34 | spinlabels = [r'$\uparrow$', r'$\downarrow$'] 35 | for i, spin in enumerate(spins): 36 | y = dos.dos[:, i+1]*spin 37 | if i == 0: 38 | label = 'Total' 39 | else: 40 | label = None 41 | ax[0].plot(x, y, '-', alpha=0.8, label=label, color=colors['total']) 42 | ax[0].axvline(x=0, color='k', linestyle='--',alpha=0.5) 43 | ax[0].axhline(y=0, color='k', linestyle='-') 44 | ax[0].legend(loc='upper left') 45 | ax[0].set_xlim(elim) 46 | ax[0].set_ylim([-90,90]) 47 | 48 | pdos = VaspXML.dos_parametric(atoms=Fe, spin=[0,1]) 49 | for i, spin in enumerate(spins): 50 | y = pdos.dos[:, i+1]*spin 51 | if i == 0: 52 | label = 'Fe' 53 | else: 54 | label = None 55 | ax[1].plot(x, y, '-', alpha=0.8, label=label, color=colors['M']) 56 | ax[1].axvline(x=0, color='k', linestyle='--',alpha=0.5) 57 | ax[1].axhline(y=0, color='k', linestyle='-') 58 | ax[1].legend(loc='upper left') 59 | ax[1].set_xlim(elim) 60 | ax[1].set_ylim([-30,30]) 61 | atoms = [C,H,N,O] 62 | atom_names = ['C','H','N','O'] 63 | for j, atom in enumerate(atoms): 64 | pdos = VaspXML.dos_parametric(atoms=atom, spin=[0,1]) 65 | for i, spin in enumerate(spins): 66 | y = pdos.dos[:, i+1]*spin 67 | if i == 0: 68 | label=atom_names[j] 69 | else: 70 | label = None 71 | ax[2].plot(x, y, '-', alpha=0.8, label=label, color=colors[atom_names[j]]) 72 | ax[2].axvline(x=0, color='k', linestyle='--',alpha=0.5) 73 | ax[2].axhline(y=0, color='k', linestyle='-') 74 | ax[2].legend(loc='upper left',ncol=2,columnspacing=1) 75 | ax[2].set_xlim(elim) 76 | ax[2].set_ylim([-75,75]) 77 | ax[0].xaxis.get_ticklocs(minor=True) 78 | ax[0].yaxis.get_ticklocs(minor=True) 79 | ax[0].minorticks_on() 80 | ax[1].xaxis.get_ticklocs(minor=True) 81 | ax[1].yaxis.get_ticklocs(minor=True) 82 | ax[1].minorticks_on() 83 | ax[2].xaxis.get_ticklocs(minor=True) 84 | ax[2].yaxis.get_ticklocs(minor=True) 85 | ax[2].minorticks_on() 86 | ax[2].set_xlabel(r'$E-E_{\mathrm{f}}$ (eV)') 87 | ax[1].set_ylabel(r'Density of states (states/eV)') 88 | plt.tight_layout() 89 | plt.savefig('dos.png',dpi=500) 90 | -------------------------------------------------------------------------------- /other/example_dos/RAXNEK/vasprun.xml.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/RAXNEK/vasprun.xml.gz -------------------------------------------------------------------------------- /other/example_dos/README.md: -------------------------------------------------------------------------------- 1 | The folders contain DOS and HSE06-D3(BJ) structures for selected materials highlighted in the [QMOF database paper](https://doi.org/10.1016/j.matt.2021.02.015). To recreate the DOS plots, run `dos.py` after using `gunzip` to unzip the `DOSCAR.gz` and `vasprun.xml` files. The DOS plotting scripts rely on [PyChemia](https://github.com/MaterialsDiscovery/PyChemia). -------------------------------------------------------------------------------- /other/example_dos/WAQMEJ/DOSCAR.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/WAQMEJ/DOSCAR.gz -------------------------------------------------------------------------------- /other/example_dos/WAQMEJ/dos.py: -------------------------------------------------------------------------------- 1 | from ase.io import read 2 | from pychemia.code.vasp import VaspXML 3 | import matplotlib.pyplot as plt 4 | from matplotlib import rc 5 | 6 | plt.rcParams['xtick.labelsize']=14 7 | plt.rcParams['ytick.labelsize']=14 8 | rc('text', usetex=True) 9 | mof = read('CONTCAR') 10 | Rh = [atom.index for atom in mof if atom.symbol == 'Rh'] 11 | O = [atom.index for atom in mof if atom.symbol == 'O'] 12 | C = [atom.index for atom in mof if atom.symbol == 'C'] 13 | N = [atom.index for atom in mof if atom.symbol == 'N'] 14 | H = [atom.index for atom in mof if atom.symbol == 'H'] 15 | S = [atom.index for atom in mof if atom.symbol == 'S'] 16 | VaspXML = VaspXML('vasprun.xml') 17 | 18 | colors = {'total':"#4d3c40", 19 | 'M':"#7248b7", 20 | 'C':'#9c9ec4', 21 | 'H':"#a4cb4f", 22 | 'O':"#c4529c", 23 | 'S':"#77b593", 24 | 'N':"#c25243"} 25 | s = [0] 26 | p = [1, 2, 3] 27 | d = [4, 5, 6, 7, 8] 28 | elim = [-5, 5] 29 | fig, ax = plt.subplots(3, figsize=(5, 6)) 30 | 31 | dos = VaspXML.dos_total 32 | x = dos.energies 33 | spins = [1, -1] 34 | spinlabels = [r'$\uparrow$', r'$\downarrow$'] 35 | for i, spin in enumerate(spins): 36 | y = dos.dos[:, i+1]*spin 37 | if i == 0: 38 | label = 'Total' 39 | else: 40 | label = None 41 | ax[0].plot(x, y, '-', alpha=0.8, label=label, color=colors['total']) 42 | ax[0].axvline(x=0, color='k', linestyle='--',alpha=0.5) 43 | ax[0].axhline(y=0, color='k', linestyle='-') 44 | ax[0].legend(loc='upper right') 45 | ax[0].set_xlim(elim) 46 | ax[0].set_ylim([-100, 100]) 47 | 48 | pdos = VaspXML.dos_parametric(atoms=Rh, spin=[0,1]) 49 | for i, spin in enumerate(spins): 50 | y = pdos.dos[:, i+1]*spin 51 | if i == 0: 52 | label = 'Rh' 53 | else: 54 | label = None 55 | ax[1].plot(x, y, '-', alpha=0.8, label=label, color=colors['M']) 56 | ax[1].axvline(x=0, color='k', linestyle='--',alpha=0.5) 57 | ax[1].axhline(y=0, color='k', linestyle='-') 58 | ax[1].legend(loc='upper right') 59 | ax[1].set_xlim(elim) 60 | ax[1].set_ylim([-75,75]) 61 | atoms = [C,H,N,O,S] 62 | atom_names = ['C','H','N','O','S'] 63 | for j, atom in enumerate(atoms): 64 | pdos = VaspXML.dos_parametric(atoms=atom, spin=[0,1]) 65 | for i, spin in enumerate(spins): 66 | y = pdos.dos[:, i+1]*spin 67 | if i == 0: 68 | label=atom_names[j] 69 | else: 70 | label = None 71 | ax[2].plot(x, y, '-', alpha=0.8, label=label, color=colors[atom_names[j]]) 72 | ax[2].axvline(x=0, color='k', linestyle='--',alpha=0.5) 73 | ax[2].axhline(y=0, color='k', linestyle='-') 74 | ax[2].legend(loc='upper right',ncol=3,columnspacing=1) 75 | ax[2].set_xlim(elim) 76 | ax[2].set_ylim([-50,50]) 77 | ax[0].xaxis.get_ticklocs(minor=True) 78 | ax[0].yaxis.get_ticklocs(minor=True) 79 | ax[0].minorticks_on() 80 | ax[1].xaxis.get_ticklocs(minor=True) 81 | ax[1].yaxis.get_ticklocs(minor=True) 82 | ax[1].minorticks_on() 83 | ax[2].xaxis.get_ticklocs(minor=True) 84 | ax[2].yaxis.get_ticklocs(minor=True) 85 | ax[2].minorticks_on() 86 | 87 | ax[2].set_xlabel(r'$E-E_{\mathrm{f}}$ (eV)') 88 | ax[1].set_ylabel(r'Density of states (states/eV)') 89 | plt.tight_layout() 90 | plt.savefig('dos.png',dpi=500) -------------------------------------------------------------------------------- /other/example_dos/WAQMEJ/vasprun.xml.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Andrew-S-Rosen/QMOF/d21e83843dc2b002836c130c3fad84063bf04af6/other/example_dos/WAQMEJ/vasprun.xml.gz -------------------------------------------------------------------------------- /updates.md: -------------------------------------------------------------------------------- 1 | This file documents all updates to the QMOF database on Figshare. 2 | 3 | # Changelog 4 | - [v16](https://figshare.com/articles/dataset/QMOF_Database/13147324/16): Removed qmof-42b55bb 5 | - [v15](https://figshare.com/articles/dataset/QMOF_Database/13147324/15): Removed qmof-7d69ad1 6 | - [v14](https://figshare.com/articles/dataset/QMOF_Database/13147324/14): New single-point calculations at the HLE17, HSE06* (i.e. 10% HF ex.), and HSE06 (25% HF ex.) levels of theory. 12/09/21. 7 | - [v13](https://figshare.com/articles/dataset/QMOF_Database/13147324/13): Locked-in version to permanently match the files [uploaded to NOMAD](https://nomad-lab.eu/prod/rae/gui/dataset/id/O-FUAo0mThSUeXg70cMN3Q?results=entries). 09/15/21. 8 | - [v12](https://figshare.com/articles/dataset/QMOF_Database/13147324/12): Several MOFs taken from the [Genomic MOF Database](https://figshare.com/s/ec378d7315581e48f1e4) with over/underbonded atoms were removed, as the original authors of the Genomic MOF Database uploaded a fairly large fraction of structures with missing H atoms. Supplemental results from new non-self-consistent (NSCF) calculations with a higher k-point density are now provided (note: this was reverted in v13, as it was discovered that LMAXMIX was [not set high enough](https://github.com/materialsproject/pymatgen/commit/932d9da8c19d30040990bd273550e996c7d7519d), effecting the NSCF results for a subset of structures). Removed raw VASP files from Figshare to instead host them on [NOMAD](https://nomad-lab.eu/prod/rae/gui/dataset/id/O-FUAo0mThSUeXg70cMN3Q). Gave each MOF a unique hash-based identifier, which will match the identifiers on the forthcoming Materials Project MOF Explorer app. 09/14/21. 9 | - [v11](https://figshare.com/articles/dataset/QMOF_Database/13147324/11): Same changes as in v12, but the bandgaps.csv file was not made backwards-compatible here. 09/13/21. 10 | - [v10](https://figshare.com/articles/dataset/QMOF_Database/13147324/10): Removed irrelevant data from the JSON, reducing the filesize. 09/01/21. 11 | - [v9](https://figshare.com/articles/dataset/QMOF_Database/13147324/9): Added ~3000 new DFT-optimized MOFs and properties from the CoRE MOF Database (based on the clean subset identified by [Chen and Manz](https://doi.org/10.1039/D0RA02498H)), the [Genomic MOF Database](https://figshare.com/s/ec378d7315581e48f1e4), and the [CSD MOF Subset](10.1021/acs.chemmater.7b00441). Deprecated 13 structures. Added spacegroup info. Added "synthesized?" flag. Added missing PLDs and LCDs. Fixed 186 structures that had EDIFF = 1e-4 instead of EDIFF = 1e-6 in the INCAR. Removed structures that were duplicates according to Pymatgen's StructureMatcher to avoid confusion. The user no longer needs to run the StructureMatcher as a result. 09/01/21. 12 | - [v8](https://figshare.com/articles/dataset/QMOF_Database/13147324/8): Added 1243 new DFT-optimized MOFs. 623 were taken from the [Boyd & Woo dataset](https://doi.org/10.24435/materialscloud:2018.0016/v3), 485 were taken directly from the [2019 CoRE MOF FSR Database](https://doi.org/10.5281/zenodo.3677685 13 | ), 92 were Cu triangle MOFs taken from [ToBaCCo](https://pubs.acs.org/doi/abs/10.1021/acs.cgd.7b00848), and 44 were Hf MOFs obtained by exchanging the Zr metals of ToBaCCo MOFs by [Anderson and coworkers](https://osf.io/7dgvy/). For the CoRE MOFs, only those found in [this pre-curated list](https://doi.org/10.1021/acs.jctc.0c01229 14 | ) were included to maximize structural fidelity. For the hypothetical MOFs, some new ones were introduced using the Boyd & Woo structures as a starting point (e.g. by exchanging metal cations). 3 MOFs were deprecated. Added MOFids, DOIs, spin-dependent CBM/VBM, and initial CIFs for the hypothetical MOFs. 07/12/21. 15 | - [v7](https://figshare.com/articles/dataset/QMOF_Database/13147324/7): Deprecated 12 MOFs. Added more properties to JSON file and made it easier to parse. 06/08/21. 16 | - [v6](https://figshare.com/articles/dataset/QMOF_Database/13147324/6): Added 2620 DFT-optimized MOFs. 1217 were taken from the CSD using the usual protocol. 1188 were hypothetical MOFs obtained from the [Boyd & Woo dataset](https://doi.org/10.24435/materialscloud:2018.0016/v3). 148 were hypothetical MOF-74 and MOF-5 analogues obtained from Haranczyk's [nanoporousmaterials.org](http://nanoporousmaterials.org/databases/). 48 were hypothetical Zr MOFs made with [ToBaCCo](https://github.com/tobacco-mofs/tobacco_3.0) and obtained from [Anderson and coworkers](https://osf.io/7dgvy/). 19 were experimental pyrene MOFs from [Smit and coworkers](https://doi.org/10.24435/materialscloud:z5-ct). The maximum number of atoms per unit cell was raised to 500. 5/7/2021. 17 | - [v5](https://figshare.com/articles/dataset/QMOF_Database/13147324/5): Release corresponding to the published [*Matter* paper](https://www.cell.com/matter/fulltext/S2590-2385(21)00070-9). No changes to the database compared to v3. Fixes a bug in `get_subset_data.py` that did not correctly write out the updated `.json` file. 2/12/21. 18 | - [v4](https://figshare.com/articles/dataset/QMOF_Database/13147324/4): Includes a few minor typo fixes and better `.xlsx` reader. 1/12/21. 19 | - [v3](https://figshare.com/articles/dataset/QMOF_Database/13147324/3): Added CM5 partial charges for every structure and 3000+ Bader charges (and spin densities). Patched some minor bugfixes with the unrelaxed properties for a few MOF structures, deprecated a few structures, and flagged more duplicates. Continued restructuring of main QMOF database for increased useability. 12/23/20. 20 | - [v2](https://figshare.com/articles/dataset/QMOF_Database/13147324/2): ~1500 new structures with pore-limiting diameter greater than 2.4 Å, computed using Zeo++ prior to structure relaxation, were added to the QMOF database along with their DFT-computed properties. The cap on the maximum number of atoms per primitive cell was raised from 150 to 300. 12/05/20. 21 | - [v1](https://figshare.com/articles/dataset/QMOF_Database/13147324/1): Initial release corresponding to the QMOF database [pre-print](https://dx.doi.org/10.26434/chemrxiv.13147616). 10/28/20. 22 | --------------------------------------------------------------------------------