├── LICENSE ├── README.md └── mace_osaka24 ├── mace-osaka24-large.sh ├── mace-osaka24-medium.sh └── mace-osaka24-small.sh /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 qiqb-osaka 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MACE-Osaka24 models 2 | This repository provides the model and training scripts for a multi-domain universal machine learning interatomic potentials (MLIPs), the MACE-Osaka24 models, capable of accurately describing both crystalline and molecular domains. 3 | 4 | The MACE-Osaka24 model is a universal MLIP trained on datasets of both crystals and molecules, which were generated using a dataset integration technique called "Total Energy Alignment" that combines first-principles calculations under various conditions. 5 | 6 | Its architecture is based on the first-generation MACE model. To use the models please install the [MACE code](https://github.com/ACEsuit/mace). 7 | 8 | ## Models 9 | 10 | The first generation of models are available in the [MACE-Osaka24](https://github.com/qiqb-osaka/mace-osaka24/releases/tag/v0.0.1). 11 | 12 | If you use the models, in addition to citing the original MACE papers, please cite: 13 | 14 | ```bib 15 | @misc{shiota2024taming, 16 | title={Taming Multi-Domain, -Fidelity Data: Towards Foundation Models for Atomistic Scale Simulations}, 17 | author={Tomoya Shiota and Kenji Ishihara and Tuan Minh Do and Toshio Mori and Wataru Mizukami}, 18 | year={2024}, 19 | eprint={2412.13088}, 20 | archivePrefix={arXiv}, 21 | primaryClass={physics.chem-ph} 22 | } 23 | ``` 24 | 25 | ## Training scripts 26 | 27 | We provide training scripts for the models in this repository. The latest training command line is found in [`mace-osaka24/mace-osaka24-large.sh`](mace_osaka24/mace-osaka24-large.sh). 28 | 29 | ## Training data 30 | 31 | The integrated inorganic–organic domain dataset used to train the models—composed of the inorganic MPtrj dataset and the organic SPICE, QMug, water clusters, and Tripeptides (OFF23) datasets—is available at [figshare](https://figshare.com/articles/dataset/Inorganic_organic_domain_dataset/28023149). If you use any of these datasets, please cite the following paper. 32 | 33 | ```bib 34 | @article{deng2023chgnet, 35 | title={CHGNet: Pretrained universal neural network potential for charge-informed atomistic modeling}, 36 | author={Bowen Deng and Peichen Zhong and KyuJung Jun and Janosh Riebesell and Kevin Han and Christopher J. Bartel and Gerbrand Ceder}, 37 | year={2023}, 38 | eprint={2302.14231}, 39 | archivePrefix={arXiv}, 40 | primaryClass={cond-mat.mtrl-sci} 41 | } 42 | 43 | @misc{kovacs2023maceoff23, 44 | title={MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules}, 45 | author={Dávid Péter Kovács and J. Harry Moore and Nicholas J. Browning and Ilyes Batatia and Joshua T. Horton and Venkat Kapil and William C. Witt and Ioan-Bogdan Magdău and Daniel J. Cole and Gábor Csányi}, 46 | year={2023}, 47 | eprint={2312.15211}, 48 | archivePrefix={arXiv}, 49 | } 50 | 51 | @article{eastman2023spice, 52 | title={Spice, a dataset of drug-like molecules and peptides for training machine learning potentials}, 53 | author={Eastman, Peter and Behara, Pavan Kumar and Dotson, David L and Galvelis, Raimondas and Herr, John E and Horton, Josh T and Mao, Yuezhi and Chodera, John D and Pritchard, Benjamin P and Wang, Yuanqing and others}, 54 | journal={Scientific Data}, 55 | volume={10}, 56 | number={1}, 57 | pages={11}, 58 | year={2023}, 59 | publisher={Nature Publishing Group UK London} 60 | } 61 | 62 | @article{donchev2021quantum, 63 | title={Quantum chemical benchmark databases of gold-standard dimer interaction energies}, 64 | author={Donchev, Alexander G and Taube, Andrew G and Decolvenaere, Elizabeth and Hargus, Cory and McGibbon, Robert T and Law, Ka-Hei and Gregersen, Brent A and Li, Je-Luen and Palmo, Kim and Siva, Karthik and others}, 65 | journal={Scientific data}, 66 | volume={8}, 67 | number={1}, 68 | pages={55}, 69 | year={2021}, 70 | publisher={Nature Publishing Group UK London} 71 | } 72 | 73 | @article{isert2022qmugs, 74 | title={QMugs, quantum mechanical properties of drug-like molecules}, 75 | author={Isert, Clemens and Atz, Kenneth and Jim{\'e}nez-Luna, Jos{\'e} and Schneider, Gisbert}, 76 | journal={Scientific Data}, 77 | volume={9}, 78 | number={1}, 79 | pages={273}, 80 | year={2022}, 81 | publisher={Nature Publishing Group UK London} 82 | } 83 | 84 | @misc{shiota2024taming, 85 | title={Taming Multi-Domain, -Fidelity Data: Towards Foundation Models for Atomistic Scale Simulations}, 86 | author={Tomoya Shiota and Kenji Ishihara and Tuan Minh Do and Toshio Mori and Wataru Mizukami}, 87 | year={2024}, 88 | eprint={2412.13088}, 89 | archivePrefix={arXiv}, 90 | primaryClass={physics.chem-ph} 91 | } 92 | ``` 93 | 94 | ## Example 95 | 96 | In this example, the energy of a silicon crystal and acetic acid is calculated using universal multi-domain MLIP MACE-Osaka24 and Atomic Simulation Environment (ASE). 97 | 98 | ```python 99 | from ase.build import bulk 100 | from ase.build import molecule 101 | from mace.calculators import MACECalculator 102 | 103 | si = bulk('Si', 'diamond', a=5.43) 104 | calculator = MACECalculator(model_path='/path-to-mace-osaka24/mace-osaka24-large.model', device='cpu') 105 | si.calc = calculator 106 | 107 | energy_si = si.get_potential_energy() 108 | print("Single-point energy of diamond Si:", energy_si) 109 | 110 | acid = molecule('CH3COOH') 111 | calculator = MACECalculator(model_path='/path-to-mace-osaka24/mace-osaka24-large.model', device='cpu') 112 | acid.calc = calculator 113 | 114 | energy_acid = acid.get_potential_energy() 115 | print("Single-point energy of acetic acid:", energy_acid) 116 | ``` 117 | 118 | ## Contributors 119 | This project was developed by: 120 | 121 | - Tomoya Shiota (@TShiotaSS) 122 | - Kenji Ishihara (@kenji-ishihara-os) 123 | - Toshio Mori (@forest1040) 124 | - Wataru Mizukami (@wmizukami) 125 | -------------------------------------------------------------------------------- /mace_osaka24/mace-osaka24-large.sh: -------------------------------------------------------------------------------- 1 | python3 /opt/src/new_mace_0729/mace/mace/cli/run_train.py \ 2 | --name="${MODEL_NAME}" \ 3 | --train_file="/dataset/train" \ 4 | --valid_file="/dataset/val" \ 5 | --test_file="/dataset/test" \ 6 | --statistics_file="/dataset/statistics.json" \ 7 | --loss='universal' \ 8 | --energy_weight=1 \ 9 | --forces_weight=10 \ 10 | --compute_stress=True \ 11 | --stress_weight=100 \ 12 | --stress_key='stress' \ 13 | --eval_interval=1 \ 14 | --error_table='PerAtomMAE' \ 15 | --model="ScaleShiftMACE" \ 16 | --interaction_first="RealAgnosticResidualInteractionBlock" \ 17 | --interaction="RealAgnosticResidualInteractionBlock" \ 18 | --num_interactions=2 \ 19 | --correlation=3 \ 20 | --max_ell=3 \ 21 | --r_max=4.5 \ 22 | --max_L=2 \ 23 | --num_channels=128 \ 24 | --num_radial_basis=10 \ 25 | --MLP_irreps="16x0e" \ 26 | --scaling='rms_forces_scaling' \ 27 | --num_workers=64 \ 28 | --lr=0.005 \ 29 | --weight_decay=1e-8 \ 30 | --ema \ 31 | --ema_decay=0.995 \ 32 | --scheduler_patience=5 \ 33 | --batch_size=16 \ 34 | --valid_batch_size=32 \ 35 | --max_num_epochs=200 \ 36 | --patience=50 \ 37 | --amsgrad \ 38 | --device=cuda \ 39 | --seed=1 \ 40 | --clip_grad=100 \ 41 | --keep_checkpoints \ 42 | --save_cpu \ 43 | --restart_latest \ 44 | --log_dir="/mnt/logs/${MODEL_NAME}" \ 45 | --model_dir="/mnt/models/${MODEL_NAME}" \ 46 | --checkpoints_dir="/mnt/checkpoints/${MODEL_NAME}" \ 47 | --results_dir="/mnt/results/${MODEL_NAME}" \ 48 | --distributed >> ${RESULT_FILE} 49 | -------------------------------------------------------------------------------- /mace_osaka24/mace-osaka24-medium.sh: -------------------------------------------------------------------------------- 1 | python3 /opt/src/new_mace_0729/mace/mace/cli/run_train.py \ 2 | --name="${MODEL_NAME}" \ 3 | --train_file="/dataset/train" \ 4 | --valid_file="/dataset/val" \ 5 | --test_file="/dataset/test" \ 6 | --statistics_file="/dataset/statistics.json" \ 7 | --loss='universal' \ 8 | --energy_weight=1 \ 9 | --forces_weight=10 \ 10 | --compute_stress=True \ 11 | --stress_weight=100 \ 12 | --stress_key='stress' \ 13 | --eval_interval=1 \ 14 | --error_table='PerAtomMAE' \ 15 | --model="ScaleShiftMACE" \ 16 | --interaction_first="RealAgnosticResidualInteractionBlock" \ 17 | --interaction="RealAgnosticResidualInteractionBlock" \ 18 | --num_interactions=2 \ 19 | --correlation=3 \ 20 | --max_ell=3 \ 21 | --r_max=4.5 \ 22 | --max_L=1 \ 23 | --num_channels=128 \ 24 | --num_radial_basis=10 \ 25 | --MLP_irreps="16x0e" \ 26 | --scaling='rms_forces_scaling' \ 27 | --num_workers=64 \ 28 | --lr=0.005 \ 29 | --weight_decay=1e-8 \ 30 | --ema \ 31 | --ema_decay=0.995 \ 32 | --scheduler_patience=5 \ 33 | --batch_size=16 \ 34 | --valid_batch_size=32 \ 35 | --max_num_epochs=200 \ 36 | --patience=50 \ 37 | --amsgrad \ 38 | --device=cuda \ 39 | --seed=1 \ 40 | --clip_grad=100 \ 41 | --keep_checkpoints \ 42 | --save_cpu \ 43 | --restart_latest \ 44 | --log_dir="/mnt/logs/${MODEL_NAME}" \ 45 | --model_dir="/mnt/models/${MODEL_NAME}" \ 46 | --checkpoints_dir="/mnt/checkpoints/${MODEL_NAME}" \ 47 | --results_dir="/mnt/results/${MODEL_NAME}" \ 48 | --distributed >> ${RESULT_FILE} 49 | -------------------------------------------------------------------------------- /mace_osaka24/mace-osaka24-small.sh: -------------------------------------------------------------------------------- 1 | python3 /opt/src/new_mace_0729/mace/mace/cli/run_train.py \ 2 | --name="${MODEL_NAME}" \ 3 | --train_file="/dataset/train" \ 4 | --valid_file="/dataset/val" \ 5 | --test_file="/dataset/test" \ 6 | --statistics_file="/dataset/statistics.json" \ 7 | --loss='universal' \ 8 | --energy_weight=1 \ 9 | --forces_weight=10 \ 10 | --compute_stress=True \ 11 | --stress_weight=100 \ 12 | --stress_key='stress' \ 13 | --eval_interval=1 \ 14 | --error_table='PerAtomMAE' \ 15 | --model="ScaleShiftMACE" \ 16 | --interaction_first="RealAgnosticResidualInteractionBlock" \ 17 | --interaction="RealAgnosticResidualInteractionBlock" \ 18 | --num_interactions=2 \ 19 | --correlation=3 \ 20 | --max_ell=3 \ 21 | --r_max=4.5 \ 22 | --max_L=0 \ 23 | --num_channels=128 \ 24 | --num_radial_basis=10 \ 25 | --MLP_irreps="16x0e" \ 26 | --scaling='rms_forces_scaling' \ 27 | --num_workers=64 \ 28 | --lr=0.005 \ 29 | --weight_decay=1e-8 \ 30 | --ema \ 31 | --ema_decay=0.995 \ 32 | --scheduler_patience=5 \ 33 | --batch_size=16 \ 34 | --valid_batch_size=32 \ 35 | --max_num_epochs=200 \ 36 | --patience=50 \ 37 | --amsgrad \ 38 | --device=cuda \ 39 | --seed=1 \ 40 | --clip_grad=100 \ 41 | --keep_checkpoints \ 42 | --save_cpu \ 43 | --restart_latest \ 44 | --log_dir="/mnt/logs/${MODEL_NAME}" \ 45 | --model_dir="/mnt/models/${MODEL_NAME}" \ 46 | --checkpoints_dir="/mnt/checkpoints/${MODEL_NAME}" \ 47 | --results_dir="/mnt/results/${MODEL_NAME}" \ 48 | --distributed >> ${RESULT_FILE} 49 | --------------------------------------------------------------------------------