├── .gitignore ├── LICENSE ├── README.md ├── pyproject.toml ├── sample_data ├── 202207_smip_monset.csv └── SMiPoly_001_generatedPolymer.csv.zip ├── sample_script └── sample_smip_demo3.ipynb ├── setup.cfg ├── setup.py ├── src └── smipoly │ ├── __init__.py │ ├── _version.py │ ├── rules │ ├── excl_lst.json │ ├── mon_dic.json │ ├── mon_dic_inv.json │ ├── mon_lst.json │ ├── mon_vals.json │ ├── ps_class.json │ ├── ps_gen.pkl │ └── ps_rxn.pkl │ └── smip │ ├── __init__.py │ ├── funclib.py │ ├── monc.py │ └── polg.py └── utilities ├── 1_MonomerDefiner.ipynb ├── 2_Ps_rxnL.ipynb ├── 3_Ps_GenL.ipynb └── rules ├── excl_lst.json ├── mon_dic.json ├── mon_dic_inv.json ├── mon_lst.json ├── mon_vals.json ├── ps_class.json ├── ps_gen.pkl └── ps_rxn.pkl /.gitignore: -------------------------------------------------------------------------------- 1 | #ignore temp files 2 | src/smipoly/__pycache__/ 3 | src/smipoly/smip/__pycache__/ 4 | sample_script/.ipynb_checkpoints/ 5 | utilities/.ipynb_checkpoints/ 6 | __pycache__/ 7 | build/ 8 | *.egg-info/ 9 | dist/ 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2022, PEJpOhn 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | 3. Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # SMiPoly 2 | 3 | ![license](https://anaconda.org/conda-forge/smipoly/badges/license.svg) 4 | |PePy.tech (pip)|conda-forge:| 5 | |:---:|:---:| 6 | |[![Downloads](https://static.pepy.tech/badge/smipoly)](https://pepy.tech/project/smipoly)|![conda](https://anaconda.org/conda-forge/smipoly/badges/downloads.svg)| 7 | 8 | ## 1. About SMiPoly 9 | "SMiPoly (**S**mall **M**olecules **i**nto **Poly**mers)" is rule-based virtual library generator for discovery of functional polymers. It is consist of two submodules, "monc.py" and "polg.py". 10 | "monc.py" is a monomer classifier from a list of small molecules, and "polg.py" is a polymer repeating unit generator from the classified monomer list. 11 | 12 | **How To Cite (publications)** 13 | SMiPoly: Generation of a Synthesizable Polymer Virtual Library Using Rule-Based Polymerization Reactions 14 | Mitsuru Ohno, Yoshihiro Hayashi, Qi Zhang, Yu Kaneko, and Ryo Yoshida 15 | *Journal of Chemical Information and Modeling* **2023** *63* (17), 5539-5548 16 | DOI: 10.1021/acs.jcim.3c00329 17 | https://doi.org/10.1021/acs.jcim.3c00329 18 | (version 0.0.1 was used) 19 | 20 | ## 2. Current version and requirements 21 | current version = 1.0.1 22 | requirements 23 | - pyhon 3.9, 3.10, 3.11, 3.12 24 | - rdkit >= 2023.9.1 25 | - numpy >= 1.26.0 26 | - pandas >= 2.1.0 27 | 28 | ## 3. Installation and usage 29 | ### 3-1. Installation 30 | SMiPoly can be installed with pip or conda. 31 | ### 3-1-1. Install with pip 32 | Create new virtual environment and activate it. 33 | To install this package, run as follows. 34 | 35 | ```sh 36 | $pip install smipoly 37 | ``` 38 | ### 3-1-2. Install with conda 39 | Add the channel "conda-forge" if it have not been enable. 40 | 41 | ```sh 42 | $conda config --add channels conda-forge 43 | ``` 44 | 45 | Create a new environment. 46 | ```sh 47 | $conda create -n "YOUR_NEW_ENVIRONMNT_NAME" python 48 | or 49 | $conda create -n "YOUR_NEW_ENVIRONMNT_NAME" python="required version (ex. 3.10)" 50 | ``` 51 | Then activate it. 52 | ```sh 53 | $conda activate "YOUR_NEW_ENVIRONMNT_NAME" 54 | ``` 55 | And install SMiPoly. 56 | ```sh 57 | $conda install smipoly 58 | ``` 59 | 60 | Or after create and activate a new environment, 61 | ```sh 62 | $conda install conda-forge::smipoly 63 | ``` 64 | 65 | ### 3-2. Quick start 66 | Download 'sample_data/202207_smip_monset.csv' and 'sample_script/sample_smip_demo2.ipynb' from [SMiPoly repository](https://github.com/PEJpOhno/SMiPoly) to the same directry on your computer. 67 | Then run sample_smip_demo.ipynb. To run this demo script, Jupyter Notebook is required. 68 | 69 | ## 4. Module contents 70 | ### 4-1. monc.py 71 | The functions of monc.py is as follows. 72 | - extract monomers from a list of small molecules. 73 | - classify extracted monomers into each monomer class. 74 | 75 | The chemical structure of the small molecule compounds should be expressed in simplified molecular input line entry system (SMILES) and given as pandas DataFrame. 76 | 77 | **Functions** 78 | smip.**monc.moncls**(*df, smiColn, minFG = 2, maxFG = 4, dsp_rsl=False*) 79 | smip.**monc.olecls**(*df, smiColn, minFG = 1, maxFG = 4, dsp_rsl=False*) 80 | 81 | ARGUMENTS: 82 | 83 | - df: name of the object DataFrame 84 | - smicoln: The column label of the SMILES column, given as a *str*. 85 | - minFG: minimum number of the polymerizable functional groups in the monomer for successive polymerization (default for moncls, 2: 2 or more; for olecls, 1: 1 or more) 86 | - maxFG: maxmum nimber of the polymerizable functional groups in the monomer for successive polymerization (default 4: 4 or less) 87 | - dsp_rsl: display classified result (default False) 88 | 89 | **Defined monomer class** 90 | By the function "moncls" 91 | - vinylidene 92 | - cyclic olefin 93 | - epoxide and diepoxide 94 | - lactone 95 | - lactam 96 | - hydroxy carboxylic acid 97 | - amino acid 98 | - cyclic carboxylic acid anhydride and bis(cyclic carboxylic acid anhydride) 99 | - hindered phenol 100 | - dicarboxylic acid and acid halide 101 | - diol 102 | - diamine and primary diamine 103 | - diisocyanate 104 | - bis(halo aryl)sulfone 105 | - bis(fluoro aryl)ketone 106 | 107 | By the function "olecls" 108 | (The following class of compounds are also belong to the class "vinylidene" and / or "cyclic olefin".) 109 | - acryl 110 | - beta-ectron withdrawing group substituted olefin 111 | - styryl 112 | - allyl 113 | - haloCH 114 | - vinyl ester 115 | - maleic imide derivatives 116 | - conjugated dienes 117 | - vinyl ether 118 | - beta-disubstituded aliphatic olefin 119 | - alycyclic olefin 120 | - aliphhatic olefin 121 | 122 | ### 4-2. polg.py 123 | The library "polg.py" has two functions, "bipolym" and "ole_copolym". 124 | 125 | #### 4-2-1. Function "bipolym" 126 | The function "bipolym" gives all synthesizable polymer repeating units starting from the classified monomer list generated by "monc.moncls". 127 | For chain polymerization (polyolefins and some polyether), it gives homo and binary-copolymers. For successive (or step) polymerization, it gives homopolymer only. 128 | 129 | smip.**polg.biplym**(*df, targ = \['all'\], Pmode = 'a', dsp_rsl=False*) 130 | 131 | ARGUMENTS: 132 | - df: name of the DataFrame of classified monomers generated by *monc.moncls*. 133 | - targ: targetted polymer class. When present, it can be a list of *str*. The selectable elements are 'polyolefin', 'polyester', 'polyether', 'polyamide', 'polyimide', 'polyurethane', 'polyoxazolidone' and 'all' (default = ['all']) 134 | - Pmod: generate all isomers of the polymer repeating unit ('a') or the polymer repeating unit of its representation ('r'). (default = 'a') 135 | **Future warning:** Mode 'r' will be deprecated and merged into mode 'a'. Use mode 'a' instead. 136 | - dsp_rsl: display the DataFrame of the generated polymers. (default False) 137 | 138 | **Defined polymer class** 139 | - polyolefin, polycyclic olefin and their binary copolymers 140 | - polyester (from lactone, hydroxy carboxylic acid, dicarboxylic acid + diol, diol + CO and cyclic carboxylic acid anhydride + epoxide) 141 | - polyether (from epoxide, hindered phenol, bis(halo aryl)sulfone + diol and bis(fluoro aryl)ketone + diol) 142 | - polyamide (from lactam, amino acid and dicarboxylic acid + diamine) 143 | - polyimide (bis(cyclic carboxylic acid anhydride + primary diamine) 144 | - polyurethane (diisocyanate + diol) 145 | - polyoxazolidone (diepoxide + diisocyanate) 146 | 147 | #### 4-2-2. Function "ole_copolym" 148 | The function "ole_copolym" gives olefinic (co)polymer repeating units starting from the classified monomer list generated by "monc.olecls". The combination of the olefinic monomer class and the number of component are given as the arguments. 149 | 150 | smip.**polg.ole_copolym**(df, targ = *\[ \]*, ncomp = *1*, dsp_rsl = None, drop_dupl = None) 151 | 152 | ARGUMENTS: 153 | - df: name of the DataFrame of classified monomers generated by *monc.olecls*. 154 | - targ: list of monomer class (es) to copolymerize. 155 | - ncomp: Number of components given as Int. (default 1) 156 | - dsp_rsl: display the DataFrame of the generated polymers. (default False) 157 | - drop_dupl: drop duplicated copolymer (default False) 158 | 159 | 160 | ### 4-3 Sample data 161 | The sample dataset './sample_data/202207_smip_monset.csv' includes common 1,083 monomers collected from published documents such as scientific articles, catalogues and so on. 162 | 163 | ### 4-4. Utilities 164 | By using the files in './utilities' directory, one can modify or add the definition of monomers, the rules of polymerization reactions and polymer classes. 165 | To apply the new rule(s), replace the old './smipoly/rules' directory by the new one. The files must be run according to the number assigned the head of the each filename. 166 | 167 | - 1_MonomerDefiner.ipynb: definitions of monomers 168 | - 2_Ps_rxnL.ipynb: rules of polymerization reactions 169 | - 3_Ps_GenL.ipynb: definitions of polymer classes with combinations of starting monomer(s) and polymerization reaction 170 | 171 | ## 5. Copyright and license 172 | Copyright (c) 2022 Mitsuru Ohno 173 | Released under the BSD-3 license, license that can be found in the LICENSE file. 174 | 175 | ## 6. Related projects 176 | RadonPy (Fully automated calculation for a comprehensive set of polymer properties) 177 | https://github.com/RadonPy/RadonPy 178 | 179 | ## 7. Directry configuration 180 | 181 | ```sh 182 | SMiPoly 183 | ├── src 184 | │ └── smipoly 185 | │ ├── __init__.py 186 | │ ├── _version.py 187 | │ ├── smip 188 | │ │ ├── __init__.py 189 | │ │ ├── funclib.py 190 | │ │ ├── monc.py 191 | │ │ └── polg.py 192 | │ └── rules 193 | │ ├── excl_lst.json 194 | │ ├── mon_dic_inv.json 195 | │ ├── mon_dic.json 196 | │ ├── mon_lst.json 197 | │ ├── mon_vals.json 198 | │ ├── ps_class.json 199 | │ ├── ps_gen.pkl 200 | │ └── ps.rxn.pkl 201 | ├── LICENSE 202 | ├── pyproject.toml 203 | ├── setup.py 204 | ├── setup.cfg 205 | ├── README.md 206 | ├── sample_data 207 | │ └── 202207_smip_monset.csv 208 | ├── sample_script 209 | │ └── sample_smip_demo3.ipynb 210 | └── utilities 211 | ├── 1_MonomerDefiner.ipynb 212 | ├── 2_Ps_rxnL.ipynb 213 | ├── 3_Ps_GenL.ipynb 214 | └── rules/ 215 | ``` 216 | 217 | ## Reference 218 | https://future-chem.com/rdkit-chemical-rxn/ 219 | https://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html 220 | https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html 221 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | [build-system] 5 | requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.0"] 6 | build-backend = "setuptools.build_meta" 7 | -------------------------------------------------------------------------------- /sample_data/SMiPoly_001_generatedPolymer.csv.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PEJpOhno/SMiPoly/cceef4dfbabde2d5b68d55d153b2e152d799b971/sample_data/SMiPoly_001_generatedPolymer.csv.zip -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | [metadata] 5 | name = smipoly 6 | version = attr: smipoly._version.__version__ 7 | url = https://github.com/PEJpOhno/SMiPoly 8 | author = Mitsuru Ohno 9 | author_email = pejpohn@gmail.com 10 | description = rule-based virtual polymer library generator 11 | long_description = file: README.md, 12 | long_description_content_type = text/markdown 13 | keywords = rdkit, polymer, 14 | license = BSD 3-Clause License 15 | license_files = file: LICENSE, 16 | classifiers = 17 | Development Status :: 4 - Beta 18 | License :: OSI Approved :: BSD License 19 | Programming Language :: Python :: 3 20 | Topic :: Scientific/Engineering :: Chemistry 21 | 22 | [options] 23 | zip_safe = False 24 | include_package_data = True 25 | packages = find_namespace: 26 | package_dir = 27 | = src 28 | python_requires = <3.13,>=3.9 29 | install_requires = 30 | rdkit >= 2023.9.1 31 | numpy >= 1.26.0 32 | pandas >= 2.1.0 33 | 34 | [options.packages.find] 35 | where = src 36 | 37 | [options.package_data] 38 | smipoly = 39 | rules/*.json 40 | rules/*.pkl 41 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | from setuptools import setup 5 | setup() 6 | -------------------------------------------------------------------------------- /src/smipoly/__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2021 Mitsuru Ohno 2 | # Use of this source code is governed by a BSD-3-style 3 | # license that can be found in the LICENSE file. 4 | 5 | # 08/21/2022, M. Ohno 6 | # smipoly __init__ 7 | 8 | from ._version import __version__ 9 | -------------------------------------------------------------------------------- /src/smipoly/_version.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | __version__ = "1.0.1" 5 | -------------------------------------------------------------------------------- /src/smipoly/rules/excl_lst.json: -------------------------------------------------------------------------------- 1 | {"0": [], "1": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "2": ["[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "51": ["[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "3": ["[CX3;R]=[CX3;R]-[CX3;R]=[CX3;R]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "4": ["[OX2;R]1[CX3;R](=[OX1])[C,c;R][C,c;R][C,c;R]1", "[c][OX2;R5][CX3;R5](=[OX1])", "[OX2;R5][CX3;R5](=[OX1])[c]", "[C,c][C;R](=[OX1])[O;R][C;R](=[OX1])[C,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "5": ["[NX3;R]1[CX3;R](=[OX1])[C,c;R][C,c;R][C,c;R]1", "[C,c;R5][NX3;R5][CX3;R5](=[OX1])[C,c;R5]", "[C;R](=[OX1])[N;R][C;R](=[OX1])", "[C;R][N;R][C;R](=[OX1])[N;R][CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "6": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "7": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "8": ["[N&X3;H2,H1,H0;!$(N[C,S]=*)]", "[F,Cl,Br,I]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "9": ["[NX2;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[NX3;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "10": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "11": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "52": ["[CX4H2][OH1]", "[CX4H1][OH1]", "[CX4H0][OH1]", "[CX4H0][C](=[O])[OH1]", "[CX4H0][C](=[O])[Cl,Br][N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "53": ["[CX3H0](=[O])[OH1]", "[CX4H0;!R]([C])([C])[O,S;H1]", "[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]", "[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "54": ["[CX3H0](=[O])[OH1]", "[CX4H0]([C])([C])[OH1]", "[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]", "[C][OX2H]", "[C][SX2H]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "55": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "56": ["[NX2;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[NX3;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "57": ["[CX3H0](=[O])[OH1]", "[CX3H0](=[O])[NX3]", "[CX4][OH1]", "[c][OH1]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "12": ["[CX3](=[OX1])[OX2]", "[CX3H0](=[O])[NX3]", "[CX4][OH1]", "[c][OH1]", "[CX4][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]", "[NX1]#[CX2]", "[CX4]([C,c])([C,c])[C,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "13": ["[CX3](=[OX1])[OX2]", "[CX3H0](=[O])[NX3]", "[CX4][OH1]", "[c][OH1]", "[CX4][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]", "[NX1]#[CX2]", "[CX4]([C,c])([C,c])[C,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "58": ["[CX3](=[OX1])[OX2]", "[CX4H0;!R]([C])([C])[O,S;H1]", "[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]", "[CX4H1][SX2;H1]", "[CX4H2][SX2;H1]", "[c][SX2;H1]", "[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H0](=[O])[NX3]", "[NX1]#[CX2]", "[CX4]1[OX2][CX4]1", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "1001": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1002": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1003": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1004": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1005": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"], "1006": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1007": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "1020": ["[CX3]=[CX3](-[CX3;!R]=[CX3;!R])=[CX3]", "[CX3]=[CX3]-[CX3]=[CX3]-[CX3]=[CX3]", "[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1030": ["[NX3;H3,H2,H1;!$(NC=O)]", "[n]", "[CX4;H2,H1][OX2;R][CX4]", "[CX4;H2,H1][SX2;R][CX4]", "[OX2][CX4,c]([C,c])([C,c])[CX4,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "1031": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[F,Cl,Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"], "1050": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[Cl,Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"], "1052": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[Cl,Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"]} -------------------------------------------------------------------------------- /src/smipoly/rules/mon_dic.json: -------------------------------------------------------------------------------- 1 | {"vinyl": 1, "epo": 2, "diepo": 51, "cOle": 3, "lactone": 4, "lactam": 5, "hydCOOH": 6, "aminCOOH": 7, "hindPhenol": 8, "cAnhyd": 9, "CO": 10, "HCHO": 11, "sfonediX": 12, "BzodiF": 13, "diCOOH": 52, "diol": 53, "diamin": 54, "diNCO": 55, "dicAnhyd": 56, "pridiamin": 57, "diol_b": 58, "acryl": 1001, "bEWole": 1002, "styryl": 1003, "allyl": 1004, "haloCH": 1005, "vinylester": 1006, "malei": 1007, "conjdiene": 1020, "vinylether": 1030, "tertcatCH": 1031, "cycCH": 1050, "aliphCH": 1052} -------------------------------------------------------------------------------- /src/smipoly/rules/mon_dic_inv.json: -------------------------------------------------------------------------------- 1 | {"1": "vinyl", "2": "epo", "51": "diepo", "3": "cOle", "4": "lactone", "5": "lactam", "6": "hydCOOH", "7": "aminCOOH", "8": "hindPhenol", "9": "cAnhyd", "10": "CO", "11": "HCHO", "12": "sfonediX", "13": "BzodiF", "52": "diCOOH", "53": "diol", "54": "diamin", "55": "diNCO", "56": "dicAnhyd", "57": "pridiamin", "58": "diol_b", "1001": "acryl", "1002": "bEWole", "1003": "styryl", "1004": "allyl", "1005": "haloCH", "1006": "vinylester", "1007": "malei", "1020": "conjdiene", "1030": "vinylether", "1031": "tertcatCH", "1050": "cycCH", "1052": "aliphCH"} -------------------------------------------------------------------------------- /src/smipoly/rules/mon_lst.json: -------------------------------------------------------------------------------- 1 | {"0": [], "1": ["[CX3H2]=[CX3]", "[CX3](F)(F)=[CX3]", "[CX3;H1](F)=[CX3]", "[CX3](Cl)(Cl)=[CX3]", "[CX3;H1](Cl)=[CX3]", "[CX3](Cl)(F)=[CX3]"], "2": ["[CX4H2]1[O][CX4]1", "[c,CX4;R]-[CX4H1]1[O][CX4]1-[c,CX4;R]", "[CX4H1]1([F,Cl])[O][CX4]1", "[CX4]1([F,Cl])([F,Cl])[O][CX4]1", "[c,CX4;R]-[CX4]1([F,Cl])[O][CX4]1-[c,CX4;R]"], "51": ["[CX4H2]1[O][CX4]1", "[c,CX4;R]-[CX4H1]1[O][CX4]1-[c,CX4;R]", "[CX4H1]1([F,Cl])[O][CX4]1", "[CX4]1([F,Cl])([F,Cl])[O][CX4]1", "[c,CX4;R]-[CX4]1([F,Cl])[O][CX4]1-[c,CX4;R]"], "3": ["[CX3;H1;R]=[CX3;H1;R]", "[CX3;H1;R]=[CX3;H0;R]"], "4": ["[C;R][OX2;R][CX3;R](=[OX1])[C;R]", "[c][OX2;R][CX3;R](=[OX1])[C;R]", "[OX2;R][CX3;R](=[OX1])[C;R][c]"], "5": ["[C;R][NX3;H1;R][CX3;R](=[OX1])[C;R][C;R]", "[c][NX3;H1;R][CX3;R](=[OX1])[C;R]", "[NX3;H1;R][CX3;R](=[OX1])[C;R][c]"], "6": ["[O&X2;H1;!$(OC=*)][C].[CX3](=[O])[OX2H1]", "[O&X2;H1;!$(OC=*)][c].[CX3](=[O])[OX2H1]"], "7": ["[N&X3;H2,H1;!$(N[C,S]=*)][C].[CX3](=[O])[OX2H1]", "[N&X3;H2,H1;!$(N[C,S]=*)][c].[CX3](=[O])[OX2H1]"], "8": ["[c]1([OX2H1])[c]([C])[c][cX3H1][c][c]1([C])"], "9": ["[C;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[c][CX3,c;R](=[O])[O,o;R][CX3,c;R](=[O])[c]"], "10": ["[C-]#[O+]"], "11": ["[CX3;H2]=[OX1]"], "52": ["[CX4H2][C](=[O])[OH1]", "[CX4H1][C](=[O])[OH1]", "[c][C](=O)[OH1]", "[CX4H2][C](=[O])[Cl,Br]", "[CX4H1][C](=[O])[Cl,Br]", "[c][C](=O)[Cl,Br]"], "53": ["[CX4H1][OX2,SX2;H1]", "[CX4H2][OX2,SX2;H1]", "[c][OX2,SX2;H1]", "[CX4;H2,H1,c]([OX2,SX2;H1])[OX2,SX2;H1]"], "54": ["[C][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]", "[C,c][N&X3;H1;!$(N[C,S]=*)][C,c]", "[N&X3;H2;!$(N[C,S]=*)][C][N&X3;H2;!$(N[C,S]=*)]"], "55": ["[C]-[NX2]=[CX2]=[O,S;X1]", "[c]-[NX2]=[CX2]=[O,S;X1]"], "56": ["[C;R][C;R;X3](=[OX1])[OX2;R][C;R;X3](=[OX1])[C;R]", "[c][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[c]"], "57": ["[C][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]"], "12": ["[c]1[c][c]([F,Cl,Br,I])[c][c][c]1[SX4](=[OX1])(=[OX1])[c]2[c][c][c]([F,Cl,Br,I])[c][c]2"], "13": ["[c]1[c][c]([F])[c][c][c]1[CX3](=[OX1])[c]2[c][c][c]([F])[c][c]2"], "58": ["[CX4H1][OX2;H1]", "[CX4H2][OX2;H1]", "[c][OX2;H1]"], "1001": ["[CX3H2;!R:1]=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]"], "1002": ["[CX3H2;!R:1]=[CX3H1;!R:2][CX2]#[NX1]", "[CX3H2;!R:1]=[CX3H1;!R:2][NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]"], "1003": ["[CX3H2;!R:1]=[CX3H1;!R:2][c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]", "[CX3;H1;r5:1]=[CX3;H1;r5:2][c:3]"], "1004": ["[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][OX2,SX2:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][NX3:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][c:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][n:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][SiX4]([C,c:3])([C,c:4])[C,c:5]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][CX2]#[N]"], "1005": ["[CX3;H2;!R:1]=[CX3;H1;!R:2][F,Cl:3]", "[CX3;H2;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]", "[CX3;H2;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]", "[CX3;H2;!R:1]=[CX3;H1;!R:2][CX4:3]([F,Cl:4])[F,Cl:5]", "[F,Cl:4][CX3;H1;!R:1]=[CX3;H1;!R:2][F,Cl:3]", "[F,Cl:5][CX3;H1;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]", "[F,Cl:5][CX3;H1;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]", "[F,Cl:4][CX3;H0;!R:1]([F,Cl:5])=[CX3;H1;!R:2][F,Cl:3]", "[F,Cl:5][CX3;H0;!R:1]([F,Cl:6])=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]", "[F,Cl:5][CX3;H0;!R:1]([F,Cl:6])=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]"], "1006": ["[CX3H2:1]=[CX3H1:2][OX2][CX3:3](=[OX1])", "[CX3H1:1]([F])=[CX3H1:2][OX2][CX3:3](=[OX1])", "[CX3H0:1]([F])([F])=[CX3H1:2][OX2][CX3:3](=[OX1])", "[CX3H2:1]=[CX3H1:2][NX3:3][CX3:4](=[OX1])", "[CX3H1:1]([F])=[CX3H1:2][NX3:3][CX3:4](=[OX1])", "[CX3H0:1]([F])([F])=[CX3H1:2][NX3:3][CX3:4](=[OX1])"], "1007": ["[CX3H1;R:1]1=[CX3H1;R:2][CX3;R](=[OX1])[NX3:3][CX3;R]1(=[OX1])", "[CX3H1;R:1]1([F])=[CX3H1;R:2]([F])[CX3;R](=[OX1])[NX3:3][CX3;R]1(=[OX1])"], "1020": ["[CX3;H2:1]=[CX3;!R:2]-[CX3;!R:3]=[CX3;H2:4]", "[CX3:1]([F])([F])=[CX3!R:2]-[CX3;!R:3]=[CX3H2:4]", "[CX3;H1:1]([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H2:4]", "[CX3;H0:1]([F])([F])=[CX3!R:2]-[CX3;!R:3]=[CX3;H1:4]([F])", "[CX3;H1:1]([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H1:4]([F])", "[CX3;H0:1]([F])([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H0:4]([F])([F])"], "1030": ["[CX3H2:1]=[CX3:2][OX2][CX4:3]", "[CX3H2:1]=[CX3:2][OX2;!$(O*=O)][CX3:3]", "[CX3H2:1]=[CX3:2][OX2][c:3]", "[CX3H1:1]([F])=[CX3:2][OX2][CX4:3]", "[CX3H1:1]([F])=[CX3:2][OX2;!$(O*=O)][CX3:3]", "[CX3H1:1]([F])=[CX3:2][OX2][c:3]", "[CX3H0:1]([F])([F])=[CX3:2][OX2][CX4:3]", "[CX3H0:1]([F])([F])=[CX3:2][OX2;!$(O*=O)][CX3:3]", "[CX3H0:1]([F])([F])=[CX3:2][OX2][c:3]"], "1031": ["[CX3;H2;!R:1]=[CX3;H0;!R:2]([CX4H3])[CX4H2:3]", "[CX3;H2;!R:1]=[CX3;H0;!R:2]([CX4H3])[CX4H3]"], "1050": ["[CX3;H1;R:1]=[CX3;H1;R:2]"], "1052": ["[CX3;H2;!R:1]=[CX3;H2;!R:2]", "[CX3;H2;!R:1]=[CX3;H1;!R:2]"], "200": "[CX3]=[CX3]", "201": "[CX4;R]1[OX2;R][CX4;R]1", "202": "[CX3](=[O])[OX2H1,F,Cl,Br,I]", "203": "[C,c][OX2,SX2;H1;!$([O,S]C=*)]", "204": "[C,c][NX3;H2;!$(N[C,S]=*)]", "205": "[NX2]=[CX2]=[OX1,SX1]", "206": "[C,c][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[C,c]"} -------------------------------------------------------------------------------- /src/smipoly/rules/mon_vals.json: -------------------------------------------------------------------------------- 1 | [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], [51, 52, 53, 54, 55, 56, 57, 58], [200, 201, 202, 203, 204, 205, 206], [1001, 1002, 1003, 1004, 1005, 1006, 1007, 1020, 1030, 1031, 1050, 1052]] -------------------------------------------------------------------------------- /src/smipoly/rules/ps_class.json: -------------------------------------------------------------------------------- 1 | {"polyolefin": 11, "polyester": 6, "polyether": 12, "polyamide": 2, "polyimide": 8, "polyurethane": 19, "polyoxazolidone": 23} -------------------------------------------------------------------------------- /src/smipoly/rules/ps_gen.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PEJpOhno/SMiPoly/cceef4dfbabde2d5b68d55d153b2e152d799b971/src/smipoly/rules/ps_gen.pkl -------------------------------------------------------------------------------- /src/smipoly/rules/ps_rxn.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PEJpOhno/SMiPoly/cceef4dfbabde2d5b68d55d153b2e152d799b971/src/smipoly/rules/ps_rxn.pkl -------------------------------------------------------------------------------- /src/smipoly/smip/__init__.py: -------------------------------------------------------------------------------- 1 | # smipoly/smip/__init__ 2 | 3 | from . import funclib 4 | from .._version import __version__ 5 | -------------------------------------------------------------------------------- /src/smipoly/smip/funclib.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # Copyright (c) 2021 Mitsuru Ohno 5 | # Use of this source code is governed by a BSD-3-style 6 | # license that can be found in the LICENSE file. 7 | 8 | # 08/02/2021, M. Ohno 9 | # functions for MonomerClassifier and PolymerGenerator. 10 | 11 | #import re 12 | #import itertools 13 | import numpy as np 14 | import pandas as pd 15 | from rdkit import rdBase, Chem 16 | from rdkit.Chem import AllChem 17 | 18 | def genmol(s): 19 | try: 20 | m = Chem.MolFromSmiles(s) 21 | except: 22 | m = np.nan 23 | return m 24 | 25 | 26 | def genc_smi(m): 27 | try: 28 | cS = Chem.MolToSmiles(m) 29 | except: 30 | cS = np.nan 31 | return cS 32 | 33 | #count the number of the targetted functional group 34 | def count_fg(m, patt): 35 | numFG = 0 36 | matchs = m.GetSubstructMatches(patt) 37 | if len(matchs)>=2: 38 | not_match = [] 39 | for i in range(0, len(matchs)-1): 40 | if len(set(matchs[i])^set(matchs[i+1]))!=2: 41 | pass 42 | else: 43 | not_match.append(i+1) 44 | numFG = len([matchs[i] for i in range(0, len(matchs)) if i not in not_match]) 45 | else: 46 | numFG = len(matchs) 47 | return numFG 48 | 49 | #classify candidate compounds for mono-FG monomer 50 | def monomer_sel_mfg(m, mons, excls): 51 | if pd.notna(m): 52 | chk_c = 0 53 | fchk_c = 0 54 | chk = [] 55 | if len(mons)!=0: 56 | for mon in mons: 57 | patt = Chem.MolFromSmarts(mon) 58 | if m.HasSubstructMatch(patt): 59 | chk_c = len(m.GetSubstructMatches(patt)) 60 | fchk_c = fchk_c+chk_c 61 | chk_excl=[] 62 | for excl in excls: 63 | excl_patt=Chem.MolFromSmarts(excl) 64 | if m.HasSubstructMatch(excl_patt): 65 | chk_excl.append(False) 66 | else: 67 | chk_excl.append(True) 68 | if False in chk_excl: 69 | chk.append(False) 70 | else: 71 | chk.append(True) 72 | else: 73 | chk.append(False) 74 | if True in chk: 75 | fchk = True 76 | else: 77 | fchk = False 78 | else: 79 | fchk = False 80 | else: 81 | fchk = False 82 | return [fchk, fchk_c] 83 | 84 | 85 | #classify candidate compounds for poly-FG monomer 86 | #count objective FGs 87 | def monomer_sel_pfg(m, mons, excls, minFG, maxFG): 88 | if pd.notna(m): 89 | chk_c = 0 90 | fchk_c = 0 91 | if len(mons)!=0: 92 | for mon in mons: 93 | patt = Chem.MolFromSmarts(mon) 94 | chk_c = count_fg(m, patt) 95 | fchk_c = fchk_c + chk_c 96 | if minFG <= fchk_c <= maxFG: 97 | chk=[] 98 | for excl in excls: 99 | excl_patt=Chem.MolFromSmarts(excl) 100 | if m.HasSubstructMatch(excl_patt): 101 | chk.append(False) 102 | else: 103 | chk.append(True) 104 | if False in chk: 105 | fchk = False 106 | else: 107 | fchk=True 108 | else: 109 | fchk = False 110 | else: 111 | fchk = (False) 112 | else: 113 | fchk = (False) 114 | return [fchk, fchk_c] 115 | 116 | 117 | #define sequential polymerization for chain polymerization except polyolefine 118 | def seq_chain(prod_P, targ_mon1, Ps_rxnL, mon_dic, monL): 119 | if Chem.MolToSmiles(prod_P) != '': 120 | if targ_mon1 not in ['vinyl', 'cOle']: 121 | seqFG2=Chem.MolFromSmarts(monL[[202][0]]) 122 | seqFG3=Chem.MolFromSmarts(monL[[203][0]]) 123 | seqFG4=Chem.MolFromSmarts(monL[[204][0]]) 124 | while prod_P.HasSubstructMatch(seqFG2): 125 | prods = Ps_rxnL[202].RunReactants([prod_P]) 126 | prod_P = prods[0][0] 127 | Chem.SanitizeMol(prod_P) 128 | while prod_P.HasSubstructMatch(seqFG3): 129 | prods = Ps_rxnL[203].RunReactants([prod_P]) 130 | prod_P = prods[0][0] 131 | Chem.SanitizeMol(prod_P) 132 | while prod_P.HasSubstructMatch(seqFG4): 133 | prods = Ps_rxnL[204].RunReactants([prod_P]) 134 | prod_P = prods[0][0] 135 | Chem.SanitizeMol(prod_P) 136 | else: 137 | prod_P=prod_P 138 | return prod_P 139 | 140 | 141 | #define sequential polymerization for successive polymerization 142 | def seq_successive(prod_P, targ_rxn, monL, Ps_rxnL, P_class): 143 | if Chem.MolToSmiles(prod_P) != '': 144 | seqFG0=Chem.MolFromSmarts(monL[[200][0]]) 145 | seqFG1=Chem.MolFromSmarts(monL[[201][0]]) 146 | seqFG2=Chem.MolFromSmarts(monL[[202][0]]) 147 | seqFG3=Chem.MolFromSmarts(monL[[203][0]]) 148 | seqFG4=Chem.MolFromSmarts(monL[[204][0]]) 149 | seqFG5=Chem.MolFromSmarts(monL[[205][0]]) 150 | seqFG6=Chem.MolFromSmarts(monL[[206][0]]) 151 | if P_class not in ['polyolefin', 'polyoxazolidone', ]: 152 | while prod_P.HasSubstructMatch(seqFG1): 153 | prods = Ps_rxnL[201].RunReactants([prod_P]) 154 | prod_P = prods[0][0] 155 | Chem.SanitizeMol(prod_P) 156 | while prod_P.HasSubstructMatch(seqFG2): 157 | prods = Ps_rxnL[202].RunReactants([prod_P]) 158 | prod_P = prods[0][0] 159 | Chem.SanitizeMol(prod_P) 160 | while prod_P.HasSubstructMatch(seqFG3): 161 | prods = Ps_rxnL[203].RunReactants([prod_P]) 162 | prod_P = prods[0][0] 163 | Chem.SanitizeMol(prod_P) 164 | while prod_P.HasSubstructMatch(seqFG4): 165 | prods = Ps_rxnL[204].RunReactants([prod_P]) 166 | prod_P = prods[0][0] 167 | Chem.SanitizeMol(prod_P) 168 | while prod_P.HasSubstructMatch(seqFG5): 169 | prods = Ps_rxnL[205].RunReactants([prod_P]) 170 | prod_P = prods[0][0] 171 | Chem.SanitizeMol(prod_P) 172 | while prod_P.HasSubstructMatch(seqFG6): 173 | if P_class =='polyimide': 174 | prods = Ps_rxnL[207].RunReactants([prod_P]) 175 | elif P_class =='polyester': 176 | prods = Ps_rxnL[206].RunReactants([prod_P]) 177 | prod_P = prods[0][0] 178 | Chem.SanitizeMol(prod_P) 179 | elif P_class in ['polyoxazolidone', ]: 180 | while prod_P.HasSubstructMatch(seqFG1): 181 | prods = Ps_rxnL[201].RunReactants([prod_P]) 182 | prod_P = prods[0][0] 183 | Chem.SanitizeMol(prod_P) 184 | while prod_P.HasSubstructMatch(seqFG5): 185 | prods = Ps_rxnL[208].RunReactants([prod_P]) 186 | prod_P = prods[0][0] 187 | Chem.SanitizeMol(prod_P) 188 | elif P_class in ['polyolefin', ]: 189 | while prod_P.HasSubstructMatch(seqFG0): 190 | prods = Ps_rxnL[200].RunReactants([prod_P]) 191 | prod_P = prods[0][0] 192 | Chem.SanitizeMol(prod_P) 193 | else: 194 | prod_P=prod_P 195 | return prod_P 196 | 197 | 198 | #homopolymerization 199 | def homopolymR(mon1,mons,excls, targ_mon1, Ps_rxnL, mon_dic, monL): 200 | prod_P=mon1 201 | while monomer_sel_mfg(prod_P, mons, excls)[0]== True: #生成したポリマーがさらに重合可能な場合、再度反応 202 | prods = Ps_rxnL[mon_dic[targ_mon1]].RunReactants([prod_P]) 203 | try: 204 | prod_P = prods[0][0] 205 | Chem.SanitizeMol(prod_P) 206 | prod_P = seq_chain(prod_P, targ_mon1=targ_mon1, Ps_rxnL=Ps_rxnL, mon_dic=mon_dic, monL=monL) 207 | except: 208 | pass 209 | return [genc_smi(prod_P)] #20230904 revised returned Molobject to SMILES 210 | 211 | 212 | #binarypolymerization 213 | def bipolymR(reactant, targ_rxn, monL, Ps_rxnL, P_class): 214 | prod_P = Chem.MolFromSmiles('') 215 | prods = targ_rxn.RunReactants(reactant) 216 | try: 217 | prod_P = prods[0][0] 218 | Chem.SanitizeMol(prod_P) 219 | prod_P = seq_successive(prod_P, targ_rxn=targ_rxn, monL=monL, Ps_rxnL=Ps_rxnL, P_class=P_class) 220 | except: 221 | pass 222 | return [genc_smi(prod_P)] #20230904 revised returned Molobject to SMILES 223 | 224 | 225 | #homopolymerization 226 | def homopolymA(mon1,mons,excls, targ_mon1, Ps_rxnL, mon_dic, monL): 227 | prod_P=mon1 228 | while monomer_sel_mfg(prod_P, mons, excls)[0]== True: #生成したポリマーがさらに重合可能な場合、再度反応 229 | prods = Ps_rxnL[mon_dic[targ_mon1]].RunReactants([prod_P]) 230 | prod_Ps = [] 231 | for prod_P in prods: 232 | try: 233 | Chem.SanitizeMol(prod_P[0]) 234 | prod_P = prod_P[0] 235 | prod_P = seq_chain(prod_P, targ_mon1=targ_mon1, Ps_rxnL=Ps_rxnL, mon_dic=mon_dic, monL=monL) 236 | prod_Ps.append(prod_P) 237 | except: 238 | pass 239 | return [genc_smi(m) for m in prod_Ps] 240 | 241 | 242 | #binarypolymerization 243 | def bipolymA(reactant, targ_rxn, monL, Ps_rxnL, P_class): 244 | prod_P = Chem.MolFromSmiles('') 245 | prods = targ_rxn.RunReactants(reactant) 246 | prod_Ps = [] 247 | for prod_P in prods: 248 | try: 249 | Chem.SanitizeMol(prod_P[0]) 250 | prod_P = prod_P[0] 251 | prod_P = seq_successive(prod_P, targ_rxn=targ_rxn, monL=monL, Ps_rxnL=Ps_rxnL, P_class=P_class) 252 | prod_Ps.append(prod_P) 253 | except: 254 | pass 255 | return [genc_smi(m) for m in prod_Ps] 256 | 257 | 258 | # Copyright (c) 2024 Mitsuru Ohno 259 | # Use of this source code is governed by a BSD-3-style 260 | # (license that can be found in the LICENSE file. ) 261 | 262 | # 08/15/2024, M. Ohno 263 | # generate CRU of olefinic polymers 264 | # NEED def ole_rxnsmarts_gen(reactant) 265 | 266 | def ole_cru_gen(m, mon): 267 | def ole_rxnsmarts_gen(reactant): 268 | prod = '' 269 | prod1 = '' 270 | prod2 = '' 271 | prod3 = '' 272 | prod4 = '' 273 | inv_reactant = reactant[::-1] 274 | C1_i = inv_reactant.find(':1]'[::-1]) 275 | C1_j = inv_reactant.find('[CX3'[::-1], C1_i) 276 | C2_i = inv_reactant.find(':2]'[::-1]) 277 | C2_j = inv_reactant.find('=[CX3'[::-1], C2_i) 278 | prod1 = inv_reactant[:C2_i]+'(-*)'[::-1] 279 | prod2 = inv_reactant[C2_i:C2_j]+'-[CX4'[::-1] 280 | prod3 = inv_reactant[C2_j+5:C1_i]+'(-*)'[::-1] 281 | prod4 = inv_reactant[C1_i:C1_j]+'[CX4'[::-1]+inv_reactant[C1_j+4:] 282 | prod = prod4[::-1] + prod3[::-1] + prod2[::-1] + prod1[::-1] 283 | rxn_smarts = reactant + '>>' + prod 284 | return rxn_smarts 285 | 286 | #m = genmol(smi) 287 | reactant = [m, ] 288 | patt = Chem.MolFromSmarts(mon) 289 | targ_rxn = AllChem.ReactionFromSmarts(ole_rxnsmarts_gen(mon)) 290 | targ_rxn.Initialize() #need initialization 291 | while m.HasSubstructMatch(patt): 292 | #while targ_rxn.IsMoleculeReactant(m): 293 | prod_P = Chem.MolFromSmiles('') 294 | prods = targ_rxn.RunReactants(reactant) 295 | prod_Ps = [] 296 | for prod_P in prods: 297 | try: 298 | Chem.SanitizeMol(prod_P[0]) 299 | prod_P = prod_P[0] 300 | prod_Ps.append(prod_P) 301 | except: 302 | pass 303 | m = prod_Ps[0] 304 | reactant = [m, ] 305 | return [m, [genc_smi(m) for m in prod_Ps]] 306 | 307 | def diene_14(x,rxn): 308 | def diene_12to14(smi, rxn): 309 | #rxn == Ps_rxnL[209] 310 | # Replace all asterisks with [3H] in the SMILES string: 311 | repl_smi = smi.replace("*", "[3H]") 312 | new_m = Chem.MolFromSmiles(repl_smi) 313 | Chem.SanitizeMol(new_m) 314 | prods = rxn.RunReactants([new_m]) 315 | for m in prods: 316 | Chem.SanitizeMol(m[0]) 317 | new_smi = Chem.MolToSmiles(prods[0][0]).replace("[3H]", "*") 318 | return new_smi 319 | if 'conjdiene' in x and x['conjdiene'][0]: 320 | x['conjdiene'][2] = diene_12to14(x['conjdiene'][2], rxn) 321 | return x 322 | 323 | # Copyright (c) 2024 Mitsuru Ohno 324 | # Use of this source code is governed by a BSD-3-style 325 | # (license that can be found in the LICENSE file. ) 326 | 327 | # 08/15/2024, M. Ohno 328 | # classify olefinic monomers and generate CRU 329 | 330 | def ole_sel_cru(m, mons, excls, minFG, maxFG): 331 | judge = monomer_sel_pfg(m, mons, excls, minFG, maxFG) 332 | if judge[0] == True: 333 | for mon in mons: 334 | patt = Chem.MolFromSmarts(mon) 335 | if m.HasSubstructMatch(patt): 336 | CRU = ole_cru_gen(m, mon) 337 | m = CRU[0] 338 | else: 339 | pass 340 | else: 341 | m = np.nan 342 | smi_p = genc_smi(m) 343 | judge.append(smi_p) 344 | return judge 345 | 346 | def update_nested_dict(row, dict_col, new_val, updated_k): 347 | row[dict_col][updated_k] = row[new_val] 348 | return row 349 | 350 | 351 | 352 | def coord_polym(smi, targ_rxn): 353 | prods = targ_rxn.RunReactants([genmol(smi),]) 354 | prod_Ps = [] 355 | for prod_P in prods: 356 | try: 357 | Chem.SanitizeMol(prod_P[0]) 358 | prod_P = prod_P[0] 359 | prod_Ps.append(prod_P) 360 | except: 361 | pass 362 | rsl = set([genc_smi(m) for m in prod_Ps]) 363 | return list(rsl) 364 | 365 | # end 366 | -------------------------------------------------------------------------------- /src/smipoly/smip/monc.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # Copyright (c) 2021 Mitsuru Ohno 5 | # Use of this source code is governed by a BSD-3-style 6 | # license that can be found in the LICENSE file. 7 | 8 | # 07/27/2021, M. Ohno 9 | # smilesで与えた化合物リストを、モノマー別に分類する 10 | # monomer categolization system of the compound list in SMILES 11 | # 12 | # Refernce: 13 | # https://future-chem.com/rdkit-chemical-rxn/ 14 | # https://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html 15 | # https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html 16 | 17 | import os 18 | from pathlib import Path 19 | import numpy as np 20 | import pandas as pd 21 | import json 22 | import pickle 23 | from rdkit import rdBase, Chem 24 | from rdkit.Chem import AllChem, Draw 25 | from .funclib import genmol, genc_smi, monomer_sel_mfg, monomer_sel_pfg, ole_sel_cru, diene_14, update_nested_dict 26 | 27 | db_file = os.path.join(str(Path(__file__).resolve().parent.parent), 'rules') 28 | with open(os.path.join(db_file, 'mon_vals.json'), 'r') as f: 29 | mon_vals = json.load(f) 30 | with open(os.path.join(db_file, 'mon_dic.json'), 'r') as f: 31 | mon_dic = json.load(f) 32 | with open(os.path.join(db_file, 'mon_dic_inv.json'), 'r') as f: 33 | mon_dic_inv = json.load(f) 34 | with open(os.path.join(db_file, 'mon_lst.json'), 'r') as f: 35 | monL = json.load(f) 36 | with open(os.path.join(db_file, 'excl_lst.json'), 'r') as f: 37 | exclL = json.load(f) 38 | with open(os.path.join(db_file, 'ps_rxn.pkl'), 'rb') as f: 39 | Ps_rxnL = pickle.load(f) 40 | 41 | monLg = {int(k): v for k, v in monL.items()} 42 | exclLg = {int(k): v for k, v in exclL.items()} 43 | mon_dic_inv = {int(k): v for k, v in mon_dic_inv.items()} 44 | 45 | 46 | def moncls(df, smiColn, minFG=None, maxFG=None, dsp_rsl=None): 47 | # #The default number of the samle class of FG were limited 2 to 4 in the same molecule for poly functionalized monomer. 48 | # (dataframe, smiColn, minFG = 2, maxFG = 4) 49 | if minFG == None: 50 | minFG = 2 51 | if maxFG == None: 52 | maxFG = 4 53 | if dsp_rsl == None: 54 | dsp_rsl = False 55 | 56 | monL = {k: v for k, v in monLg.items() if k in mon_vals[0]+mon_vals[1]+mon_vals[2]} 57 | exclL = {k: v for k, v in exclLg.items() if k in mon_vals[0]+mon_vals[1]+mon_vals[2]} 58 | 59 | #read source file 60 | DF01 = df 61 | smiColn = smiColn 62 | 63 | #append CO and HCHO for carbonate 64 | DF02 = DF01.copy() 65 | if smiColn in DF02.columns.to_list(): 66 | #DF02 = DF02[['CID', smiColn]] #remove this row if not required 67 | DFadd = pd.DataFrame([['[C-]#[O+]'], ['C=O'], ], columns=[smiColn]) 68 | DF02 = pd.concat([DF02, DFadd], ignore_index=True) 69 | else: 70 | print("invalid SMILES column name") 71 | 72 | 73 | #drop NA of smiles, and add chemical structure 74 | DF02['ROMol'] = DF02[smiColn].apply(genmol) 75 | DF02['smip_cand_mons'] = DF02['ROMol'].apply(genc_smi) 76 | 77 | #classification for mono-functionalized monomer 78 | #count functional groupe, remove exclude compounds andjudge targetted monomer or not. 79 | for i in mon_vals[0]: 80 | mons=() 81 | excls=() 82 | mons = monL[i] 83 | excls = list(exclL[i]) 84 | DF02[mon_dic_inv[i]] = [e[0] for e in DF02['ROMol'].apply(monomer_sel_mfg, mons=mons, excls=excls)] 85 | if dsp_rsl==True: 86 | print(i) 87 | print(mon_dic_inv[i], ' = ', len(DF02[DF02[mon_dic_inv[i]]==True]), ' / ', len(DF02)) 88 | else: 89 | pass 90 | 91 | 92 | #classification for poly-functionalized monomer 93 | for i in mon_vals[1]: 94 | mons=() 95 | excls=() 96 | mons=monL[i] 97 | excls=exclL[i] 98 | DF02[mon_dic_inv[i]] = [e[0] for e in DF02['ROMol'].apply(monomer_sel_pfg, mons=mons, excls=excls, minFG=minFG, maxFG=maxFG)] 99 | if dsp_rsl==True: 100 | print(i) 101 | print(mon_dic_inv[i], ' = ', len(DF02[DF02[mon_dic_inv[i]]==True]), ' / ', len(DF02)) 102 | else: 103 | pass 104 | 105 | DF02 = DF02.drop('ROMol', axis=1) #2024/01 modified 106 | return DF02 107 | 108 | 109 | #classification for olefinic monomer **UNDER THE CONSTRUCTION** 110 | #count functional groupe, remove exclude compounds andjudge targetted monomer or not. 111 | # monc.py改訂 112 | def olecls(df, smiColn, minFG=None, maxFG=None, dsp_rsl=None): 113 | if minFG == None: 114 | minFG = 1 115 | if maxFG == None: 116 | maxFG = 4 117 | if dsp_rsl == None: 118 | dsp_rsl = False 119 | 120 | monL = {k: v for k, v in monLg.items() if k in mon_vals[3]} 121 | exclL = {k: v for k, v in exclLg.items() if k in mon_vals[3]} 122 | 123 | template_ole_keys = [mon_dic_inv[i] for i in mon_vals[3]] 124 | print(template_ole_keys) 125 | 126 | #read source file 127 | DF02 = df 128 | smiColn = smiColn 129 | #drop NA of smiles, and add chemical structure 130 | DF02['ROMol'] = DF02[smiColn].apply(genmol) 131 | DF02['smip_cand_mons'] = DF02['ROMol'].apply(genc_smi) 132 | #create null column for olefin classification 133 | DF02['ole_cls'] = [{k:v for k, v in zip(template_ole_keys, [np.nan for x in range(len(template_ole_keys))])} for y in range(len(DF02))] 134 | 135 | for i in mon_vals[3]: 136 | mons=() 137 | excls=() 138 | mons = monL[i] 139 | excls = list(exclL[i]) 140 | DF02['temp'] = ['' for e in range(len(DF02))] 141 | #DF02[mon_dic_inv[i]] = DF02['ROMol'].apply(ole_sel_cru, mons=mons, excls=excls, minFG=minFG, maxFG=maxFG) #ver0.1 style 142 | DF02['temp'] = DF02['ROMol'].apply(ole_sel_cru, mons=mons, excls=excls, minFG=minFG, maxFG=maxFG) 143 | DF02 = DF02.apply(update_nested_dict, axis=1, args=('ole_cls', 'temp', mon_dic_inv[i])) 144 | 145 | if dsp_rsl==True: 146 | print(mon_dic_inv[i], ' = ', 147 | #list(DF02[mon_dic_inv[i]].apply(lambda x : x[0]==True)).count(True), #ver0.1 style 148 | list(DF02['ole_cls'].apply(lambda x : x[mon_dic_inv[i]][0]==True)).count(True), 149 | ' / ', len(DF02)) 150 | 151 | else: 152 | pass 153 | DF02['ole_cls'] = df['ole_cls'].apply(diene_14, rxn=Ps_rxnL[209]) #refine conjugated diene CRU 154 | DF02 = DF02.drop('ROMol', axis=1) 155 | DF02 = DF02.drop('temp', axis=1) 156 | return DF02 157 | 158 | #end 159 | -------------------------------------------------------------------------------- /src/smipoly/smip/polg.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # coding: utf-8 3 | 4 | # Copyright (c) 2021 Mitsuru Ohno 5 | # Use of this source code is governed by a BSD-3-style 6 | # license that can be found in the LICENSE file. 7 | 8 | # polymer generator from classfied monomers. 9 | 10 | import os 11 | from pathlib import Path 12 | import itertools 13 | import numpy as np 14 | import pandas as pd 15 | import json 16 | import pickle 17 | from rdkit import rdBase, Chem 18 | from rdkit.Chem import AllChem 19 | #from .funclib import monomer_sel_mfg, monomer_sel_pfg, seq_chain, seq_successive 20 | from .funclib import genmol, coord_polym 21 | import warnings #for warning 22 | 23 | 24 | db_file = os.path.join(str(Path(__file__).resolve().parent.parent), 'rules') 25 | with open(os.path.join(db_file, 'mon_vals.json'), 'r') as f: 26 | mon_vals = json.load(f) 27 | with open(os.path.join(db_file, 'mon_dic.json'), 'r') as f: 28 | mon_dic = json.load(f) 29 | with open(os.path.join(db_file, 'mon_dic_inv.json'), 'r') as f: 30 | mon_dic_inv = json.load(f) 31 | with open(os.path.join(db_file, 'mon_lst.json'), 'r') as f: 32 | monL = json.load(f) 33 | with open(os.path.join(db_file, 'excl_lst.json'), 'r') as f: 34 | exclL = json.load(f) 35 | with open(os.path.join(db_file, 'ps_rxn.pkl'), 'rb') as f: 36 | Ps_rxnL = pickle.load(f) 37 | with open(os.path.join(db_file, 'ps_class.json'), 'r') as f: 38 | Ps_classL = json.load(f) 39 | with open(os.path.join(db_file, 'ps_gen.pkl'), 'rb') as f: 40 | Ps_GenL = pickle.load(f) 41 | 42 | monLg = {int(k): v for k, v in monL.items()} 43 | exclLg = {int(k): v for k, v in exclL.items()} 44 | mon_dic_inv = {int(k): v for k, v in mon_dic_inv.items()} 45 | 46 | def biplym(df, targ = None, Pmode = None, dsp_rsl = None): 47 | if targ == None: 48 | targ = ['all', ] 49 | if Pmode == None: 50 | Pmode = 'a' 51 | if dsp_rsl == None: 52 | dsp_rsl = False 53 | 54 | #FOR FUTURE WORKS!! temporary reduced the dictionaly on 04/21/2004 55 | monL = {k: v for k, v in monLg.items() if k in mon_vals[0]+mon_vals[1]+mon_vals[2]} 56 | exclL = {k: v for k, v in exclLg.items() if k in mon_vals[0]+mon_vals[1]+mon_vals[2]} 57 | 58 | #set the generated polymer class 59 | targL = [] 60 | if targ==['all', ]: 61 | targL = Ps_classL.keys() 62 | else: 63 | targL.extend(targ) 64 | for x in targL: 65 | if x not in Ps_classL.keys(): 66 | print('oops! no such polymer class!') 67 | return 68 | else: 69 | pass 70 | 71 | #treat source DataFrame 72 | 73 | if 'ROMol' in df.columns: #2024/01 modified 74 | DF =df.drop('ROMol', axis=1).dropna(subset=['smip_cand_mons']) 75 | else: 76 | DF =df.dropna(subset=['smip_cand_mons']) 77 | 78 | DF_L = ['smip_cand_mons', ] 79 | for col_nam in DF.columns.values: 80 | if col_nam in list(mon_dic): 81 | DF_L.append(col_nam) 82 | else: 83 | pass 84 | DF = DF[DF_L] 85 | DF_L = DF_L[1:] 86 | for col_nam in DF_L: 87 | DF[col_nam] = DF[col_nam].replace('False', '') 88 | DF[col_nam] = DF[col_nam].astype('bool') 89 | 90 | #select mode 91 | if Pmode == 'r': 92 | warnings.warn("Mode 'r' will be deprecated and merged into mode 'a'. Use mode 'a' instead. ") 93 | from .funclib import bipolymR, homopolymR 94 | elif Pmode == 'a': 95 | from .funclib import bipolymA, homopolymA 96 | else: 97 | raise Exception('invalid mode!') 98 | 99 | DF_Pgen = pd.DataFrame(columns=['mon1', 'mon2', 'polym', 'polymer_class', 'Ps_rxnL']) #20240826added 100 | 101 | #generate polymer 102 | for P_class in targL: 103 | for P_set in Ps_GenL[str(P_class)]: 104 | targ_mon1 = '' 105 | targ_mon2 = '' 106 | targ_mon1 = P_set[0] 107 | targ_mon2 = P_set[1] 108 | temp1 = [] 109 | temp2 = [] 110 | 111 | #20240826 addded 112 | Ps_rxnL_key = [] 113 | Ps_rxnL_key = [k for k,v in Ps_rxnL.items() if AllChem.ReactionToSmarts(v)==AllChem.ReactionToSmarts(P_set[2])] 114 | 115 | DF10 = DF[DF[targ_mon1]] 116 | temp1 = list(DF10['smip_cand_mons']) 117 | if len(temp1) != 0: 118 | if targ_mon2 != 'none': 119 | DF20 = DF[DF[targ_mon2]] 120 | temp2 = list(DF20['smip_cand_mons']) 121 | del DF10, DF20 122 | if len(temp2) != 0: 123 | combs=[[m1, m2] for m1 in temp1 for m2 in temp2] 124 | temp11=[] 125 | temp21=[] 126 | for comb in combs: 127 | if comb[0]!=comb[1]: 128 | temp11.append(comb[0]) 129 | temp21.append(comb[1]) 130 | DF_temp = pd.DataFrame() 131 | DF_temp = pd.DataFrame(data={'mon1':temp11, 'mon2':temp21}, columns=['mon1', 'mon2']) 132 | DF_temp['polymer_class'] = str(P_class) 133 | DF_temp['Ps_rxnL'] = int(Ps_rxnL_key[0]) #20240826added 134 | targ_rxn=P_set[2] 135 | if Pmode == 'r': 136 | DF_temp['polym'] = DF_temp.apply(lambda x: [genmol(x['mon1']), genmol(x['mon2'])], axis=1).apply(bipolymR, targ_rxn=targ_rxn, monL=monL, Ps_rxnL=Ps_rxnL, P_class=P_class) 137 | DF_Pgen = pd.concat([DF_Pgen, DF_temp], ignore_index=True, copy=False) 138 | elif Pmode == 'a': 139 | DF_temp['polym'] = DF_temp.apply(lambda x: [genmol(x['mon1']), genmol(x['mon2'])], axis=1).apply(bipolymA, targ_rxn=targ_rxn, monL=monL, Ps_rxnL=Ps_rxnL, P_class=P_class) 140 | DF_Pgen = pd.concat([DF_Pgen, DF_temp], ignore_index=True, copy=False) 141 | else: 142 | temp2 = ['' for i in range(len(temp1))] 143 | DF_temp = pd.DataFrame() 144 | DF_temp = pd.DataFrame(data={'mon1':temp1, 'mon2':temp2}, columns=['mon1', 'mon2']) 145 | DF_temp['polymer_class'] = str(P_class) 146 | DF_temp['Ps_rxnL'] = int(Ps_rxnL_key[0]) #20240826added 147 | mons=monL[mon_dic[targ_mon1]] 148 | excls=exclL[mon_dic[targ_mon1]] 149 | if Pmode == 'r': 150 | DF_temp['polym'] = DF_temp.apply(lambda x: genmol(x['mon1']), axis=1).apply(homopolymR, mons=mons, excls=excls, targ_mon1=targ_mon1, Ps_rxnL=Ps_rxnL, mon_dic=mon_dic, monL=monL) 151 | DF_Pgen = pd.concat([DF_Pgen, DF_temp], ignore_index=True, copy=False) 152 | elif Pmode == 'a': 153 | DF_temp['polym'] = DF_temp.apply(lambda x: genmol(x['mon1']), axis=1).apply(homopolymA, mons=mons, excls=excls, targ_mon1=targ_mon1, Ps_rxnL=Ps_rxnL, mon_dic=mon_dic, monL=monL) 154 | DF_Pgen = pd.concat([DF_Pgen, DF_temp], ignore_index=True, copy=False) 155 | 156 | num_polym_react = len(DF_Pgen) 157 | DF_gendP = DF_Pgen.explode('polym') 158 | DF_gendP = DF_gendP.reset_index(drop=True) 159 | DF_gendP = DF_gendP.dropna(subset=['polym']) 160 | 161 | #adjust DataFrame 162 | DF_gendP.replace({'polym':{'':np.nan}}, inplace=True) 163 | 164 | #drpo duplicated polymerization reaction 165 | DF_gendP=DF_gendP.dropna(subset=['polym']) 166 | DF_gendP=DF_gendP[DF_gendP['mon1']!=DF_gendP['mon2']] 167 | DF_gendP['reactset']=np.sort(DF_gendP.loc[:,['mon1', 'mon2']].values).tolist() 168 | DF_gendP['reactset']=DF_gendP['reactset'].apply(set).apply(tuple) 169 | DF_gendP = DF_gendP.drop_duplicates(subset=['reactset', 'polym']) 170 | DF_gendP = DF_gendP.reset_index(drop=True) 171 | if dsp_rsl == True: 172 | if Pmode == 'a': 173 | print('run at advanced mode') 174 | elif Pmode == 'r': 175 | print('run at rapid mode') 176 | else: 177 | print('invalid mode') 178 | print('number of polymerization reactions = ', num_polym_react) 179 | print('number of generated polymers = ', len(DF_gendP)) 180 | else: 181 | pass 182 | return DF_gendP 183 | 184 | #set the olefin class(es) of the copolymer 185 | def ole_copolym(df, targ = None, ncomp = None, dsp_rsl = None, drop_dupl = None): 186 | if targ == None or len(targ)==0: 187 | print('Plz define the olefin class(es)') 188 | return 189 | if ncomp == None: 190 | ncomp = 1 191 | if dsp_rsl == None: 192 | dsp_rsl = False 193 | if drop_dupl == None: 194 | drop_dupl = True 195 | 196 | #explanation 197 | template_ole_keys = [mon_dic_inv[i] for i in mon_vals[3]] 198 | print('\n', 'valid arguments \n', template_ole_keys , 199 | ',\n', Ps_GenL['rec:coord'], '\n') 200 | 201 | #confirm the list of copolymerization unit(s) 202 | if type(targ) != list: 203 | print('This arg. should be given as list') 204 | return 205 | for x in targ: 206 | if x not in [mon_dic_inv[k] for k in mon_vals[3]]+list(Ps_GenL['rec:coord']): 207 | print('oops! no such olefin class!') 208 | return 209 | else: 210 | pass 211 | if len(targ)>ncomp: 212 | print('reconfirm ncomp') 213 | return 214 | if any(e in targ for e in Ps_GenL['rec:coord']): 215 | if len(targ) > 1: 216 | print('ROMP, ROMPH, and COC should each be used solely. ') 217 | return 218 | if ncomp > 2: 219 | print('Reccomend: nocmp=1 for ROMP, ROMPH and 2 for COC') 220 | 221 | #reconsruct the list of cllasified olefin monomers for co-polymerization 222 | cand_cru = [] 223 | comb_cru = [] 224 | copoly_cru = [] 225 | 226 | ole_clsL = df['ole_cls'].to_list() 227 | cand_monsL = df['smip_cand_mons'].to_list() 228 | ole_targL = [e for e in list(zip(ole_clsL, cand_monsL)) if True in [d[0] for d in e[0].values()]] #[({オレフィン種:[該非, 官能基数, CRU], }, 出発モノマー), ()] 229 | ole_targL2 = [] 230 | if all([item in template_ole_keys for item in targ]): 231 | for ole in [mon_dic_inv[k] for k in mon_vals[3]]: 232 | for m in ole_targL: 233 | if m[0][ole][0]==True: 234 | ole_targL2.append([ole, m[0][ole][2], m[1]]) 235 | 236 | #Extract CRUs corresponding to defined components 237 | cand_cru = list(itertools.chain.from_iterable([[m for m in ole_targL2 if t==m[0]] for t in targ])) 238 | 239 | elif targ==['ROMP',] or targ==['ROMPH',]: 240 | ole = 'cycCH' 241 | for m in ole_targL: 242 | if m[0][ole][0]==True: 243 | ole_targL2.append([ole, m[0][ole][2], m[1]]) 244 | for e in ole_targL2: 245 | e[1] = coord_polym(e[2], Ps_rxnL[1050]) 246 | #explode the list of generated CRU 247 | cand_cru = [[e[0], sub_e, e[2]] for e in ole_targL2 for sub_e in e[1]] 248 | if targ==['ROMPH']: 249 | for e in cand_cru: 250 | e[1] = e[1].replace("=", "") 251 | 252 | elif targ==['COC',]: 253 | targ.append('') #dummy element to count components 254 | if ncomp<2: 255 | print('reconfirm ncomp') 256 | return 257 | else: 258 | coc_mons = ['cycCH', 'aliphCH'] 259 | ole_targL2cyc = [] 260 | ole_targL2chain = [] 261 | for ole in coc_mons: 262 | if ole=='cycCH': 263 | for m in ole_targL: 264 | if m[0][ole][0]==True: 265 | if ole=='cycCH': 266 | ole_targL2cyc.append([ole, m[0][ole][2], m[1]]) 267 | for e in ole_targL2cyc: 268 | e[1] = coord_polym(e[2], Ps_rxnL[1051]) 269 | elif ole=='aliphCH': 270 | for m in ole_targL: 271 | if m[0][ole][0]==True: 272 | if ole=='cycCH': 273 | ole_targL2chain.append([ole, m[0][ole][2], m[1]]) 274 | for e in ole_targL2: 275 | e[1] = coord_polym(e[2], Ps_rxnL[1052]) 276 | ole_targL2chain.append([ole, m[0][ole][2], m[1]]) 277 | for e in ole_targL2chain: 278 | e[1] = coord_polym(e[2], Ps_rxnL[1052]) 279 | ole_targL2 = list(itertools.chain(ole_targL2cyc, ole_targL2chain)) 280 | #explode the list of generated CRU 281 | cand_cru = [[e[0], sub_e, e[2]] for e in ole_targL2 for sub_e in e[1]] 282 | 283 | else: 284 | print('reconfirm olefin class(es)') 285 | return 286 | 287 | 288 | #generate copolymers 289 | comb_cru = [e for e in itertools.combinations(cand_cru, ncomp)] 290 | 291 | #Exclude if it does not contain all defined olefin class as the component 292 | for e in comb_cru: 293 | if len(set([l[0] for l in e]))>=len(targ): 294 | copoly_cru.append(([l[0] for l in e], [l[1] for l in e], [l[2] for l in e])) 295 | else: 296 | pass 297 | 298 | #Export as Pandas DataFrame 299 | DF_Pgen = pd.DataFrame(columns=['mon1', 'mon2', 'polym', 'polymer_class', 'Ps_rxnL', 'reactset']) 300 | DF_Pgen[['polymer_class', 'polym', 'reactset']] = pd.DataFrame(copoly_cru) 301 | DF_Pgen = DF_Pgen.fillna({'mon1':'', 'mon2':''}) 302 | 303 | #Type of initiator 304 | l_initiator = ['rec:radi', 'rec:cati', 'rec:ani'] 305 | rec_initiator = [] 306 | if any(e in targ for e in Ps_GenL['rec:coord']): 307 | rec_initiator = targ 308 | for k in l_initiator: 309 | if all([e in Ps_GenL[k] for e in targ]): 310 | rec_initiator.append(k) 311 | else: 312 | pass 313 | if len(rec_initiator)==0: 314 | rec_initiator=np.nan 315 | DF_Pgen['Ps_rxnL'] = DF_Pgen.apply(lambda x: rec_initiator, axis=1) 316 | 317 | #Drop duplicated copolymer. It takes long time when the DataFrame (DF_Pgen) was large. 318 | if drop_dupl == True: 319 | DF_Pgen['temp_dro_dupl'] = DF_Pgen.apply(lambda x:"".join(sorted(x['polym'])), axis=1) 320 | DF_gendP = DF_Pgen.drop_duplicates(subset='temp_dro_dupl') 321 | DF_gendP = DF_gendP.drop('temp_dro_dupl', axis=1) 322 | DF_gendP = DF_gendP.reset_index(drop=True) 323 | else: 324 | DF_gendP = DF_Pgen 325 | 326 | if dsp_rsl==True: 327 | print('Number of generated (co)polymer, ', ncomp, ' component(s) system : ', format(len(DF_gendP), ',')) 328 | else: 329 | pass 330 | 331 | return DF_gendP 332 | 333 | # #end 334 | -------------------------------------------------------------------------------- /utilities/1_MonomerDefiner.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "id": "pJJYMFna3hnM" 7 | }, 8 | "source": [ 9 | "Copyright (c) 2021 Mitsuru Ohno \n", 10 | "Use of this source code is governed by a BSD-3-style \n", 11 | "license that can be found in the LICENSE file. \n", 12 | " \n", 13 | "07/27/2021, M. Ohno \n", 14 | "tool for define monmers \n", 15 | " \n", 16 | "Refernce: \n", 17 | "https://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html \n", 18 | "https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html " 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": null, 24 | "metadata": { 25 | "id": "Br62wmV_3hnQ" 26 | }, 27 | "outputs": [], 28 | "source": [ 29 | "import json" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": { 35 | "id": "MTXMHr6U3hnR" 36 | }, 37 | "source": [ 38 | "define functional grope (FG) for each monomer \n", 39 | "objective functional groupe (FG):xxx_m \n", 40 | "inconpatible FG:xxx_excl; CHO, N3, non-cyclic anhydride and non-cyclic imide were defined as inappropriate FG for materials \n", 41 | "The number of the samle class of FG were limited 2 to 4 in the same molecule for poly functionalized monomer. " 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": { 47 | "id": "_g1IlAB-3hnS" 48 | }, 49 | "source": [ 50 | "defined monomers, \n", 51 | " - vinylidene (terminal olefin) include acrylate; vinyl \n", 52 | " - epoxide(mono and poly); epo \n", 53 | " - epoxide (poly); diepo \n", 54 | " - cyclic olefin; cOle \n", 55 | " - lactone except gamma-butyrolactone; lactone\n", 56 | " - lactame; lactame \n", 57 | " - hydroxy carboxilic acid; hydCOOH \n", 58 | " - amino acid; aminCOOH\n", 59 | " - hindered phenol and thiophenol; hindPhenol \n", 60 | " - poly carboxylic acid and acid halide; diCOOH \n", 61 | " - polyol and thiol; diol \n", 62 | " - polyamine; diamn \n", 63 | " - polyisocyanate; diNCO \n", 64 | " - polycarboxilic acid anhydride; diCanhyd " 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "metadata": { 70 | "id": "tCaXt7Y-3hnS" 71 | }, 72 | "source": [ 73 | "Detailed definition of olefin monomers,\n", 74 | " - acryl \n", 75 | " - beta-electronwithdrawing group substituted olefon \n", 76 | " - styryl \n", 77 | " - allyl \n", 78 | " - halegenated hydrocarbon \n", 79 | " - vinyl ester \n", 80 | " - maleic imide derivatives \n", 81 | " - conjugated dienes \n", 82 | " - vinyl ether \n", 83 | " - beta-di substituted aliphatic vinilidene olefin (tertcatCH) \n", 84 | " - aliphatic cyclic and chain olefine \n", 85 | " \n", 86 | "reference \n", 87 | "https://doi.org/10.11364/networkpolymer.30.234 \n", 88 | "https://doi.org/10.5059/yukigoseikyokaishi.22.20 " 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "metadata": { 94 | "id": "BloWlPrW3hnS" 95 | }, 96 | "source": [ 97 | "modify this dictionaries if you add or derete monomer species" 98 | ] 99 | }, 100 | { 101 | "cell_type": "markdown", 102 | "metadata": { 103 | "id": "7Nojbkts3hnT" 104 | }, 105 | "source": [ 106 | "for single monomer system; value 1- 49 \n", 107 | "for binary monomer system; value 51 - 99 \n", 108 | "for sequential polymerization of residual functional groupe (FG); 201 - \n", 109 | "for detailed definition of olefin monomers; 1001 - \n", 110 | "1051 is an unused number for COC " 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": null, 116 | "metadata": { 117 | "id": "bhn8-Kdj3hnT" 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "#1- : for addition, RO polymerization, self-condensation etc.\n", 122 | "#51- : for poly condensation etc.\n", 123 | "mon_dic = {\"vinyl\":1, \"epo\":2, \"diepo\":51, \"cOle\":3, \"lactone\":4, \"lactam\":5, \"hydCOOH\":6, \"aminCOOH\":7,\n", 124 | " \"hindPhenol\":8, \"cAnhyd\":9, \"CO\":10, \"HCHO\":11, \"sfonediX\":12, \"BzodiF\":13,\n", 125 | " \"diCOOH\":52, \"diol\":53, \"diamin\":54, \"diNCO\":55, \"dicAnhyd\":56, \"pridiamin\":57, \"diol_b\":58,\n", 126 | " \"acryl\":1001, \"bEWole\":1002, \"styryl\":1003, \"allyl\":1004, \"haloCH\":1005, \"vinylester\":1006,\n", 127 | " \"malei\":1007, \"conjdiene\":1020, \"vinylether\":1030, \"tertcatCH\":1031, \"cycCH\":1050, \"aliphCH\":1052, }\n", 128 | "mon_dic_inv = {v: k for k, v in mon_dic.items()}" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "metadata": { 135 | "id": "XTeRgHJj3hnT" 136 | }, 137 | "outputs": [], 138 | "source": [ 139 | "#refine follows tuple if definition of monomer species was modified\n", 140 | "#tuple ((value of mon_dic for addition, RO polymerization, self-condensation etc.),\n", 141 | "#(value of mon_dic for poly condensation etc. ))\n", 142 | "mon_vals = ((1,2,3,4,5,6,7,8,9,10,11,12,13), (51,52,53,54,55,56,57,58), (200, 201, 202, 203, 204, 205, 206),\n", 143 | " (1001, 1002, 1003, 1004, 1005, 1006, 1007,1020, 1030, 1031, 1050, 1052))" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": null, 149 | "metadata": { 150 | "id": "tNVhGNc83hnU" 151 | }, 152 | "outputs": [], 153 | "source": [ 154 | "monL={}\n", 155 | "exclL={}" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": { 162 | "id": "ESNcwLMD3hnU" 163 | }, 164 | "outputs": [], 165 | "source": [ 166 | "#modify this list if you add or derete monomer species\n", 167 | "#each list must have more than two elements.\n", 168 | "monL[0] = ()\n", 169 | "exclL[0] = ()" 170 | ] 171 | }, 172 | { 173 | "cell_type": "code", 174 | "execution_count": null, 175 | "metadata": { 176 | "id": "vAspxklI3hnU" 177 | }, 178 | "outputs": [], 179 | "source": [ 180 | "#definition of vinylidene monomer\n", 181 | "#objective FG: open chain terminal olefin include acrylate, F and / or Cl substituted olefin\n", 182 | "#excluded FG: -\n", 183 | "n=mon_dic['vinyl']\n", 184 | "monL[n]=('[CX3H2]=[CX3]', '[CX3](F)(F)=[CX3]', '[CX3;H1](F)=[CX3]',\n", 185 | " '[CX3](Cl)(Cl)=[CX3]', '[CX3;H1](Cl)=[CX3]', '[CX3](Cl)(F)=[CX3]')\n", 186 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 187 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": { 194 | "id": "31KAS1C33hnU" 195 | }, 196 | "outputs": [], 197 | "source": [ 198 | "#definition of epoxide monomer (which has at least one epxide)\n", 199 | "#objective FG: open chain terminal epoxide, alicyclic epoxide\n", 200 | "#excluded FG: prim- and sec-amine\n", 201 | "n=mon_dic['epo']\n", 202 | "monL[n]=('[CX4H2]1[O][CX4]1', '[c,CX4;R]-[CX4H1]1[O][CX4]1-[c,CX4;R]',\n", 203 | " '[CX4H1]1([F,Cl])[O][CX4]1', '[CX4]1([F,Cl])([F,Cl])[O][CX4]1', '[c,CX4;R]-[CX4]1([F,Cl])[O][CX4]1-[c,CX4;R]') #added c\n", 204 | "exclL[n]=('[N&X3;H2,H1;!$(N[C,S]=*)]',\n", 205 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 206 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": null, 212 | "metadata": { 213 | "id": "UtCxDlek3hnV" 214 | }, 215 | "outputs": [], 216 | "source": [ 217 | "#definition of polyepoxide monomer\n", 218 | "#objective FG: open chain terminal epoxide, alicyclic epoxide\n", 219 | "#excluded FG: prim- and sec-amine\n", 220 | "n=mon_dic['diepo']\n", 221 | "monL[n]=('[CX4H2]1[O][CX4]1', '[c,CX4;R]-[CX4H1]1[O][CX4]1-[c,CX4;R]',\n", 222 | " '[CX4H1]1([F,Cl])[O][CX4]1', '[CX4]1([F,Cl])([F,Cl])[O][CX4]1', '[c,CX4;R]-[CX4]1([F,Cl])[O][CX4]1-[c,CX4;R]') #added c\n", 223 | "exclL[n]=('[N&X3;H2,H1;!$(N[C,S]=*)]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 224 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 225 | ] 226 | }, 227 | { 228 | "cell_type": "code", 229 | "execution_count": null, 230 | "metadata": { 231 | "id": "fzvEYoA03hnV" 232 | }, 233 | "outputs": [], 234 | "source": [ 235 | "#definition of cycloOflein monomer\n", 236 | "#objective FG:\n", 237 | "#excluded FG: cyclic diene\n", 238 | "n=mon_dic['cOle']\n", 239 | "monL[n]=('[CX3;H1;R]=[CX3;H1;R]', '[CX3;H1;R]=[CX3;H0;R]')\n", 240 | "exclL[n]=('[CX3;R]=[CX3;R]-[CX3;R]=[CX3;R]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 241 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 242 | ] 243 | }, 244 | { 245 | "cell_type": "code", 246 | "execution_count": null, 247 | "metadata": { 248 | "id": "tKglyB5m3hnV" 249 | }, 250 | "outputs": [], 251 | "source": [ 252 | "#definition of lactone monomer\n", 253 | "#objective FG: aliphatic and aromatic, dioxodioxane was included\n", 254 | "#excluded FG: gamma-lactone, cyclic acid anhydride\n", 255 | "n=mon_dic['lactone']\n", 256 | "monL[n]=('[C;R][OX2;R][CX3;R](=[OX1])[C;R]', '[c][OX2;R][CX3;R](=[OX1])[C;R]',\n", 257 | " '[OX2;R][CX3;R](=[OX1])[C;R][c]')\n", 258 | "exclL[n]=('[OX2;R]1[CX3;R](=[OX1])[C,c;R][C,c;R][C,c;R]1', '[c][OX2;R5][CX3;R5](=[OX1])',\n", 259 | " '[OX2;R5][CX3;R5](=[OX1])[c]', '[C,c][C;R](=[OX1])[O;R][C;R](=[OX1])[C,c]',\n", 260 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 261 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 262 | ] 263 | }, 264 | { 265 | "cell_type": "code", 266 | "execution_count": null, 267 | "metadata": { 268 | "id": "6lcRDpWY3hnW" 269 | }, 270 | "outputs": [], 271 | "source": [ 272 | "#definition of lactam monomer\n", 273 | "#objective FG: aliphatic and aromatic, N non-substituted\n", 274 | "#excluded FG:5 membered ring, imide, isocyanurate\n", 275 | "n=mon_dic['lactam']\n", 276 | "monL[n]=('[C;R][NX3;H1;R][CX3;R](=[OX1])[C;R][C;R]', '[c][NX3;H1;R][CX3;R](=[OX1])[C;R]',\n", 277 | " '[NX3;H1;R][CX3;R](=[OX1])[C;R][c]')\n", 278 | "exclL[n]=('[NX3;R]1[CX3;R](=[OX1])[C,c;R][C,c;R][C,c;R]1', '[C,c;R5][NX3;R5][CX3;R5](=[OX1])[C,c;R5]',\n", 279 | " '[C;R](=[OX1])[N;R][C;R](=[OX1])', '[C;R][N;R][C;R](=[OX1])[N;R]'\n", 280 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 281 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 282 | ] 283 | }, 284 | { 285 | "cell_type": "code", 286 | "execution_count": null, 287 | "metadata": { 288 | "id": "gxItHngB3hnW" 289 | }, 290 | "outputs": [], 291 | "source": [ 292 | "#definition of hydroxy carboxylic acid monomer\n", 293 | "#objective FG:\n", 294 | "#excluded FG:\n", 295 | "n=mon_dic['hydCOOH']\n", 296 | "monL[n]=('[O&X2;H1;!$(OC=*)][C].[CX3](=[O])[OX2H1]', '[O&X2;H1;!$(OC=*)][c].[CX3](=[O])[OX2H1]')\n", 297 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 298 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": null, 304 | "metadata": { 305 | "id": "IaFDj3Mj3hnW" 306 | }, 307 | "outputs": [], 308 | "source": [ 309 | "#definition of amino acid monomer\n", 310 | "#objective FG:\n", 311 | "#excluded FG:\n", 312 | "n=mon_dic['aminCOOH']\n", 313 | "monL[n]=(('[N&X3;H2,H1;!$(N[C,S]=*)][C].[CX3](=[O])[OX2H1]', '[N&X3;H2,H1;!$(N[C,S]=*)][c].[CX3](=[O])[OX2H1]'))\n", 314 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 315 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 316 | ] 317 | }, 318 | { 319 | "cell_type": "code", 320 | "execution_count": null, 321 | "metadata": { 322 | "id": "8KEX4DE53hnW" 323 | }, 324 | "outputs": [], 325 | "source": [ 326 | "#definition of hindered phenol monomer\n", 327 | "#objective FG: o-disubstituted and p-unsubstituted phenol\n", 328 | "#excluded FG: amine, halogenated compound\n", 329 | "n=mon_dic['hindPhenol']\n", 330 | "monL[n]=('[c]1([OX2H1])[c]([C])[c][cX3H1][c][c]1([C])', )\n", 331 | "exclL[n]=('[N&X3;H2,H1,H0;!$(N[C,S]=*)]', '[F,Cl,Br,I]',\n", 332 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 333 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": null, 339 | "metadata": { 340 | "id": "1RRd7AjE3hnX" 341 | }, 342 | "outputs": [], 343 | "source": [ 344 | "#definition of cyclic carboxilic acid anhydride monomer\n", 345 | "#objective FG: cyclic carboxilic acid anhydride\n", 346 | "#excluded FG: oxazoline\n", 347 | "n=mon_dic['cAnhyd']\n", 348 | "monL[n]=('[C;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]', '[c][CX3,c;R](=[O])[O,o;R][CX3,c;R](=[O])[c]')\n", 349 | "exclL[n]=('[NX2;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]', '[NX3;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]',\n", 350 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 351 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 352 | ] 353 | }, 354 | { 355 | "cell_type": "code", 356 | "execution_count": null, 357 | "metadata": { 358 | "id": "KtjCYzF_3hnX" 359 | }, 360 | "outputs": [], 361 | "source": [ 362 | "#definition of carbonmonooxide for carbonate\n", 363 | "#objective FG: CO\n", 364 | "#excluded FG: -\n", 365 | "n=mon_dic['CO']\n", 366 | "monL[n]=('[C-]#[O+]',)\n", 367 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 368 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": null, 374 | "metadata": { 375 | "id": "Yx4vNrvM3hnX" 376 | }, 377 | "outputs": [], 378 | "source": [ 379 | "#definition of fromaldehyde for phenol / melamine resin\n", 380 | "#objective FG: HCHO\n", 381 | "#excluded FG: -\n", 382 | "n=mon_dic['HCHO']\n", 383 | "monL[n]=('[CX3;H2]=[OX1]',)\n", 384 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 385 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 386 | ] 387 | }, 388 | { 389 | "cell_type": "code", 390 | "execution_count": null, 391 | "metadata": { 392 | "id": "7bw0O12C3hnX" 393 | }, 394 | "outputs": [], 395 | "source": [ 396 | "#definition of polycarboxylic acid monomer include acid chrolide\n", 397 | "#objective FG: aliphatic prim- and sec-carboxylic acid, aromatic carboxylic acid\n", 398 | "#excluded FG: alcohol, amine and tert-carboxylic acid\n", 399 | "n=mon_dic['diCOOH']\n", 400 | "monL[n]=('[CX4H2][C](=[O])[OH1]', '[CX4H1][C](=[O])[OH1]', '[c][C](=O)[OH1]',\n", 401 | " '[CX4H2][C](=[O])[Cl,Br]', '[CX4H1][C](=[O])[Cl,Br]', '[c][C](=O)[Cl,Br]')\n", 402 | "exclL[n]=('[CX4H2][OH1]', '[CX4H1][OH1]', '[CX4H0][OH1]', '[CX4H0][C](=[O])[OH1]', '[CX4H0][C](=[O])[Cl,Br]'\n", 403 | " '[N&X3;H2,H1;!$(N[C,S]=*)]',\n", 404 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 405 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": null, 411 | "metadata": { 412 | "id": "NK1ptnlB3hnY" 413 | }, 414 | "outputs": [], 415 | "source": [ 416 | "#definition of polyol monomer include thiol\n", 417 | "#objective FG: phenol, aliphatic prim- and sec-alcohol\n", 418 | "#excluded FG: carboxylic acid, non-cyclic tert-alcohol, sugar, amine\n", 419 | "n=mon_dic['diol']\n", 420 | "monL[n]=('[CX4H1][OX2,SX2;H1]', '[CX4H2][OX2,SX2;H1]', '[c][OX2,SX2;H1]', '[CX4;H2,H1,c]([OX2,SX2;H1])[OX2,SX2;H1]') #added 2024/01\n", 421 | "exclL[n]=('[CX3H0](=[O])[OH1]', '[CX4H0;!R]([C])([C])[O,S;H1]', '[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]',\n", 422 | " '[N&X3;H2,H1;!$(N[C,S]=*)]',\n", 423 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 424 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 425 | ] 426 | }, 427 | { 428 | "cell_type": "code", 429 | "execution_count": null, 430 | "metadata": { 431 | "id": "jdyJFz8w3hnY" 432 | }, 433 | "outputs": [], 434 | "source": [ 435 | "#definition of polyamine monomer\n", 436 | "#objective FG: prim- and sec-amine (aliphatic and aromatic)\n", 437 | "#excluded FG: carboxylic acd, alcohol, amide\n", 438 | "n=mon_dic['diamin']\n", 439 | "monL[n]=('[C][N&X3;H2;!$(N[C,S]=*)]', '[c][N&X3;H2;!$(N[C,S]=*)]', '[C,c][N&X3;H1;!$(N[C,S]=*)][C,c]',\n", 440 | " '[N&X3;H2;!$(N[C,S]=*)][C][N&X3;H2;!$(N[C,S]=*)]') #added 2024/01\n", 441 | "exclL[n]=('[CX3H0](=[O])[OH1]', '[CX4H0]([C])([C])[OH1]', '[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]',\n", 442 | " '[C][OX2H]', '[C][SX2H]',\n", 443 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 444 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 445 | ] 446 | }, 447 | { 448 | "cell_type": "code", 449 | "execution_count": null, 450 | "metadata": { 451 | "id": "E53Vx_-v3hnY" 452 | }, 453 | "outputs": [], 454 | "source": [ 455 | "#definition of polyisocyanate monomer\n", 456 | "#objective FG: aliphatic and aromatic isocyanate and thioisosyanate\n", 457 | "#excluded FG: -\n", 458 | "n=mon_dic['diNCO']\n", 459 | "monL[n]=('[C]-[NX2]=[CX2]=[O,S;X1]', '[c]-[NX2]=[CX2]=[O,S;X1]')\n", 460 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 461 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 462 | ] 463 | }, 464 | { 465 | "cell_type": "code", 466 | "execution_count": null, 467 | "metadata": { 468 | "id": "0qzXWNYk3hnY" 469 | }, 470 | "outputs": [], 471 | "source": [ 472 | "#definition of poly (cyclic carboxilic acid anhydride) monomer\n", 473 | "#objective FG: poly (cyclic carboxilic acid anhydride)\n", 474 | "#excluded FG: oxazoline\n", 475 | "n=mon_dic['dicAnhyd']\n", 476 | "monL[n]=('[C;R][C;R;X3](=[OX1])[OX2;R][C;R;X3](=[OX1])[C;R]', '[c][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[c]')\n", 477 | "exclL[n]=('[NX2;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]', '[NX3;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]',\n", 478 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 479 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 480 | ] 481 | }, 482 | { 483 | "cell_type": "code", 484 | "execution_count": null, 485 | "metadata": { 486 | "id": "vKGQwEtV3hnZ" 487 | }, 488 | "outputs": [], 489 | "source": [ 490 | "#definition of primary polyamine monomer\n", 491 | "#objective FG: prim-amine (aliphatic and aromatic)\n", 492 | "#excluded FG: carboxylic acd, alcohol, amide\n", 493 | "n=mon_dic['pridiamin']\n", 494 | "monL[n]=('[C][N&X3;H2;!$(N[C,S]=*)]', '[c][N&X3;H2;!$(N[C,S]=*)]')\n", 495 | "exclL[n]=('[CX3H0](=[O])[OH1]', '[CX3H0](=[O])[NX3]', '[CX4][OH1]', '[c][OH1]',\n", 496 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 497 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 498 | ] 499 | }, 500 | { 501 | "cell_type": "code", 502 | "execution_count": null, 503 | "metadata": { 504 | "id": "LlfPoU-A3hnZ" 505 | }, 506 | "outputs": [], 507 | "source": [ 508 | "#definition of diArsulfone monomer\n", 509 | "#objective FG: p-halogenated aryl sulfone\n", 510 | "#excluded FG: carbonyl with oxygen (acid, ester etc), amide, alcohol, prim- and sec-amine, nitryl, tert-alkyl\n", 511 | "n=mon_dic['sfonediX']\n", 512 | "monL[n]=('[c]1[c][c]([F,Cl,Br,I])[c][c][c]1[SX4](=[OX1])(=[OX1])[c]2[c][c][c]([F,Cl,Br,I])[c][c]2', )\n", 513 | "exclL[n]=('[CX3](=[OX1])[OX2]', '[CX3H0](=[O])[NX3]', '[CX4][OH1]', '[c][OH1]',\n", 514 | " '[CX4][N&X3;H2;!$(N[C,S]=*)]', '[c][N&X3;H2;!$(N[C,S]=*)]', '[NX1]#[CX2]', '[CX4]([C,c])([C,c])[C,c]',\n", 515 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 516 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 517 | ] 518 | }, 519 | { 520 | "cell_type": "code", 521 | "execution_count": null, 522 | "metadata": { 523 | "id": "GvHFZTes3hnZ" 524 | }, 525 | "outputs": [], 526 | "source": [ 527 | "#definition of benzophenone-p-diF monomer\n", 528 | "#objective FG: p-fuluorinated brnzophenone\n", 529 | "#excluded FG: carbonyl with oxygen (acid, ester etc), amide, alcohol, prim- and sec-amine, nitryl, tert-alkyl\n", 530 | "n=mon_dic['BzodiF']\n", 531 | "monL[n]=('[c]1[c][c]([F])[c][c][c]1[CX3](=[OX1])[c]2[c][c][c]([F])[c][c]2', )\n", 532 | "exclL[n]=('[CX3](=[OX1])[OX2]', '[CX3H0](=[O])[NX3]', '[CX4][OH1]', '[c][OH1]',\n", 533 | " '[CX4][N&X3;H2;!$(N[C,S]=*)]', '[c][N&X3;H2;!$(N[C,S]=*)]', '[NX1]#[CX2]', '[CX4]([C,c])([C,c])[C,c]',\n", 534 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 535 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 536 | ] 537 | }, 538 | { 539 | "cell_type": "code", 540 | "execution_count": null, 541 | "metadata": { 542 | "id": "fv7KJoyw3hna" 543 | }, 544 | "outputs": [], 545 | "source": [ 546 | "#definition of polyol monomer include thiol for alkaline condensation\n", 547 | "#objective FG: phenol, aliphatic prim- and sec-alcohol\n", 548 | "#excluded FG: carbonyl with oxygen (acid, ester etc), non-cyclic tert-alcohol, thiol, sugar, amine, nitryl, epoxide\n", 549 | "n=mon_dic['diol_b']\n", 550 | "monL[n]=('[CX4H1][OX2;H1]', '[CX4H2][OX2;H1]', '[c][OX2;H1]')\n", 551 | "exclL[n]=('[CX3](=[OX1])[OX2]', '[CX4H0;!R]([C])([C])[O,S;H1]', '[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]',\n", 552 | " '[CX4H1][SX2;H1]', '[CX4H2][SX2;H1]', '[c][SX2;H1]',\n", 553 | " '[N&X3;H2,H1;!$(N[C,S]=*)]', '[CX3H0](=[O])[NX3]', '[NX1]#[CX2]', '[CX4]1[OX2][CX4]1',\n", 554 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 555 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 556 | ] 557 | }, 558 | { 559 | "cell_type": "code", 560 | "execution_count": null, 561 | "metadata": { 562 | "id": "MYOYXKUM3hna" 563 | }, 564 | "outputs": [], 565 | "source": [ 566 | "#definition of acryl monomer\n", 567 | "#objective FG: (meth)acrylates include F substituted olefin\n", 568 | "#excluded FG: -\n", 569 | "n=mon_dic['acryl']\n", 570 | "monL[n]=('[CX3H2;!R:1]=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 571 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 572 | " '[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 573 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 574 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 575 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 576 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 577 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 578 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 579 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 580 | " '[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 581 | " '[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 582 | " '[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 583 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 584 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 585 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 586 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 587 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 588 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 589 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 590 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 591 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 592 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 593 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 594 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 595 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 596 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 597 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 598 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 599 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 600 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 601 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 602 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 603 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 604 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 605 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 606 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 607 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]',\n", 608 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]' )\n", 609 | "exclL[n]=('[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 610 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])',\n", 611 | " '[CX4]-[NX2]=[NX2]-[CX4]', '[OX2]-[OX2]', '[OX2H][cX3]:[c]', '[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1')" 612 | ] 613 | }, 614 | { 615 | "cell_type": "code", 616 | "execution_count": null, 617 | "metadata": { 618 | "id": "mGLfl8Kk3hnb" 619 | }, 620 | "outputs": [], 621 | "source": [ 622 | "#definition of beta-electron withdrawing gorup substituted olefin monomer\n", 623 | "#objective FG: (meth)acryl nitryls, sulfonates, phosphates, isocyanates\n", 624 | "#excluded FG: -\n", 625 | "n=mon_dic['bEWole']\n", 626 | "monL[n]=('[CX3H2;!R:1]=[CX3H1;!R:2][CX2]#[NX1]',\n", 627 | " '[CX3H2;!R:1]=[CX3H1;!R:2][NX2]=[CX2]=[OX1]',\n", 628 | " '[CX3H2;!R:1]=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]',\n", 629 | " '[CX3H2;!R:1]=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]',\n", 630 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]',\n", 631 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]',\n", 632 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 633 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 634 | " '[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]',\n", 635 | " '[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]',\n", 636 | " '[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 637 | " '[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 638 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]',\n", 639 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]',\n", 640 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 641 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 642 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]',\n", 643 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]',\n", 644 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 645 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 646 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]',\n", 647 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]',\n", 648 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 649 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 650 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]',\n", 651 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]',\n", 652 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 653 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 654 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]',\n", 655 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]',\n", 656 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 657 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 658 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]',\n", 659 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]',\n", 660 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 661 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 662 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]',\n", 663 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]',\n", 664 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 665 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 666 | " '[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]',\n", 667 | " '[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]',\n", 668 | " '[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 669 | " '[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 670 | " '[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]',\n", 671 | " '[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]',\n", 672 | " '[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 673 | " '[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 674 | " '[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]',\n", 675 | " '[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]',\n", 676 | " '[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 677 | " '[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 678 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][CX2]#[NX1]',\n", 679 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][NX2]=[CX2]=[OX1]',\n", 680 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]',\n", 681 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]',\n", 682 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]',\n", 683 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]',\n", 684 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 685 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 686 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]',\n", 687 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]',\n", 688 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 689 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 690 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]',\n", 691 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]',\n", 692 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 693 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 694 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]',\n", 695 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]',\n", 696 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 697 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 698 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]',\n", 699 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]',\n", 700 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 701 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 702 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]',\n", 703 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]',\n", 704 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 705 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 706 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]',\n", 707 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]',\n", 708 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 709 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 710 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]',\n", 711 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]',\n", 712 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 713 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 714 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]',\n", 715 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]',\n", 716 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 717 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 718 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]',\n", 719 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]',\n", 720 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 721 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 722 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]',\n", 723 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]',\n", 724 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 725 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 726 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]',\n", 727 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]',\n", 728 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 729 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 730 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][CX2]#[NX1]',\n", 731 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][NX2]=[CX2]=[OX1]',\n", 732 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]',\n", 733 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]',\n", 734 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]',\n", 735 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]',\n", 736 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 737 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 738 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]',\n", 739 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]',\n", 740 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 741 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 742 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]',\n", 743 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]',\n", 744 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 745 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 746 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]',\n", 747 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]',\n", 748 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 749 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 750 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]',\n", 751 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]',\n", 752 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 753 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 754 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]',\n", 755 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]',\n", 756 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 757 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 758 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]',\n", 759 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]',\n", 760 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 761 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 762 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]',\n", 763 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]',\n", 764 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 765 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 766 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]',\n", 767 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]',\n", 768 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 769 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 770 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]',\n", 771 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]',\n", 772 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 773 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 774 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]',\n", 775 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]',\n", 776 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 777 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]',\n", 778 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]',\n", 779 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]',\n", 780 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]',\n", 781 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]')\n", 782 | "exclL[n]=('[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 783 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])',\n", 784 | " '[CX4]-[NX2]=[NX2]-[CX4]', '[OX2]-[OX2]', '[OX2H][cX3]:[c]', '[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1')" 785 | ] 786 | }, 787 | { 788 | "cell_type": "code", 789 | "execution_count": null, 790 | "metadata": { 791 | "id": "06iTEd3P3hnc" 792 | }, 793 | "outputs": [], 794 | "source": [ 795 | "#definition of styryl monomer\n", 796 | "#objective FG: open chain terminal olefin neighboring aryl group\n", 797 | "#excluded FG: -\n", 798 | "n=mon_dic['styryl']\n", 799 | "monL[n]=('[CX3H2;!R:1]=[CX3H1;!R:2][c:3]',\n", 800 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[c:3]',\n", 801 | " '[CX3H2;!R:1]=[CX3;!R:2]([F])[c:3]',\n", 802 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F])[c:3]',\n", 803 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]',\n", 804 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]',\n", 805 | " '[CX3H2;!R:1]=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]',\n", 806 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][c:3]',\n", 807 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[c:3]',\n", 808 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F])[c:3]',\n", 809 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F])[c:3]',\n", 810 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]',\n", 811 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]',\n", 812 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]',\n", 813 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][c:3]',\n", 814 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[c:3]',\n", 815 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F])[c:3]',\n", 816 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F])[c:3]',\n", 817 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]',\n", 818 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]',\n", 819 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]',\n", 820 | " '[CX3H2;!R:1]=[CX3H1;!R:2][n:3]',\n", 821 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[n:3]',\n", 822 | " '[CX3H2;!R:1]=[CX3;!R:2]([F])[n:3]',\n", 823 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F])[n:3]',\n", 824 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]',\n", 825 | " '[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]',\n", 826 | " '[CX3H2;!R:1]=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]',\n", 827 | " '[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][n:3]',\n", 828 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[n:3]',\n", 829 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F])[n:3]',\n", 830 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F])[n:3]',\n", 831 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]',\n", 832 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]',\n", 833 | " '[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]',\n", 834 | " '[CX3;H1;!R:1]([F])=[CX3H1;!R:2][n:3]',\n", 835 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[n:3]',\n", 836 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([F])[n:3]',\n", 837 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F])[n:3]',\n", 838 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]',\n", 839 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]',\n", 840 | " '[CX3;H1;!R:1]([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]',\n", 841 | " '[CX3;H1;r5:1]=[CX3;H1;r5:2][c:3]')\n", 842 | "exclL[n]=('[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 843 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])',\n", 844 | " '[CX4]-[NX2]=[NX2]-[CX4]', '[OX2]-[OX2]', '[OX2H][cX3]:[c]', '[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1')" 845 | ] 846 | }, 847 | { 848 | "cell_type": "code", 849 | "execution_count": null, 850 | "metadata": { 851 | "id": "iePBztwA3hnd" 852 | }, 853 | "outputs": [], 854 | "source": [ 855 | "#definition of allyl monomer\n", 856 | "#objective FG: open chain allylic ehter, ester, amine, silane, aromatics.\n", 857 | "#excluded FG: -\n", 858 | "n=mon_dic['allyl']\n", 859 | "monL[n]=('[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][OX2,SX2:3]', '[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][NX3:3]',\n", 860 | " '[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][c:3]', '[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][n:3]',\n", 861 | " '[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][SiX4]([C,c:3])([C,c:4])[C,c:5]', '[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][CX2]#[N]')\n", 862 | "exclL[n]=('[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 863 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])',\n", 864 | " '[CX4]-[NX2]=[NX2]-[CX4]', '[OX2]-[OX2]', '[OX2H][cX3]:[c]', '[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1')" 865 | ] 866 | }, 867 | { 868 | "cell_type": "code", 869 | "execution_count": null, 870 | "metadata": { 871 | "id": "ZZBkpW804wvF" 872 | }, 873 | "outputs": [], 874 | "source": [ 875 | "#definition of halogenated olefin\n", 876 | "#objective FG: open chain F, Cl substituted hydrocarbon.\n", 877 | "#excluded FG: -\n", 878 | "n=mon_dic['haloCH']\n", 879 | "monL[n]=('[CX3;H2;!R:1]=[CX3;H1;!R:2][F,Cl:3]', '[CX3;H2;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]',\n", 880 | " '[CX3;H2;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]', '[CX3;H2;!R:1]=[CX3;H1;!R:2][CX4:3]([F,Cl:4])[F,Cl:5]',\n", 881 | " '[F,Cl:4][CX3;H1;!R:1]=[CX3;H1;!R:2][F,Cl:3]', '[F,Cl:5][CX3;H1;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]',\n", 882 | " '[F,Cl:5][CX3;H1;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]',\n", 883 | " '[F,Cl:4][CX3;H0;!R:1]([F,Cl:5])=[CX3;H1;!R:2][F,Cl:3]', '[F,Cl:5][CX3;H0;!R:1]([F,Cl:6])=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]',\n", 884 | " '[F,Cl:5][CX3;H0;!R:1]([F,Cl:6])=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]')\n", 885 | "exclL[n]=('[A+1]', '[A+2]', '[A+3]', '[A+4]', '[a+1]', '[a+2]', '[a+3]', '[a+4]',\n", 886 | " '[A-1]', '[A-2]', '[A-3]', '[A-4]', '[a-1]', '[a-2]', '[a-3]', '[a-4]',\n", 887 | " '[CX3]=[CX3]-[CX3]=[CX3]', '[CX2]#[CX2]',\n", 888 | " '[Li,Na,K,Rb,Cs]', '[Be,Mg,Ca,Sr,Ba]', '[Zn,Cd,Hg]',\n", 889 | " '[B,b]', '[c]', '[N,n]', '[O,o]', '[Al]', '[P,p]', '[S,s]',\n", 890 | " '[Ga]', '[Ge]', '[As,as]', '[Se,se]',\n", 891 | " '[In]', '[Sn,sn]', '[Sb,sb]', '[Te,te]', '[Tl]', '[Pb,pb]', '[Bi]',\n", 892 | " '[Br,I]', '[si]', '[Si][F,Cl]', '[Si;H4,H3,H2,H1]')" 893 | ] 894 | }, 895 | { 896 | "cell_type": "code", 897 | "execution_count": null, 898 | "metadata": { 899 | "id": "bP922F6t3hne" 900 | }, 901 | "outputs": [], 902 | "source": [ 903 | "#definition of vinyl ester and amide\n", 904 | "#objective FG: open chain vinyl ether include F substituted olefin\n", 905 | "#excluded FG: -\n", 906 | "n=mon_dic['vinylester']\n", 907 | "monL[n]=('[CX3H2:1]=[CX3H1:2][OX2][CX3:3](=[OX1])', '[CX3H1:1]([F])=[CX3H1:2][OX2][CX3:3](=[OX1])', '[CX3H0:1]([F])([F])=[CX3H1:2][OX2][CX3:3](=[OX1])',\n", 908 | " '[CX3H2:1]=[CX3H1:2][NX3:3][CX3:4](=[OX1])', '[CX3H1:1]([F])=[CX3H1:2][NX3:3][CX3:4](=[OX1])',\n", 909 | " '[CX3H0:1]([F])([F])=[CX3H1:2][NX3:3][CX3:4](=[OX1])')\n", 910 | "exclL[n]=('[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 911 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])',\n", 912 | " '[CX4]-[NX2]=[NX2]-[CX4]', '[OX2]-[OX2]', '[OX2H][cX3]:[c]', '[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1')" 913 | ] 914 | }, 915 | { 916 | "cell_type": "code", 917 | "execution_count": null, 918 | "metadata": { 919 | "id": "L4cRdxvs3hnf" 920 | }, 921 | "outputs": [], 922 | "source": [ 923 | "#definition of maleimide deribatives\n", 924 | "#https://doi.org/10.1295/kobunshi.14.217\n", 925 | "#objective FG: maleimide derivatives\n", 926 | "#excluded FG: -\n", 927 | "n=mon_dic['malei']\n", 928 | "monL[n]=('[CX3H1;R:1]1=[CX3H1;R:2][CX3;R](=[OX1])[NX3:3][CX3;R]1(=[OX1])', '[CX3H1;R:1]1([F])=[CX3H1;R:2]([F])[CX3;R](=[OX1])[NX3:3][CX3;R]1(=[OX1])')\n", 929 | "exclL[n]=('[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 930 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 931 | ] 932 | }, 933 | { 934 | "cell_type": "code", 935 | "execution_count": null, 936 | "metadata": { 937 | "id": "lxC0qMWb3hnd" 938 | }, 939 | "outputs": [], 940 | "source": [ 941 | "#definition of diene monomer\n", 942 | "#objective FG: open chain conjugated olefine, RECONFIRMATION REQUIRED\n", 943 | "#excluded FG: -\n", 944 | "n=mon_dic['conjdiene']\n", 945 | "monL[n]= ('[CX3;H2:1]=[CX3;!R:2]-[CX3;!R:3]=[CX3;H2:4]', '[CX3:1]([F])([F])=[CX3!R:2]-[CX3;!R:3]=[CX3H2:4]',\n", 946 | " '[CX3;H1:1]([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H2:4]', '[CX3;H0:1]([F])([F])=[CX3!R:2]-[CX3;!R:3]=[CX3;H1:4]([F])',\n", 947 | " '[CX3;H1:1]([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H1:4]([F])', '[CX3;H0:1]([F])([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H0:4]([F])([F])')\n", 948 | "exclL[n]=('[CX3]=[CX3](-[CX3;!R]=[CX3;!R])=[CX3]' , '[CX3]=[CX3]-[CX3]=[CX3]-[CX3]=[CX3]', \n", 949 | " '[CX4]-[I,Br]', '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 950 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])',\n", 951 | " '[CX4]-[NX2]=[NX2]-[CX4]', '[OX2]-[OX2]', '[OX2H][cX3]:[c]', '[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1')" 952 | ] 953 | }, 954 | { 955 | "cell_type": "code", 956 | "execution_count": null, 957 | "metadata": { 958 | "id": "-ypYyNBZ3hne" 959 | }, 960 | "outputs": [], 961 | "source": [ 962 | "#definition of vinyl ether\n", 963 | "#objective FG: open chain terminal olefin include acrylate, Fsubstituted olefin\n", 964 | "#dintinction from vinyl ester\n", 965 | "#excluded FG: basic FG such as amine\n", 966 | "n=mon_dic['vinylether']\n", 967 | "monL[n]=('[CX3H2:1]=[CX3:2][OX2][CX4:3]', '[CX3H2:1]=[CX3:2][OX2;!$(O*=O)][CX3:3]', '[CX3H2:1]=[CX3:2][OX2][c:3]',\n", 968 | " '[CX3H1:1]([F])=[CX3:2][OX2][CX4:3]', '[CX3H1:1]([F])=[CX3:2][OX2;!$(O*=O)][CX3:3]', '[CX3H1:1]([F])=[CX3:2][OX2][c:3]',\n", 969 | " '[CX3H0:1]([F])([F])=[CX3:2][OX2][CX4:3]', '[CX3H0:1]([F])([F])=[CX3:2][OX2;!$(O*=O)][CX3:3]', '[CX3H0:1]([F])([F])=[CX3:2][OX2][c:3]')\n", 970 | "exclL[n]=('[NX3;H3,H2,H1;!$(NC=O)]', '[n]',\n", 971 | " '[CX4;H2,H1][OX2;R][CX4]', '[CX4;H2,H1][SX2;R][CX4]', '[OX2][CX4,c]([C,c])([C,c])[CX4,c]',\n", 972 | " '[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 973 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 974 | ] 975 | }, 976 | { 977 | "cell_type": "code", 978 | "execution_count": null, 979 | "metadata": { 980 | "id": "mik6yIz4qY0u" 981 | }, 982 | "outputs": [], 983 | "source": [ 984 | "#definition of aliphatic precursor of tert-cation olefine for cationic polymerization\n", 985 | "#objective FG: beta-disubstituted aliphatic CH olefine\n", 986 | "#excluded FG: -\n", 987 | "n=mon_dic['tertcatCH']\n", 988 | "monL[n]=('[CX3;H2;!R:1]=[CX3;H0;!R:2]([CX4H3])[CX4H2:3]', '[CX3;H2;!R:1]=[CX3;H0;!R:2]([CX4H3])[CX4H3]')\n", 989 | "exclL[n]=('[A+1]', '[A+2]', '[A+3]', '[A+4]', '[a+1]', '[a+2]', '[a+3]', '[a+4]',\n", 990 | " '[A-1]', '[A-2]', '[A-3]', '[A-4]', '[a-1]', '[a-2]', '[a-3]', '[a-4]',\n", 991 | " '[CX3]=[CX3]-[CX3]=[CX3]', '[CX2]#[CX2]',\n", 992 | " '[Li,Na,K,Rb,Cs]', '[Be,Mg,Ca,Sr,Ba]', '[Zn,Cd,Hg]',\n", 993 | " '[B,b]', '[c]', '[N,n]', '[O,o]', '[Al]', '[P,p]', '[S,s]',\n", 994 | " '[Ga]', '[Ge]', '[As,as]', '[Se,se]',\n", 995 | " '[In]', '[Sn,sn]', '[Sb,sb]', '[Te,te]', '[Tl]', '[Pb,pb]', '[Bi]',\n", 996 | " '[F,Cl,Br,I]', '[si]', '[Si][F,Cl]', '[Si;H4,H3,H2,H1]')" 997 | ] 998 | }, 999 | { 1000 | "cell_type": "code", 1001 | "execution_count": null, 1002 | "metadata": { 1003 | "id": "i-NIiHcZ9ZVR" 1004 | }, 1005 | "outputs": [], 1006 | "source": [ 1007 | "#Futrue works; non classical carbocation\n", 1008 | "#bicyclic cf beta-pinene '[CX3;H2;!R:1]=[CX3;H0;r4,r5,r6:2]([CX4;H2;r4,r5,r6:3])[CX4;H2;r4,r5,r6,R6:4]'" 1009 | ] 1010 | }, 1011 | { 1012 | "cell_type": "code", 1013 | "execution_count": null, 1014 | "metadata": { 1015 | "id": "6uo7sd0L3hnf" 1016 | }, 1017 | "outputs": [], 1018 | "source": [ 1019 | "#definition of cyclic CH olefine for coordination polymerization\n", 1020 | "#objective FG: cyclic CH olefine\n", 1021 | "#excluded FG: -\n", 1022 | "n=mon_dic['cycCH']\n", 1023 | "monL[n]=('[CX3;H1;R:1]=[CX3;H1;R:2]',)\n", 1024 | "exclL[n]=('[A+1]', '[A+2]', '[A+3]', '[A+4]', '[a+1]', '[a+2]', '[a+3]', '[a+4]',\n", 1025 | " '[A-1]', '[A-2]', '[A-3]', '[A-4]', '[a-1]', '[a-2]', '[a-3]', '[a-4]',\n", 1026 | " '[CX3]=[CX3]-[CX3]=[CX3]', '[CX2]#[CX2]',\n", 1027 | " '[Li,Na,K,Rb,Cs]', '[Be,Mg,Ca,Sr,Ba]', '[Zn,Cd,Hg]',\n", 1028 | " '[B,b]', '[c]', '[N,n]', '[O,o]', '[Al]', '[P,p]', '[S,s]',\n", 1029 | " '[Ga]', '[Ge]', '[As,as]', '[Se,se]',\n", 1030 | " '[In]', '[Sn,sn]', '[Sb,sb]', '[Te,te]', '[Tl]', '[Pb,pb]', '[Bi]',\n", 1031 | " '[Cl,Br,I]', '[si]', '[Si][F,Cl]', '[Si;H4,H3,H2,H1]')" 1032 | ] 1033 | }, 1034 | { 1035 | "cell_type": "code", 1036 | "execution_count": null, 1037 | "metadata": { 1038 | "id": "QoS9GaCH3hnf" 1039 | }, 1040 | "outputs": [], 1041 | "source": [ 1042 | "#definition of aliphatic CH olefine for coordination polymerization\n", 1043 | "#objective FG: open chain liphatic CH olefine\n", 1044 | "#excluded FG: -\n", 1045 | "n=mon_dic['aliphCH']\n", 1046 | "monL[n]=('[CX3;H2;!R:1]=[CX3;H2;!R:2]', '[CX3;H2;!R:1]=[CX3;H1;!R:2]', )\n", 1047 | "exclL[n]=('[A+1]', '[A+2]', '[A+3]', '[A+4]', '[a+1]', '[a+2]', '[a+3]', '[a+4]',\n", 1048 | " '[A-1]', '[A-2]', '[A-3]', '[A-4]', '[a-1]', '[a-2]', '[a-3]', '[a-4]',\n", 1049 | " '[CX3]=[CX3]-[CX3]=[CX3]', '[CX2]#[CX2]',\n", 1050 | " '[Li,Na,K,Rb,Cs]', '[Be,Mg,Ca,Sr,Ba]', '[Zn,Cd,Hg]',\n", 1051 | " '[B,b]', '[c]', '[N,n]', '[O,o]', '[Al]', '[P,p]', '[S,s]',\n", 1052 | " '[Ga]', '[Ge]', '[As,as]', '[Se,se]',\n", 1053 | " '[In]', '[Sn,sn]', '[Sb,sb]', '[Te,te]', '[Tl]', '[Pb,pb]', '[Bi]',\n", 1054 | " '[Cl,Br,I]', '[si]', '[Si][F,Cl]', '[Si;H4,H3,H2,H1]')" 1055 | ] 1056 | }, 1057 | { 1058 | "cell_type": "raw", 1059 | "metadata": { 1060 | "id": "ZoGc4va93hng", 1061 | "vscode": { 1062 | "languageId": "raw" 1063 | } 1064 | }, 1065 | "source": [ 1066 | "#!!under construction!!\n", 1067 | "#definition of Diels-Aldered maleic bicyclic olefins\n", 1068 | "#objective FG: bicyclic compouds\n", 1069 | "#excluded FG: -\n", 1070 | "n=mon_dic['bicyc']\n", 1071 | "monL[n]=('[CX3H1;R:1]1=[CX3H1;R:2][CX4;R:3][CX4H1;R:4]2[CX3;R](=[OX1])[NX3:5][CX3;R](=[OX1])[CX4H1;R:6]2[CX4;R:7]1',\n", 1072 | " '[CX3H1;R:1]1([F])=[CX3H1;R:2]([F])[CX4;R:3][CX4H1;R:4]2[CX3;R](=[OX1])[NX3:5][CX3;R](=[OX1])[CX4H1;R:6]2[CX4;R:7]1',\n", 1073 | " '[CX3H1;R:1]1([F])=[CX3H1;R:2]([F])[CX4;R:3][CX4;R]:42([F])[CX3;R](=[OX1])[NX3:5][CX3;R](=[OX1])[CX4;R:6]2([F])[CX4;R:7]1')\n", 1074 | "exclL[n]=('[CX3H1]=[O]', '[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]',\n", 1075 | " '[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])', '[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])')" 1076 | ] 1077 | }, 1078 | { 1079 | "cell_type": "code", 1080 | "execution_count": null, 1081 | "metadata": { 1082 | "id": "bsrpWUNO3hng" 1083 | }, 1084 | "outputs": [], 1085 | "source": [ 1086 | "#for sequential polymerization of residual FGs\n", 1087 | "monL[200] = ('[CX3]=[CX3]')\n", 1088 | "monL[201] = ('[CX4;R]1[OX2;R][CX4;R]1')\n", 1089 | "monL[202] = ('[CX3](=[O])[OX2H1,F,Cl,Br,I]')\n", 1090 | "monL[203] = ('[C,c][OX2,SX2;H1;!$([O,S]C=*)]')\n", 1091 | "monL[204] = ('[C,c][NX3;H2;!$(N[C,S]=*)]')\n", 1092 | "monL[205] = ('[NX2]=[CX2]=[OX1,SX1]')\n", 1093 | "monL[206] = ('[C,c][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[C,c]')" 1094 | ] 1095 | }, 1096 | { 1097 | "cell_type": "markdown", 1098 | "metadata": { 1099 | "id": "g2pfzIYm3hng" 1100 | }, 1101 | "source": [ 1102 | "export lists, dictionaries and defs as pickle" 1103 | ] 1104 | }, 1105 | { 1106 | "cell_type": "code", 1107 | "execution_count": null, 1108 | "metadata": { 1109 | "id": "WXTAm7Qx3hng" 1110 | }, 1111 | "outputs": [], 1112 | "source": [ 1113 | "with open(\"./rules/mon_vals.json\",\"w\") as f:\n", 1114 | " json.dump(mon_vals, f)" 1115 | ] 1116 | }, 1117 | { 1118 | "cell_type": "code", 1119 | "execution_count": null, 1120 | "metadata": { 1121 | "id": "-YwXLc7u3hng" 1122 | }, 1123 | "outputs": [], 1124 | "source": [ 1125 | "with open(\"./rules/mon_dic.json\",\"w\") as f:\n", 1126 | " json.dump(mon_dic, f)" 1127 | ] 1128 | }, 1129 | { 1130 | "cell_type": "code", 1131 | "execution_count": null, 1132 | "metadata": { 1133 | "id": "grv6EZGG3hnh" 1134 | }, 1135 | "outputs": [], 1136 | "source": [ 1137 | "with open(\"./rules/mon_dic_inv.json\",\"w\") as f:\n", 1138 | " json.dump(mon_dic_inv, f)" 1139 | ] 1140 | }, 1141 | { 1142 | "cell_type": "code", 1143 | "execution_count": null, 1144 | "metadata": { 1145 | "id": "K-rsVqtj3hnh" 1146 | }, 1147 | "outputs": [], 1148 | "source": [ 1149 | "with open(\"./rules/mon_lst.json\",\"w\") as f:\n", 1150 | " json.dump(monL, f)" 1151 | ] 1152 | }, 1153 | { 1154 | "cell_type": "code", 1155 | "execution_count": null, 1156 | "metadata": { 1157 | "id": "hK6R5Bcn3hnh" 1158 | }, 1159 | "outputs": [], 1160 | "source": [ 1161 | "with open(\"./rules/excl_lst.json\",\"w\") as f:\n", 1162 | " json.dump(exclL, f)" 1163 | ] 1164 | }, 1165 | { 1166 | "cell_type": "markdown", 1167 | "metadata": { 1168 | "id": "RYZ7jkdF3hnh" 1169 | }, 1170 | "source": [ 1171 | "end" 1172 | ] 1173 | } 1174 | ], 1175 | "metadata": { 1176 | "colab": { 1177 | "provenance": [] 1178 | }, 1179 | "kernelspec": { 1180 | "display_name": "Python 3 (ipykernel)", 1181 | "language": "python", 1182 | "name": "python3" 1183 | }, 1184 | "language_info": { 1185 | "codemirror_mode": { 1186 | "name": "ipython", 1187 | "version": 3 1188 | }, 1189 | "file_extension": ".py", 1190 | "mimetype": "text/x-python", 1191 | "name": "python", 1192 | "nbconvert_exporter": "python", 1193 | "pygments_lexer": "ipython3", 1194 | "version": "3.9.21" 1195 | } 1196 | }, 1197 | "nbformat": 4, 1198 | "nbformat_minor": 4 1199 | } 1200 | -------------------------------------------------------------------------------- /utilities/2_Ps_rxnL.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "5957bfb5", 6 | "metadata": {}, 7 | "source": [ 8 | "Copyright (c) 2021 Mitsuru Ohno \n", 9 | "Use of this source code is governed by a BSD-3-style \n", 10 | "license that can be found in the LICENSE file. \n", 11 | " \n", 12 | "08/09/2021, M. Ohno \n", 13 | "tool for defined list of polymerizarion reaction. \n", 14 | "\n", 15 | "Refernce: \n", 16 | "https://future-chem.com/rdkit-chemical-rxn/\n", 17 | "https://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html\n", 18 | "https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html \n", 19 | "https://www.daylight.com/dayhtml/doc/theory/index.pdf " 20 | ] 21 | }, 22 | { 23 | "cell_type": "code", 24 | "execution_count": null, 25 | "id": "b8e9c5ca", 26 | "metadata": {}, 27 | "outputs": [], 28 | "source": [ 29 | "import json\n", 30 | "import pickle\n", 31 | "from rdkit.Chem import AllChem" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "id": "cdc2aa37", 38 | "metadata": {}, 39 | "outputs": [], 40 | "source": [ 41 | "with open('./rules/mon_dic.json', 'r') as f:\n", 42 | " mon_dic = json.load(f)" 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "id": "b8c17330", 48 | "metadata": {}, 49 | "source": [ 50 | "for single monomer system; value 1- 49 \n", 51 | "for binary monomer system; value 101 - 199 \n", 52 | "for sequential reaction; value 201 - \n", 53 | "for classifidation of the olefinic monomers 1001 - " 54 | ] 55 | }, 56 | { 57 | "cell_type": "code", 58 | "execution_count": null, 59 | "id": "bdbcc0c7", 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "Ps_rxnL = {}" 64 | ] 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": null, 69 | "id": "62a6aafb", 70 | "metadata": {}, 71 | "outputs": [], 72 | "source": [ 73 | "#vinyl homopolyrization \n", 74 | "n= mon_dic['vinyl']\n", 75 | "vinyl_homo = '[CX3;H2,H1,H0:1]=[C;H2,H1,H0:2]>>*-[C:1][C:2]-*'\n", 76 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(vinyl_homo)" 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "id": "0dd02cbf", 83 | "metadata": {}, 84 | "outputs": [], 85 | "source": [ 86 | "#epoxide homopolyrization (ROP) \n", 87 | "n= mon_dic['epo']\n", 88 | "epo_homo = '[CX4;H2,H1,H0;R:1]1[O;R][C;R:2]1>>*-[CX4:1][CX4:2][O]-*'\n", 89 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(epo_homo)" 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "id": "86703b67", 96 | "metadata": {}, 97 | "outputs": [], 98 | "source": [ 99 | "#cyclic ofefin homopolyrization \n", 100 | "n= mon_dic['cOle']\n", 101 | "cOle_homo = '[CX3;R:1]=[CX3;R:2]>>*-[CX4;R:1][CX4;R:2]-*'\n", 102 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(cOle_homo)" 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "id": "11d6daf0", 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "#lactone ROP \n", 113 | "n= mon_dic['lactone']\n", 114 | "lactone_homo = '[CX3;R:1](=[OX1])[OX2;R:2]>>(*-[CX3:1](=[OX1]).[OX2:2]-*)'\n", 115 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(lactone_homo)" 116 | ] 117 | }, 118 | { 119 | "cell_type": "code", 120 | "execution_count": null, 121 | "id": "a6c73d55", 122 | "metadata": {}, 123 | "outputs": [], 124 | "source": [ 125 | "#lactam ROP \n", 126 | "n= mon_dic['lactam']\n", 127 | "lactam_homo = '[CX3;R:1](=[OX1])[NX3;R:2]>>(*-[CX3:1](=[OX1]).[NX3:2]-*)'\n", 128 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(lactam_homo)" 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "id": "7c857bfb", 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [ 138 | "#hydroxy carboxylic acid self condensation \n", 139 | "n= mon_dic['hydCOOH']\n", 140 | "hydCOOH_homo = '([OX2H1;!$(OC=*):1].[CX3:2](=[O])[OX2H1])>>(*-[OX2:1].[CX3:2](=[O])-*)'\n", 141 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(hydCOOH_homo)" 142 | ] 143 | }, 144 | { 145 | "cell_type": "code", 146 | "execution_count": null, 147 | "id": "7866f94a", 148 | "metadata": {}, 149 | "outputs": [], 150 | "source": [ 151 | "#amino carboxylic acid self condensation \n", 152 | "n= mon_dic['aminCOOH']\n", 153 | "aminCOOH_homo = '([NX3;H2,H1;!$(OC=*):1].[CX3:2](=[O])[OX2H1])>>(*-[NX3:1].[CX3:2](=[O])-*)'\n", 154 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(aminCOOH_homo)" 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "id": "d9d453d8", 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "#hindered phenol oxidative polymerization \n", 165 | "n= mon_dic['hindPhenol']\n", 166 | "hindPhenol_homo = '[c]1([OH1:1])[c:2][c:3][c;H1:4][c:5][c:6]1>>[c]1([OX2:1]-[*])[c:2][c:3][c:4](-*)[c:5][c:6]1'\n", 167 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(hindPhenol_homo)" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": null, 173 | "id": "1f2f75ee", 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "#polyolefine:co-vinyl \n", 178 | "vinyl_cross = '[CX3;H2,H1,H0;!R:1]=[CX3;H2,H1,H0;!R:2].[CX3;H2,H1,H0;!R:3]=[CX3;H2,H1,H0;!R:4]>>*-[CX4:1][CX4:2][CX4:3][CX4:4]-*'\n", 179 | "Ps_rxnL[101] = AllChem.ReactionFromSmarts(vinyl_cross)" 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": null, 185 | "id": "272d2bcb", 186 | "metadata": {}, 187 | "outputs": [], 188 | "source": [ 189 | "#polyolefine:vinyl-cOle \n", 190 | "VcO = '[CX3;H2,H1,H0;!R:1]=[CX3;H2,H1,H0;!R:2].[CX3;H1,H0;R:3]=[CX3;H1,H0;R:4]>>*-[CX4:1][CX4:2][CX4:3][CX4:4]-*'\n", 191 | "Ps_rxnL[102] = AllChem.ReactionFromSmarts(VcO)" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "id": "32892dd5", 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "#polyolefine:co-cOle \n", 202 | "cO_cross = '[CX3;H1,H0;R:1]=[CX3;H1,H0;R:2].[CX3;H1,H0;R:3]=[CX3;R:4]>>*-[CX4:1][CX4:2][CX4:3][CX4:4]-*'\n", 203 | "Ps_rxnL[103] = AllChem.ReactionFromSmarts(cO_cross)" 204 | ] 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": null, 209 | "id": "2b5c7bac", 210 | "metadata": {}, 211 | "outputs": [], 212 | "source": [ 213 | "#polyester:diCOOH+diOH\n", 214 | "dehydest = '([CX3:1](=[O])[OX2H1,Cl,Br].[CX3:2](=[O])[OX2H1,Cl,Br]).([O,S;X2;H1;!$([O,S]C=*):3].[O,S;X2;H1;!$([O,S]C=*):4])>>(*-[CX3:1](=[O]).[CX3:2](=[O])-[O,S;X2;!$([O,S]C=*):3].[O,S;X2;!$([O,S]C=*):4]-*)'\n", 215 | "Ps_rxnL[104] = AllChem.ReactionFromSmarts(dehydest)" 216 | ] 217 | }, 218 | { 219 | "cell_type": "code", 220 | "execution_count": null, 221 | "id": "15c51288", 222 | "metadata": {}, 223 | "outputs": [], 224 | "source": [ 225 | "#polyester:co-hydroxy carboxylic acid condensation \n", 226 | "cohydCOOH = '([OX2H1;!$(OC=*):1].[CX3:2](=[O])[OX2H1]).([OX2H1;!$(OC=*):3].[CX3:4](=[O])[OX2H1])>>(*-[OX2:1].[CX3:2](=[O])[OX2:3].[CX3:4](=[O])-*)'\n", 227 | "Ps_rxnL[105] = AllChem.ReactionFromSmarts(cohydCOOH)" 228 | ] 229 | }, 230 | { 231 | "cell_type": "code", 232 | "execution_count": null, 233 | "id": "3be0e2ef", 234 | "metadata": {}, 235 | "outputs": [], 236 | "source": [ 237 | "#polycarbonate\n", 238 | "PC = '([O,S;X2;H1;!$([O,S]C=*):1].[O,S;X2;H1;!$([O,S]C=*):2]).[C-]#[O+]>>(*-[O,S;X2;!$([O,S]C=*):1].[O,S;X2;!$([O,S]C=*):2][CX3](=[O])-*)'\n", 239 | "Ps_rxnL[106] = AllChem.ReactionFromSmarts(PC)" 240 | ] 241 | }, 242 | { 243 | "cell_type": "code", 244 | "execution_count": null, 245 | "id": "706e2df7", 246 | "metadata": {}, 247 | "outputs": [], 248 | "source": [ 249 | "#polyester:cyclic anhydride+epo\n", 250 | "anhydepo = '[C,c;R:1][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[C,c;R:2].[CX4;R:3]1[OX2;R:4][CX4;R:5]1>>([C,c:1][CX3](=[OX1])(-*).[C,c:2][CX3](=[OX1])[OX2][CX4:3][CX4:5][OX2:4]-*)'\n", 251 | "Ps_rxnL[112] = AllChem.ReactionFromSmarts(anhydepo)" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": null, 257 | "id": "f46643e6", 258 | "metadata": {}, 259 | "outputs": [], 260 | "source": [ 261 | "#polyamide:diCOOH+diamin\n", 262 | "dehydamid = '([CX3:1](=[O])[OX2H1,Cl,Br].[CX3:2](=[O])[OX2H1,Cl,Br]).([N&X3;H2,H1;!$(NC=*):3].[N&X3;H2,H1;!$(NC=*):4])>>(*-[CX3:1](=[O]).[CX3:2](=[O])-[NX3;!$(NC=*):3].[NX3;!$(NC=*):4]-*)'\n", 263 | "Ps_rxnL[108] = AllChem.ReactionFromSmarts(dehydamid)" 264 | ] 265 | }, 266 | { 267 | "cell_type": "code", 268 | "execution_count": null, 269 | "id": "d5b103c7", 270 | "metadata": {}, 271 | "outputs": [], 272 | "source": [ 273 | "#polyamide:co-amino acid condensation \n", 274 | "coaminCOOH = '([N&X3;H2,H1;!$(NC=*):1].[CX3:2](=[O])[OX2H1]).([N&X3;H2,H1;!$(NC=*):3].[CX3:4](=[O])[OX2H1])>>(*-[NX3;!$(NC=*):1].[CX3:2](=[O])[NX3;!$(NC=*):3].[CX3:4](=[O])-*)'\n", 275 | "Ps_rxnL[109] = AllChem.ReactionFromSmarts(coaminCOOH)" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "execution_count": null, 281 | "id": "4d7cd49e", 282 | "metadata": {}, 283 | "outputs": [], 284 | "source": [ 285 | "#polyimide:cyclic anhydride+primary diamine\n", 286 | "PI = '([CX3,c;R:1](=[OX1])[OX2,o;R][CX3,c;R:2](=[OX1]).[CX3,c;R:3](=[OX1])[OX2,o;R][CX3,c;R:4](=[OX1])).([C,c:5][NX3;H2;!$(N[C,S]=*)].[C,c:6][NX3;H2;!$(N[C,S]=*)])>>([CX3,c;R:1](=[OX1])[NX3;R]([C,c:5].[C,c:6]-*)[CX3,c;R:2](=[OX1]).[CX3,c;R:3](=[OX1])[NX3;R](-*)[CX3;R:4](=[OX1]))'\n", 287 | "Ps_rxnL[110] = AllChem.ReactionFromSmarts(PI)" 288 | ] 289 | }, 290 | { 291 | "cell_type": "code", 292 | "execution_count": null, 293 | "id": "c1bda9c4", 294 | "metadata": {}, 295 | "outputs": [], 296 | "source": [ 297 | "#polyurethane:diisocyanate+diOH\n", 298 | "PU = '([NX2:1]=[CX2]=[OX1,SX1:2].[NX2:3]=[CX2:4]=[OX1,SX1:5]).([OX2,SX2;H1;!$([O,S]C=*):6].[OX2,SX2;H1;!$([O,S]C=*):7])>>(*-[CX3](=[OX1,SX1:2])[NX3:1].[NX3:3][CX3:4](=[OX1,SX1:5])[OX2,SX2;!$([O,S]C=*):6].[OX2,SX2;!$([O,S]C=*):7]-*)'\n", 299 | "Ps_rxnL[111] = AllChem.ReactionFromSmarts(PU)" 300 | ] 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": null, 305 | "id": "a9473c8b", 306 | "metadata": {}, 307 | "outputs": [], 308 | "source": [ 309 | "#poly-oxazolidone; diepo+diNCO\n", 310 | "pox = '([CX4;H2,H1,H0;R:1]1[OX2;R:2][CX4;H1,H0;R:3]1.[CX4;H2,H1,H0;R:4]2[OX2;R:5][CX4;H1,H0;R:6]2).([OX1,SX1:7]=[CX2:8]=[NX2:9].[OX1,SX1:10]=[CX2:11]=[NX2:12][C,c:13])>>([CX4;R:6]1[OX2;R:5][CX2;R:8](=[OX1,SX1:7])[NX3;R:9][CX4;R:4]1.[CX4;R:3]1[OX2;R:2][CX2;R:11](=[OX1,SX1:10])[NX3;R:12](-*)[CX4;R:1]1.[C,c:13](-*))'\n", 311 | "Ps_rxnL[113] = AllChem.ReactionFromSmarts(pox)" 312 | ] 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": null, 317 | "id": "83654ea1", 318 | "metadata": {}, 319 | "outputs": [], 320 | "source": [ 321 | "#polysulfone; suldiX+diol\n", 322 | "PSU = '[c:1]1[c:2][c:3]([F,Cl,Br,I])[c:4][c:5][c:6]1[SX4](=[OX1])(=[OX1])[c:7]2[c:8][c:9][c:10]([F,Cl,Br,I])[c:11][c:12]2.([OX2;H1;!$([O,S]C=*):13].[OX2;H1;!$([O,S]C=*):14])>>[c:1]1[c:2][c:3](-[*])[c:4][c:5][c:6]1[SX4](=[OX1])(=[OX1])[c:7]2[c:8][c:9][c:10]([OX2;!$([O,S]C=*):13].[OX2;!$([O,S]C=*):14]-[*])[c:11][c:12]2'\n", 323 | "Ps_rxnL[114] = AllChem.ReactionFromSmarts(PSU)" 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": null, 329 | "id": "7f972e17", 330 | "metadata": {}, 331 | "outputs": [], 332 | "source": [ 333 | "#polysulfone; BzodiF+diol\n", 334 | "PEK = '[c:1]1[c:2][c:3]([F])[c:4][c:5][c:6]1[CX3](=[OX1])[c:7]2[c:8][c:9][c:10]([F])[c:11][c:12]2.([OX2;H1;!$([O,S]C=*):13].[OX2;H1;!$([O,S]C=*):14])>>[c:1]1[c:2][c:3](-[*])[c:4][c:5][c:6]1[CX3](=[OX1])[c:7]2[c:8][c:9][c:10]([OX2;!$([O,S]C=*):13].[OX2;!$([O,S]C=*):14]-[*])[c:11][c:12]2'\n", 335 | "Ps_rxnL[115] = AllChem.ReactionFromSmarts(PEK)" 336 | ] 337 | }, 338 | { 339 | "cell_type": "code", 340 | "execution_count": null, 341 | "id": "ffb2cc17-7933-4968-9783-16875d195005", 342 | "metadata": {}, 343 | "outputs": [], 344 | "source": [ 345 | "#ROMP\n", 346 | "n= mon_dic['cycCH']\n", 347 | "ROMP = '[CX3;H1;R:1]=[CX3;H1;R:2][CX4;R:3]>>([CX3;H1:1]=[CX3;H1:2]-*.[CX4:3]-*)'\n", 348 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(ROMP)" 349 | ] 350 | }, 351 | { 352 | "cell_type": "code", 353 | "execution_count": null, 354 | "id": "ca2db5cd-ffc8-496e-92cf-34985085cca5", 355 | "metadata": {}, 356 | "outputs": [], 357 | "source": [ 358 | "#COC for cyclic olefin \n", 359 | "COC_cyc = '[CX3;H1;R:1]=[CX3;H1;R:2]>>[CX4;R:1](-*)-[CX4;R:2]-*'\n", 360 | "Ps_rxnL[1051] = AllChem.ReactionFromSmarts(COC_cyc)" 361 | ] 362 | }, 363 | { 364 | "cell_type": "code", 365 | "execution_count": null, 366 | "id": "e267c338-52bb-42c8-812b-438a091657bc", 367 | "metadata": {}, 368 | "outputs": [], 369 | "source": [ 370 | "#COC for chain olefin \n", 371 | "#'[CX3;H2;!R:1]=[CX3;H2;!R:2]', '[CX3;H2;!R:1]=[CX3;H1;!R:2]',\n", 372 | "n= mon_dic['aliphCH']\n", 373 | "COC_chain = '[CX3;H2;!R:1]=[CX3;H2,H1;!R:2]>>[CX4;!R:1](-*)-[CX4;!R:2]-*'\n", 374 | "Ps_rxnL[n] = AllChem.ReactionFromSmarts(COC_chain)" 375 | ] 376 | }, 377 | { 378 | "cell_type": "code", 379 | "execution_count": null, 380 | "id": "74a5fbb2", 381 | "metadata": {}, 382 | "outputs": [], 383 | "source": [ 384 | "#sequential rection for epoxide groupe \n", 385 | "seqole = '[CX3:1]=[CX3:2]>>*-[CX4:1][CX4:2]-*'\n", 386 | "Ps_rxnL[200] = AllChem.ReactionFromSmarts(seqole)" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": null, 392 | "id": "4e030c89", 393 | "metadata": {}, 394 | "outputs": [], 395 | "source": [ 396 | "#sequential rection for epoxide groupe \n", 397 | "seqepo = '[CX4;R:1]1[OX2;R][CX4;R:2]1>>[CX4:1](-*)[CX4:2][OX2]-*'\n", 398 | "Ps_rxnL[201] = AllChem.ReactionFromSmarts(seqepo)" 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "execution_count": null, 404 | "id": "b4dae4a4", 405 | "metadata": {}, 406 | "outputs": [], 407 | "source": [ 408 | "#sequential rection for COOH groupe \n", 409 | "seqCOOH = '[CX3:1](=[O])[OX2H1,F,Cl,Br,I]>>[CX3:1](=[O])-[*]'\n", 410 | "Ps_rxnL[202] = AllChem.ReactionFromSmarts(seqCOOH)" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "id": "ea9ca8dd", 417 | "metadata": {}, 418 | "outputs": [], 419 | "source": [ 420 | "#sequential rection for hydroxyl groupe \n", 421 | "seqOH = '[C,c:1][OX2,SX2;H1;!$([O,S]C=*):2]>>[C,c:1][OX2,SX2;!$([O,S]C=*):2]-[*]'\n", 422 | "Ps_rxnL[203] = AllChem.ReactionFromSmarts(seqOH)" 423 | ] 424 | }, 425 | { 426 | "cell_type": "code", 427 | "execution_count": null, 428 | "id": "96af814e", 429 | "metadata": {}, 430 | "outputs": [], 431 | "source": [ 432 | "#sequential rection for prim- and sec-amine \n", 433 | "seqamin = '[C,c:1][NX3;H2;!$(NC=*):2]>>[C,c:1][NX3;!$(NC=*):2]-[*]'\n", 434 | "Ps_rxnL[204] = AllChem.ReactionFromSmarts(seqamin)" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": null, 440 | "id": "f41e2f8c", 441 | "metadata": {}, 442 | "outputs": [], 443 | "source": [ 444 | "#sequential rection for isocyanate groupe \n", 445 | "seqNCO = '[NX2:1]=[CX2:2]=[OX1,SX1:3]>>[NX3H1:1][CX3:2](=[OX1,SX1:3])-*'\n", 446 | "Ps_rxnL[205] = AllChem.ReactionFromSmarts(seqNCO)" 447 | ] 448 | }, 449 | { 450 | "cell_type": "code", 451 | "execution_count": null, 452 | "id": "11c9e756", 453 | "metadata": {}, 454 | "outputs": [], 455 | "source": [ 456 | "#sequential rection for cyclic anhydride groupe \n", 457 | "seqcAnhyd = '[C,c:1][CX3,c;R:2](=[OX1])[OX2,o;R][CX3,c;R:3](=[O])[C,c:4]>>([C,c:1][CX3:2](=[OX1])(-*).[C,c:4][CX3:3](=[OX1])[OX2]-*)'\n", 458 | "Ps_rxnL[206] = AllChem.ReactionFromSmarts(seqcAnhyd)" 459 | ] 460 | }, 461 | { 462 | "cell_type": "code", 463 | "execution_count": null, 464 | "id": "68e27e9e", 465 | "metadata": {}, 466 | "outputs": [], 467 | "source": [ 468 | "#sequential rection for cyclic anhydride groupe 2 (for imide) \n", 469 | "seqcAnhyd2 = '[CX3,c;R:1](=[OX1])[OX2,o;R][CX3,c;R:2](=[OX1])>>([CX3:1](=[OX1])[NX3](-*)[CX3:2](=[OX1]))'\n", 470 | "Ps_rxnL[207] = AllChem.ReactionFromSmarts(seqcAnhyd2)" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": null, 476 | "id": "a338bea4", 477 | "metadata": {}, 478 | "outputs": [], 479 | "source": [ 480 | "#sequential rection for isocyanate groupe 2 (for oxazolidone) \n", 481 | "seqNCO = '[NX2:1]=[CX2:2]=[OX1,SX1:3]>>[NX3:1](-*)[CX3:2](=[OX1,SX1:3])-*'\n", 482 | "Ps_rxnL[208] = AllChem.ReactionFromSmarts(seqNCO)" 483 | ] 484 | }, 485 | { 486 | "cell_type": "code", 487 | "execution_count": null, 488 | "id": "5bd3f8a4-efa6-435a-b28a-cb1672196561", 489 | "metadata": {}, 490 | "outputs": [], 491 | "source": [ 492 | "#sequential rection for 1,2-added diene to a 1,4-addition\n", 493 | "seqdiene_12to14 = '[CX3:1]=[CX3;!R:2]-[CX4;!R:3](-[3H])[CX4:4](-[3H])>>[3H]-[CX4:1]-[CX3;!R:2]=[CX3;!R:3][CX4:4]-[3H]'\n", 494 | "Ps_rxnL[209] = AllChem.ReactionFromSmarts(seqdiene_12to14)" 495 | ] 496 | }, 497 | { 498 | "cell_type": "code", 499 | "execution_count": null, 500 | "id": "a4e5e1e3-70ac-4a27-ac55-44e1fc3f40a6", 501 | "metadata": {}, 502 | "outputs": [], 503 | "source": [ 504 | "#sequential rection for ROMPH \n", 505 | "seqROMPH = '[CX3:1]=[CX3:2]>>[CX4:1]-[CX4:2]'\n", 506 | "Ps_rxnL[210] = AllChem.ReactionFromSmarts(seqROMPH)" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": null, 512 | "id": "7ec50865", 513 | "metadata": {}, 514 | "outputs": [], 515 | "source": [ 516 | "with open(\"./rules/ps_rxn.pkl\",\"wb\") as f:\n", 517 | " pickle.dump(Ps_rxnL, f)" 518 | ] 519 | }, 520 | { 521 | "cell_type": "markdown", 522 | "id": "ee5b1b08", 523 | "metadata": {}, 524 | "source": [ 525 | "#end" 526 | ] 527 | } 528 | ], 529 | "metadata": { 530 | "kernelspec": { 531 | "display_name": "Python 3 (ipykernel)", 532 | "language": "python", 533 | "name": "python3" 534 | }, 535 | "language_info": { 536 | "codemirror_mode": { 537 | "name": "ipython", 538 | "version": 3 539 | }, 540 | "file_extension": ".py", 541 | "mimetype": "text/x-python", 542 | "name": "python", 543 | "nbconvert_exporter": "python", 544 | "pygments_lexer": "ipython3", 545 | "version": "3.9.21" 546 | } 547 | }, 548 | "nbformat": 4, 549 | "nbformat_minor": 5 550 | } 551 | -------------------------------------------------------------------------------- /utilities/3_Ps_GenL.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "8dd34601", 6 | "metadata": {}, 7 | "source": [ 8 | "Copyright (c) 2021 Mitsuru Ohno \n", 9 | "Use of this source code is governed by a BSD-3-style \n", 10 | "license that can be found in the LICENSE file. \n", 11 | " \n", 12 | "08/11/2021, M. Ohno \n", 13 | "polymer and polymerization list. \n", 14 | "\n", 15 | "Refernce: \n", 16 | "https://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html\n", 17 | "https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html " 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "id": "df5482df", 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "import json\n", 28 | "import pickle" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "id": "a4fda4cd", 35 | "metadata": {}, 36 | "outputs": [], 37 | "source": [ 38 | "with open('./rules/mon_dic.json', 'r') as f:\n", 39 | " mon_dic = json.load(f)" 40 | ] 41 | }, 42 | { 43 | "cell_type": "code", 44 | "execution_count": null, 45 | "id": "26ad4d4a", 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": [ 49 | "with open('./rules/ps_rxn.pkl', 'rb') as f:\n", 50 | " Ps_rxnL = pickle.load(f)" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "id": "6a015641", 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "#values in dictionary based on the class of PoLyInfo\n", 61 | "Ps_classL = {'polyolefin':11, 'polyester':6, 'polyether':12, 'polyamide':2, 'polyimide':8, 'polyurethane':19, \n", 62 | " 'polyoxazolidone':23, }" 63 | ] 64 | }, 65 | { 66 | "cell_type": "code", 67 | "execution_count": null, 68 | "id": "867ff451", 69 | "metadata": {}, 70 | "outputs": [], 71 | "source": [ 72 | "Ps_GenL = {}" 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": null, 78 | "id": "15db3d51", 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [ 82 | "Ps_GenL['polyolefin'] = (('vinyl', 'none', Ps_rxnL[mon_dic['vinyl']]), ('cOle', 'none', Ps_rxnL[mon_dic['cOle']]), \n", 83 | " ('vinyl', 'vinyl', Ps_rxnL[101]), ('vinyl', 'cOle', Ps_rxnL[102]), )" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": null, 89 | "id": "67ebd1b9", 90 | "metadata": {}, 91 | "outputs": [], 92 | "source": [ 93 | "Ps_GenL['polyester'] = (('hydCOOH', 'none', Ps_rxnL[mon_dic['hydCOOH']]), ('lactone', 'none', Ps_rxnL[mon_dic['lactone']]), \n", 94 | " ('hydCOOH', 'hydCOOH', Ps_rxnL[105]), ('diol', 'CO', Ps_rxnL[106]), \n", 95 | " ('diCOOH', 'diol', Ps_rxnL[104]), ('cAnhyd', 'epo', Ps_rxnL[112]), )" 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "id": "7160c30f", 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "Ps_GenL['polyether'] = (('epo', 'none', Ps_rxnL[mon_dic['epo']]), ('hindPhenol', 'none', Ps_rxnL[mon_dic['hindPhenol']]), \n", 106 | " ('sfonediX', 'diol_b', Ps_rxnL[114]), ('BzodiF', 'diol_b', Ps_rxnL[115]), )" 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "id": "7d174411", 113 | "metadata": {}, 114 | "outputs": [], 115 | "source": [ 116 | "Ps_GenL['polyamide'] = (('lactam', 'none', Ps_rxnL[mon_dic['lactam']]), ('aminCOOH', 'none', Ps_rxnL[mon_dic['aminCOOH']]), \n", 117 | " ('diCOOH', 'diamin', Ps_rxnL[108]), ('aminCOOH', 'aminCOOH', Ps_rxnL[109]),)" 118 | ] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": null, 123 | "id": "2b53bc9e", 124 | "metadata": {}, 125 | "outputs": [], 126 | "source": [ 127 | "Ps_GenL['polyimide'] = (('dicAnhyd', 'diamin', Ps_rxnL[110]), )" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "id": "9d8f93c1", 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "Ps_GenL['polyurethane'] = (('diNCO', 'diol', Ps_rxnL[111]), )" 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": null, 143 | "id": "c8660bc3", 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "Ps_GenL['polyoxazolidone'] = (('diepo', 'diNCO', Ps_rxnL[113]), )" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": null, 153 | "id": "8eb72b46-5fe6-4a03-aeeb-b9f93f3a3243", 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [ 157 | "Ps_GenL['ROMP'] = (('cycCH', 'none', Ps_rxnL[1050]), )" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": null, 163 | "id": "a4bfa3bf-1900-402b-acdb-e0e05c29c33f", 164 | "metadata": {}, 165 | "outputs": [], 166 | "source": [ 167 | "Ps_GenL['COC'] = (('cycCH', 'aliphCH', 'none'), )" 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": null, 173 | "id": "ef4b3b68-b49f-4400-b96b-f827219492e4", 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "Ps_GenL['rec:radi'] = ('acryl', 'bEWole', 'styryl', 'allyl', 'haloCH', 'vinylester', 'malei', 'conjdiene', )\n", 178 | "Ps_GenL['rec:cati'] = ('styryl', 'vinylether', 'tertcatCH', )\n", 179 | "Ps_GenL['rec:ani'] = ('acryl', 'styryl', 'conjdiene')\n", 180 | "Ps_GenL['rec:coord'] = ('ROMP', 'ROMPH', 'COC', )" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "id": "d1eec9a5", 187 | "metadata": {}, 188 | "outputs": [], 189 | "source": [ 190 | "with open(\"./rules/ps_class.json\",\"w\") as f:\n", 191 | " json.dump(Ps_classL, f)" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "id": "734df890", 198 | "metadata": {}, 199 | "outputs": [], 200 | "source": [ 201 | "with open(\"./rules/ps_gen.pkl\",\"wb\") as f:\n", 202 | " pickle.dump(Ps_GenL, f)" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "id": "9a2d7d17", 209 | "metadata": {}, 210 | "outputs": [], 211 | "source": [] 212 | } 213 | ], 214 | "metadata": { 215 | "kernelspec": { 216 | "display_name": "Python 3 (ipykernel)", 217 | "language": "python", 218 | "name": "python3" 219 | }, 220 | "language_info": { 221 | "codemirror_mode": { 222 | "name": "ipython", 223 | "version": 3 224 | }, 225 | "file_extension": ".py", 226 | "mimetype": "text/x-python", 227 | "name": "python", 228 | "nbconvert_exporter": "python", 229 | "pygments_lexer": "ipython3", 230 | "version": "3.9.21" 231 | } 232 | }, 233 | "nbformat": 4, 234 | "nbformat_minor": 5 235 | } 236 | -------------------------------------------------------------------------------- /utilities/rules/excl_lst.json: -------------------------------------------------------------------------------- 1 | {"0": [], "1": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "2": ["[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "51": ["[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "3": ["[CX3;R]=[CX3;R]-[CX3;R]=[CX3;R]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "4": ["[OX2;R]1[CX3;R](=[OX1])[C,c;R][C,c;R][C,c;R]1", "[c][OX2;R5][CX3;R5](=[OX1])", "[OX2;R5][CX3;R5](=[OX1])[c]", "[C,c][C;R](=[OX1])[O;R][C;R](=[OX1])[C,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "5": ["[NX3;R]1[CX3;R](=[OX1])[C,c;R][C,c;R][C,c;R]1", "[C,c;R5][NX3;R5][CX3;R5](=[OX1])[C,c;R5]", "[C;R](=[OX1])[N;R][C;R](=[OX1])", "[C;R][N;R][C;R](=[OX1])[N;R][CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "6": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "7": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "8": ["[N&X3;H2,H1,H0;!$(N[C,S]=*)]", "[F,Cl,Br,I]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "9": ["[NX2;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[NX3;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "10": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "11": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "52": ["[CX4H2][OH1]", "[CX4H1][OH1]", "[CX4H0][OH1]", "[CX4H0][C](=[O])[OH1]", "[CX4H0][C](=[O])[Cl,Br][N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "53": ["[CX3H0](=[O])[OH1]", "[CX4H0;!R]([C])([C])[O,S;H1]", "[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]", "[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "54": ["[CX3H0](=[O])[OH1]", "[CX4H0]([C])([C])[OH1]", "[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]", "[C][OX2H]", "[C][SX2H]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "55": ["[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "56": ["[NX2;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[NX3;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "57": ["[CX3H0](=[O])[OH1]", "[CX3H0](=[O])[NX3]", "[CX4][OH1]", "[c][OH1]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "12": ["[CX3](=[OX1])[OX2]", "[CX3H0](=[O])[NX3]", "[CX4][OH1]", "[c][OH1]", "[CX4][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]", "[NX1]#[CX2]", "[CX4]([C,c])([C,c])[C,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "13": ["[CX3](=[OX1])[OX2]", "[CX3H0](=[O])[NX3]", "[CX4][OH1]", "[c][OH1]", "[CX4][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]", "[NX1]#[CX2]", "[CX4]([C,c])([C,c])[C,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "58": ["[CX3](=[OX1])[OX2]", "[CX4H0;!R]([C])([C])[O,S;H1]", "[CX4H1;R]([OH1])[CX4H1;R]([OH1])[O;R]", "[CX4H1][SX2;H1]", "[CX4H2][SX2;H1]", "[c][SX2;H1]", "[N&X3;H2,H1;!$(N[C,S]=*)]", "[CX3H0](=[O])[NX3]", "[NX1]#[CX2]", "[CX4]1[OX2][CX4]1", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "1001": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1002": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1003": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1004": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1005": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"], "1006": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1007": ["[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "1020": ["[CX3]=[CX3](-[CX3;!R]=[CX3;!R])=[CX3]", "[CX3]=[CX3]-[CX3]=[CX3]-[CX3]=[CX3]", "[CX4]-[I,Br]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])", "[CX4]-[NX2]=[NX2]-[CX4]", "[OX2]-[OX2]", "[OX2H][cX3]:[c]", "[CX3]1(=[OX1])[CX3]=[CX3][CX3](=[OX1])[CX3]=[CX2]1"], "1030": ["[NX3;H3,H2,H1;!$(NC=O)]", "[n]", "[CX4;H2,H1][OX2;R][CX4]", "[CX4;H2,H1][SX2;R][CX4]", "[OX2][CX4,c]([C,c])([C,c])[CX4,c]", "[CX3H1]=[O]", "[$(*-[NX2-]-[NX2+]#[NX1]),$(*-[NX2]=[NX2+]=[NX1-])]", "[CX3;!R](=[OX1])[OX2;!R][CX3;!R](=[OX1])", "[CX3;!R](=[OX1])[NX3;!R][CX3;!R](=[OX1])"], "1031": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[F,Cl,Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"], "1050": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[Cl,Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"], "1052": ["[A+1]", "[A+2]", "[A+3]", "[A+4]", "[a+1]", "[a+2]", "[a+3]", "[a+4]", "[A-1]", "[A-2]", "[A-3]", "[A-4]", "[a-1]", "[a-2]", "[a-3]", "[a-4]", "[CX3]=[CX3]-[CX3]=[CX3]", "[CX2]#[CX2]", "[Li,Na,K,Rb,Cs]", "[Be,Mg,Ca,Sr,Ba]", "[Zn,Cd,Hg]", "[B,b]", "[c]", "[N,n]", "[O,o]", "[Al]", "[P,p]", "[S,s]", "[Ga]", "[Ge]", "[As,as]", "[Se,se]", "[In]", "[Sn,sn]", "[Sb,sb]", "[Te,te]", "[Tl]", "[Pb,pb]", "[Bi]", "[Cl,Br,I]", "[si]", "[Si][F,Cl]", "[Si;H4,H3,H2,H1]"]} -------------------------------------------------------------------------------- /utilities/rules/mon_dic.json: -------------------------------------------------------------------------------- 1 | {"vinyl": 1, "epo": 2, "diepo": 51, "cOle": 3, "lactone": 4, "lactam": 5, "hydCOOH": 6, "aminCOOH": 7, "hindPhenol": 8, "cAnhyd": 9, "CO": 10, "HCHO": 11, "sfonediX": 12, "BzodiF": 13, "diCOOH": 52, "diol": 53, "diamin": 54, "diNCO": 55, "dicAnhyd": 56, "pridiamin": 57, "diol_b": 58, "acryl": 1001, "bEWole": 1002, "styryl": 1003, "allyl": 1004, "haloCH": 1005, "vinylester": 1006, "malei": 1007, "conjdiene": 1020, "vinylether": 1030, "tertcatCH": 1031, "cycCH": 1050, "aliphCH": 1052} -------------------------------------------------------------------------------- /utilities/rules/mon_dic_inv.json: -------------------------------------------------------------------------------- 1 | {"1": "vinyl", "2": "epo", "51": "diepo", "3": "cOle", "4": "lactone", "5": "lactam", "6": "hydCOOH", "7": "aminCOOH", "8": "hindPhenol", "9": "cAnhyd", "10": "CO", "11": "HCHO", "12": "sfonediX", "13": "BzodiF", "52": "diCOOH", "53": "diol", "54": "diamin", "55": "diNCO", "56": "dicAnhyd", "57": "pridiamin", "58": "diol_b", "1001": "acryl", "1002": "bEWole", "1003": "styryl", "1004": "allyl", "1005": "haloCH", "1006": "vinylester", "1007": "malei", "1020": "conjdiene", "1030": "vinylether", "1031": "tertcatCH", "1050": "cycCH", "1052": "aliphCH"} -------------------------------------------------------------------------------- /utilities/rules/mon_lst.json: -------------------------------------------------------------------------------- 1 | {"0": [], "1": ["[CX3H2]=[CX3]", "[CX3](F)(F)=[CX3]", "[CX3;H1](F)=[CX3]", "[CX3](Cl)(Cl)=[CX3]", "[CX3;H1](Cl)=[CX3]", "[CX3](Cl)(F)=[CX3]"], "2": ["[CX4H2]1[O][CX4]1", "[c,CX4;R]-[CX4H1]1[O][CX4]1-[c,CX4;R]", "[CX4H1]1([F,Cl])[O][CX4]1", "[CX4]1([F,Cl])([F,Cl])[O][CX4]1", "[c,CX4;R]-[CX4]1([F,Cl])[O][CX4]1-[c,CX4;R]"], "51": ["[CX4H2]1[O][CX4]1", "[c,CX4;R]-[CX4H1]1[O][CX4]1-[c,CX4;R]", "[CX4H1]1([F,Cl])[O][CX4]1", "[CX4]1([F,Cl])([F,Cl])[O][CX4]1", "[c,CX4;R]-[CX4]1([F,Cl])[O][CX4]1-[c,CX4;R]"], "3": ["[CX3;H1;R]=[CX3;H1;R]", "[CX3;H1;R]=[CX3;H0;R]"], "4": ["[C;R][OX2;R][CX3;R](=[OX1])[C;R]", "[c][OX2;R][CX3;R](=[OX1])[C;R]", "[OX2;R][CX3;R](=[OX1])[C;R][c]"], "5": ["[C;R][NX3;H1;R][CX3;R](=[OX1])[C;R][C;R]", "[c][NX3;H1;R][CX3;R](=[OX1])[C;R]", "[NX3;H1;R][CX3;R](=[OX1])[C;R][c]"], "6": ["[O&X2;H1;!$(OC=*)][C].[CX3](=[O])[OX2H1]", "[O&X2;H1;!$(OC=*)][c].[CX3](=[O])[OX2H1]"], "7": ["[N&X3;H2,H1;!$(N[C,S]=*)][C].[CX3](=[O])[OX2H1]", "[N&X3;H2,H1;!$(N[C,S]=*)][c].[CX3](=[O])[OX2H1]"], "8": ["[c]1([OX2H1])[c]([C])[c][cX3H1][c][c]1([C])"], "9": ["[C;R][C;R;X3](=[O])[O;R][C;R;X3](=[O])[C;R]", "[c][CX3,c;R](=[O])[O,o;R][CX3,c;R](=[O])[c]"], "10": ["[C-]#[O+]"], "11": ["[CX3;H2]=[OX1]"], "52": ["[CX4H2][C](=[O])[OH1]", "[CX4H1][C](=[O])[OH1]", "[c][C](=O)[OH1]", "[CX4H2][C](=[O])[Cl,Br]", "[CX4H1][C](=[O])[Cl,Br]", "[c][C](=O)[Cl,Br]"], "53": ["[CX4H1][OX2,SX2;H1]", "[CX4H2][OX2,SX2;H1]", "[c][OX2,SX2;H1]", "[CX4;H2,H1,c]([OX2,SX2;H1])[OX2,SX2;H1]"], "54": ["[C][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]", "[C,c][N&X3;H1;!$(N[C,S]=*)][C,c]", "[N&X3;H2;!$(N[C,S]=*)][C][N&X3;H2;!$(N[C,S]=*)]"], "55": ["[C]-[NX2]=[CX2]=[O,S;X1]", "[c]-[NX2]=[CX2]=[O,S;X1]"], "56": ["[C;R][C;R;X3](=[OX1])[OX2;R][C;R;X3](=[OX1])[C;R]", "[c][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[c]"], "57": ["[C][N&X3;H2;!$(N[C,S]=*)]", "[c][N&X3;H2;!$(N[C,S]=*)]"], "12": ["[c]1[c][c]([F,Cl,Br,I])[c][c][c]1[SX4](=[OX1])(=[OX1])[c]2[c][c][c]([F,Cl,Br,I])[c][c]2"], "13": ["[c]1[c][c]([F])[c][c][c]1[CX3](=[OX1])[c]2[c][c][c]([F])[c][c]2"], "58": ["[CX4H1][OX2;H1]", "[CX4H2][OX2;H1]", "[c][OX2;H1]"], "1001": ["[CX3H2;!R:1]=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX3;!R](=[OX1])[OX2,SX2,NX3:3]"], "1002": ["[CX3H2;!R:1]=[CX3H1;!R:2][CX2]#[NX1]", "[CX3H2;!R:1]=[CX3H1;!R:2][NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3H2;!R:1]=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F,Cl,Br:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F,Cl:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F,Cl:4])[F,Cl:5])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][c:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H3])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([CX4;H3])[CX4;H2:4][CX3;!R](=[OX1])[OX2,SX2,NX3:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX2]#[NX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([NX2]=[CX2]=[OX1])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([SX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[CX2]#[NX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[NX2]=[CX2]=[OX1]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[SX4](=[OX1])(=[OX1])[OX2:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([PX4](=[OX1])(=[OX1])[OX2:4])[PX4](=[OX1])(=[OX1])[OX2:3]"], "1003": ["[CX3H2;!R:1]=[CX3H1;!R:2][c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]", "[CX3H2;!R:1]=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F])[F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[c:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[c:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H3])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H2][F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]", "[CX3H2;!R:1]=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3H1;!R:2][n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H3])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H2][F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]", "[CX3;H0;!R:1]([F])([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]", "[CX3;H1;!R:1]([F])=[CX3H1;!R:2][n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H3])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H2][F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H1]([F])[F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([CX4;H0]([F])([F])[F])[n:3]", "[CX3;H1;!R:1]([F])=[CX3;!R:2]([OX2][CX4,SiX4:4])[n:3]", "[CX3;H1;r5:1]=[CX3;H1;r5:2][c:3]"], "1004": ["[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][OX2,SX2:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][NX3:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][c:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][n:3]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][SiX4]([C,c:3])([C,c:4])[C,c:5]", "[CX3H2;!R:1]=[CX3H1;!R:2][CX4H2][CX2]#[N]"], "1005": ["[CX3;H2;!R:1]=[CX3;H1;!R:2][F,Cl:3]", "[CX3;H2;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]", "[CX3;H2;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]", "[CX3;H2;!R:1]=[CX3;H1;!R:2][CX4:3]([F,Cl:4])[F,Cl:5]", "[F,Cl:4][CX3;H1;!R:1]=[CX3;H1;!R:2][F,Cl:3]", "[F,Cl:5][CX3;H1;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]", "[F,Cl:5][CX3;H1;!R:1]=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]", "[F,Cl:4][CX3;H0;!R:1]([F,Cl:5])=[CX3;H1;!R:2][F,Cl:3]", "[F,Cl:5][CX3;H0;!R:1]([F,Cl:6])=[CX3;H0;!R:2]([F,Cl;3])[F,Cl:4]", "[F,Cl:5][CX3;H0;!R:1]([F,Cl:6])=[CX3;H0;!R:2]([F,Cl;3])[CX4:4]"], "1006": ["[CX3H2:1]=[CX3H1:2][OX2][CX3:3](=[OX1])", "[CX3H1:1]([F])=[CX3H1:2][OX2][CX3:3](=[OX1])", "[CX3H0:1]([F])([F])=[CX3H1:2][OX2][CX3:3](=[OX1])", "[CX3H2:1]=[CX3H1:2][NX3:3][CX3:4](=[OX1])", "[CX3H1:1]([F])=[CX3H1:2][NX3:3][CX3:4](=[OX1])", "[CX3H0:1]([F])([F])=[CX3H1:2][NX3:3][CX3:4](=[OX1])"], "1007": ["[CX3H1;R:1]1=[CX3H1;R:2][CX3;R](=[OX1])[NX3:3][CX3;R]1(=[OX1])", "[CX3H1;R:1]1([F])=[CX3H1;R:2]([F])[CX3;R](=[OX1])[NX3:3][CX3;R]1(=[OX1])"], "1020": ["[CX3;H2:1]=[CX3;!R:2]-[CX3;!R:3]=[CX3;H2:4]", "[CX3:1]([F])([F])=[CX3!R:2]-[CX3;!R:3]=[CX3H2:4]", "[CX3;H1:1]([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H2:4]", "[CX3;H0:1]([F])([F])=[CX3!R:2]-[CX3;!R:3]=[CX3;H1:4]([F])", "[CX3;H1:1]([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H1:4]([F])", "[CX3;H0:1]([F])([F])=[CX3;!R:2]-[CX3;!R:3]=[CX3;H0:4]([F])([F])"], "1030": ["[CX3H2:1]=[CX3:2][OX2][CX4:3]", "[CX3H2:1]=[CX3:2][OX2;!$(O*=O)][CX3:3]", "[CX3H2:1]=[CX3:2][OX2][c:3]", "[CX3H1:1]([F])=[CX3:2][OX2][CX4:3]", "[CX3H1:1]([F])=[CX3:2][OX2;!$(O*=O)][CX3:3]", "[CX3H1:1]([F])=[CX3:2][OX2][c:3]", "[CX3H0:1]([F])([F])=[CX3:2][OX2][CX4:3]", "[CX3H0:1]([F])([F])=[CX3:2][OX2;!$(O*=O)][CX3:3]", "[CX3H0:1]([F])([F])=[CX3:2][OX2][c:3]"], "1031": ["[CX3;H2;!R:1]=[CX3;H0;!R:2]([CX4H3])[CX4H2:3]", "[CX3;H2;!R:1]=[CX3;H0;!R:2]([CX4H3])[CX4H3]"], "1050": ["[CX3;H1;R:1]=[CX3;H1;R:2]"], "1052": ["[CX3;H2;!R:1]=[CX3;H2;!R:2]", "[CX3;H2;!R:1]=[CX3;H1;!R:2]"], "200": "[CX3]=[CX3]", "201": "[CX4;R]1[OX2;R][CX4;R]1", "202": "[CX3](=[O])[OX2H1,F,Cl,Br,I]", "203": "[C,c][OX2,SX2;H1;!$([O,S]C=*)]", "204": "[C,c][NX3;H2;!$(N[C,S]=*)]", "205": "[NX2]=[CX2]=[OX1,SX1]", "206": "[C,c][CX3,c;R](=[OX1])[OX2,o;R][CX3,c;R](=[OX1])[C,c]"} -------------------------------------------------------------------------------- /utilities/rules/mon_vals.json: -------------------------------------------------------------------------------- 1 | [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], [51, 52, 53, 54, 55, 56, 57, 58], [200, 201, 202, 203, 204, 205, 206], [1001, 1002, 1003, 1004, 1005, 1006, 1007, 1020, 1030, 1031, 1050, 1052]] -------------------------------------------------------------------------------- /utilities/rules/ps_class.json: -------------------------------------------------------------------------------- 1 | {"polyolefin": 11, "polyester": 6, "polyether": 12, "polyamide": 2, "polyimide": 8, "polyurethane": 19, "polyoxazolidone": 23} -------------------------------------------------------------------------------- /utilities/rules/ps_gen.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PEJpOhno/SMiPoly/cceef4dfbabde2d5b68d55d153b2e152d799b971/utilities/rules/ps_gen.pkl -------------------------------------------------------------------------------- /utilities/rules/ps_rxn.pkl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PEJpOhno/SMiPoly/cceef4dfbabde2d5b68d55d153b2e152d799b971/utilities/rules/ps_rxn.pkl --------------------------------------------------------------------------------