├── knime.png ├── LICENSE ├── README.md └── parallel_smina.ipynb /knime.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tubiana/screening_protocol/HEAD/knime.png -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Thibault Tubiana 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #

SCREENING TUTORIALS

2 | 3 | 4 | ```diff 5 | - NOTE: THIS IS A WORK IN PROGRESS, IF YOU SEE STUFFS THAT COULD BE IMPROVED, OPEN A TICKET OR CONTACT ME 6 | ``` 7 | **Author**: Thibault Tubiana 8 | **date**: 02/2023 9 | 10 | 11 | The purpose of this page is to describe my screening protocol. Please contact me if you find any issue with it or if you want to give your feedback. 12 | 13 | # Compile the softwares. 14 | 15 | ## A. Autodock-GPU 16 | There are several softwares to prepare screening. I've been advised that AUTODOCK-GPU is not that bad. 17 | After a while, I managed to compile it with the write arguments. 18 | 19 | 1. First, make surce you have CUDA installed. You can check it with a `which nvcc` command. Example: 20 | ```bash 21 | ➜ which nvcc 22 | /usr/local/cuda/bin/nvcc 23 | ``` 24 | So my cuda installation path is **`/usr/local/cuda/`** 25 | 26 | 2. Check your compute capability version number : https://developer.nvidia.com/cuda-gpus#compute. For example, for my RTX A6000, the compute capability is **8.6**. for the `TARGETS` argument in the compiler I will then use **`86`** (if it was 8.0 --> 80...) 27 | 28 | ```bash 29 | cd ~/softwares #or change it if you want it elsewhere 30 | git clone https://github.com/ccsb-scripps/AutoDock-GPU 31 | cd AutoDock-GPU 32 | 33 | #Specify cuda library version BUT REPLACE WITH YOUR LIBRARY PATH 34 | export GPU_LIBRARY_PATH=/usr/local/cuda/lib64 35 | export GPU_INCLUDE_PATH=/usr/local/cuda/include 36 | 37 | # 38 | make DEVICE=CUDA NUMWI=64 TARGETS="86" #86 because compute capability = 8.6 39 | 40 | #The softwares are now in ~/softwares/AutoDock-GPU/bin 41 | ``` 42 | if you want to use MULTIPLE GPU for screening, you can compile with 43 | ``` 44 | make DEVICE=CUDA NUMWI=64 TARGETS="86" OVERLAP=on#86 because compute capability = 8.6 45 | ``` 46 | 47 | 48 | ## B. Autodock TOOLS. 49 | If you want to use a graphical interface to generate the grid and prepare some files, the AutoDockTool is a bit old but still very usefull! You can easilly install it thanks to `conda` in a **sperate environement** (very important since it needs python 2.7 which is deprecated now). 50 | 51 | ```bash 52 | conda create -n mgltools -c bioconda mgltools -y 53 | ``` 54 | To execute AutoDockTool : 55 | ```bash 56 | conda activate mgltools 57 | adt 58 | ``` 59 | 60 | ## C. Autodock classic 61 | The old version of Autodock is still usefull because there is the `autogrid4` software needed to generate important imput files for AutoDock-GPU. 62 | 63 | Luckily, nothing to install... Just download the pre-compiled software from https://autodock.scripps.edu/download-autodock4/ : https://autodock.scripps.edu/wp-content/uploads/sites/56/2021/10/autodocksuite-4.2.6-x86_64Linux2.tar 64 | 65 | Once done, extract it where you want to install it (for exaple `~/softwares/AutoDock/`) 66 | 67 | You can also install it into the mgltools environement 68 | `conda install -n mgltools -c bioconda autodock autogrid` 69 | 70 | 71 | # Screening! 72 | ## 1. Ligand preparation 73 | 74 | First Download the ligands from the supplementary materials (excel files) 75 | Then, Use KNIME to prepare the ligand (see protocol) 76 | 1. Read Excel file and keep only the good columns + compound name 77 | 2. cast it to a readble molecule format 78 | 3. de-salt it (some compound contains salt, like "CL.CCCCCC" meaning that there is a free CL in solution) 79 | 4. Rename the column otherwise it wont work... 80 | 5. Convert it with RDKIT to a molecule (RDKIT is a very famous opensource chemoinformatic package) 81 | 6. Generate a 3D conformation of the molecule 82 | 7. Optimize the geometry (very coarse "minimisation") with MMFF94 force field 83 | 8. Save it to SDF. 84 | 85 | Example : 86 | ![KNIME](knime.png) 87 | 88 | 89 | ## 2. conversion of the molecules 90 | 91 | Autodock take only pdbqt format as input. You have to convert the sdf into pdbqt. 92 | 2 possibilities: 93 | 94 | 95 | 96 | ### Using MEEKO (Better (YES)) 97 | Meeko is a soft that generate pdbqt from SDF and can generate a SDF from PDBQT. 98 | 99 | ```bash 100 | conda create -n rdkit -c conda-forge numpy scipy rdkit 101 | pip install prody meeko 102 | ``` 103 | 104 | Then Conversion to PDBQT 105 | ```bash 106 | mk_prepare_ligand.py -i MOLECULES.sdf --rigid_macrocycles --multimol_outdir OUTPUTFOLDER 107 | ``` 108 | 109 | **NB: Keep `--rigid_macrocycles` for autodock-GPU since autodockGPU doesn't handle flexible macrocycle (but newest version of autodock-VINA does!)** 110 | 111 | ### (OPTIONAL) Using BABEL 112 | 113 | For that, you can use `openbabel` to convert it. Openbabel can convert almost everything, even generate 3D format, filter, compute some properties etc.... 114 | During the conversion, I would add the gasteiger charges as well... 115 | 116 | ```bash 117 | obabel -isdf molecules_1-10.sdf -opdbqt -O molecules_1-10.pdbqt --partialcharge gasteiger 118 | ``` 119 | (if you don't have openbabel installed you can install it with `conda install -c bioconda -c conda-forge openbabel` or `sudo apt install openbabel`) 120 | 121 | 122 | 123 | 124 | ## 3. Prepare the protein. 125 | 1. The protein needs to be prepared as well. For that, the best way is to keep only the chain we want, strip the waters and other molecules. 126 | You can do that in pymol with some commands like : 127 | ```pymol 128 | load protein.pdb 129 | remove resname HOH 130 | remove not chain A 131 | save protein_chainA.pdb 132 | ``` 133 | 2. Now we have to add the hydrogens in a "good way". I recommand using propka or directly the pdb2pqr webserver (https://server.poissonboltzmann.org/pdb2pqr). 134 | You can use the default options for now (I like to tick the option to keep the chain names) 135 | 136 | 3. Once done, download the pqr file. 137 | 138 | 139 | ## 4. Pocket idenditication (optional) 140 | To identify the pocket, the best option (I think) is to use fpocket (https://bioserv.rpbs.univ-paris-diderot.fr/services/fpocket/ or https://durrantlab.pitt.edu/fpocketweb/). 141 | Per default, you can submit a pdb file and look at the possible pocket. There is a druguability score from 0 to 1 that can be used to find hypothetical droguable pockets. 142 | 143 | 144 | ## 5. Prepare the grid and the protein of the docking. 145 | The next step is to prepare the docking grid. There is two option : 146 | - Set a grid around the hypothetical binding pocket 147 | - set a grid around the whole protein. 148 | 149 | In either case, you can prepare it with autodock tools (inside the `mgltools` environment. You can activate it with `conda activate mgltools` and then run adt with `adt` command) 150 | 151 | 1. In the `grid` menu go to `Macromolecule > Open Macromolecule` and load your pqr file (in filter, choose `all files` if it is in pqr format instead of pdbqt) 152 | 2. Usually, autodock works great with gasteiger charges. Say "yes" when adt ask if you want to add gasteiger charges. 153 | 3. Let AutodockTools prepare the protein (ie, calculate the gasteiger charges + Merging non-polar hydrogens) and **save the protein in pdbqt format** 154 | 4. Set the maps type : 155 | Map types are important because the calculations are made based on a potential map. You have to create a map for each atom types! 156 | Usually, 157 | - `Grid > Set Map Type > Directly` 158 | - write those atoms name `A Br C HD N NA OA SA Cl S F I P`. Some atoms name might be missing (if you want all atoms name check the file `input/7cpa/derived/AD4.1_bound.dat` in autodock-GPU repository) 159 | 5. Set the grid box 160 | - `Grid > Grid box` 161 | - Changes the values to ajust the docking box. 162 | - **If you dock on the whole protein** : Change the spacing to 0.5 or 1 depending the size of the protein. 163 | - **save it before closing it** : `File > Close Saving Current` 164 | 6. Save the grid file : `Grid > Output > Save GPF` 165 | 7. Prepare all MAP.FLD file for the docking with autodock-grid : `~/softwares/AutoDock/autogrid4 -p FILE.gpf` 166 | 167 | 168 | 169 | ## 6. SCREEEENING!! 170 | Now that the protein is prepared, and the molecules as well, it it time to screeeen! 171 | The following screening is made with autodock-gpu. The idea is to "screen" very fastly, and then re-dock best compounds with flexible residues [NOTE FOR TT -> TODO]. 172 | 173 | **NOTE**: If you open the .fld file and .gpf file you will notice that it countains the name of the prepared protein. It means that when you run autodock, all file have to be located at the right place!!! 174 | 175 | 1. Check that all files are there and path in FLD file point to the right file 176 | 2. Generate a file list with **ALL** the ligands you wanna screen. You can do that in python for example. It has to follow a specific format 177 | ``` 178 | PATH_TO_PLD_FILE 179 | ligand1.pdbqt 180 | NAME 1 181 | ligand2.pdbqt 182 | NAME 2 183 | ``` 184 | Example: 185 | ``` 186 | receptor_prepared.maps.fld 187 | ./molecules/TCMDC-140876.pdbqt 188 | TCMDC-140876 189 | ./molecules/TCMDC-124866.pdbqt 190 | TCMDC-124866 191 | ``` 192 | You can use this python code for example 193 | ```python 194 | import os 195 | import glob 196 | from pathlib import Path 197 | 198 | molecules = glob.glob("molecules/*.pdbqt") 199 | with open("molecules_list.list","w") as output: 200 | output.write("WRITE_YOUR_FLD_FILE_HERE\n") 201 | for mol in molecules: 202 | name = Path(mol).stem 203 | output.write(f"./{mol}\n") 204 | output.write(f"{name}\n") 205 | 206 | ``` 207 | 3. Now run the screening!!! 208 | `time ~/softwares/AutoDock-GPU/bin/autodock_gpu_256wi --filelist molecules_list.list -nrun 300 -D all > screening.log` 209 | please note that redirecting the output of autodock_gpu is important because you can track the best energy directly. 210 | 211 | # NOT SORTED STUFFS 212 | 213 | 214 | 215 | ## Flexible residues... (only during testing, to be removed) 216 | Flexible residues : 217 | - GLU69; ASN73; LYS75; HIS79; MET80; ASP84; PHE87; TYR88 218 | 219 | BOX: 220 | - Number of points: 221 | - X: 42 222 | - Y: 40 223 | - Z: 74 224 | - Center grid 225 | - X: 6.665 226 | - Y: 27.642 227 | - Z: -3.689 228 | - offset 229 | - X: 6.667 230 | - Y: 27.639 231 | - Z: -3.667 232 | 233 | ## Get all different atoms type in all pdbqt 234 | If use this code to read all ligand file, get the `ATOM` lines and check the `ATOMTYPE` column. 235 | This is needed to know which atom type are present to prepare the grid for the screening. 236 | 237 | I leave the code, but I think this definition is enough for most of the ligand: `A Br C HD N NA OA SA Cl S F I P`. 238 | 239 | CHECK ALL ATOMTYPES : `./input/7cpa/derived/AD4.1_bound.dat` 240 | If you want to update this list with your ligand list, use this bellow. 241 | 242 | ```python 243 | import glob 244 | from tqdm import tqdm 245 | import numpy as np 246 | 247 | filelist = glob.glob("./*.pdbqt") 248 | 249 | atom=[] 250 | errors = [] 251 | for file in tqdm(filelist): 252 | with open(file,"r") as filin: 253 | lines = filin.readlines() 254 | for line in lines: 255 | try: 256 | if line.startswith("ATOM"): 257 | atomType= line[77:80].strip() 258 | if atomType in ["CG0","CG1","G0","G1"]: 259 | errors.append(file) 260 | atom.append(atomType) 261 | except: 262 | pass 263 | 264 | print (np.unique(np.asarray(atom))) 265 | print (np.unique(np.asarray(errors))) 266 | 267 | ``` 268 | 269 | # DLG TO PDBQT 270 | To convert DLG to PDBQT in one command you can use 271 | ```bash 272 | grep ^DOCKED results.dlg | cut -c 9- > docked.pdbqt 273 | ``` 274 | or (better) 275 | ```bash 276 | obabel -ipdbqt TCMDC-124963.dlg -opdbqt -O test.pdbqt -ad 277 | ``` 278 | NOTE: the `-ad` precise the `d` argument for the `pdbqt` READING format. (check https://openbabel.org/docs/current/FileFormats/AutoDock_PDQBT_format.html#read-options and https://openbabel.org/docs/current/FileFormats/Overview.html for how to give input/output arguments). 279 | 280 | 281 | 282 | ## Get the best energy from logfile 283 | ```python 284 | import re 285 | 286 | reg = re.compile(r"(B|b)est energy +(-\d+\.\d+) kcal\/mol") 287 | 288 | os.chdir("/home/thibault/work/projects/other/Sylvie/new_test") 289 | logfile = "screening.log" 290 | 291 | with open(logfile,"r") as infile: 292 | ligands = [] 293 | energies = [] 294 | for line in infile.readlines(): 295 | if line.strip().startswith("Ligand file"): 296 | ligandname = line.split("/")[-1].split(".")[0] #Get the ligand name 297 | ligands.append(ligandname) 298 | match = reg.findall(line) 299 | if len(match) > 0: 300 | energy = float(match[0][1]) 301 | energies.append(energy) 302 | 303 | if len(energies) == len(ligands): 304 | import pandas as pd 305 | 306 | d = { 307 | "Molecule":ligands, 308 | "Best Energy (kcal/mol)":energies 309 | } 310 | results = pd.DataFrame.from_dict(d) 311 | results = results.sort_values("Best Energy (kcal/mol)") 312 | results.to_csv("results_from_logfile.csv", sep=";") 313 | 314 | else: 315 | print("mismatch in len)") 316 | 317 | ``` 318 | 319 | 320 | ## Extract the energy from XML files 321 | ```python 322 | import pandas as pd 323 | import glob 324 | from pathlib import Path 325 | #Don't forget to install lmxl 326 | 327 | filelist = glob.glob("./test062223/*.xml") 328 | molecules = [] 329 | energiesMin = [] 330 | energiesAve = [] 331 | for file in filelist: 332 | name = Path(file).stem 333 | molecules.append(name) 334 | results = pd.read_xml(file, xpath="//runs/run",namespaces={"autodock-gpu":"runs"}) 335 | energiesMin.append(results.free_NRG_binding.min()) 336 | energiesAve.append(results.free_NRG_binding.mean()) 337 | 338 | results = pd.DataFrame.from_dict( 339 | { 340 | "molecule":molecules, 341 | "energyMin":energiesMin, 342 | "energyAve":energiesAve, 343 | } 344 | ) 345 | results.to_csv("results_from_xml.csv", sep=";") 346 | 347 | 348 | ``` 349 | -------------------------------------------------------------------------------- /parallel_smina.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 1. Install dependancies." 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 1, 13 | "metadata": {}, 14 | "outputs": [ 15 | { 16 | "name": "stdout", 17 | "output_type": "stream", 18 | "text": [ 19 | "Requirement already satisfied: pandas in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (1.5.3)\n", 20 | "Requirement already satisfied: pandarallel in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (1.6.4)\n", 21 | "Requirement already satisfied: tqdm in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (4.64.1)\n", 22 | "Requirement already satisfied: ipynb in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (0.5.1)\n", 23 | "Requirement already satisfied: ipywidgets==7.7.2 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (7.7.2)\n", 24 | "Requirement already satisfied: ipykernel>=4.5.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipywidgets==7.7.2) (6.20.2)\n", 25 | "Requirement already satisfied: ipython-genutils~=0.2.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipywidgets==7.7.2) (0.2.0)\n", 26 | "Requirement already satisfied: traitlets>=4.3.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipywidgets==7.7.2) (5.9.0)\n", 27 | "Requirement already satisfied: widgetsnbextension~=3.6.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipywidgets==7.7.2) (3.6.2)\n", 28 | "Requirement already satisfied: ipython>=4.0.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipywidgets==7.7.2) (8.9.0)\n", 29 | "Requirement already satisfied: jupyterlab-widgets<3,>=1.0.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipywidgets==7.7.2) (1.1.2)\n", 30 | "Requirement already satisfied: python-dateutil>=2.8.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from pandas) (2.8.2)\n", 31 | "Requirement already satisfied: pytz>=2020.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from pandas) (2022.7.1)\n", 32 | "Requirement already satisfied: numpy>=1.21.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from pandas) (1.23.5)\n", 33 | "Requirement already satisfied: dill>=0.3.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from pandarallel) (0.3.6)\n", 34 | "Requirement already satisfied: psutil in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from pandarallel) (5.9.4)\n", 35 | "Requirement already satisfied: comm>=0.1.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (0.1.2)\n", 36 | "Requirement already satisfied: debugpy>=1.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (1.6.6)\n", 37 | "Requirement already satisfied: jupyter-client>=6.1.12 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (8.0.2)\n", 38 | "Requirement already satisfied: matplotlib-inline>=0.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (0.1.6)\n", 39 | "Requirement already satisfied: nest-asyncio in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (1.5.6)\n", 40 | "Requirement already satisfied: packaging in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (23.0)\n", 41 | "Requirement already satisfied: pyzmq>=17 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (25.0.0)\n", 42 | "Requirement already satisfied: tornado>=6.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipykernel>=4.5.1->ipywidgets==7.7.2) (6.2)\n", 43 | "Requirement already satisfied: backcall in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (0.2.0)\n", 44 | "Requirement already satisfied: decorator in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (5.1.1)\n", 45 | "Requirement already satisfied: jedi>=0.16 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (0.18.2)\n", 46 | "Requirement already satisfied: pickleshare in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (0.7.5)\n", 47 | "Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.30 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (3.0.36)\n", 48 | "Requirement already satisfied: pygments>=2.4.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (2.14.0)\n", 49 | "Requirement already satisfied: stack-data in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (0.6.2)\n", 50 | "Requirement already satisfied: pexpect>4.3 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from ipython>=4.0.0->ipywidgets==7.7.2) (4.8.0)\n", 51 | "Requirement already satisfied: six>=1.5 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)\n", 52 | "Requirement already satisfied: notebook>=4.4.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (6.5.2)\n", 53 | "Requirement already satisfied: parso<0.9.0,>=0.8.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jedi>=0.16->ipython>=4.0.0->ipywidgets==7.7.2) (0.8.3)\n", 54 | "Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets==7.7.2) (5.2.0)\n", 55 | "Requirement already satisfied: jinja2 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (3.1.2)\n", 56 | "Requirement already satisfied: argon2-cffi in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (21.3.0)\n", 57 | "Requirement already satisfied: nbformat in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (5.7.3)\n", 58 | "Requirement already satisfied: nbconvert>=5 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (7.2.9)\n", 59 | "Requirement already satisfied: Send2Trash>=1.8.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.8.0)\n", 60 | "Requirement already satisfied: terminado>=0.8.3 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.17.1)\n", 61 | "Requirement already satisfied: prometheus-client in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.16.0)\n", 62 | "Requirement already satisfied: nbclassic>=0.4.7 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.5.2)\n", 63 | "Requirement already satisfied: ptyprocess>=0.5 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from pexpect>4.3->ipython>=4.0.0->ipywidgets==7.7.2) (0.7.0)\n", 64 | "Requirement already satisfied: wcwidth in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from prompt-toolkit<3.1.0,>=3.0.30->ipython>=4.0.0->ipywidgets==7.7.2) (0.2.6)\n", 65 | "Requirement already satisfied: executing>=1.2.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from stack-data->ipython>=4.0.0->ipywidgets==7.7.2) (1.2.0)\n", 66 | "Requirement already satisfied: asttokens>=2.1.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from stack-data->ipython>=4.0.0->ipywidgets==7.7.2) (2.2.1)\n", 67 | "Requirement already satisfied: pure-eval in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from stack-data->ipython>=4.0.0->ipywidgets==7.7.2) (0.2.2)\n", 68 | "Requirement already satisfied: platformdirs>=2.5 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-core!=5.0.*,>=4.12->jupyter-client>=6.1.12->ipykernel>=4.5.1->ipywidgets==7.7.2) (2.6.2)\n", 69 | "Requirement already satisfied: jupyter-server>=1.8 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.3.0)\n", 70 | "Requirement already satisfied: notebook-shim>=0.1.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.2.2)\n", 71 | "Requirement already satisfied: beautifulsoup4 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (4.11.2)\n", 72 | "Requirement already satisfied: bleach in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (6.0.0)\n", 73 | "Requirement already satisfied: defusedxml in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.7.1)\n", 74 | "Requirement already satisfied: jupyterlab-pygments in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.2.2)\n", 75 | "Requirement already satisfied: markupsafe>=2.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.1.2)\n", 76 | "Requirement already satisfied: mistune<3,>=2.0.3 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.0.5)\n", 77 | "Requirement already satisfied: nbclient>=0.5.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.7.2)\n", 78 | "Requirement already satisfied: pandocfilters>=1.4.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.5.0)\n", 79 | "Requirement already satisfied: tinycss2 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.2.1)\n", 80 | "Requirement already satisfied: fastjsonschema in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.16.2)\n", 81 | "Requirement already satisfied: jsonschema>=2.6 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (4.17.3)\n", 82 | "Requirement already satisfied: argon2-cffi-bindings in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (21.2.0)\n", 83 | "Requirement already satisfied: attrs>=17.4.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (22.2.0)\n", 84 | "Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.19.3)\n", 85 | "Requirement already satisfied: anyio>=3.1.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (3.6.2)\n", 86 | "Requirement already satisfied: jupyter-events>=0.4.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.6.3)\n", 87 | "Requirement already satisfied: jupyter-server-terminals in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.4.4)\n", 88 | "Requirement already satisfied: websocket-client in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.5.1)\n", 89 | "Requirement already satisfied: cffi>=1.0.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.15.1)\n", 90 | "Requirement already satisfied: soupsieve>1.2 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from beautifulsoup4->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.4)\n", 91 | "Requirement already satisfied: webencodings in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from bleach->nbconvert>=5->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.5.1)\n", 92 | "Requirement already satisfied: idna>=2.8 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from anyio>=3.1.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (3.4)\n", 93 | "Requirement already satisfied: sniffio>=1.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from anyio>=3.1.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.3.0)\n", 94 | "Requirement already satisfied: pycparser in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.21)\n", 95 | "Requirement already satisfied: python-json-logger>=2.0.4 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-events>=0.4.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.0.7)\n", 96 | "Requirement already satisfied: pyyaml>=5.3 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-events>=0.4.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (6.0)\n", 97 | "Requirement already satisfied: rfc3339-validator in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-events>=0.4.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.1.4)\n", 98 | "Requirement already satisfied: rfc3986-validator>=0.1.1 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jupyter-events>=0.4.0->jupyter-server>=1.8->nbclassic>=0.4.7->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (0.1.1)\n", 99 | "Requirement already satisfied: fqdn in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.5.1)\n", 100 | "Requirement already satisfied: isoduration in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (20.11.0)\n", 101 | "Requirement already satisfied: jsonpointer>1.13 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (2.3)\n", 102 | "Requirement already satisfied: uri-template in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.2.0)\n", 103 | "Requirement already satisfied: webcolors>=1.11 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.12)\n", 104 | "Requirement already satisfied: arrow>=0.15.0 in /home/thibault/miniconda3/envs/rdkit/lib/python3.11/site-packages (from isoduration->jsonschema>=2.6->nbformat->notebook>=4.4.1->widgetsnbextension~=3.6.0->ipywidgets==7.7.2) (1.2.3)\n" 105 | ] 106 | } 107 | ], 108 | "source": [ 109 | "!pip install pandas pandarallel tqdm ipynb ipywidgets==7.7.2" 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": {}, 115 | "source": [ 116 | "# 2. Import and working directory" 117 | ] 118 | }, 119 | { 120 | "cell_type": "code", 121 | "execution_count": 3, 122 | "metadata": {}, 123 | "outputs": [ 124 | { 125 | "name": "stdout", 126 | "output_type": "stream", 127 | "text": [ 128 | "INFO: Pandarallel will run on 6 workers.\n", 129 | "INFO: Pandarallel will use Memory file system to transfer data between the main process and workers.\n" 130 | ] 131 | } 132 | ], 133 | "source": [ 134 | "import pandas as pd\n", 135 | "import tqdm.notebook as tqdm\n", 136 | "from tqdm import tqdm as tq\n", 137 | "import glob\n", 138 | "import pathlib\n", 139 | "import os\n", 140 | "import re\n", 141 | "import numpy as np\n", 142 | "import subprocess\n", 143 | "tq.pandas()\n", 144 | "\n", 145 | "from pandarallel import pandarallel\n", 146 | "\n", 147 | "pandarallel.initialize(progress_bar=True, nb_workers=6) #6 process, each will take 8 cores, which leave me with 48 other cores (96 cores in total)\n", 148 | "\n", 149 | "#Regex for best mode energy\n", 150 | "regex_best_energy = re.compile(r\"^1 +(-?[0-9]+\\.[0-9]+) +0\\.000 +0.000 +\")\n", 151 | "\n" 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "metadata": {}, 157 | "source": [ 158 | "Go to your working directory" 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": 4, 164 | "metadata": {}, 165 | "outputs": [], 166 | "source": [ 167 | "os.chdir(\"/home/thibault/work/projects/other/Sylvie/SMINA\")" 168 | ] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": {}, 173 | "source": [ 174 | "# 3. Searching for pdbqt files\n", 175 | "\n", 176 | "This will search all `pdbqt` files in `ligands` folder and put them in a pandas dataframe (used for quick parallelisation)" 177 | ] 178 | }, 179 | { 180 | "cell_type": "code", 181 | "execution_count": 5, 182 | "metadata": {}, 183 | "outputs": [], 184 | "source": [ 185 | "ligand_list = glob.glob(\"ligands/*.pdbqt\")\n", 186 | "df = pd.DataFrame(ligand_list, columns=[\"LigandFile\"])" 187 | ] 188 | }, 189 | { 190 | "cell_type": "markdown", 191 | "metadata": {}, 192 | "source": [ 193 | "# Run processess" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "execution_count": 9, 199 | "metadata": {}, 200 | "outputs": [ 201 | { 202 | "data": { 203 | "application/vnd.jupyter.widget-view+json": { 204 | "model_id": "909184bd4fbd4deebff83ca9159496f1", 205 | "version_major": 2, 206 | "version_minor": 0 207 | }, 208 | "text/plain": [ 209 | "VBox(children=(HBox(children=(IntProgress(value=0, description='0.00%', max=2), Label(value='0 / 2'))), HBox(c…" 210 | ] 211 | }, 212 | "metadata": {}, 213 | "output_type": "display_data" 214 | } 215 | ], 216 | "source": [ 217 | "def run_smina(row):\n", 218 | " file=row.iloc[0]\n", 219 | " basename = pathlib.Path(file).stem\n", 220 | " outfolder = f\"all_poses/{basename}\"\n", 221 | " if not os.path.exists(outfolder):\n", 222 | " os.makedirs(outfolder)\n", 223 | "\n", 224 | " outputposes = f\"{outfolder}/{basename}.pdbqt\"\n", 225 | " outputlog = f\"{outfolder}/{basename}.log\"\n", 226 | "\n", 227 | " #default success = False. Only change it when it worked.\n", 228 | " success = False\n", 229 | " best_energy = np.nan\n", 230 | " #Check if the logfile already exist\n", 231 | " #if os.path.exists(outputlog):\n", 232 | " if False:\n", 233 | " with open(outputlog, \"r\") as log:\n", 234 | " stdout = log.readlines()\n", 235 | " success = True\n", 236 | " else:\n", 237 | " results = subprocess.run([\n", 238 | " \"./smina.static\", \n", 239 | " \"--config\",\"ACDC_B.inp\",\n", 240 | " \"--ligand\", file, \n", 241 | " \"--out\", outputposes, \n", 242 | " \"--log\",outputlog,\n", 243 | " \"--cpu\",\"8\", \n", 244 | " \"--scoring\",\"vinardo\"],\n", 245 | " capture_output=True)\n", 246 | " if results.returncode == 0:\n", 247 | " success = True\n", 248 | " if success == True:\n", 249 | " stdout = results.stdout.decode(\"utf-8\").split(\"\\n\")\n", 250 | " \n", 251 | "\n", 252 | " if success == True:\n", 253 | " #Get the best energy\n", 254 | " for line in stdout:\n", 255 | " match = regex_best_energy.findall(line)\n", 256 | " if match:\n", 257 | " best_energy=float(match[0])\n", 258 | " break\n", 259 | " else:\n", 260 | " best_energy = np.nan\n", 261 | "\n", 262 | "\n", 263 | " return pd.Series(\n", 264 | " {\n", 265 | " \"name\":basename,\n", 266 | " \"filename\":file,\n", 267 | " \"success\":success,\n", 268 | " \"BestEnergy\":best_energy\n", 269 | " }\n", 270 | " )\n", 271 | "results = df.parallel_apply(lambda x: run_smina(x), axis=1)\n", 272 | "\n", 273 | "results.to_csv(\"results.csv\",sep=\";\")" 274 | ] 275 | }, 276 | { 277 | "cell_type": "code", 278 | "execution_count": 36, 279 | "metadata": {}, 280 | "outputs": [ 281 | { 282 | "data": { 283 | "text/html": [ 284 | "
\n", 285 | "\n", 298 | "\n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | " \n", 403 | " \n", 404 | " \n", 405 | " \n", 406 | " \n", 407 | " \n", 408 | " \n", 409 | " \n", 410 | " \n", 411 | " \n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | "
namefilenamesuccessBestEnergy
827TCMDC-138610ligands/TCMDC-138610.pdbqtTrue-11.2
1224TCMDC-137673ligands/TCMDC-137673.pdbqtTrue-11.5
2419TCMDC-139868ligands/TCMDC-139868.pdbqtTrue-11.5
3011TCMDC-125666ligands/TCMDC-125666.pdbqtTrue-11.8
3026TCMDC-139069ligands/TCMDC-139069.pdbqtTrue-11.1
3620TCMDC-124223ligands/TCMDC-124223.pdbqtTrue-11.6
3945TCMDC-138256ligands/TCMDC-138256.pdbqtTrue-11.1
4734TCMDC-139888ligands/TCMDC-139888.pdbqtTrue-11.5
4848TCMDC-140353ligands/TCMDC-140353.pdbqtTrue-11.1
5002TCMDC-125639ligands/TCMDC-125639.pdbqtTrue-11.4
5156TCMDC-140235ligands/TCMDC-140235.pdbqtTrue-12.1
5374TCMDC-140834ligands/TCMDC-140834.pdbqtTrue-11.6
6012TCMDC-124284ligands/TCMDC-124284.pdbqtTrue-11.1
6026TCMDC-131906ligands/TCMDC-131906.pdbqtTrue-11.4
6196TCMDC-138607ligands/TCMDC-138607.pdbqtTrue-11.5
6657TCMDC-137354ligands/TCMDC-137354.pdbqtTrue-11.1
8892TCMDC-134680ligands/TCMDC-134680.pdbqtTrue-11.5
9983TCMDC-133253ligands/TCMDC-133253.pdbqtTrue-11.2
11750TCMDC-132199ligands/TCMDC-132199.pdbqtTrue-11.2
\n", 444 | "
" 445 | ], 446 | "text/plain": [ 447 | " name filename success BestEnergy\n", 448 | "827 TCMDC-138610 ligands/TCMDC-138610.pdbqt True -11.2\n", 449 | "1224 TCMDC-137673 ligands/TCMDC-137673.pdbqt True -11.5\n", 450 | "2419 TCMDC-139868 ligands/TCMDC-139868.pdbqt True -11.5\n", 451 | "3011 TCMDC-125666 ligands/TCMDC-125666.pdbqt True -11.8\n", 452 | "3026 TCMDC-139069 ligands/TCMDC-139069.pdbqt True -11.1\n", 453 | "3620 TCMDC-124223 ligands/TCMDC-124223.pdbqt True -11.6\n", 454 | "3945 TCMDC-138256 ligands/TCMDC-138256.pdbqt True -11.1\n", 455 | "4734 TCMDC-139888 ligands/TCMDC-139888.pdbqt True -11.5\n", 456 | "4848 TCMDC-140353 ligands/TCMDC-140353.pdbqt True -11.1\n", 457 | "5002 TCMDC-125639 ligands/TCMDC-125639.pdbqt True -11.4\n", 458 | "5156 TCMDC-140235 ligands/TCMDC-140235.pdbqt True -12.1\n", 459 | "5374 TCMDC-140834 ligands/TCMDC-140834.pdbqt True -11.6\n", 460 | "6012 TCMDC-124284 ligands/TCMDC-124284.pdbqt True -11.1\n", 461 | "6026 TCMDC-131906 ligands/TCMDC-131906.pdbqt True -11.4\n", 462 | "6196 TCMDC-138607 ligands/TCMDC-138607.pdbqt True -11.5\n", 463 | "6657 TCMDC-137354 ligands/TCMDC-137354.pdbqt True -11.1\n", 464 | "8892 TCMDC-134680 ligands/TCMDC-134680.pdbqt True -11.5\n", 465 | "9983 TCMDC-133253 ligands/TCMDC-133253.pdbqt True -11.2\n", 466 | "11750 TCMDC-132199 ligands/TCMDC-132199.pdbqt True -11.2" 467 | ] 468 | }, 469 | "execution_count": 36, 470 | "metadata": {}, 471 | "output_type": "execute_result" 472 | } 473 | ], 474 | "source": [ 475 | "results.query(\"BestEnergy < -11\")" 476 | ] 477 | }, 478 | { 479 | "cell_type": "markdown", 480 | "metadata": {}, 481 | "source": [ 482 | "# Reformating the pdbqt to be \"VINA LIKE\" and extract the best pose" 483 | ] 484 | }, 485 | { 486 | "cell_type": "markdown", 487 | "metadata": {}, 488 | "source": [ 489 | "Vina pdbqt have a line with the result in the format `REMARK VINA X X X`. \n", 490 | "This code will add it by parsing the logfiles." 491 | ] 492 | }, 493 | { 494 | "cell_type": "code", 495 | "execution_count": 70, 496 | "metadata": {}, 497 | "outputs": [ 498 | { 499 | "name": "stderr", 500 | "output_type": "stream", 501 | "text": [ 502 | "100%|██████████| 13530/13530 [00:06<00:00, 1953.88it/s]\n" 503 | ] 504 | } 505 | ], 506 | "source": [ 507 | "\n", 508 | "def reformat_and_extract_best_pose(row):\n", 509 | " file=row.iloc[0]\n", 510 | " basename = pathlib.Path(file).stem\n", 511 | " outfolder = f\"all_poses/{basename}\"\n", 512 | " outputposes = f\"{outfolder}/{basename}.pdbqt\"\n", 513 | " outputlog = f\"{outfolder}/{basename}.log\"\n", 514 | " regex_results = re.compile(r\"^([0-9]+) +(-?[0-9]+\\.[0-9]) +(-?[0-9]+\\.[0-9]+) +(-?[0-9]+\\.[0-9]+)\")\n", 515 | "\n", 516 | " #default success = False. Only change it when it worked.\n", 517 | " success = False\n", 518 | " best_energy = np.nan\n", 519 | " #Check if the logfile already exist\n", 520 | " if os.path.exists(outputlog):\n", 521 | " with open(outputlog, \"r\") as log:\n", 522 | " stdout = log.readlines()\n", 523 | " success = True\n", 524 | " \n", 525 | "\n", 526 | " if success == True:\n", 527 | " #Get the best energy4\n", 528 | " scores = {}\n", 529 | " for line in stdout:\n", 530 | " match = regex_results.findall(line)\n", 531 | " if len(match)>0:\n", 532 | " model=match[0][0]\n", 533 | " energy=float(match[0][1])\n", 534 | " rmsdLB=float(match[0][2])\n", 535 | " rmsdUB=float(match[0][3])\n", 536 | " scores[model]=(energy, rmsdLB, rmsdUB)\n", 537 | "\n", 538 | " if success:\n", 539 | " newpdbqt = []\n", 540 | " best_model = []\n", 541 | " with open(outputposes,\"r\") as pdbqt:\n", 542 | " lines = pdbqt.readlines()\n", 543 | " model=0\n", 544 | " vinaLineAdded = False\n", 545 | " for line in lines:\n", 546 | " newpdbqt.append(line)\n", 547 | " if line.startswith(\"MODEL\"):\n", 548 | " model = line.strip().split(\" \")[-1]\n", 549 | " energy = float(scores[model][0])\n", 550 | " rmsdLB = float(scores[model][1])\n", 551 | " rmsdUB = float(scores[model][2])\n", 552 | " VINALINE = f\"REMARK VINA RESULT: {energy:>10.1f} {rmsdLB:>10.3f} {rmsdUB:>10.3f}\\n\"\n", 553 | " newpdbqt.append(VINALINE)\n", 554 | " \n", 555 | " if model == \"1\":\n", 556 | " best_model.append(line)\n", 557 | " if vinaLineAdded == False:\n", 558 | " best_model.append(VINALINE)\n", 559 | " vinaLineAdded = True\n", 560 | " \n", 561 | "\n", 562 | " with open(f\"{outfolder}/{basename}_vinaFormat.pdbqt\",'w') as vinaout:\n", 563 | " for line in newpdbqt:\n", 564 | " vinaout.write(line)\n", 565 | "\n", 566 | " with open(f\"{outfolder}/{basename}_bestpose.pdbqt\",'w') as bestout:\n", 567 | " for line in best_model:\n", 568 | " bestout.write(line)\n", 569 | "\n", 570 | "_ = df.progress_apply(lambda x: reformat_and_extract_best_pose(x), axis=1)" 571 | ] 572 | }, 573 | { 574 | "cell_type": "code", 575 | "execution_count": null, 576 | "metadata": {}, 577 | "outputs": [], 578 | "source": [] 579 | } 580 | ], 581 | "metadata": { 582 | "kernelspec": { 583 | "display_name": "Python 3.11.0 ('rdkit')", 584 | "language": "python", 585 | "name": "python3" 586 | }, 587 | "language_info": { 588 | "codemirror_mode": { 589 | "name": "ipython", 590 | "version": 3 591 | }, 592 | "file_extension": ".py", 593 | "mimetype": "text/x-python", 594 | "name": "python", 595 | "nbconvert_exporter": "python", 596 | "pygments_lexer": "ipython3", 597 | "version": "3.11.0" 598 | }, 599 | "orig_nbformat": 4, 600 | "vscode": { 601 | "interpreter": { 602 | "hash": "87f1d35410e59831f3176a5f724881fcc4c468da76b136c6c7be7d82e22461b6" 603 | } 604 | } 605 | }, 606 | "nbformat": 4, 607 | "nbformat_minor": 2 608 | } 609 | --------------------------------------------------------------------------------