├── LICENSE
├── README.md
├── bindcraft.py
├── bindcraft.slurm
├── example
└── PDL1.pdb
├── functions
├── DAlphaBall.gcc
├── __init__.py
├── biopython_utils.py
├── colabdesign_utils.py
├── dssp
├── generic_utils.py
└── pyrosetta_utils.py
├── install_bindcraft.sh
├── notebooks
└── BindCraft.ipynb
├── pipeline.png
├── settings_advanced
├── betasheet_4stage_multimer.json
├── betasheet_4stage_multimer_flexible.json
├── betasheet_4stage_multimer_flexible_hardtarget.json
├── betasheet_4stage_multimer_hardtarget.json
├── betasheet_4stage_multimer_mpnn.json
├── betasheet_4stage_multimer_mpnn_flexible.json
├── betasheet_4stage_multimer_mpnn_flexible_hardtarget.json
├── betasheet_4stage_multimer_mpnn_hardtarget.json
├── default_4stage_multimer.json
├── default_4stage_multimer_flexible.json
├── default_4stage_multimer_flexible_hardtarget.json
├── default_4stage_multimer_hardtarget.json
├── default_4stage_multimer_mpnn.json
├── default_4stage_multimer_mpnn_flexible.json
├── default_4stage_multimer_mpnn_flexible_hardtarget.json
├── default_4stage_multimer_mpnn_hardtarget.json
├── peptide_3stage_multimer.json
├── peptide_3stage_multimer_flexible.json
├── peptide_3stage_multimer_mpnn.json
└── peptide_3stage_multimer_mpnn_flexible.json
├── settings_filters
├── default_filters.json
├── no_filters.json
├── peptide_filters.json
├── peptide_relaxed_filters.json
└── relaxed_filters.json
└── settings_target
└── PDL1.json
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2024 Martin Pacesa
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # BindCraft
2 | 
3 |
4 | Simple binder design pipeline using AlphaFold2 backpropagation, MPNN, and PyRosetta. Select your target and let the script do the rest of the work and finish once you have enough designs to order!
5 |
6 | [Preprint link for BindCraft](https://www.biorxiv.org/content/10.1101/2024.09.30.615802)
7 |
8 | ## Installation
9 | First you need to clone this repository. Replace **[install_folder]** with the path where you want to install it.
10 |
11 | `git clone https://github.com/martinpacesa/BindCraft [install_folder]`
12 |
13 | The navigate into your install folder using *cd* and run the installation code. BindCraft requires a CUDA-compatible Nvidia graphics card to run. In the *cuda* setting, please specify the CUDA version compatible with your graphics card, for example '11.8'. If unsure, leave blank but it's possible that the installation might select the wrong version, which will lead to errors. In *pkg_manager* specify whether you are using 'mamba' or 'conda', if left blank it will use 'conda' by default.
14 |
15 | Note: This install script will install PyRosetta, which requires a license for commercial purposes. The code requires about 2 Mb of storage space, while the AlphaFold2 weights take up about 5.3 Gb.
16 |
17 | `bash install_bindcraft.sh --cuda '12.4' --pkg_manager 'conda'`
18 |
19 | ## Google Colab
20 |
21 |
22 |
23 | We prepared a convenient google colab notebook to test the bindcraft code functionalities. However, as the pipeline requires significant amount of GPU memory to run for larger target+binder complexes, we highly recommend to run it using a local installation and at least 32 Gb of GPU memory.
24 |
25 | **Always try to trim the input target PDB to the smallest size possible! It will significantly speed up the binder generation and minimise the GPU memory requirements.**
26 |
27 | **Be ready to run at least a few hundred trajectories to see some accepted binders, for difficult targets it might even be a few thousand.**
28 |
29 |
30 | ## Running the script locally and explanation of settings
31 | To run the script locally, first you need to configure your target .json file in the *settings_target* folder. In the json file are the following settings:
32 |
33 | ```
34 | design_path -> path where to save designs and statistics
35 | binder_name -> what to prefix your designed binder files with
36 | starting_pdb -> the path to the PDB of your target protein
37 | chains -> which chains to target in your protein, rest will be ignored
38 | target_hotspot_residues -> which position to target for binder design, for example `1,2-10` or chain specific `A1-10,B1-20` or entire chains `A`, set to null if you want AF2 to select binding site; better to select multiple target residues or a small patch to reduce search space for binder
39 | lengths -> range of binder lengths to design
40 | number_of_final_designs -> how many designs that pass all filters to aim for, script will stop if this many are reached
41 | ```
42 | Then run the binder design script:
43 |
44 | `sbatch ./bindcraft.slurm --settings './settings_target/PDL1.json' --filters './settings_filters/default_filters.json' --advanced './settings_advanced/default_4stage_multimer.json'`
45 |
46 | The *settings* flag should point to your target .json which you set above. The *filters* flag points to the json where the design filters are specified (default is ./filters/default_filters.json). The *advanced* flag points to your advanced settings (default is ./advanced_settings/default_4stage_multimer.json). If you leave out the filters and advanced settings flags it will automatically point to the defaults.
47 |
48 | Alternatively, if your machine does not support SLURM, you can run the code directly by activating the environment in conda and running the python code:
49 |
50 | ```
51 | conda activate BindCraft
52 | cd /path/to/bindcraft/folder/
53 | python -u ./bindcraft.py --settings './settings_target/PDL1.json' --filters './settings_filters/default_filters.json' --advanced './settings_advanced/default_4stage_multimer.json'
54 | ```
55 |
56 | **We recommend to generate at least a 100 final designs passing all filters, then order the top 5-20 for experimental characterisation.** If high affinity binders are required, it is better to screen more, as the ipTM metric used for ranking is not a good predictor for affinity, but has been shown to be a good binary predictor of binding.
57 |
58 | Below are explanations for individual filters and advanced settings.
59 |
60 | ## Advanced settings
61 | Here are the advanced settings controlling the design process:
62 |
63 | ```
64 | omit_AAs -> which amino acids to exclude from design (note: they can still occur if no other options are possible in the position)
65 | force_reject_AA -> whether to force reject design if it contains any amino acids specified in omit_AAs
66 | design_algorithm -> which design algorithm for the trajecory to use, the currently implemented algorithms are below
67 | use_multimer_design -> whether to use AF2-ptm or AF2-multimer for binder design; the other model will be used for validation then
68 | num_recycles_design -> how many recycles of AF2 for design
69 | num_recycles_validation -> how many recycles of AF2 use for structure prediction and validation
70 | sample_models = True -> whether to randomly sample parameters from AF2 models, recommended to avoid overfitting
71 | rm_template_seq_design -> remove target template sequence for design (increases target flexibility)
72 | rm_template_seq_predict -> remove target template sequence for reprediction (increases target flexibility)
73 | rm_template_sc_design -> remove sidechains from target template for design
74 | rm_template_sc_predict -> remove sidechains from target template for reprediction
75 | predict_initial_guess -> Introduce bias by providing binder atom positions as a starting point for prediction. Recommended if designs fail after MPNN optimization.
76 | predict_bigbang -> Introduce atom position bias into the structure module for atom initilisation. Recommended if target and design are large (more than 600 amino acids).
77 |
78 | # Design iterations
79 | soft_iterations -> number of soft iterations (all amino acids considered at all positions)
80 | temporary_iterations -> number of temporary iterations (softmax, most probable amino acids considered at all positions)
81 | hard_iterations -> number of hard iterations (one hot encoding, single amino acids considered at all positions)
82 | greedy_iterations -> number of iterations to sample random mutations from PSSM that reduce loss
83 | greedy_percentage -> What percentage of protein length to mutate during each greedy iteration
84 |
85 | # Design weights, higher value puts more weight on optimising the parameter.
86 | weights_plddt -> Design weight - pLDDT of designed chain
87 | weights_pae_intra -> Design weight - PAE within designed chain
88 | weights_pae_inter -> Design weight - PAE between chains
89 | weights_con_intra -> Design weight - maximise number of contacts within designed chain
90 | weights_con_inter -> Design weight - maximise number of contacts between chains
91 | intra_contact_distance -> Cbeta-Cbeta cutoff distance for contacts within the binder
92 | inter_contact_distance -> Cbeta-Cbeta cutoff distance for contacts between binder and target
93 | intra_contact_number -> how many contacts each contact esidue should make within a chain, excluding immediate neighbours
94 | inter_contact_number -> how many contacts each contact residue should make between chains
95 | weights_helicity -> Design weight - helix propensity of the design, Default 0, negative values bias towards beta sheets
96 | random_helicity -> whether to randomly sample helicity weights for trajectories, from -1 to 1
97 |
98 | # Additional losses
99 | use_i_ptm_loss -> Use i_ptm loss to optimise for interface pTM score?
100 | weights_iptm -> Design weight - i_ptm between chains
101 | use_rg_loss -> use radius of gyration loss?
102 | weights_rg -> Design weight - radius of gyration weight for binder
103 | use_termini_distance_loss -> Try to minimise distance between N- and C-terminus of binder? Helpful for grafting
104 | weights_termini_loss -> Design weight - N- and C-terminus distance minimisation weight of binder
105 |
106 | # MPNN settings
107 | mpnn_fix_interface -> whether to fix the interface designed in the starting trajectory
108 | num_seqs -> number of MPNN generated sequences to sample and predict per binder
109 | max_mpnn_sequences -> how many maximum MPNN sequences per trajectory to save if several pass filters
110 | max_tm-score_filter -> filter out final lower ranking designs by this TM score cut off relative to all passing designs
111 | max_seq-similarity_filter -> filter out final lower ranking designs by this sequence similarity cut off relative to all passing designs
112 | sampling_temp = 0.1 -> sampling temperature for amino acids, T=0.0 means taking argmax, T>>1.0 means sampling randomly.")
113 |
114 | # MPNN settings - advanced
115 | sample_seq_parallel -> how many sequences to sample in parallel, reduce if running out of memory
116 | backbone_noise -> backbone noise during sampling, 0.00-0.02 are good values
117 | model_path -> path to the MPNN model weights
118 | mpnn_weights -> whether to use "original" mpnn weights or "soluble" weights
119 | save_mpnn_fasta -> whether to save MPNN sequences as fasta files, normally not needed as the sequence is also in the CSV file
120 |
121 | # AF2 design settings - advanced
122 | num_recycles_design -> how many recycles of AF2 for design
123 | num_recycles_validation -> how many recycles of AF2 use for structure prediction and validation
124 | optimise_beta -> optimise predictions if beta sheeted trajectory detected?
125 | optimise_beta_extra_soft -> how many extra soft iterations to add if beta sheets detected
126 | optimise_beta_extra_temp -> how many extra temporary iterations to add if beta sheets detected
127 | optimise_beta_recycles_design -> how many recycles to do during design if beta sheets detected
128 | optimise_beta_recycles_valid -> how many recycles to do during reprediction if beta sheets detected
129 |
130 | # Optimise script
131 | remove_unrelaxed_trajectory -> remove the PDB files of unrelaxed designed trajectories, relaxed PDBs are retained
132 | remove_unrelaxed_complex -> remove the PDB files of unrelaxed predicted MPNN-optimised complexes, relaxed PDBs are retained
133 | remove_binder_monomer -> remove the PDB files of predicted binder monomers after scoring to save space
134 | zip_animations -> at the end, zip Animations trajectory folder to save space
135 | zip_plots -> at the end, zip Plots trajectory folder to save space
136 | save_trajectory_pickle -> save pickle file of the generated trajectory, careful, takes up a lot of storage space!
137 | max_trajectories -> how many maximum trajectories to generate, for benchmarking
138 | acceptance_rate -> what fraction of trajectories should yield designs passing the filters, if the proportion of successful designs is less than this fraction then the script will stop and you should adjust your design weights
139 | start_monitoring -> after what number of trajectories should we start monitoring acceptance_rate, do not set too low, could terminate prematurely
140 |
141 | # debug settings
142 | enable_mpnn = True -> whether to enable MPNN design
143 | enable_rejection_check -> enable rejection rate check
144 | ```
145 |
146 | ## Filters
147 | Here are the features by which your designs will be filtered, if you don't want to use some, just set *null* as threshold. *higher* option indicates whether values higher than threshold should be kept (true) or lower (false). Features starting with N_ correspond to statistics per each AlphaFold model, Averages are accross all models predicted.
148 | ```
149 | MPNN_score -> MPNN sequence score, generally not recommended as it depends on protein
150 | MPNN_seq_recovery -> MPNN sequence recovery of original trajectory
151 | pLDDT -> pLDDT confidence score of AF2 complex prediction, normalised to 0-1
152 | pTM -> pTM confidence score of AF2 complex prediction, normalised to 0-1
153 | i_pTM -> interface pTM confidence score of AF2 complex prediction, normalised to 0-1
154 | pAE -> predicted alignment error of AF2 complex prediction, normalised compared AF2 by n/31 to 0-1
155 | i_pAE -> predicted interface alignment error of AF2 complex prediction, normalised compared AF2 by n/31 to 0-1
156 | i_pLDDT -> interface pLDDT confidence score of AF2 complex prediction, normalised to 0-1
157 | ss_pLDDT -> secondary structure pLDDT confidence score of AF2 complex prediction, normalised to 0-1
158 | Unrelaxed_Clashes -> number of interface clashes before relaxation
159 | Relaxed_Clashes -> number of interface clashes after relaxation
160 | Binder_Energy_Score -> Rosetta energy score for binder alone
161 | Surface_Hydrophobicity -> surface hydrophobicity fraction for binder
162 | ShapeComplementarity -> interface shape complementarity
163 | PackStat -> interface packstat rosetta score
164 | dG -> interface rosetta dG energy
165 | dSASA -> interface delta SASA (size)
166 | dG/dSASA -> interface energy divided by interface size
167 | Interface_SASA_% -> Fraction of binder surface covered by the interface
168 | Interface_Hydrophobicity -> Interface hydrophobicity fraction of binder interface
169 | n_InterfaceResidues -> number of interface residues
170 | n_InterfaceHbonds -> number of hydrogen bonds at the interface
171 | InterfaceHbondsPercentage -> number of hydrogen bonds compared to interface size
172 | n_InterfaceUnsatHbonds -> number of unsatisfied buried hydrogen bonds at the interface
173 | InterfaceUnsatHbondsPercentage -> number of unsatisfied buried hydrogen bonds compared to interface size
174 | Interface_Helix% -> proportion of alfa helices at the interface
175 | Interface_BetaSheet% -> proportion of beta sheets at the interface
176 | Interface_Loop% -> proportion of loops at the interface
177 | Binder_Helix% -> proportion of alfa helices in the binder structure
178 | Binder_BetaSheet% -> proportion of beta sheets in the binder structure
179 | Binder_Loop% -> proportion of loops in the binder structure
180 | InterfaceAAs -> number of amino acids of each type at the interface
181 | HotspotRMSD -> unaligned RMSD of binder compared to original trajectory, in other words how far is binder in the repredicted complex from the original binding site
182 | Target_RMSD -> RMSD of target predicted in context of the designed binder compared to input PDB
183 | Binder_pLDDT -> pLDDT confidence score of binder predicted alone
184 | Binder_pTM -> pTM confidence score of binder predicted alone
185 | Binder_pAE -> predicted alignment error of binder predicted alone
186 | Binder_RMSD -> RMSD of binder predicted alone compared to original trajectory
187 | ```
188 |
189 | ## Implemented design algorithms
190 |
191 | - 2stage - design with logits->pssm_semigreedy (faster)
192 | - 3stage - design with logits->softmax(logits)->one-hot (standard)
193 | - 4stage - design with logits->softmax(logits)->one-hot->pssm_semigreedy (default, extensive)
194 | - greedy - design with random mutations that decrease loss (less memory intensive, slower, less efficient)
195 | - mcmc - design with random mutations that decrease loss, similar to Wicky et al. (less memory intensive, slower, less efficient)
196 |
197 |
198 | ## Known limitations
199 |
200 | - Settings might not work for all targets! Number of iterations, design weights, and/or filters might have to be adjusted. Target site selection is also important, but AF2 is very good at detecting good binding sites if no hotspot is specified.
201 | - AF2 is worse at predicting/designing hydrophilic then it is at hydrophobic interfaces.
202 | - Sometimes the trajectories can end up being deformed or 'squashed'. This is normal for AF2 multimer design, as it is very sensitive to the sequence input, this cannot be avoided without model retraining. However these trajectories are quickly detected and discarded.
203 |
204 |
205 | ## Credits
206 | Thanks to Lennart Nickel, Yehlin Cho, Casper Goverde, and Sergey Ovchinnikov for help with coding and discussing ideas. This repository uses code from:
207 |
208 | - Sergey Ovchinnikov's ColabDesign (https://github.com/sokrypton/ColabDesign)
209 | - Justas Dauparas's ProteinMPNN (https://github.com/dauparas/ProteinMPNN)
210 | - PyRosetta (https://github.com/RosettaCommons/PyRosetta.notebooks)
211 |
212 |
--------------------------------------------------------------------------------
/bindcraft.py:
--------------------------------------------------------------------------------
1 | ####################################
2 | ###################### BindCraft Run
3 | ####################################
4 | ### Import dependencies
5 | from functions import *
6 |
7 | # Check if JAX-capable GPU is available, otherwise exit
8 | check_jax_gpu()
9 |
10 | ######################################
11 | ### parse input paths
12 | parser = argparse.ArgumentParser(description='Script to run BindCraft binder design.')
13 |
14 | parser.add_argument('--settings', '-s', type=str, required=True,
15 | help='Path to the basic settings.json file. Required.')
16 | parser.add_argument('--filters', '-f', type=str, default='./settings_filters/default_filters.json',
17 | help='Path to the filters.json file used to filter design. If not provided, default will be used.')
18 | parser.add_argument('--advanced', '-a', type=str, default='./settings_advanced/default_4stage_multimer.json',
19 | help='Path to the advanced.json file with additional design settings. If not provided, default will be used.')
20 |
21 | args = parser.parse_args()
22 |
23 | # perform checks of input setting files
24 | settings_path, filters_path, advanced_path = perform_input_check(args)
25 |
26 | ### load settings from JSON
27 | target_settings, advanced_settings, filters = load_json_settings(settings_path, filters_path, advanced_path)
28 |
29 | settings_file = os.path.basename(settings_path).split('.')[0]
30 | filters_file = os.path.basename(filters_path).split('.')[0]
31 | advanced_file = os.path.basename(advanced_path).split('.')[0]
32 |
33 | ### load AF2 model settings
34 | design_models, prediction_models, multimer_validation = load_af2_models(advanced_settings["use_multimer_design"])
35 |
36 | ### perform checks on advanced_settings
37 | bindcraft_folder = os.path.dirname(os.path.realpath(__file__))
38 | advanced_settings = perform_advanced_settings_check(advanced_settings, bindcraft_folder)
39 |
40 | ### generate directories, design path names can be found within the function
41 | design_paths = generate_directories(target_settings["design_path"])
42 |
43 | ### generate dataframes
44 | trajectory_labels, design_labels, final_labels = generate_dataframe_labels()
45 |
46 | trajectory_csv = os.path.join(target_settings["design_path"], 'trajectory_stats.csv')
47 | mpnn_csv = os.path.join(target_settings["design_path"], 'mpnn_design_stats.csv')
48 | final_csv = os.path.join(target_settings["design_path"], 'final_design_stats.csv')
49 | failure_csv = os.path.join(target_settings["design_path"], 'failure_csv.csv')
50 |
51 | create_dataframe(trajectory_csv, trajectory_labels)
52 | create_dataframe(mpnn_csv, design_labels)
53 | create_dataframe(final_csv, final_labels)
54 | generate_filter_pass_csv(failure_csv, args.filters)
55 |
56 | ####################################
57 | ####################################
58 | ####################################
59 | ### initialise PyRosetta
60 | pr.init(f'-ignore_unrecognized_res -ignore_zero_occupancy -mute all -holes:dalphaball {advanced_settings["dalphaball_path"]} -corrections::beta_nov16 true -relax:default_repeats 1')
61 | print(f"Running binder design for target {settings_file}")
62 | print(f"Design settings used: {advanced_file}")
63 | print(f"Filtering designs based on {filters_file}")
64 |
65 | ####################################
66 | # initialise counters
67 | script_start_time = time.time()
68 | trajectory_n = 1
69 | accepted_designs = 0
70 |
71 | ### start design loop
72 | while True:
73 | ### check if we have the target number of binders
74 | final_designs_reached = check_accepted_designs(design_paths, mpnn_csv, final_labels, final_csv, advanced_settings, target_settings, design_labels)
75 |
76 | if final_designs_reached:
77 | # stop design loop execution
78 | break
79 |
80 | ### check if we reached maximum allowed trajectories
81 | max_trajectories_reached = check_n_trajectories(design_paths, advanced_settings)
82 |
83 | if max_trajectories_reached:
84 | break
85 |
86 | ### Initialise design
87 | # measure time to generate design
88 | trajectory_start_time = time.time()
89 |
90 | # generate random seed to vary designs
91 | seed = int(np.random.randint(0, high=999999, size=1, dtype=int)[0])
92 |
93 | # sample binder design length randomly from defined distribution
94 | samples = np.arange(min(target_settings["lengths"]), max(target_settings["lengths"]) + 1)
95 | length = np.random.choice(samples)
96 |
97 | # load desired helicity value to sample different secondary structure contents
98 | helicity_value = load_helicity(advanced_settings)
99 |
100 | # generate design name and check if same trajectory was already run
101 | design_name = target_settings["binder_name"] + "_l" + str(length) + "_s"+ str(seed)
102 | trajectory_dirs = ["Trajectory", "Trajectory/Relaxed", "Trajectory/LowConfidence", "Trajectory/Clashing"]
103 | trajectory_exists = any(os.path.exists(os.path.join(design_paths[trajectory_dir], design_name + ".pdb")) for trajectory_dir in trajectory_dirs)
104 |
105 | if not trajectory_exists:
106 | print("Starting trajectory: "+design_name)
107 |
108 | ### Begin binder hallucination
109 | trajectory = binder_hallucination(design_name, target_settings["starting_pdb"], target_settings["chains"],
110 | target_settings["target_hotspot_residues"], length, seed, helicity_value,
111 | design_models, advanced_settings, design_paths, failure_csv)
112 | trajectory_metrics = copy_dict(trajectory._tmp["best"]["aux"]["log"]) # contains plddt, ptm, i_ptm, pae, i_pae
113 | trajectory_pdb = os.path.join(design_paths["Trajectory"], design_name + ".pdb")
114 |
115 | # round the metrics to two decimal places
116 | trajectory_metrics = {k: round(v, 2) if isinstance(v, float) else v for k, v in trajectory_metrics.items()}
117 |
118 | # time trajectory
119 | trajectory_time = time.time() - trajectory_start_time
120 | trajectory_time_text = f"{'%d hours, %d minutes, %d seconds' % (int(trajectory_time // 3600), int((trajectory_time % 3600) // 60), int(trajectory_time % 60))}"
121 | print("Starting trajectory took: "+trajectory_time_text)
122 | print("")
123 |
124 | # Proceed if there is no trajectory termination signal
125 | if trajectory.aux["log"]["terminate"] == "":
126 | # Relax binder to calculate statistics
127 | trajectory_relaxed = os.path.join(design_paths["Trajectory/Relaxed"], design_name + ".pdb")
128 | pr_relax(trajectory_pdb, trajectory_relaxed)
129 |
130 | # define binder chain, placeholder in case multi-chain parsing in ColabDesign gets changed
131 | binder_chain = "B"
132 |
133 | # Calculate clashes before and after relaxation
134 | num_clashes_trajectory = calculate_clash_score(trajectory_pdb)
135 | num_clashes_relaxed = calculate_clash_score(trajectory_relaxed)
136 |
137 | # secondary structure content of starting trajectory binder and interface
138 | trajectory_alpha, trajectory_beta, trajectory_loops, trajectory_alpha_interface, trajectory_beta_interface, trajectory_loops_interface, trajectory_i_plddt, trajectory_ss_plddt = calc_ss_percentage(trajectory_pdb, advanced_settings, binder_chain)
139 |
140 | # analyze interface scores for relaxed af2 trajectory
141 | trajectory_interface_scores, trajectory_interface_AA, trajectory_interface_residues = score_interface(trajectory_relaxed, binder_chain)
142 |
143 | # starting binder sequence
144 | trajectory_sequence = trajectory.get_seq(get_best=True)[0]
145 |
146 | # analyze sequence
147 | traj_seq_notes = validate_design_sequence(trajectory_sequence, num_clashes_relaxed, advanced_settings)
148 |
149 | # target structure RMSD compared to input PDB
150 | trajectory_target_rmsd = target_pdb_rmsd(trajectory_pdb, target_settings["starting_pdb"], target_settings["chains"])
151 |
152 | # save trajectory statistics into CSV
153 | trajectory_data = [design_name, advanced_settings["design_algorithm"], length, seed, helicity_value, target_settings["target_hotspot_residues"], trajectory_sequence, trajectory_interface_residues,
154 | trajectory_metrics['plddt'], trajectory_metrics['ptm'], trajectory_metrics['i_ptm'], trajectory_metrics['pae'], trajectory_metrics['i_pae'],
155 | trajectory_i_plddt, trajectory_ss_plddt, num_clashes_trajectory, num_clashes_relaxed, trajectory_interface_scores['binder_score'],
156 | trajectory_interface_scores['surface_hydrophobicity'], trajectory_interface_scores['interface_sc'], trajectory_interface_scores['interface_packstat'],
157 | trajectory_interface_scores['interface_dG'], trajectory_interface_scores['interface_dSASA'], trajectory_interface_scores['interface_dG_SASA_ratio'],
158 | trajectory_interface_scores['interface_fraction'], trajectory_interface_scores['interface_hydrophobicity'], trajectory_interface_scores['interface_nres'], trajectory_interface_scores['interface_interface_hbonds'],
159 | trajectory_interface_scores['interface_hbond_percentage'], trajectory_interface_scores['interface_delta_unsat_hbonds'], trajectory_interface_scores['interface_delta_unsat_hbonds_percentage'],
160 | trajectory_alpha_interface, trajectory_beta_interface, trajectory_loops_interface, trajectory_alpha, trajectory_beta, trajectory_loops, trajectory_interface_AA, trajectory_target_rmsd,
161 | trajectory_time_text, traj_seq_notes, settings_file, filters_file, advanced_file]
162 | insert_data(trajectory_csv, trajectory_data)
163 |
164 | if advanced_settings["enable_mpnn"]:
165 | # initialise MPNN counters
166 | mpnn_n = 1
167 | accepted_mpnn = 0
168 | mpnn_dict = {}
169 | design_start_time = time.time()
170 |
171 | ### MPNN redesign of starting binder
172 | mpnn_trajectories = mpnn_gen_sequence(trajectory_pdb, binder_chain, trajectory_interface_residues, advanced_settings)
173 | existing_mpnn_sequences = set(pd.read_csv(mpnn_csv, usecols=['Sequence'])['Sequence'].values)
174 |
175 | # create set of MPNN sequences with allowed amino acid composition
176 | restricted_AAs = set(aa.strip().upper() for aa in advanced_settings["omit_AAs"].split(',')) if advanced_settings["force_reject_AA"] else set()
177 |
178 | mpnn_sequences = sorted({
179 | mpnn_trajectories['seq'][n][-length:]: {
180 | 'seq': mpnn_trajectories['seq'][n][-length:],
181 | 'score': mpnn_trajectories['score'][n],
182 | 'seqid': mpnn_trajectories['seqid'][n]
183 | } for n in range(advanced_settings["num_seqs"])
184 | if (not restricted_AAs or not any(aa in mpnn_trajectories['seq'][n][-length:].upper() for aa in restricted_AAs))
185 | and mpnn_trajectories['seq'][n][-length:] not in existing_mpnn_sequences
186 | }.values(), key=lambda x: x['score'])
187 |
188 | del existing_mpnn_sequences
189 |
190 | # check whether any sequences are left after amino acid rejection and duplication check, and if yes proceed with prediction
191 | if mpnn_sequences:
192 | # add optimisation for increasing recycles if trajectory is beta sheeted
193 | if advanced_settings["optimise_beta"] and float(trajectory_beta) > 15:
194 | advanced_settings["num_recycles_validation"] = advanced_settings["optimise_beta_recycles_valid"]
195 |
196 | ### Compile prediction models once for faster prediction of MPNN sequences
197 | clear_mem()
198 | # compile complex prediction model
199 | complex_prediction_model = mk_afdesign_model(protocol="binder", num_recycles=advanced_settings["num_recycles_validation"], data_dir=advanced_settings["af_params_dir"],
200 | use_multimer=multimer_validation, use_initial_guess=advanced_settings["predict_initial_guess"], use_initial_atom_pos=advanced_settings["predict_bigbang"])
201 | if advanced_settings["predict_initial_guess"] or advanced_settings["predict_bigbang"]:
202 | complex_prediction_model.prep_inputs(pdb_filename=trajectory_pdb, chain='A', binder_chain='B', binder_len=length, use_binder_template=True, rm_target_seq=advanced_settings["rm_template_seq_predict"],
203 | rm_target_sc=advanced_settings["rm_template_sc_predict"], rm_template_ic=True)
204 | else:
205 | complex_prediction_model.prep_inputs(pdb_filename=target_settings["starting_pdb"], chain=target_settings["chains"], binder_len=length, rm_target_seq=advanced_settings["rm_template_seq_predict"],
206 | rm_target_sc=advanced_settings["rm_template_sc_predict"])
207 |
208 | # compile binder monomer prediction model
209 | binder_prediction_model = mk_afdesign_model(protocol="hallucination", use_templates=False, initial_guess=False,
210 | use_initial_atom_pos=False, num_recycles=advanced_settings["num_recycles_validation"],
211 | data_dir=advanced_settings["af_params_dir"], use_multimer=multimer_validation)
212 | binder_prediction_model.prep_inputs(length=length)
213 |
214 | # iterate over designed sequences
215 | for mpnn_sequence in mpnn_sequences:
216 | mpnn_time = time.time()
217 |
218 | # generate mpnn design name numbering
219 | mpnn_design_name = design_name + "_mpnn" + str(mpnn_n)
220 | mpnn_score = round(mpnn_sequence['score'],2)
221 | mpnn_seqid = round(mpnn_sequence['seqid'],2)
222 |
223 | # add design to dictionary
224 | mpnn_dict[mpnn_design_name] = {'seq': mpnn_sequence['seq'], 'score': mpnn_score, 'seqid': mpnn_seqid}
225 |
226 | # save fasta sequence
227 | if advanced_settings["save_mpnn_fasta"] is True:
228 | save_fasta(mpnn_design_name, mpnn_sequence['seq'], design_paths)
229 |
230 | ### Predict mpnn redesigned binder complex using masked templates
231 | mpnn_complex_statistics, pass_af2_filters = predict_binder_complex(complex_prediction_model,
232 | mpnn_sequence['seq'], mpnn_design_name,
233 | target_settings["starting_pdb"], target_settings["chains"],
234 | length, trajectory_pdb, prediction_models, advanced_settings,
235 | filters, design_paths, failure_csv)
236 |
237 | # if AF2 filters are not passed then skip the scoring
238 | if not pass_af2_filters:
239 | print(f"Base AF2 filters not passed for {mpnn_design_name}, skipping interface scoring")
240 | mpnn_n += 1
241 | continue
242 |
243 | # calculate statistics for each model individually
244 | for model_num in prediction_models:
245 | mpnn_design_pdb = os.path.join(design_paths["MPNN"], f"{mpnn_design_name}_model{model_num+1}.pdb")
246 | mpnn_design_relaxed = os.path.join(design_paths["MPNN/Relaxed"], f"{mpnn_design_name}_model{model_num+1}.pdb")
247 |
248 | if os.path.exists(mpnn_design_pdb):
249 | # Calculate clashes before and after relaxation
250 | num_clashes_mpnn = calculate_clash_score(mpnn_design_pdb)
251 | num_clashes_mpnn_relaxed = calculate_clash_score(mpnn_design_relaxed)
252 |
253 | # analyze interface scores for relaxed af2 trajectory
254 | mpnn_interface_scores, mpnn_interface_AA, mpnn_interface_residues = score_interface(mpnn_design_relaxed, binder_chain)
255 |
256 | # secondary structure content of starting trajectory binder
257 | mpnn_alpha, mpnn_beta, mpnn_loops, mpnn_alpha_interface, mpnn_beta_interface, mpnn_loops_interface, mpnn_i_plddt, mpnn_ss_plddt = calc_ss_percentage(mpnn_design_pdb, advanced_settings, binder_chain)
258 |
259 | # unaligned RMSD calculate to determine if binder is in the designed binding site
260 | rmsd_site = unaligned_rmsd(trajectory_pdb, mpnn_design_pdb, binder_chain, binder_chain)
261 |
262 | # calculate RMSD of target compared to input PDB
263 | target_rmsd = target_pdb_rmsd(mpnn_design_pdb, target_settings["starting_pdb"], target_settings["chains"])
264 |
265 | # add the additional statistics to the mpnn_complex_statistics dictionary
266 | mpnn_complex_statistics[model_num+1].update({
267 | 'i_pLDDT': mpnn_i_plddt,
268 | 'ss_pLDDT': mpnn_ss_plddt,
269 | 'Unrelaxed_Clashes': num_clashes_mpnn,
270 | 'Relaxed_Clashes': num_clashes_mpnn_relaxed,
271 | 'Binder_Energy_Score': mpnn_interface_scores['binder_score'],
272 | 'Surface_Hydrophobicity': mpnn_interface_scores['surface_hydrophobicity'],
273 | 'ShapeComplementarity': mpnn_interface_scores['interface_sc'],
274 | 'PackStat': mpnn_interface_scores['interface_packstat'],
275 | 'dG': mpnn_interface_scores['interface_dG'],
276 | 'dSASA': mpnn_interface_scores['interface_dSASA'],
277 | 'dG/dSASA': mpnn_interface_scores['interface_dG_SASA_ratio'],
278 | 'Interface_SASA_%': mpnn_interface_scores['interface_fraction'],
279 | 'Interface_Hydrophobicity': mpnn_interface_scores['interface_hydrophobicity'],
280 | 'n_InterfaceResidues': mpnn_interface_scores['interface_nres'],
281 | 'n_InterfaceHbonds': mpnn_interface_scores['interface_interface_hbonds'],
282 | 'InterfaceHbondsPercentage': mpnn_interface_scores['interface_hbond_percentage'],
283 | 'n_InterfaceUnsatHbonds': mpnn_interface_scores['interface_delta_unsat_hbonds'],
284 | 'InterfaceUnsatHbondsPercentage': mpnn_interface_scores['interface_delta_unsat_hbonds_percentage'],
285 | 'InterfaceAAs': mpnn_interface_AA,
286 | 'Interface_Helix%': mpnn_alpha_interface,
287 | 'Interface_BetaSheet%': mpnn_beta_interface,
288 | 'Interface_Loop%': mpnn_loops_interface,
289 | 'Binder_Helix%': mpnn_alpha,
290 | 'Binder_BetaSheet%': mpnn_beta,
291 | 'Binder_Loop%': mpnn_loops,
292 | 'Hotspot_RMSD': rmsd_site,
293 | 'Target_RMSD': target_rmsd
294 | })
295 |
296 | # save space by removing unrelaxed predicted mpnn complex pdb?
297 | if advanced_settings["remove_unrelaxed_complex"]:
298 | os.remove(mpnn_design_pdb)
299 |
300 | # calculate complex averages
301 | mpnn_complex_averages = calculate_averages(mpnn_complex_statistics, handle_aa=True)
302 |
303 | ### Predict binder alone in single sequence mode
304 | binder_statistics = predict_binder_alone(binder_prediction_model, mpnn_sequence['seq'], mpnn_design_name, length,
305 | trajectory_pdb, binder_chain, prediction_models, advanced_settings, design_paths)
306 |
307 | # extract RMSDs of binder to the original trajectory
308 | for model_num in prediction_models:
309 | mpnn_binder_pdb = os.path.join(design_paths["MPNN/Binder"], f"{mpnn_design_name}_model{model_num+1}.pdb")
310 |
311 | if os.path.exists(mpnn_binder_pdb):
312 | rmsd_binder = unaligned_rmsd(trajectory_pdb, mpnn_binder_pdb, binder_chain, "A")
313 |
314 | # append to statistics
315 | binder_statistics[model_num+1].update({
316 | 'Binder_RMSD': rmsd_binder
317 | })
318 |
319 | # save space by removing binder monomer models?
320 | if advanced_settings["remove_binder_monomer"]:
321 | os.remove(mpnn_binder_pdb)
322 |
323 | # calculate binder averages
324 | binder_averages = calculate_averages(binder_statistics)
325 |
326 | # analyze sequence to make sure there are no cysteins and it contains residues that absorb UV for detection
327 | seq_notes = validate_design_sequence(mpnn_sequence['seq'], mpnn_complex_averages.get('Relaxed_Clashes', None), advanced_settings)
328 |
329 | # measure time to generate design
330 | mpnn_end_time = time.time() - mpnn_time
331 | elapsed_mpnn_text = f"{'%d hours, %d minutes, %d seconds' % (int(mpnn_end_time // 3600), int((mpnn_end_time % 3600) // 60), int(mpnn_end_time % 60))}"
332 |
333 |
334 | # Insert statistics about MPNN design into CSV, will return None if corresponding model does note exist
335 | model_numbers = range(1, 6)
336 | statistics_labels = ['pLDDT', 'pTM', 'i_pTM', 'pAE', 'i_pAE', 'i_pLDDT', 'ss_pLDDT', 'Unrelaxed_Clashes', 'Relaxed_Clashes', 'Binder_Energy_Score', 'Surface_Hydrophobicity',
337 | 'ShapeComplementarity', 'PackStat', 'dG', 'dSASA', 'dG/dSASA', 'Interface_SASA_%', 'Interface_Hydrophobicity', 'n_InterfaceResidues', 'n_InterfaceHbonds', 'InterfaceHbondsPercentage',
338 | 'n_InterfaceUnsatHbonds', 'InterfaceUnsatHbondsPercentage', 'Interface_Helix%', 'Interface_BetaSheet%', 'Interface_Loop%', 'Binder_Helix%',
339 | 'Binder_BetaSheet%', 'Binder_Loop%', 'InterfaceAAs', 'Hotspot_RMSD', 'Target_RMSD']
340 |
341 | # Initialize mpnn_data with the non-statistical data
342 | mpnn_data = [mpnn_design_name, advanced_settings["design_algorithm"], length, seed, helicity_value, target_settings["target_hotspot_residues"], mpnn_sequence['seq'], mpnn_interface_residues, mpnn_score, mpnn_seqid]
343 |
344 | # Add the statistical data for mpnn_complex
345 | for label in statistics_labels:
346 | mpnn_data.append(mpnn_complex_averages.get(label, None))
347 | for model in model_numbers:
348 | mpnn_data.append(mpnn_complex_statistics.get(model, {}).get(label, None))
349 |
350 | # Add the statistical data for binder
351 | for label in ['pLDDT', 'pTM', 'pAE', 'Binder_RMSD']: # These are the labels for binder alone
352 | mpnn_data.append(binder_averages.get(label, None))
353 | for model in model_numbers:
354 | mpnn_data.append(binder_statistics.get(model, {}).get(label, None))
355 |
356 | # Add the remaining non-statistical data
357 | mpnn_data.extend([elapsed_mpnn_text, seq_notes, settings_file, filters_file, advanced_file])
358 |
359 | # insert data into csv
360 | insert_data(mpnn_csv, mpnn_data)
361 |
362 | # find best model number by pLDDT
363 | plddt_values = {i: mpnn_data[i] for i in range(11, 15) if mpnn_data[i] is not None}
364 |
365 | # Find the key with the highest value
366 | highest_plddt_key = int(max(plddt_values, key=plddt_values.get))
367 |
368 | # Output the number part of the key
369 | best_model_number = highest_plddt_key - 10
370 | best_model_pdb = os.path.join(design_paths["MPNN/Relaxed"], f"{mpnn_design_name}_model{best_model_number}.pdb")
371 |
372 | # run design data against filter thresholds
373 | filter_conditions = check_filters(mpnn_data, design_labels, filters)
374 | if filter_conditions == True:
375 | print(mpnn_design_name+" passed all filters")
376 | accepted_mpnn += 1
377 | accepted_designs += 1
378 |
379 | # copy designs to accepted folder
380 | shutil.copy(best_model_pdb, design_paths["Accepted"])
381 |
382 | # insert data into final csv
383 | final_data = [''] + mpnn_data
384 | insert_data(final_csv, final_data)
385 |
386 | # copy animation from accepted trajectory
387 | if advanced_settings["save_design_animations"]:
388 | accepted_animation = os.path.join(design_paths["Accepted/Animation"], f"{design_name}.html")
389 | if not os.path.exists(accepted_animation):
390 | shutil.copy(os.path.join(design_paths["Trajectory/Animation"], f"{design_name}.html"), accepted_animation)
391 |
392 | # copy plots of accepted trajectory
393 | plot_files = os.listdir(design_paths["Trajectory/Plots"])
394 | plots_to_copy = [f for f in plot_files if f.startswith(design_name) and f.endswith('.png')]
395 | for accepted_plot in plots_to_copy:
396 | source_plot = os.path.join(design_paths["Trajectory/Plots"], accepted_plot)
397 | target_plot = os.path.join(design_paths["Accepted/Plots"], accepted_plot)
398 | if not os.path.exists(target_plot):
399 | shutil.copy(source_plot, target_plot)
400 |
401 | else:
402 | print(f"Unmet filter conditions for {mpnn_design_name}")
403 | failure_df = pd.read_csv(failure_csv)
404 | special_prefixes = ('Average_', '1_', '2_', '3_', '4_', '5_')
405 | incremented_columns = set()
406 |
407 | for column in filter_conditions:
408 | base_column = column
409 | for prefix in special_prefixes:
410 | if column.startswith(prefix):
411 | base_column = column.split('_', 1)[1]
412 |
413 | if base_column not in incremented_columns:
414 | failure_df[base_column] = failure_df[base_column] + 1
415 | incremented_columns.add(base_column)
416 |
417 | failure_df.to_csv(failure_csv, index=False)
418 | shutil.copy(best_model_pdb, design_paths["Rejected"])
419 |
420 | # increase MPNN design number
421 | mpnn_n += 1
422 |
423 | # if enough mpnn sequences of the same trajectory pass filters then stop
424 | if accepted_mpnn >= advanced_settings["max_mpnn_sequences"]:
425 | break
426 |
427 | if accepted_mpnn >= 1:
428 | print("Found "+str(accepted_mpnn)+" MPNN designs passing filters")
429 | print("")
430 | else:
431 | print("No accepted MPNN designs found for this trajectory.")
432 | print("")
433 |
434 | else:
435 | print('Duplicate MPNN designs sampled with different trajectory, skipping current trajectory optimisation')
436 | print("")
437 |
438 | # save space by removing unrelaxed design trajectory PDB
439 | if advanced_settings["remove_unrelaxed_trajectory"]:
440 | os.remove(trajectory_pdb)
441 |
442 | # measure time it took to generate designs for one trajectory
443 | design_time = time.time() - design_start_time
444 | design_time_text = f"{'%d hours, %d minutes, %d seconds' % (int(design_time // 3600), int((design_time % 3600) // 60), int(design_time % 60))}"
445 | print("Design and validation of trajectory "+design_name+" took: "+design_time_text)
446 |
447 | # analyse the rejection rate of trajectories to see if we need to readjust the design weights
448 | if trajectory_n >= advanced_settings["start_monitoring"] and advanced_settings["enable_rejection_check"]:
449 | acceptance = accepted_designs / trajectory_n
450 | if not acceptance >= advanced_settings["acceptance_rate"]:
451 | print("The ratio of successful designs is lower than defined acceptance rate! Consider changing your design settings!")
452 | print("Script execution stopping...")
453 | break
454 |
455 | # increase trajectory number
456 | trajectory_n += 1
457 | gc.collect()
458 |
459 | ### Script finished
460 | elapsed_time = time.time() - script_start_time
461 | elapsed_text = f"{'%d hours, %d minutes, %d seconds' % (int(elapsed_time // 3600), int((elapsed_time % 3600) // 60), int(elapsed_time % 60))}"
462 | print("Finished all designs. Script execution for "+str(trajectory_n)+" trajectories took: "+elapsed_text)
--------------------------------------------------------------------------------
/bindcraft.slurm:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | #SBATCH --nodes 1
3 | #SBATCH --ntasks 1
4 | #SBATCH --cpus-per-task 1
5 | #SBATCH --partition=gpu
6 | #SBATCH --qos=gpu
7 | #SBATCH --gres=gpu:1
8 | #SBATCH --mem 42gb
9 | #SBATCH --time 72:00:00
10 | #SBATCH --output=bindcraft_%A.log
11 |
12 | # Initialise environment and modules
13 | CONDA_BASE=$(conda info --base)
14 | source ${CONDA_BASE}/bin/activate ${CONDA_BASE}/envs/BindCraft
15 | export LD_LIBRARY_PATH=${CONDA_BASE}/lib
16 |
17 | # alternatively you can source the environment directly
18 | #source /path/to/mambaforge/bin/activate /path/to/mambaforge/envs/BindCraft
19 |
20 | # Get the directory where the bindcraft script is located
21 | SCRIPT_DIR=$(dirname "$0")
22 |
23 | # Parsing command line options
24 | SETTINGS=""
25 | FILTERS=""
26 | ADVANCED=""
27 | TEMP=$(getopt -o s:f:a: --long settings:,filters:,advanced: -n 'bindcraft.slurm' -- "$@")
28 | eval set -- "$TEMP"
29 |
30 | while true ; do
31 | case "$1" in
32 | -s|--settings) SETTINGS="$2" ; shift 2 ;;
33 | -f|--filters) FILTERS="$2" ; shift 2 ;;
34 | -a|--advanced) ADVANCED="$2" ; shift 2 ;;
35 | --) shift ; break ;;
36 | *) echo "Invalid Option" ; exit 1 ;;
37 | esac
38 | done
39 |
40 | # Ensure that SETTINGS is not empty
41 | if [ -z "$SETTINGS" ]; then
42 | echo "Error: The -s or --settings option is required."
43 | exit 1
44 | fi
45 |
46 | echo "Running the BindCraft pipeline"
47 | python -u "${SCRIPT_DIR}/bindcraft.py" --settings "${SETTINGS}" --filters "${FILTERS}" --advanced "${ADVANCED}"
48 |
--------------------------------------------------------------------------------
/functions/DAlphaBall.gcc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/martinpacesa/BindCraft/477755f2cdd4077840ce51051749ddcf63a26862/functions/DAlphaBall.gcc
--------------------------------------------------------------------------------
/functions/__init__.py:
--------------------------------------------------------------------------------
1 | import os, re, shutil, time, json, gc
2 | import argparse
3 | import pickle
4 | import warnings
5 | import zipfile
6 | import numpy as np
7 | import pandas as pd
8 | import math, random
9 | import matplotlib.pyplot as plt
10 |
11 | from .pyrosetta_utils import *
12 | from .colabdesign_utils import *
13 | from .biopython_utils import *
14 | from .generic_utils import *
15 |
16 | # suppress warnings
17 | #os.environ["SLURM_STEP_NODELIST"] = os.environ["SLURM_NODELIST"]
18 | warnings.simplefilter(action='ignore', category=FutureWarning)
19 | warnings.simplefilter(action='ignore', category=DeprecationWarning)
20 | warnings.simplefilter(action='ignore', category=BiopythonWarning)
--------------------------------------------------------------------------------
/functions/biopython_utils.py:
--------------------------------------------------------------------------------
1 | ####################################
2 | ################ BioPython functions
3 | ####################################
4 | ### Import dependencies
5 | import os
6 | import math
7 | import numpy as np
8 | from collections import defaultdict
9 | from scipy.spatial import cKDTree
10 | from Bio import BiopythonWarning
11 | from Bio.PDB import PDBParser, DSSP, Selection, Polypeptide, PDBIO, Select, Chain, Superimposer
12 | from Bio.SeqUtils.ProtParam import ProteinAnalysis
13 | from Bio.PDB.Selection import unfold_entities
14 | from Bio.PDB.Polypeptide import is_aa
15 |
16 | # analyze sequence composition of design
17 | def validate_design_sequence(sequence, num_clashes, advanced_settings):
18 | note_array = []
19 |
20 | # Check if protein contains clashes after relaxation
21 | if num_clashes > 0:
22 | note_array.append('Relaxed structure contains clashes.')
23 |
24 | # Check if the sequence contains disallowed amino acids
25 | if advanced_settings["omit_AAs"]:
26 | restricted_AAs = advanced_settings["omit_AAs"].split(',')
27 | for restricted_AA in restricted_AAs:
28 | if restricted_AA in sequence:
29 | note_array.append('Contains: '+restricted_AA+'!')
30 |
31 | # Analyze the protein
32 | analysis = ProteinAnalysis(sequence)
33 |
34 | # Calculate the reduced extinction coefficient per 1% solution
35 | extinction_coefficient_reduced = analysis.molar_extinction_coefficient()[0]
36 | molecular_weight = round(analysis.molecular_weight() / 1000, 2)
37 | extinction_coefficient_reduced_1 = round(extinction_coefficient_reduced / molecular_weight * 0.01, 2)
38 |
39 | # Check if the absorption is high enough
40 | if extinction_coefficient_reduced_1 <= 2:
41 | note_array.append(f'Absorption value is {extinction_coefficient_reduced_1}, consider adding tryptophane to design.')
42 |
43 | # Join the notes into a single string
44 | notes = ' '.join(note_array)
45 |
46 | return notes
47 |
48 | # temporary function, calculate RMSD of input PDB and trajectory target
49 | def target_pdb_rmsd(trajectory_pdb, starting_pdb, chain_ids_string):
50 | # Parse the PDB files
51 | parser = PDBParser(QUIET=True)
52 | structure_trajectory = parser.get_structure('trajectory', trajectory_pdb)
53 | structure_starting = parser.get_structure('starting', starting_pdb)
54 |
55 | # Extract chain A from trajectory_pdb
56 | chain_trajectory = structure_trajectory[0]['A']
57 |
58 | # Extract the specified chains from starting_pdb
59 | chain_ids = chain_ids_string.split(',')
60 | residues_starting = []
61 | for chain_id in chain_ids:
62 | chain_id = chain_id.strip()
63 | chain = structure_starting[0][chain_id]
64 | for residue in chain:
65 | if is_aa(residue, standard=True):
66 | residues_starting.append(residue)
67 |
68 | # Extract residues from chain A in trajectory_pdb
69 | residues_trajectory = [residue for residue in chain_trajectory if is_aa(residue, standard=True)]
70 |
71 | # Ensure that both structures have the same number of residues
72 | min_length = min(len(residues_starting), len(residues_trajectory))
73 | residues_starting = residues_starting[:min_length]
74 | residues_trajectory = residues_trajectory[:min_length]
75 |
76 | # Collect CA atoms from the two sets of residues
77 | atoms_starting = [residue['CA'] for residue in residues_starting if 'CA' in residue]
78 | atoms_trajectory = [residue['CA'] for residue in residues_trajectory if 'CA' in residue]
79 |
80 | # Calculate RMSD using structural alignment
81 | sup = Superimposer()
82 | sup.set_atoms(atoms_starting, atoms_trajectory)
83 | rmsd = sup.rms
84 |
85 | return round(rmsd, 2)
86 |
87 | # detect C alpha clashes for deformed trajectories
88 | def calculate_clash_score(pdb_file, threshold=2.4, only_ca=False):
89 | parser = PDBParser(QUIET=True)
90 | structure = parser.get_structure('protein', pdb_file)
91 |
92 | atoms = []
93 | atom_info = [] # Detailed atom info for debugging and processing
94 |
95 | for model in structure:
96 | for chain in model:
97 | for residue in chain:
98 | for atom in residue:
99 | if atom.element == 'H': # Skip hydrogen atoms
100 | continue
101 | if only_ca and atom.get_name() != 'CA':
102 | continue
103 | atoms.append(atom.coord)
104 | atom_info.append((chain.id, residue.id[1], atom.get_name(), atom.coord))
105 |
106 | tree = cKDTree(atoms)
107 | pairs = tree.query_pairs(threshold)
108 |
109 | valid_pairs = set()
110 | for (i, j) in pairs:
111 | chain_i, res_i, name_i, coord_i = atom_info[i]
112 | chain_j, res_j, name_j, coord_j = atom_info[j]
113 |
114 | # Exclude clashes within the same residue
115 | if chain_i == chain_j and res_i == res_j:
116 | continue
117 |
118 | # Exclude directly sequential residues in the same chain for all atoms
119 | if chain_i == chain_j and abs(res_i - res_j) == 1:
120 | continue
121 |
122 | # If calculating sidechain clashes, only consider clashes between different chains
123 | if not only_ca and chain_i == chain_j:
124 | continue
125 |
126 | valid_pairs.add((i, j))
127 |
128 | return len(valid_pairs)
129 |
130 | three_to_one_map = {
131 | 'ALA': 'A', 'CYS': 'C', 'ASP': 'D', 'GLU': 'E', 'PHE': 'F',
132 | 'GLY': 'G', 'HIS': 'H', 'ILE': 'I', 'LYS': 'K', 'LEU': 'L',
133 | 'MET': 'M', 'ASN': 'N', 'PRO': 'P', 'GLN': 'Q', 'ARG': 'R',
134 | 'SER': 'S', 'THR': 'T', 'VAL': 'V', 'TRP': 'W', 'TYR': 'Y'
135 | }
136 |
137 | # identify interacting residues at the binder interface
138 | def hotspot_residues(trajectory_pdb, binder_chain="B", atom_distance_cutoff=4.0):
139 | # Parse the PDB file
140 | parser = PDBParser(QUIET=True)
141 | structure = parser.get_structure("complex", trajectory_pdb)
142 |
143 | # Get the specified chain
144 | binder_atoms = Selection.unfold_entities(structure[0][binder_chain], 'A')
145 | binder_coords = np.array([atom.coord for atom in binder_atoms])
146 |
147 | # Get atoms and coords for the target chain
148 | target_atoms = Selection.unfold_entities(structure[0]['A'], 'A')
149 | target_coords = np.array([atom.coord for atom in target_atoms])
150 |
151 | # Build KD trees for both chains
152 | binder_tree = cKDTree(binder_coords)
153 | target_tree = cKDTree(target_coords)
154 |
155 | # Prepare to collect interacting residues
156 | interacting_residues = {}
157 |
158 | # Query the tree for pairs of atoms within the distance cutoff
159 | pairs = binder_tree.query_ball_tree(target_tree, atom_distance_cutoff)
160 |
161 | # Process each binder atom's interactions
162 | for binder_idx, close_indices in enumerate(pairs):
163 | binder_residue = binder_atoms[binder_idx].get_parent()
164 | binder_resname = binder_residue.get_resname()
165 |
166 | # Convert three-letter code to single-letter code using the manual dictionary
167 | if binder_resname in three_to_one_map:
168 | aa_single_letter = three_to_one_map[binder_resname]
169 | for close_idx in close_indices:
170 | target_residue = target_atoms[close_idx].get_parent()
171 | interacting_residues[binder_residue.id[1]] = aa_single_letter
172 |
173 | return interacting_residues
174 |
175 | # calculate secondary structure percentage of design
176 | def calc_ss_percentage(pdb_file, advanced_settings, chain_id="B", atom_distance_cutoff=4.0):
177 | # Parse the structure
178 | parser = PDBParser(QUIET=True)
179 | structure = parser.get_structure('protein', pdb_file)
180 | model = structure[0] # Consider only the first model in the structure
181 |
182 | # Calculate DSSP for the model
183 | dssp = DSSP(model, pdb_file, dssp=advanced_settings["dssp_path"])
184 |
185 | # Prepare to count residues
186 | ss_counts = defaultdict(int)
187 | ss_interface_counts = defaultdict(int)
188 | plddts_interface = []
189 | plddts_ss = []
190 |
191 | # Get chain and interacting residues once
192 | chain = model[chain_id]
193 | interacting_residues = set(hotspot_residues(pdb_file, chain_id, atom_distance_cutoff).keys())
194 |
195 | for residue in chain:
196 | residue_id = residue.id[1]
197 | if (chain_id, residue_id) in dssp:
198 | ss = dssp[(chain_id, residue_id)][2] # Get the secondary structure
199 | ss_type = 'loop'
200 | if ss in ['H', 'G', 'I']:
201 | ss_type = 'helix'
202 | elif ss == 'E':
203 | ss_type = 'sheet'
204 |
205 | ss_counts[ss_type] += 1
206 |
207 | if ss_type != 'loop':
208 | # calculate secondary structure normalised pLDDT
209 | avg_plddt_ss = sum(atom.bfactor for atom in residue) / len(residue)
210 | plddts_ss.append(avg_plddt_ss)
211 |
212 | if residue_id in interacting_residues:
213 | ss_interface_counts[ss_type] += 1
214 |
215 | # calculate interface pLDDT
216 | avg_plddt_residue = sum(atom.bfactor for atom in residue) / len(residue)
217 | plddts_interface.append(avg_plddt_residue)
218 |
219 | # Calculate percentages
220 | total_residues = sum(ss_counts.values())
221 | total_interface_residues = sum(ss_interface_counts.values())
222 |
223 | percentages = calculate_percentages(total_residues, ss_counts['helix'], ss_counts['sheet'])
224 | interface_percentages = calculate_percentages(total_interface_residues, ss_interface_counts['helix'], ss_interface_counts['sheet'])
225 |
226 | i_plddt = round(sum(plddts_interface) / len(plddts_interface) / 100, 2) if plddts_interface else 0
227 | ss_plddt = round(sum(plddts_ss) / len(plddts_ss) / 100, 2) if plddts_ss else 0
228 |
229 | return (*percentages, *interface_percentages, i_plddt, ss_plddt)
230 |
231 | def calculate_percentages(total, helix, sheet):
232 | helix_percentage = round((helix / total) * 100,2) if total > 0 else 0
233 | sheet_percentage = round((sheet / total) * 100,2) if total > 0 else 0
234 | loop_percentage = round(((total - helix - sheet) / total) * 100,2) if total > 0 else 0
235 |
236 | return helix_percentage, sheet_percentage, loop_percentage
--------------------------------------------------------------------------------
/functions/colabdesign_utils.py:
--------------------------------------------------------------------------------
1 | ####################################
2 | ############## ColabDesign functions
3 | ####################################
4 | ### Import dependencies
5 | import os, re, shutil, math, pickle
6 | import matplotlib.pyplot as plt
7 | import numpy as np
8 | import jax
9 | import jax.numpy as jnp
10 | from scipy.special import softmax
11 | from colabdesign import mk_afdesign_model, clear_mem
12 | from colabdesign.mpnn import mk_mpnn_model
13 | from colabdesign.af.alphafold.common import residue_constants
14 | from colabdesign.af.loss import get_ptm, mask_loss, get_dgram_bins, _get_con_loss
15 | from colabdesign.shared.utils import copy_dict
16 | from .biopython_utils import hotspot_residues, calculate_clash_score, calc_ss_percentage, calculate_percentages
17 | from .pyrosetta_utils import pr_relax, align_pdbs
18 | from .generic_utils import update_failures
19 |
20 | # hallucinate a binder
21 | def binder_hallucination(design_name, starting_pdb, chain, target_hotspot_residues, length, seed, helicity_value, design_models, advanced_settings, design_paths, failure_csv):
22 | model_pdb_path = os.path.join(design_paths["Trajectory"], design_name+".pdb")
23 |
24 | # clear GPU memory for new trajectory
25 | clear_mem()
26 |
27 | # initialise binder hallucination model
28 | af_model = mk_afdesign_model(protocol="binder", debug=False, data_dir=advanced_settings["af_params_dir"],
29 | use_multimer=advanced_settings["use_multimer_design"], num_recycles=advanced_settings["num_recycles_design"],
30 | best_metric='loss')
31 |
32 | # sanity check for hotspots
33 | if target_hotspot_residues == "":
34 | target_hotspot_residues = None
35 |
36 | af_model.prep_inputs(pdb_filename=starting_pdb, chain=chain, binder_len=length, hotspot=target_hotspot_residues, seed=seed, rm_aa=advanced_settings["omit_AAs"],
37 | rm_target_seq=advanced_settings["rm_template_seq_design"], rm_target_sc=advanced_settings["rm_template_sc_design"])
38 |
39 | ### Update weights based on specified settings
40 | af_model.opt["weights"].update({"pae":advanced_settings["weights_pae_intra"],
41 | "plddt":advanced_settings["weights_plddt"],
42 | "i_pae":advanced_settings["weights_pae_inter"],
43 | "con":advanced_settings["weights_con_intra"],
44 | "i_con":advanced_settings["weights_con_inter"],
45 | })
46 |
47 | # redefine intramolecular contacts (con) and intermolecular contacts (i_con) definitions
48 | af_model.opt["con"].update({"num":advanced_settings["intra_contact_number"],"cutoff":advanced_settings["intra_contact_distance"],"binary":False,"seqsep":9})
49 | af_model.opt["i_con"].update({"num":advanced_settings["inter_contact_number"],"cutoff":advanced_settings["inter_contact_distance"],"binary":False})
50 |
51 |
52 | ### additional loss functions
53 | if advanced_settings["use_rg_loss"]:
54 | # radius of gyration loss
55 | add_rg_loss(af_model, advanced_settings["weights_rg"])
56 |
57 | if advanced_settings["use_i_ptm_loss"]:
58 | # interface pTM loss
59 | add_i_ptm_loss(af_model, advanced_settings["weights_iptm"])
60 |
61 | if advanced_settings["use_termini_distance_loss"]:
62 | # termini distance loss
63 | add_termini_distance_loss(af_model, advanced_settings["weights_termini_loss"])
64 |
65 | # add the helicity loss
66 | add_helix_loss(af_model, helicity_value)
67 |
68 | # calculate the number of mutations to do based on the length of the protein
69 | greedy_tries = math.ceil(length * (advanced_settings["greedy_percentage"] / 100))
70 |
71 | ### start design algorithm based on selection
72 | if advanced_settings["design_algorithm"] == '2stage':
73 | # uses gradient descend to get a PSSM profile and then uses PSSM to bias the sampling of random mutations to decrease loss
74 | af_model.design_pssm_semigreedy(soft_iters=advanced_settings["soft_iterations"], hard_iters=advanced_settings["greedy_iterations"], tries=greedy_tries, models=design_models,
75 | num_models=1, sample_models=advanced_settings["sample_models"], ramp_models=False, save_best=True)
76 |
77 | elif advanced_settings["design_algorithm"] == '3stage':
78 | # 3 stage design using logits, softmax, and one hot encoding
79 | af_model.design_3stage(soft_iters=advanced_settings["soft_iterations"], temp_iters=advanced_settings["temporary_iterations"], hard_iters=advanced_settings["hard_iterations"],
80 | num_models=1, models=design_models, sample_models=advanced_settings["sample_models"], save_best=True)
81 |
82 | elif advanced_settings["design_algorithm"] == 'greedy':
83 | # design by using random mutations that decrease loss
84 | af_model.design_semigreedy(advanced_settings["greedy_iterations"], tries=greedy_tries, num_models=1, models=design_models,
85 | sample_models=advanced_settings["sample_models"], save_best=True)
86 |
87 | elif advanced_settings["design_algorithm"] == 'mcmc':
88 | # design by using random mutations that decrease loss
89 | half_life = round(advanced_settings["greedy_iterations"] / 5, 0)
90 | t_mcmc = 0.01
91 | af_model._design_mcmc(advanced_settings["greedy_iterations"], half_life=half_life, T_init=t_mcmc, mutation_rate=greedy_tries, num_models=1, models=design_models,
92 | sample_models=advanced_settings["sample_models"], save_best=True)
93 |
94 | elif advanced_settings["design_algorithm"] == '4stage':
95 | # initial logits to prescreen trajectory
96 | print("Stage 1: Test Logits")
97 | af_model.design_logits(iters=50, e_soft=0.9, models=design_models, num_models=1, sample_models=advanced_settings["sample_models"], save_best=True)
98 |
99 | # determine pLDDT of best iteration according to lowest 'loss' value
100 | initial_plddt = get_best_plddt(af_model, length)
101 |
102 | # if best iteration has high enough confidence then continue
103 | if initial_plddt > 0.65:
104 | print("Initial trajectory pLDDT good, continuing: "+str(initial_plddt))
105 | if advanced_settings["optimise_beta"]:
106 | # temporarily dump model to assess secondary structure
107 | af_model.save_pdb(model_pdb_path)
108 | _, beta, *_ = calc_ss_percentage(model_pdb_path, advanced_settings, 'B')
109 | os.remove(model_pdb_path)
110 |
111 | # if beta sheeted trajectory is detected then choose to optimise
112 | if float(beta) > 15:
113 | advanced_settings["soft_iterations"] = advanced_settings["soft_iterations"] + advanced_settings["optimise_beta_extra_soft"]
114 | advanced_settings["temporary_iterations"] = advanced_settings["temporary_iterations"] + advanced_settings["optimise_beta_extra_temp"]
115 | af_model.set_opt(num_recycles=advanced_settings["optimise_beta_recycles_design"])
116 | print("Beta sheeted trajectory detected, optimising settings")
117 |
118 | # how many logit iterations left
119 | logits_iter = advanced_settings["soft_iterations"] - 50
120 | if logits_iter > 0:
121 | print("Stage 1: Additional Logits Optimisation")
122 | af_model.clear_best()
123 | af_model.design_logits(iters=logits_iter, e_soft=1, models=design_models, num_models=1, sample_models=advanced_settings["sample_models"],
124 | ramp_recycles=False, save_best=True)
125 | af_model._tmp["seq_logits"] = af_model.aux["seq"]["logits"]
126 | logit_plddt = get_best_plddt(af_model, length)
127 | print("Optimised logit trajectory pLDDT: "+str(logit_plddt))
128 | else:
129 | logit_plddt = initial_plddt
130 |
131 | # perform softmax trajectory design
132 | if advanced_settings["temporary_iterations"] > 0:
133 | print("Stage 2: Softmax Optimisation")
134 | af_model.clear_best()
135 | af_model.design_soft(advanced_settings["temporary_iterations"], e_temp=1e-2, models=design_models, num_models=1,
136 | sample_models=advanced_settings["sample_models"], ramp_recycles=False, save_best=True)
137 | softmax_plddt = get_best_plddt(af_model, length)
138 | else:
139 | softmax_plddt = logit_plddt
140 |
141 | # perform one hot encoding
142 | if softmax_plddt > 0.65:
143 | print("Softmax trajectory pLDDT good, continuing: "+str(softmax_plddt))
144 | if advanced_settings["hard_iterations"] > 0:
145 | af_model.clear_best()
146 | print("Stage 3: One-hot Optimisation")
147 | af_model.design_hard(advanced_settings["hard_iterations"], temp=1e-2, models=design_models, num_models=1,
148 | sample_models=advanced_settings["sample_models"], dropout=False, ramp_recycles=False, save_best=True)
149 | onehot_plddt = get_best_plddt(af_model, length)
150 |
151 | if onehot_plddt > 0.65:
152 | # perform greedy mutation optimisation
153 | print("One-hot trajectory pLDDT good, continuing: "+str(onehot_plddt))
154 | if advanced_settings["greedy_iterations"] > 0:
155 | print("Stage 4: PSSM Semigreedy Optimisation")
156 | af_model.design_pssm_semigreedy(soft_iters=0, hard_iters=advanced_settings["greedy_iterations"], tries=greedy_tries, models=design_models,
157 | num_models=1, sample_models=advanced_settings["sample_models"], ramp_models=False, save_best=True)
158 |
159 | else:
160 | update_failures(failure_csv, 'Trajectory_one-hot_pLDDT')
161 | print("One-hot trajectory pLDDT too low to continue: "+str(onehot_plddt))
162 |
163 | else:
164 | update_failures(failure_csv, 'Trajectory_softmax_pLDDT')
165 | print("Softmax trajectory pLDDT too low to continue: "+str(softmax_plddt))
166 |
167 | else:
168 | update_failures(failure_csv, 'Trajectory_logits_pLDDT')
169 | print("Initial trajectory pLDDT too low to continue: "+str(initial_plddt))
170 |
171 | else:
172 | print("ERROR: No valid design model selected")
173 | exit()
174 | return
175 |
176 | ### save trajectory PDB
177 | final_plddt = get_best_plddt(af_model, length)
178 | af_model.save_pdb(model_pdb_path)
179 | af_model.aux["log"]["terminate"] = ""
180 |
181 | # let's check whether the trajectory is worth optimising by checking confidence, clashes, and contacts
182 | # check clashes
183 | #clash_interface = calculate_clash_score(model_pdb_path, 2.4)
184 | ca_clashes = calculate_clash_score(model_pdb_path, 2.5, only_ca=True)
185 |
186 | #if clash_interface > 25 or ca_clashes > 0:
187 | if ca_clashes > 0:
188 | af_model.aux["log"]["terminate"] = "Clashing"
189 | update_failures(failure_csv, 'Trajectory_Clashes')
190 | print("Severe clashes detected, skipping analysis and MPNN optimisation")
191 | print("")
192 | else:
193 | # check if low quality prediction
194 | if final_plddt < 0.7:
195 | af_model.aux["log"]["terminate"] = "LowConfidence"
196 | update_failures(failure_csv, 'Trajectory_final_pLDDT')
197 | print("Trajectory starting confidence low, skipping analysis and MPNN optimisation")
198 | print("")
199 | else:
200 | # does it have enough contacts to consider?
201 | binder_contacts = hotspot_residues(model_pdb_path)
202 | binder_contacts_n = len(binder_contacts.items())
203 |
204 | # if less than 3 contacts then protein is floating above and is not binder
205 | if binder_contacts_n < 3:
206 | af_model.aux["log"]["terminate"] = "LowConfidence"
207 | update_failures(failure_csv, 'Trajectory_Contacts')
208 | print("Too few contacts at the interface, skipping analysis and MPNN optimisation")
209 | print("")
210 | else:
211 | # phew, trajectory is okay! We can continue
212 | af_model.aux["log"]["terminate"] = ""
213 | print("Trajectory successful, final pLDDT: "+str(final_plddt))
214 |
215 | # move low quality prediction:
216 | if af_model.aux["log"]["terminate"] != "":
217 | shutil.move(model_pdb_path, design_paths[f"Trajectory/{af_model.aux['log']['terminate']}"])
218 |
219 | ### get the sampled sequence for plotting
220 | af_model.get_seqs()
221 | if advanced_settings["save_design_trajectory_plots"]:
222 | plot_trajectory(af_model, design_name, design_paths)
223 |
224 | ### save the hallucination trajectory animation
225 | if advanced_settings["save_design_animations"]:
226 | plots = af_model.animate(dpi=150)
227 | with open(os.path.join(design_paths["Trajectory/Animation"], design_name+".html"), 'w') as f:
228 | f.write(plots)
229 | plt.close('all')
230 |
231 | if advanced_settings["save_trajectory_pickle"]:
232 | with open(os.path.join(design_paths["Trajectory/Pickle"], design_name+".pickle"), 'wb') as handle:
233 | pickle.dump(af_model.aux['all'], handle, protocol=pickle.HIGHEST_PROTOCOL)
234 |
235 | return af_model
236 |
237 | # run prediction for binder with masked template target
238 | def predict_binder_complex(prediction_model, binder_sequence, mpnn_design_name, target_pdb, chain, length, trajectory_pdb, prediction_models, advanced_settings, filters, design_paths, failure_csv, seed=None):
239 | prediction_stats = {}
240 |
241 | # clean sequence
242 | binder_sequence = re.sub("[^A-Z]", "", binder_sequence.upper())
243 |
244 | # reset filtering conditionals
245 | pass_af2_filters = True
246 | filter_failures = {}
247 |
248 | # start prediction per AF2 model, 2 are used by default due to masked templates
249 | for model_num in prediction_models:
250 | # check to make sure prediction does not exist already
251 | complex_pdb = os.path.join(design_paths["MPNN"], f"{mpnn_design_name}_model{model_num+1}.pdb")
252 | if not os.path.exists(complex_pdb):
253 | # predict model
254 | prediction_model.predict(seq=binder_sequence, models=[model_num], num_recycles=advanced_settings["num_recycles_validation"], verbose=False)
255 | prediction_model.save_pdb(complex_pdb)
256 | prediction_metrics = copy_dict(prediction_model.aux["log"]) # contains plddt, ptm, i_ptm, pae, i_pae
257 |
258 | # extract the statistics for the model
259 | stats = {
260 | 'pLDDT': round(prediction_metrics['plddt'], 2),
261 | 'pTM': round(prediction_metrics['ptm'], 2),
262 | 'i_pTM': round(prediction_metrics['i_ptm'], 2),
263 | 'pAE': round(prediction_metrics['pae'], 2),
264 | 'i_pAE': round(prediction_metrics['i_pae'], 2)
265 | }
266 | prediction_stats[model_num+1] = stats
267 |
268 | # List of filter conditions and corresponding keys
269 | filter_conditions = [
270 | (f"{model_num+1}_pLDDT", 'plddt', '>='),
271 | (f"{model_num+1}_pTM", 'ptm', '>='),
272 | (f"{model_num+1}_i_pTM", 'i_ptm', '>='),
273 | (f"{model_num+1}_pAE", 'pae', '<='),
274 | (f"{model_num+1}_i_pAE", 'i_pae', '<='),
275 | ]
276 |
277 | # perform initial AF2 values filtering to determine whether to skip relaxation and interface scoring
278 | for filter_name, metric_key, comparison in filter_conditions:
279 | threshold = filters.get(filter_name, {}).get("threshold")
280 | if threshold is not None:
281 | if comparison == '>=' and prediction_metrics[metric_key] < threshold:
282 | pass_af2_filters = False
283 | filter_failures[filter_name] = filter_failures.get(filter_name, 0) + 1
284 | elif comparison == '<=' and prediction_metrics[metric_key] > threshold:
285 | pass_af2_filters = False
286 | filter_failures[filter_name] = filter_failures.get(filter_name, 0) + 1
287 |
288 | if not pass_af2_filters:
289 | break
290 |
291 | # Update the CSV file with the failure counts
292 | if filter_failures:
293 | update_failures(failure_csv, filter_failures)
294 |
295 | # AF2 filters passed, contuing with relaxation
296 | for model_num in prediction_models:
297 | complex_pdb = os.path.join(design_paths["MPNN"], f"{mpnn_design_name}_model{model_num+1}.pdb")
298 | if pass_af2_filters:
299 | mpnn_relaxed = os.path.join(design_paths["MPNN/Relaxed"], f"{mpnn_design_name}_model{model_num+1}.pdb")
300 | pr_relax(complex_pdb, mpnn_relaxed)
301 | else:
302 | if os.path.exists(complex_pdb):
303 | os.remove(complex_pdb)
304 |
305 | return prediction_stats, pass_af2_filters
306 |
307 | # run prediction for binder alone
308 | def predict_binder_alone(prediction_model, binder_sequence, mpnn_design_name, length, trajectory_pdb, binder_chain, prediction_models, advanced_settings, design_paths, seed=None):
309 | binder_stats = {}
310 |
311 | # prepare sequence for prediction
312 | binder_sequence = re.sub("[^A-Z]", "", binder_sequence.upper())
313 | prediction_model.set_seq(binder_sequence)
314 |
315 | # predict each model separately
316 | for model_num in prediction_models:
317 | # check to make sure prediction does not exist already
318 | binder_alone_pdb = os.path.join(design_paths["MPNN/Binder"], f"{mpnn_design_name}_model{model_num+1}.pdb")
319 | if not os.path.exists(binder_alone_pdb):
320 | # predict model
321 | prediction_model.predict(models=[model_num], num_recycles=advanced_settings["num_recycles_validation"], verbose=False)
322 | prediction_model.save_pdb(binder_alone_pdb)
323 | prediction_metrics = copy_dict(prediction_model.aux["log"]) # contains plddt, ptm, pae
324 |
325 | # align binder model to trajectory binder
326 | align_pdbs(trajectory_pdb, binder_alone_pdb, binder_chain, "A")
327 |
328 | # extract the statistics for the model
329 | stats = {
330 | 'pLDDT': round(prediction_metrics['plddt'], 2),
331 | 'pTM': round(prediction_metrics['ptm'], 2),
332 | 'pAE': round(prediction_metrics['pae'], 2)
333 | }
334 | binder_stats[model_num+1] = stats
335 |
336 | return binder_stats
337 |
338 | # run MPNN to generate sequences for binders
339 | def mpnn_gen_sequence(trajectory_pdb, binder_chain, trajectory_interface_residues, advanced_settings):
340 | # clear GPU memory
341 | clear_mem()
342 |
343 | # initialise MPNN model
344 | mpnn_model = mk_mpnn_model(backbone_noise=advanced_settings["backbone_noise"], model_name=advanced_settings["model_path"], weights=advanced_settings["mpnn_weights"])
345 |
346 | # check whether keep the interface generated by the trajectory or whether to redesign with MPNN
347 | design_chains = 'A,' + binder_chain
348 |
349 | if advanced_settings["mpnn_fix_interface"]:
350 | fixed_positions = 'A,' + trajectory_interface_residues
351 | fixed_positions = fixed_positions.rstrip(",")
352 | print("Fixing interface residues: "+trajectory_interface_residues)
353 | else:
354 | fixed_positions = 'A'
355 |
356 | # prepare inputs for MPNN
357 | mpnn_model.prep_inputs(pdb_filename=trajectory_pdb, chain=design_chains, fix_pos=fixed_positions, rm_aa=advanced_settings["omit_AAs"])
358 |
359 | # sample MPNN sequences in parallel
360 | mpnn_sequences = mpnn_model.sample(temperature=advanced_settings["sampling_temp"], num=1, batch=advanced_settings["num_seqs"])
361 |
362 | return mpnn_sequences
363 |
364 | # Get pLDDT of best model
365 | def get_best_plddt(af_model, length):
366 | return round(np.mean(af_model._tmp["best"]["aux"]["plddt"][-length:]),2)
367 |
368 | # Define radius of gyration loss for colabdesign
369 | def add_rg_loss(self, weight=0.1):
370 | '''add radius of gyration loss'''
371 | def loss_fn(inputs, outputs):
372 | xyz = outputs["structure_module"]
373 | ca = xyz["final_atom_positions"][:,residue_constants.atom_order["CA"]]
374 | ca = ca[-self._binder_len:]
375 | rg = jnp.sqrt(jnp.square(ca - ca.mean(0)).sum(-1).mean() + 1e-8)
376 | rg_th = 2.38 * ca.shape[0] ** 0.365
377 |
378 | rg = jax.nn.elu(rg - rg_th)
379 | return {"rg":rg}
380 |
381 | self._callbacks["model"]["loss"].append(loss_fn)
382 | self.opt["weights"]["rg"] = weight
383 |
384 | # Define interface pTM loss for colabdesign
385 | def add_i_ptm_loss(self, weight=0.1):
386 | def loss_iptm(inputs, outputs):
387 | p = 1 - get_ptm(inputs, outputs, interface=True)
388 | i_ptm = mask_loss(p)
389 | return {"i_ptm": i_ptm}
390 |
391 | self._callbacks["model"]["loss"].append(loss_iptm)
392 | self.opt["weights"]["i_ptm"] = weight
393 |
394 | # add helicity loss
395 | def add_helix_loss(self, weight=0):
396 | def binder_helicity(inputs, outputs):
397 | if "offset" in inputs:
398 | offset = inputs["offset"]
399 | else:
400 | idx = inputs["residue_index"].flatten()
401 | offset = idx[:,None] - idx[None,:]
402 |
403 | # define distogram
404 | dgram = outputs["distogram"]["logits"]
405 | dgram_bins = get_dgram_bins(outputs)
406 | mask_2d = np.outer(np.append(np.zeros(self._target_len), np.ones(self._binder_len)), np.append(np.zeros(self._target_len), np.ones(self._binder_len)))
407 |
408 | x = _get_con_loss(dgram, dgram_bins, cutoff=6.0, binary=True)
409 | if offset is None:
410 | if mask_2d is None:
411 | helix_loss = jnp.diagonal(x,3).mean()
412 | else:
413 | helix_loss = jnp.diagonal(x * mask_2d,3).sum() + (jnp.diagonal(mask_2d,3).sum() + 1e-8)
414 | else:
415 | mask = offset == 3
416 | if mask_2d is not None:
417 | mask = jnp.where(mask_2d,mask,0)
418 | helix_loss = jnp.where(mask,x,0.0).sum() / (mask.sum() + 1e-8)
419 |
420 | return {"helix":helix_loss}
421 | self._callbacks["model"]["loss"].append(binder_helicity)
422 | self.opt["weights"]["helix"] = weight
423 |
424 | # add N- and C-terminus distance loss
425 | def add_termini_distance_loss(self, weight=0.1, threshold_distance=7.0):
426 | '''Add loss penalizing the distance between N and C termini'''
427 | def loss_fn(inputs, outputs):
428 | xyz = outputs["structure_module"]
429 | ca = xyz["final_atom_positions"][:, residue_constants.atom_order["CA"]]
430 | ca = ca[-self._binder_len:] # Considering only the last _binder_len residues
431 |
432 | # Extract N-terminus (first CA atom) and C-terminus (last CA atom)
433 | n_terminus = ca[0]
434 | c_terminus = ca[-1]
435 |
436 | # Compute the distance between N and C termini
437 | termini_distance = jnp.linalg.norm(n_terminus - c_terminus)
438 |
439 | # Compute the deviation from the threshold distance using ELU activation
440 | deviation = jax.nn.elu(termini_distance - threshold_distance)
441 |
442 | # Ensure the loss is never lower than 0
443 | termini_distance_loss = jax.nn.relu(deviation)
444 | return {"NC": termini_distance_loss}
445 |
446 | # Append the loss function to the model callbacks
447 | self._callbacks["model"]["loss"].append(loss_fn)
448 | self.opt["weights"]["NC"] = weight
449 |
450 | # plot design trajectory losses
451 | def plot_trajectory(af_model, design_name, design_paths):
452 | metrics_to_plot = ['loss', 'plddt', 'ptm', 'i_ptm', 'con', 'i_con', 'pae', 'i_pae', 'rg', 'mpnn']
453 | colors = ['b', 'g', 'r', 'c', 'm', 'y', 'k']
454 |
455 | for index, metric in enumerate(metrics_to_plot):
456 | if metric in af_model.aux["log"]:
457 | # Create a new figure for each metric
458 | plt.figure()
459 |
460 | loss = af_model.get_loss(metric)
461 | # Create an x axis for iterations
462 | iterations = range(1, len(loss) + 1)
463 |
464 | plt.plot(iterations, loss, label=f'{metric}', color=colors[index % len(colors)])
465 |
466 | # Add labels and a legend
467 | plt.xlabel('Iterations')
468 | plt.ylabel(metric)
469 | plt.title(design_name)
470 | plt.legend()
471 | plt.grid(True)
472 |
473 | # Save the plot
474 | plt.savefig(os.path.join(design_paths["Trajectory/Plots"], design_name+"_"+metric+".png"), dpi=150)
475 |
476 | # Close the figure
477 | plt.close()
478 |
--------------------------------------------------------------------------------
/functions/dssp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/martinpacesa/BindCraft/477755f2cdd4077840ce51051749ddcf63a26862/functions/dssp
--------------------------------------------------------------------------------
/functions/generic_utils.py:
--------------------------------------------------------------------------------
1 | ####################################
2 | ################## General functions
3 | ####################################
4 | ### Import dependencies
5 | import os
6 | import json
7 | import jax
8 | import shutil
9 | import zipfile
10 | import random
11 | import math
12 | import pandas as pd
13 | import numpy as np
14 |
15 | # Define labels for dataframes
16 | def generate_dataframe_labels():
17 | # labels for trajectory
18 | trajectory_labels = ['Design', 'Protocol', 'Length', 'Seed', 'Helicity', 'Target_Hotspot', 'Sequence', 'InterfaceResidues', 'pLDDT', 'pTM', 'i_pTM', 'pAE', 'i_pAE', 'i_pLDDT', 'ss_pLDDT', 'Unrelaxed_Clashes',
19 | 'Relaxed_Clashes', 'Binder_Energy_Score', 'Surface_Hydrophobicity', 'ShapeComplementarity', 'PackStat', 'dG', 'dSASA', 'dG/dSASA', 'Interface_SASA_%', 'Interface_Hydrophobicity', 'n_InterfaceResidues',
20 | 'n_InterfaceHbonds', 'InterfaceHbondsPercentage', 'n_InterfaceUnsatHbonds', 'InterfaceUnsatHbondsPercentage', 'Interface_Helix%', 'Interface_BetaSheet%', 'Interface_Loop%',
21 | 'Binder_Helix%', 'Binder_BetaSheet%', 'Binder_Loop%', 'InterfaceAAs', 'Target_RMSD', 'TrajectoryTime', 'Notes', 'TargetSettings', 'Filters', 'AdvancedSettings']
22 |
23 | # labels for mpnn designs
24 | core_labels = ['pLDDT', 'pTM', 'i_pTM', 'pAE', 'i_pAE', 'i_pLDDT', 'ss_pLDDT', 'Unrelaxed_Clashes', 'Relaxed_Clashes', 'Binder_Energy_Score', 'Surface_Hydrophobicity',
25 | 'ShapeComplementarity', 'PackStat', 'dG', 'dSASA', 'dG/dSASA', 'Interface_SASA_%', 'Interface_Hydrophobicity', 'n_InterfaceResidues', 'n_InterfaceHbonds', 'InterfaceHbondsPercentage',
26 | 'n_InterfaceUnsatHbonds', 'InterfaceUnsatHbondsPercentage', 'Interface_Helix%', 'Interface_BetaSheet%', 'Interface_Loop%', 'Binder_Helix%',
27 | 'Binder_BetaSheet%', 'Binder_Loop%', 'InterfaceAAs', 'Hotspot_RMSD', 'Target_RMSD', 'Binder_pLDDT', 'Binder_pTM', 'Binder_pAE', 'Binder_RMSD']
28 |
29 | design_labels = ['Design', 'Protocol', 'Length', 'Seed', 'Helicity', 'Target_Hotspot', 'Sequence', 'InterfaceResidues', 'MPNN_score', 'MPNN_seq_recovery']
30 |
31 | for label in core_labels:
32 | design_labels += ['Average_' + label] + [f'{i}_{label}' for i in range(1, 6)]
33 |
34 | design_labels += ['DesignTime', 'Notes', 'TargetSettings', 'Filters', 'AdvancedSettings']
35 |
36 | final_labels = ['Rank'] + design_labels
37 |
38 | return trajectory_labels, design_labels, final_labels
39 |
40 | # Create base directions of the project
41 | def generate_directories(design_path):
42 | design_path_names = ["Accepted", "Accepted/Ranked", "Accepted/Animation", "Accepted/Plots", "Accepted/Pickle", "Trajectory",
43 | "Trajectory/Relaxed", "Trajectory/Plots", "Trajectory/Clashing", "Trajectory/LowConfidence", "Trajectory/Animation",
44 | "Trajectory/Pickle", "MPNN", "MPNN/Binder", "MPNN/Sequences", "MPNN/Relaxed", "Rejected"]
45 | design_paths = {}
46 |
47 | # make directories and set design_paths[FOLDER_NAME] variable
48 | for name in design_path_names:
49 | path = os.path.join(design_path, name)
50 | os.makedirs(path, exist_ok=True)
51 | design_paths[name] = path
52 |
53 | return design_paths
54 |
55 | # generate CSV file for tracking designs not passing filters
56 | def generate_filter_pass_csv(failure_csv, filter_json):
57 | if not os.path.exists(failure_csv):
58 | with open(filter_json, 'r') as file:
59 | data = json.load(file)
60 |
61 | # Create a list of modified keys
62 | names = ['Trajectory_logits_pLDDT', 'Trajectory_softmax_pLDDT', 'Trajectory_one-hot_pLDDT', 'Trajectory_final_pLDDT', 'Trajectory_Contacts', 'Trajectory_Clashes', 'Trajectory_WrongHotspot']
63 | special_prefixes = ('Average_', '1_', '2_', '3_', '4_', '5_')
64 | tracked_filters = set()
65 |
66 | for key in data.keys():
67 | processed_name = key # Use the full key by default
68 |
69 | # Check if the key starts with any special prefixes
70 | for prefix in special_prefixes:
71 | if key.startswith(prefix):
72 | # Strip the prefix and use the remaining part
73 | processed_name = key.split('_', 1)[1]
74 | break
75 |
76 | # Handle 'InterfaceAAs' with appending amino acids
77 | if 'InterfaceAAs' in processed_name:
78 | # Generate 20 variations of 'InterfaceAAs' with amino acids appended
79 | amino_acids = 'ACDEFGHIKLMNPQRSTVWY'
80 | for aa in amino_acids:
81 | variant_name = f"InterfaceAAs_{aa}"
82 | if variant_name not in tracked_filters:
83 | names.append(variant_name)
84 | tracked_filters.add(variant_name)
85 | elif processed_name not in tracked_filters:
86 | # Add processed name if it hasn't been added before
87 | names.append(processed_name)
88 | tracked_filters.add(processed_name)
89 |
90 | # make dataframe with 0s
91 | df = pd.DataFrame(columns=names)
92 | df.loc[0] = [0] * len(names)
93 |
94 | df.to_csv(failure_csv, index=False)
95 |
96 | # update failure rates from trajectories and early predictions
97 | def update_failures(failure_csv, failure_column_or_dict):
98 | failure_df = pd.read_csv(failure_csv)
99 |
100 | def strip_model_prefix(name):
101 | # Strips the model-specific prefix if it exists
102 | parts = name.split('_')
103 | if parts[0].isdigit():
104 | return '_'.join(parts[1:])
105 | return name
106 |
107 | # update dictionary coming from complex prediction
108 | if isinstance(failure_column_or_dict, dict):
109 | # Update using a dictionary of failures
110 | for filter_name, count in failure_column_or_dict.items():
111 | stripped_name = strip_model_prefix(filter_name)
112 | if stripped_name in failure_df.columns:
113 | failure_df[stripped_name] += count
114 | else:
115 | failure_df[stripped_name] = count
116 | else:
117 | # Update a single column from trajectory generation
118 | failure_column = strip_model_prefix(failure_column_or_dict)
119 | if failure_column in failure_df.columns:
120 | failure_df[failure_column] += 1
121 | else:
122 | failure_df[failure_column] = 1
123 |
124 | failure_df.to_csv(failure_csv, index=False)
125 |
126 | # Check if number of trajectories generated
127 | def check_n_trajectories(design_paths, advanced_settings):
128 | n_trajectories = [f for f in os.listdir(design_paths["Trajectory/Relaxed"]) if f.endswith('.pdb')]
129 |
130 | if advanced_settings["max_trajectories"] is not False and len(n_trajectories) >= advanced_settings["max_trajectories"]:
131 | print(f"Target number of {str(len(n_trajectories))} trajectories reached, stopping execution...")
132 | return True
133 | else:
134 | return False
135 |
136 | # Check if we have required number of accepted targets, rank them, and analyse sequence and structure properties
137 | def check_accepted_designs(design_paths, mpnn_csv, final_labels, final_csv, advanced_settings, target_settings, design_labels):
138 | accepted_binders = [f for f in os.listdir(design_paths["Accepted"]) if f.endswith('.pdb')]
139 |
140 | if len(accepted_binders) >= target_settings["number_of_final_designs"]:
141 | print(f"Target number {str(len(accepted_binders))} of designs reached! Reranking...")
142 |
143 | # clear the Ranked folder in case we added new designs in the meantime so we rerank them all
144 | for f in os.listdir(design_paths["Accepted/Ranked"]):
145 | os.remove(os.path.join(design_paths["Accepted/Ranked"], f))
146 |
147 | # load dataframe of designed binders
148 | design_df = pd.read_csv(mpnn_csv)
149 | design_df = design_df.sort_values('Average_i_pTM', ascending=False)
150 |
151 | # create final csv dataframe to copy matched rows, initialize with the column labels
152 | final_df = pd.DataFrame(columns=final_labels)
153 |
154 | # check the ranking of the designs and copy them with new ranked IDs to the folder
155 | rank = 1
156 | for _, row in design_df.iterrows():
157 | for binder in accepted_binders:
158 | target_settings["binder_name"], model = binder.rsplit('_model', 1)
159 | if target_settings["binder_name"] == row['Design']:
160 | # rank and copy into ranked folder
161 | row_data = {'Rank': rank, **{label: row[label] for label in design_labels}}
162 | final_df = pd.concat([final_df, pd.DataFrame([row_data])], ignore_index=True)
163 | old_path = os.path.join(design_paths["Accepted"], binder)
164 | new_path = os.path.join(design_paths["Accepted/Ranked"], f"{rank}_{target_settings['binder_name']}_model{model.rsplit('.', 1)[0]}.pdb")
165 | shutil.copyfile(old_path, new_path)
166 |
167 | rank += 1
168 | break
169 |
170 | # save the final_df to final_csv
171 | final_df.to_csv(final_csv, index=False)
172 |
173 | # zip large folders to save space
174 | if advanced_settings["zip_animations"]:
175 | zip_and_empty_folder(design_paths["Trajectory/Animation"], '.html')
176 |
177 | if advanced_settings["zip_plots"]:
178 | zip_and_empty_folder(design_paths["Trajectory/Plots"], '.png')
179 |
180 | return True
181 |
182 | else:
183 | return False
184 |
185 | # Load required helicity value
186 | def load_helicity(advanced_settings):
187 | if advanced_settings["random_helicity"] is True:
188 | # will sample a random bias towards helicity
189 | helicity_value = round(np.random.uniform(-3, 1),2)
190 | elif advanced_settings["weights_helicity"] != 0:
191 | # using a preset helicity bias
192 | helicity_value = advanced_settings["weights_helicity"]
193 | else:
194 | # no bias towards helicity
195 | helicity_value = 0
196 | return helicity_value
197 |
198 | # Report JAX-capable devices
199 | def check_jax_gpu():
200 | devices = jax.devices()
201 |
202 | has_gpu = any(device.platform == 'gpu' for device in devices)
203 |
204 | if not has_gpu:
205 | print("No GPU device found, terminating.")
206 | exit()
207 | else:
208 | print("Available GPUs:")
209 | for i, device in enumerate(devices):
210 | print(f"{device.device_kind}{i + 1}: {device.platform}")
211 |
212 | # check all input files being passed
213 | def perform_input_check(args):
214 | # Get the directory of the current script
215 | binder_script_path = os.path.dirname(os.path.abspath(__file__))
216 |
217 | # Ensure settings file is provided
218 | if not args.settings:
219 | print("Error: --settings is required.")
220 | exit()
221 |
222 | # Set default filters.json path if not provided
223 | if not args.filters:
224 | args.filters = os.path.join(binder_script_path, 'settings_filters', 'default_filters.json')
225 |
226 | # Set a random advanced json settings file if not provided
227 | if not args.advanced:
228 | args.advanced = os.path.join(binder_script_path, 'settings_advanced', 'default_4stage_multimer.json')
229 |
230 | return args.settings, args.filters, args.advanced
231 |
232 | # check specific advanced settings
233 | def perform_advanced_settings_check(advanced_settings, bindcraft_folder):
234 | # set paths to model weights and executables
235 | if bindcraft_folder == "colab":
236 | advanced_settings["af_params_dir"] = '/content/bindcraft/params/'
237 | advanced_settings["dssp_path"] = '/content/bindcraft/functions/dssp'
238 | advanced_settings["dalphaball_path"] = '/content/bindcraft/functions/DAlphaBall.gcc'
239 | else:
240 | # Set paths individually if they are not already set
241 | if not advanced_settings["af_params_dir"]:
242 | advanced_settings["af_params_dir"] = bindcraft_folder
243 | if not advanced_settings["dssp_path"]:
244 | advanced_settings["dssp_path"] = os.path.join(bindcraft_folder, 'functions', 'dssp')
245 | if not advanced_settings["dalphaball_path"]:
246 | advanced_settings["dalphaball_path"] = os.path.join(bindcraft_folder, 'functions', 'DAlphaBall.gcc')
247 |
248 | # check formatting of omit_AAs setting
249 | omit_aas = advanced_settings["omit_AAs"]
250 | if advanced_settings["omit_AAs"] in [None, False, '']:
251 | advanced_settings["omit_AAs"] = None
252 | elif isinstance(advanced_settings["omit_AAs"], str):
253 | advanced_settings["omit_AAs"] = advanced_settings["omit_AAs"].strip()
254 |
255 | return advanced_settings
256 |
257 | # Load settings from JSONs
258 | def load_json_settings(settings_json, filters_json, advanced_json):
259 | # load settings from json files
260 | with open(settings_json, 'r') as file:
261 | target_settings = json.load(file)
262 |
263 | with open(advanced_json, 'r') as file:
264 | advanced_settings = json.load(file)
265 |
266 | with open(filters_json, 'r') as file:
267 | filters = json.load(file)
268 |
269 | return target_settings, advanced_settings, filters
270 |
271 | # AF2 model settings, make sure non-overlapping models with template option are being used for design and re-prediction
272 | def load_af2_models(af_multimer_setting):
273 | if af_multimer_setting:
274 | design_models = [0,1,2,3,4]
275 | prediction_models = [0,1]
276 | multimer_validation = False
277 | else:
278 | design_models = [0,1]
279 | prediction_models = [0,1,2,3,4]
280 | multimer_validation = True
281 |
282 | return design_models, prediction_models, multimer_validation
283 |
284 | # create csv for insertion of data
285 | def create_dataframe(csv_file, columns):
286 | if not os.path.exists(csv_file):
287 | df = pd.DataFrame(columns=columns)
288 | df.to_csv(csv_file, index=False)
289 |
290 | # insert row of statistics into csv
291 | def insert_data(csv_file, data_array):
292 | df = pd.DataFrame([data_array])
293 | df.to_csv(csv_file, mode='a', header=False, index=False)
294 |
295 | # save generated sequence
296 | def save_fasta(design_name, sequence, design_paths):
297 | fasta_path = os.path.join(design_paths["MPNN/Sequences"], design_name+".fasta")
298 | with open(fasta_path,"w") as fasta:
299 | line = f'>{design_name}\n{sequence}'
300 | fasta.write(line+"\n")
301 |
302 | # clean unnecessary rosetta information from PDB
303 | def clean_pdb(pdb_file):
304 | # Read the pdb file and filter relevant lines
305 | with open(pdb_file, 'r') as f_in:
306 | relevant_lines = [line for line in f_in if line.startswith(('ATOM', 'HETATM', 'MODEL', 'TER', 'END', 'LINK'))]
307 |
308 | # Write the cleaned lines back to the original pdb file
309 | with open(pdb_file, 'w') as f_out:
310 | f_out.writelines(relevant_lines)
311 |
312 | def zip_and_empty_folder(folder_path, extension):
313 | folder_basename = os.path.basename(folder_path)
314 | zip_filename = os.path.join(os.path.dirname(folder_path), folder_basename + '.zip')
315 |
316 | # Open the zip file in 'a' mode to append if it exists, otherwise create a new one
317 | with zipfile.ZipFile(zip_filename, 'a', zipfile.ZIP_DEFLATED) as zipf:
318 | for file in os.listdir(folder_path):
319 | if file.endswith(extension):
320 | # Create an absolute path
321 | file_path = os.path.join(folder_path, file)
322 | # Add file to zip file, replacing it if it already exists
323 | zipf.write(file_path, arcname=file)
324 | # Remove the file after adding it to the zip
325 | os.remove(file_path)
326 | print(f"Files in folder '{folder_path}' have been zipped and removed.")
327 |
328 | # calculate averages for statistics
329 | def calculate_averages(statistics, handle_aa=False):
330 | # Initialize a dictionary to hold the sums of each statistic
331 | sums = {}
332 | # Initialize a dictionary to hold the sums of each amino acid count
333 | aa_sums = {}
334 |
335 | # Iterate over the model numbers
336 | for model_num in range(1, 6): # assumes models are numbered 1 through 5
337 | # Check if the model's data exists
338 | if model_num in statistics:
339 | # Get the model's statistics
340 | model_stats = statistics[model_num]
341 | # For each statistic, add its value to the sum
342 | for stat, value in model_stats.items():
343 | # If this is the first time we've seen this statistic, initialize its sum to 0
344 | if stat not in sums:
345 | sums[stat] = 0
346 |
347 | if value is None:
348 | value = 0
349 |
350 | # If the statistic is mpnn_interface_AA and we're supposed to handle it separately, do so
351 | if handle_aa and stat == 'InterfaceAAs':
352 | for aa, count in value.items():
353 | # If this is the first time we've seen this amino acid, initialize its sum to 0
354 | if aa not in aa_sums:
355 | aa_sums[aa] = 0
356 | aa_sums[aa] += count
357 | else:
358 | sums[stat] += value
359 |
360 | # Now that we have the sums, we can calculate the averages
361 | averages = {stat: round(total / len(statistics), 2) for stat, total in sums.items()}
362 |
363 | # If we're handling aa counts, calculate their averages
364 | if handle_aa:
365 | aa_averages = {aa: round(total / len(statistics),2) for aa, total in aa_sums.items()}
366 | averages['InterfaceAAs'] = aa_averages
367 |
368 | return averages
369 |
370 | # filter designs based on feature thresholds
371 | def check_filters(mpnn_data, design_labels, filters):
372 | # check mpnn_data against labels
373 | mpnn_dict = {label: value for label, value in zip(design_labels, mpnn_data)}
374 |
375 | unmet_conditions = []
376 |
377 | # check filters against thresholds
378 | for label, conditions in filters.items():
379 | # special conditions for interface amino acid counts
380 | if label == 'Average_InterfaceAAs' or label == '1_InterfaceAAs' or label == '2_InterfaceAAs' or label == '3_InterfaceAAs' or label == '4_InterfaceAAs' or label == '5_InterfaceAAs':
381 | for aa, aa_conditions in conditions.items():
382 | if mpnn_dict.get(label) is None:
383 | continue
384 | value = mpnn_dict.get(label).get(aa)
385 | if value is None or aa_conditions["threshold"] is None:
386 | continue
387 | if aa_conditions["higher"]:
388 | if value < aa_conditions["threshold"]:
389 | unmet_conditions.append(f"{label}_{aa}")
390 | else:
391 | if value > aa_conditions["threshold"]:
392 | unmet_conditions.append(f"{label}_{aa}")
393 | else:
394 | # if no threshold, then skip
395 | value = mpnn_dict.get(label)
396 | if value is None or conditions["threshold"] is None:
397 | continue
398 | if conditions["higher"]:
399 | if value < conditions["threshold"]:
400 | unmet_conditions.append(label)
401 | else:
402 | if value > conditions["threshold"]:
403 | unmet_conditions.append(label)
404 |
405 | # if all filters are passed then return True
406 | if len(unmet_conditions) == 0:
407 | return True
408 | # if some filters were unmet, print them out
409 | else:
410 | return unmet_conditions
411 |
--------------------------------------------------------------------------------
/functions/pyrosetta_utils.py:
--------------------------------------------------------------------------------
1 | ####################################
2 | ################ PyRosetta functions
3 | ####################################
4 | ### Import dependencies
5 | import os
6 | import pyrosetta as pr
7 | from pyrosetta.rosetta.core.kinematics import MoveMap
8 | from pyrosetta.rosetta.core.select.residue_selector import ChainSelector
9 | from pyrosetta.rosetta.protocols.simple_moves import AlignChainMover
10 | from pyrosetta.rosetta.protocols.analysis import InterfaceAnalyzerMover
11 | from pyrosetta.rosetta.protocols.relax import FastRelax
12 | from pyrosetta.rosetta.core.simple_metrics.metrics import RMSDMetric
13 | from pyrosetta.rosetta.core.select import get_residues_from_subset
14 | from pyrosetta.rosetta.core.io import pose_from_pose
15 | from pyrosetta.rosetta.protocols.rosetta_scripts import XmlObjects
16 | from .generic_utils import clean_pdb
17 | from .biopython_utils import hotspot_residues
18 |
19 | # Rosetta interface scores
20 | def score_interface(pdb_file, binder_chain="B"):
21 | # load pose
22 | pose = pr.pose_from_pdb(pdb_file)
23 |
24 | # analyze interface statistics
25 | iam = InterfaceAnalyzerMover()
26 | iam.set_interface("A_B")
27 | scorefxn = pr.get_fa_scorefxn()
28 | iam.set_scorefunction(scorefxn)
29 | iam.set_compute_packstat(True)
30 | iam.set_compute_interface_energy(True)
31 | iam.set_calc_dSASA(True)
32 | iam.set_calc_hbond_sasaE(True)
33 | iam.set_compute_interface_sc(True)
34 | iam.set_pack_separated(True)
35 | iam.apply(pose)
36 |
37 | # Initialize dictionary with all amino acids
38 | interface_AA = {aa: 0 for aa in 'ACDEFGHIKLMNPQRSTVWY'}
39 |
40 | # Initialize list to store PDB residue IDs at the interface
41 | interface_residues_set = hotspot_residues(pdb_file, binder_chain)
42 | interface_residues_pdb_ids = []
43 |
44 | # Iterate over the interface residues
45 | for pdb_res_num, aa_type in interface_residues_set.items():
46 | # Increase the count for this amino acid type
47 | interface_AA[aa_type] += 1
48 |
49 | # Append the binder_chain and the PDB residue number to the list
50 | interface_residues_pdb_ids.append(f"{binder_chain}{pdb_res_num}")
51 |
52 | # count interface residues
53 | interface_nres = len(interface_residues_pdb_ids)
54 |
55 | # Convert the list into a comma-separated string
56 | interface_residues_pdb_ids_str = ','.join(interface_residues_pdb_ids)
57 |
58 | # Calculate the percentage of hydrophobic residues at the interface of the binder
59 | hydrophobic_aa = set('ACFILMPVWY')
60 | hydrophobic_count = sum(interface_AA[aa] for aa in hydrophobic_aa)
61 | if interface_nres != 0:
62 | interface_hydrophobicity = (hydrophobic_count / interface_nres) * 100
63 | else:
64 | interface_hydrophobicity = 0
65 |
66 | # retrieve statistics
67 | interfacescore = iam.get_all_data()
68 | interface_sc = interfacescore.sc_value # shape complementarity
69 | interface_interface_hbonds = interfacescore.interface_hbonds # number of interface H-bonds
70 | interface_dG = iam.get_interface_dG() # interface dG
71 | interface_dSASA = iam.get_interface_delta_sasa() # interface dSASA (interface surface area)
72 | interface_packstat = iam.get_interface_packstat() # interface pack stat score
73 | interface_dG_SASA_ratio = interfacescore.dG_dSASA_ratio * 100 # ratio of dG/dSASA (normalised energy for interface area size)
74 | buns_filter = XmlObjects.static_get_filter('')
75 | interface_delta_unsat_hbonds = buns_filter.report_sm(pose)
76 |
77 | if interface_nres != 0:
78 | interface_hbond_percentage = (interface_interface_hbonds / interface_nres) * 100 # Hbonds per interface size percentage
79 | interface_bunsch_percentage = (interface_delta_unsat_hbonds / interface_nres) * 100 # Unsaturated H-bonds per percentage
80 | else:
81 | interface_hbond_percentage = None
82 | interface_bunsch_percentage = None
83 |
84 | # calculate binder energy score
85 | chain_design = ChainSelector(binder_chain)
86 | tem = pr.rosetta.core.simple_metrics.metrics.TotalEnergyMetric()
87 | tem.set_scorefunction(scorefxn)
88 | tem.set_residue_selector(chain_design)
89 | binder_score = tem.calculate(pose)
90 |
91 | # calculate binder SASA fraction
92 | bsasa = pr.rosetta.core.simple_metrics.metrics.SasaMetric()
93 | bsasa.set_residue_selector(chain_design)
94 | binder_sasa = bsasa.calculate(pose)
95 |
96 | if binder_sasa > 0:
97 | interface_binder_fraction = (interface_dSASA / binder_sasa) * 100
98 | else:
99 | interface_binder_fraction = 0
100 |
101 | # calculate surface hydrophobicity
102 | binder_pose = {pose.pdb_info().chain(pose.conformation().chain_begin(i)): p for i, p in zip(range(1, pose.num_chains()+1), pose.split_by_chain())}[binder_chain]
103 |
104 | layer_sel = pr.rosetta.core.select.residue_selector.LayerSelector()
105 | layer_sel.set_layers(pick_core = False, pick_boundary = False, pick_surface = True)
106 | surface_res = layer_sel.apply(binder_pose)
107 |
108 | exp_apol_count = 0
109 | total_count = 0
110 |
111 | # count apolar and aromatic residues at the surface
112 | for i in range(1, len(surface_res) + 1):
113 | if surface_res[i] == True:
114 | res = binder_pose.residue(i)
115 |
116 | # count apolar and aromatic residues as hydrophobic
117 | if res.is_apolar() == True or res.name() == 'PHE' or res.name() == 'TRP' or res.name() == 'TYR':
118 | exp_apol_count += 1
119 | total_count += 1
120 |
121 | surface_hydrophobicity = exp_apol_count/total_count
122 |
123 | # output interface score array and amino acid counts at the interface
124 | interface_scores = {
125 | 'binder_score': binder_score,
126 | 'surface_hydrophobicity': surface_hydrophobicity,
127 | 'interface_sc': interface_sc,
128 | 'interface_packstat': interface_packstat,
129 | 'interface_dG': interface_dG,
130 | 'interface_dSASA': interface_dSASA,
131 | 'interface_dG_SASA_ratio': interface_dG_SASA_ratio,
132 | 'interface_fraction': interface_binder_fraction,
133 | 'interface_hydrophobicity': interface_hydrophobicity,
134 | 'interface_nres': interface_nres,
135 | 'interface_interface_hbonds': interface_interface_hbonds,
136 | 'interface_hbond_percentage': interface_hbond_percentage,
137 | 'interface_delta_unsat_hbonds': interface_delta_unsat_hbonds,
138 | 'interface_delta_unsat_hbonds_percentage': interface_bunsch_percentage
139 | }
140 |
141 | # round to two decimal places
142 | interface_scores = {k: round(v, 2) if isinstance(v, float) else v for k, v in interface_scores.items()}
143 |
144 | return interface_scores, interface_AA, interface_residues_pdb_ids_str
145 |
146 | # align pdbs to have same orientation
147 | def align_pdbs(reference_pdb, align_pdb, reference_chain_id, align_chain_id):
148 | # initiate poses
149 | reference_pose = pr.pose_from_pdb(reference_pdb)
150 | align_pose = pr.pose_from_pdb(align_pdb)
151 |
152 | align = AlignChainMover()
153 | align.pose(reference_pose)
154 |
155 | # If the chain IDs contain commas, split them and only take the first value
156 | reference_chain_id = reference_chain_id.split(',')[0]
157 | align_chain_id = align_chain_id.split(',')[0]
158 |
159 | # Get the chain number corresponding to the chain ID in the poses
160 | reference_chain = pr.rosetta.core.pose.get_chain_id_from_chain(reference_chain_id, reference_pose)
161 | align_chain = pr.rosetta.core.pose.get_chain_id_from_chain(align_chain_id, align_pose)
162 |
163 | align.source_chain(align_chain)
164 | align.target_chain(reference_chain)
165 | align.apply(align_pose)
166 |
167 | # Overwrite aligned pdb
168 | align_pose.dump_pdb(align_pdb)
169 | clean_pdb(align_pdb)
170 |
171 | # calculate the rmsd without alignment
172 | def unaligned_rmsd(reference_pdb, align_pdb, reference_chain_id, align_chain_id):
173 | reference_pose = pr.pose_from_pdb(reference_pdb)
174 | align_pose = pr.pose_from_pdb(align_pdb)
175 |
176 | # Define chain selectors for the reference and align chains
177 | reference_chain_selector = ChainSelector(reference_chain_id)
178 | align_chain_selector = ChainSelector(align_chain_id)
179 |
180 | # Apply selectors to get residue subsets
181 | reference_chain_subset = reference_chain_selector.apply(reference_pose)
182 | align_chain_subset = align_chain_selector.apply(align_pose)
183 |
184 | # Convert subsets to residue index vectors
185 | reference_residue_indices = get_residues_from_subset(reference_chain_subset)
186 | align_residue_indices = get_residues_from_subset(align_chain_subset)
187 |
188 | # Create empty subposes
189 | reference_chain_pose = pr.Pose()
190 | align_chain_pose = pr.Pose()
191 |
192 | # Fill subposes
193 | pose_from_pose(reference_chain_pose, reference_pose, reference_residue_indices)
194 | pose_from_pose(align_chain_pose, align_pose, align_residue_indices)
195 |
196 | # Calculate RMSD using the RMSDMetric
197 | rmsd_metric = RMSDMetric()
198 | rmsd_metric.set_comparison_pose(reference_chain_pose)
199 | rmsd = rmsd_metric.calculate(align_chain_pose)
200 |
201 | return round(rmsd, 2)
202 |
203 | # Relax designed structure
204 | def pr_relax(pdb_file, relaxed_pdb_path):
205 | if not os.path.exists(relaxed_pdb_path):
206 | # Generate pose
207 | pose = pr.pose_from_pdb(pdb_file)
208 | start_pose = pose.clone()
209 |
210 | ### Generate movemaps
211 | mmf = MoveMap()
212 | mmf.set_chi(True) # enable sidechain movement
213 | mmf.set_bb(True) # enable backbone movement, can be disabled to increase speed by 30% but makes metrics look worse on average
214 | mmf.set_jump(False) # disable whole chain movement
215 |
216 | # Run FastRelax
217 | fastrelax = FastRelax()
218 | scorefxn = pr.get_fa_scorefxn()
219 | fastrelax.set_scorefxn(scorefxn)
220 | fastrelax.set_movemap(mmf) # set MoveMap
221 | fastrelax.max_iter(200) # default iterations is 2500
222 | fastrelax.min_type("lbfgs_armijo_nonmonotone")
223 | fastrelax.constrain_relax_to_start_coords(True)
224 | fastrelax.apply(pose)
225 |
226 | # Align relaxed structure to original trajectory
227 | align = AlignChainMover()
228 | align.source_chain(0)
229 | align.target_chain(0)
230 | align.pose(start_pose)
231 | align.apply(pose)
232 |
233 | # Copy B factors from start_pose to pose
234 | for resid in range(1, pose.total_residue() + 1):
235 | if pose.residue(resid).is_protein():
236 | # Get the B factor of the first heavy atom in the residue
237 | bfactor = start_pose.pdb_info().bfactor(resid, 1)
238 | for atom_id in range(1, pose.residue(resid).natoms() + 1):
239 | pose.pdb_info().bfactor(resid, atom_id, bfactor)
240 |
241 | # output relaxed and aligned PDB
242 | pose.dump_pdb(relaxed_pdb_path)
243 | clean_pdb(relaxed_pdb_path)
--------------------------------------------------------------------------------
/install_bindcraft.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | ################## BindCraft installation script
3 | ################## specify conda/mamba folder, and installation folder for git repositories, and whether to use mamba or $pkg_manager
4 | # Default value for pkg_manager
5 | pkg_manager='conda'
6 | cuda=''
7 |
8 | # Define the short and long options
9 | OPTIONS=p:c:
10 | LONGOPTIONS=pkg_manager:,cuda:
11 |
12 | # Parse the command-line options
13 | PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTIONS --name "$0" -- "$@")
14 | eval set -- "$PARSED"
15 |
16 | # Process the command-line options
17 | while true; do
18 | case "$1" in
19 | -p|--pkg_manager)
20 | pkg_manager="$2"
21 | shift 2
22 | ;;
23 | -c|--cuda)
24 | cuda="$2"
25 | shift 2
26 | ;;
27 | --)
28 | shift
29 | break
30 | ;;
31 | *)
32 | echo -e "Invalid option $1" >&2
33 | exit 1
34 | ;;
35 | esac
36 | done
37 |
38 | # Example usage of the parsed variables
39 | echo -e "Package manager: $pkg_manager"
40 | echo -e "CUDA: $cuda"
41 |
42 | ############################################################################################################
43 | ############################################################################################################
44 | ################## initialisation
45 | SECONDS=0
46 |
47 | # set paths needed for installation and check for conda installation
48 | install_dir=$(pwd)
49 | CONDA_BASE=$(conda info --base 2>/dev/null) || { echo -e "Error: conda is not installed or cannot be initialised."; exit 1; }
50 | echo -e "Conda is installed at: $CONDA_BASE"
51 |
52 | ### BindCraft install begin, create base environment
53 | echo -e "Installing BindCraft environment\n"
54 | $pkg_manager create --name BindCraft python=3.10 -y || { echo -e "Error: Failed to create BindCraft conda environment"; exit 1; }
55 | conda env list | grep -w 'BindCraft' >/dev/null 2>&1 || { echo -e "Error: Conda environment 'BindCraft' does not exist after creation."; exit 1; }
56 |
57 | # Load newly created BindCraft environment
58 | echo -e "Loading BindCraft environment\n"
59 | source ${CONDA_BASE}/bin/activate ${CONDA_BASE}/envs/BindCraft || { echo -e "Error: Failed to activate the BindCraft environment."; exit 1; }
60 | [ "$CONDA_DEFAULT_ENV" = "BindCraft" ] || { echo -e "Error: The BindCraft environment is not active."; exit 1; }
61 | echo -e "BindCraft environment activated at ${CONDA_BASE}/envs/BindCraft"
62 |
63 | # install required conda packages
64 | echo -e "Instaling conda requirements\n"
65 | if [ -n "$cuda" ]; then
66 | CONDA_OVERRIDE_CUDA="$cuda" $pkg_manager install pip pandas matplotlib numpy"<2.0.0" biopython scipy pdbfixer seaborn libgfortran5 tqdm jupyter ffmpeg pyrosetta fsspec py3dmol chex dm-haiku flax"<0.10.0" dm-tree joblib ml-collections immutabledict optax jaxlib=*=*cuda* jax cuda-nvcc cudnn -c conda-forge -c nvidia --channel https://conda.graylab.jhu.edu -y || { echo -e "Error: Failed to install conda packages."; exit 1; }
67 | else
68 | $pkg_manager install pip pandas matplotlib numpy"<2.0.0" biopython scipy pdbfixer seaborn libgfortran5 tqdm jupyter ffmpeg pyrosetta fsspec py3dmol chex dm-haiku flax"<0.10.0" dm-tree joblib ml-collections immutabledict optax jaxlib jax cuda-nvcc cudnn -c conda-forge -c nvidia --channel https://conda.graylab.jhu.edu -y || { echo -e "Error: Failed to install conda packages."; exit 1; }
69 | fi
70 |
71 | # make sure all required packages were installed
72 | required_packages=(pip pandas libgfortran5 matplotlib numpy biopython scipy pdbfixer seaborn tqdm jupyter ffmpeg pyrosetta fsspec py3dmol chex dm-haiku dm-tree joblib ml-collections immutabledict optax jaxlib jax cuda-nvcc cudnn)
73 | missing_packages=()
74 |
75 | # Check each package
76 | for pkg in "${required_packages[@]}"; do
77 | conda list "$pkg" | grep -w "$pkg" >/dev/null 2>&1 || missing_packages+=("$pkg")
78 | done
79 |
80 | # If any packages are missing, output error and exit
81 | if [ ${#missing_packages[@]} -ne 0 ]; then
82 | echo -e "Error: The following packages are missing from the environment:"
83 | for pkg in "${missing_packages[@]}"; do
84 | echo -e " - $pkg"
85 | done
86 | exit 1
87 | fi
88 |
89 | # install ColabDesign
90 | echo -e "Installing ColabDesign\n"
91 | pip3 install git+https://github.com/sokrypton/ColabDesign.git --no-deps || { echo -e "Error: Failed to install ColabDesign"; exit 1; }
92 | python -c "import colabdesign" >/dev/null 2>&1 || { echo -e "Error: colabdesign module not found after installation"; exit 1; }
93 |
94 | # AlphaFold2 weights
95 | echo -e "Downloading AlphaFold2 model weights \n"
96 | params_dir="${install_dir}/params"
97 | params_file="${params_dir}/alphafold_params_2022-12-06.tar"
98 |
99 | # download AF2 weights
100 | mkdir -p "${params_dir}" || { echo -e "Error: Failed to create weights directory"; exit 1; }
101 | wget -O "${params_file}" "https://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tar" || { echo -e "Error: Failed to download AlphaFold2 weights"; exit 1; }
102 | [ -s "${params_file}" ] || { echo -e "Error: Could not locate downloaded AlphaFold2 weights"; exit 1; }
103 |
104 | # extract AF2 weights
105 | tar tf "${params_file}" >/dev/null 2>&1 || { echo -e "Error: Corrupt AlphaFold2 weights download"; exit 1; }
106 | tar -xvf "${params_file}" -C "${params_dir}" || { echo -e "Error: Failed to extract AlphaFold2weights"; exit 1; }
107 | [ -f "${params_dir}/params_model_5_ptm.npz" ] || { echo -e "Error: Could not locate extracted AlphaFold2 weights"; exit 1; }
108 | rm "${params_file}" || { echo -e "Warning: Failed to remove AlphaFold2 weights archive"; }
109 |
110 | # chmod executables
111 | echo -e "Changing permissions for executables\n"
112 | chmod +x "${install_dir}/functions/dssp" || { echo -e "Error: Failed to chmod dssp"; exit 1; }
113 | chmod +x "${install_dir}/functions/DAlphaBall.gcc" || { echo -e "Error: Failed to chmod DAlphaBall.gcc"; exit 1; }
114 |
115 | # finish
116 | conda deactivate
117 | echo -e "BindCraft environment set up\n"
118 |
119 | ############################################################################################################
120 | ############################################################################################################
121 | ################## cleanup
122 | echo -e "Cleaning up ${pkg_manager} temporary files to save space\n"
123 | $pkg_manager clean -a -y
124 | echo -e "$pkg_manager cleaned up\n"
125 |
126 | ################## finish script
127 | t=$SECONDS
128 | echo -e "Successfully finished BindCraft installation!\n"
129 | echo -e "Activate environment using command: \"$pkg_manager activate BindCraft\""
130 | echo -e "\n"
131 | echo -e "Installation took $(($t / 3600)) hours, $((($t / 60) % 60)) minutes and $(($t % 60)) seconds."
--------------------------------------------------------------------------------
/pipeline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/martinpacesa/BindCraft/477755f2cdd4077840ce51051749ddcf63a26862/pipeline.png
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_flexible.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_flexible_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_mpnn.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_mpnn_flexible.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_mpnn_flexible_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/betasheet_4stage_multimer_mpnn_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.15,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 0.4,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -2.0,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_flexible.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 300,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_flexible_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 300,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 600,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_mpnn.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 300,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_mpnn_flexible.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": false,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 300,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_mpnn_flexible_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 300,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/default_4stage_multimer_mpnn_hardtarget.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": false,
4 | "use_multimer_design": true,
5 | "design_algorithm": "4stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.1,
23 | "weights_con_intra": 1.0,
24 | "weights_con_inter": 1.0,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": -0.3,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": true,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 20,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": true,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.01,
62 | "start_monitoring": 300,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/peptide_3stage_multimer.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": true,
4 | "use_multimer_design": true,
5 | "design_algorithm": "3stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.2,
23 | "weights_con_intra": 0.5,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": 0.95,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": false,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 10,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": false,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.1,
62 | "start_monitoring": 1000,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/peptide_3stage_multimer_flexible.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": true,
4 | "use_multimer_design": true,
5 | "design_algorithm": "3stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.2,
23 | "weights_con_intra": 0.5,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": 0.95,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": false,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": true,
39 | "num_seqs": 10,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": false,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.1,
62 | "start_monitoring": 1000,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/peptide_3stage_multimer_mpnn.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": true,
4 | "use_multimer_design": true,
5 | "design_algorithm": "3stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": false,
8 | "rm_template_seq_predict": false,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.2,
23 | "weights_con_intra": 0.5,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": 0.95,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": false,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 50,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": false,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.1,
62 | "start_monitoring": 1000,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_advanced/peptide_3stage_multimer_mpnn_flexible.json:
--------------------------------------------------------------------------------
1 | {
2 | "omit_AAs": "C",
3 | "force_reject_AA": true,
4 | "use_multimer_design": true,
5 | "design_algorithm": "3stage",
6 | "sample_models": true,
7 | "rm_template_seq_design": true,
8 | "rm_template_seq_predict": true,
9 | "rm_template_sc_design": false,
10 | "rm_template_sc_predict": false,
11 | "predict_initial_guess": true,
12 | "predict_bigbang": false,
13 | "soft_iterations": 75,
14 | "temporary_iterations": 45,
15 | "hard_iterations": 5,
16 | "greedy_iterations": 15,
17 | "greedy_percentage": 5,
18 | "save_design_animations": true,
19 | "save_design_trajectory_plots": true,
20 | "weights_plddt": 0.1,
21 | "weights_pae_intra": 0.4,
22 | "weights_pae_inter": 0.2,
23 | "weights_con_intra": 0.5,
24 | "weights_con_inter": 0.5,
25 | "intra_contact_distance": 14.0,
26 | "inter_contact_distance": 20.0,
27 | "intra_contact_number": 2,
28 | "inter_contact_number": 2,
29 | "weights_helicity": 0.95,
30 | "random_helicity": false,
31 | "use_i_ptm_loss": true,
32 | "weights_iptm": 0.05,
33 | "use_rg_loss": false,
34 | "weights_rg": 0.3,
35 | "use_termini_distance_loss": false,
36 | "weights_termini_loss": 0.1,
37 | "enable_mpnn": true,
38 | "mpnn_fix_interface": false,
39 | "num_seqs": 50,
40 | "max_mpnn_sequences": 2,
41 | "sampling_temp": 0.1,
42 | "backbone_noise": 0.00,
43 | "model_path": "v_48_020",
44 | "mpnn_weights": "soluble",
45 | "save_mpnn_fasta": false,
46 | "num_recycles_design": 1,
47 | "num_recycles_validation": 3,
48 | "optimise_beta": false,
49 | "optimise_beta_extra_soft": 0,
50 | "optimise_beta_extra_temp": 0,
51 | "optimise_beta_recycles_design": 3,
52 | "optimise_beta_recycles_valid": 3,
53 | "remove_unrelaxed_trajectory": true,
54 | "remove_unrelaxed_complex": true,
55 | "remove_binder_monomer": true,
56 | "zip_animations": true,
57 | "zip_plots": true,
58 | "save_trajectory_pickle": false,
59 | "max_trajectories": false,
60 | "enable_rejection_check": true,
61 | "acceptance_rate": 0.1,
62 | "start_monitoring": 1000,
63 | "af_params_dir": "",
64 | "dssp_path": "",
65 | "dalphaball_path": ""
66 | }
67 |
--------------------------------------------------------------------------------
/settings_filters/peptide_filters.json:
--------------------------------------------------------------------------------
1 | {
2 | "MPNN_score": {
3 | "threshold": null,
4 | "higher": false
5 | },
6 | "MPNN_seq_recovery": {
7 | "threshold": null,
8 | "higher": false
9 | },
10 | "Average_pLDDT": {
11 | "threshold": 0.8,
12 | "higher": true
13 | },
14 | "1_pLDDT": {
15 | "threshold": 0.8,
16 | "higher": true
17 | },
18 | "2_pLDDT": {
19 | "threshold": 0.8,
20 | "higher": true
21 | },
22 | "3_pLDDT": {
23 | "threshold": null,
24 | "higher": true
25 | },
26 | "4_pLDDT": {
27 | "threshold": null,
28 | "higher": true
29 | },
30 | "5_pLDDT": {
31 | "threshold": null,
32 | "higher": true
33 | },
34 | "Average_pTM": {
35 | "threshold": 0.55,
36 | "higher": true
37 | },
38 | "1_pTM": {
39 | "threshold": 0.55,
40 | "higher": true
41 | },
42 | "2_pTM": {
43 | "threshold": 0.55,
44 | "higher": true
45 | },
46 | "3_pTM": {
47 | "threshold": null,
48 | "higher": true
49 | },
50 | "4_pTM": {
51 | "threshold": null,
52 | "higher": true
53 | },
54 | "5_pTM": {
55 | "threshold": null,
56 | "higher": true
57 | },
58 | "Average_i_pTM": {
59 | "threshold": 0.4,
60 | "higher": true
61 | },
62 | "1_i_pTM": {
63 | "threshold": 0.4,
64 | "higher": true
65 | },
66 | "2_i_pTM": {
67 | "threshold": 0.4,
68 | "higher": true
69 | },
70 | "3_i_pTM": {
71 | "threshold": null,
72 | "higher": true
73 | },
74 | "4_i_pTM": {
75 | "threshold": null,
76 | "higher": true
77 | },
78 | "5_i_pTM": {
79 | "threshold": null,
80 | "higher": true
81 | },
82 | "Average_pAE": {
83 | "threshold": null,
84 | "higher": false
85 | },
86 | "1_pAE": {
87 | "threshold": null,
88 | "higher": false
89 | },
90 | "2_pAE": {
91 | "threshold": null,
92 | "higher": false
93 | },
94 | "3_pAE": {
95 | "threshold": null,
96 | "higher": false
97 | },
98 | "4_pAE": {
99 | "threshold": null,
100 | "higher": false
101 | },
102 | "5_pAE": {
103 | "threshold": null,
104 | "higher": false
105 | },
106 | "Average_i_pAE": {
107 | "threshold": 0.3,
108 | "higher": false
109 | },
110 | "1_i_pAE": {
111 | "threshold": 0.3,
112 | "higher": false
113 | },
114 | "2_i_pAE": {
115 | "threshold": 0.3,
116 | "higher": false
117 | },
118 | "3_i_pAE": {
119 | "threshold": null,
120 | "higher": false
121 | },
122 | "4_i_pAE": {
123 | "threshold": null,
124 | "higher": false
125 | },
126 | "5_i_pAE": {
127 | "threshold": null,
128 | "higher": false
129 | },
130 | "Average_i_pLDDT": {
131 | "threshold": null,
132 | "higher": true
133 | },
134 | "1_i_pLDDT": {
135 | "threshold": null,
136 | "higher": true
137 | },
138 | "2_i_pLDDT": {
139 | "threshold": null,
140 | "higher": true
141 | },
142 | "3_i_pLDDT": {
143 | "threshold": null,
144 | "higher": true
145 | },
146 | "4_i_pLDDT": {
147 | "threshold": null,
148 | "higher": true
149 | },
150 | "5_i_pLDDT": {
151 | "threshold": null,
152 | "higher": true
153 | },
154 | "Average_ss_pLDDT": {
155 | "threshold": null,
156 | "higher": true
157 | },
158 | "1_ss_pLDDT": {
159 | "threshold": null,
160 | "higher": true
161 | },
162 | "2_ss_pLDDT": {
163 | "threshold": null,
164 | "higher": true
165 | },
166 | "3_ss_pLDDT": {
167 | "threshold": null,
168 | "higher": true
169 | },
170 | "4_ss_pLDDT": {
171 | "threshold": null,
172 | "higher": true
173 | },
174 | "5_ss_pLDDT": {
175 | "threshold": null,
176 | "higher": true
177 | },
178 | "Average_Unrelaxed_Clashes": {
179 | "threshold": null,
180 | "higher": false
181 | },
182 | "1_Unrelaxed_Clashes": {
183 | "threshold": null,
184 | "higher": false
185 | },
186 | "2_Unrelaxed_Clashes": {
187 | "threshold": null,
188 | "higher": false
189 | },
190 | "3_Unrelaxed_Clashes": {
191 | "threshold": null,
192 | "higher": false
193 | },
194 | "4_Unrelaxed_Clashes": {
195 | "threshold": null,
196 | "higher": false
197 | },
198 | "5_Unrelaxed_Clashes": {
199 | "threshold": null,
200 | "higher": false
201 | },
202 | "Average_Relaxed_Clashes": {
203 | "threshold": null,
204 | "higher": false
205 | },
206 | "1_Relaxed_Clashes": {
207 | "threshold": null,
208 | "higher": false
209 | },
210 | "2_Relaxed_Clashes": {
211 | "threshold": null,
212 | "higher": false
213 | },
214 | "3_Relaxed_Clashes": {
215 | "threshold": null,
216 | "higher": false
217 | },
218 | "4_Relaxed_Clashes": {
219 | "threshold": null,
220 | "higher": false
221 | },
222 | "5_Relaxed_Clashes": {
223 | "threshold": null,
224 | "higher": false
225 | },
226 | "Average_Binder_Energy_Score": {
227 | "threshold": 0,
228 | "higher": false
229 | },
230 | "1_Binder_Energy_Score": {
231 | "threshold": 0,
232 | "higher": false
233 | },
234 | "2_Binder_Energy_Score": {
235 | "threshold": 0,
236 | "higher": false
237 | },
238 | "3_Binder_Energy_Score": {
239 | "threshold": null,
240 | "higher": false
241 | },
242 | "4_Binder_Energy_Score": {
243 | "threshold": null,
244 | "higher": false
245 | },
246 | "5_Binder_Energy_Score": {
247 | "threshold": null,
248 | "higher": false
249 | },
250 | "Average_Surface_Hydrophobicity": {
251 | "threshold": 0.5,
252 | "higher": false
253 | },
254 | "1_Surface_Hydrophobicity": {
255 | "threshold": 0.5,
256 | "higher": false
257 | },
258 | "2_Surface_Hydrophobicity": {
259 | "threshold": 0.5,
260 | "higher": false
261 | },
262 | "3_Surface_Hydrophobicity": {
263 | "threshold": null,
264 | "higher": false
265 | },
266 | "4_Surface_Hydrophobicity": {
267 | "threshold": null,
268 | "higher": false
269 | },
270 | "5_Surface_Hydrophobicity": {
271 | "threshold": null,
272 | "higher": false
273 | },
274 | "Average_ShapeComplementarity": {
275 | "threshold": 0.55,
276 | "higher": true
277 | },
278 | "1_ShapeComplementarity": {
279 | "threshold": 0.5,
280 | "higher": true
281 | },
282 | "2_ShapeComplementarity": {
283 | "threshold": 0.5,
284 | "higher": true
285 | },
286 | "3_ShapeComplementarity": {
287 | "threshold": null,
288 | "higher": true
289 | },
290 | "4_ShapeComplementarity": {
291 | "threshold": null,
292 | "higher": true
293 | },
294 | "5_ShapeComplementarity": {
295 | "threshold": null,
296 | "higher": true
297 | },
298 | "Average_PackStat": {
299 | "threshold": null,
300 | "higher": true
301 | },
302 | "1_PackStat": {
303 | "threshold": null,
304 | "higher": true
305 | },
306 | "2_PackStat": {
307 | "threshold": null,
308 | "higher": true
309 | },
310 | "3_PackStat": {
311 | "threshold": null,
312 | "higher": true
313 | },
314 | "4_PackStat": {
315 | "threshold": null,
316 | "higher": true
317 | },
318 | "5_PackStat": {
319 | "threshold": null,
320 | "higher": true
321 | },
322 | "Average_dG": {
323 | "threshold": 0,
324 | "higher": false
325 | },
326 | "1_dG": {
327 | "threshold": 0,
328 | "higher": false
329 | },
330 | "2_dG": {
331 | "threshold": 0,
332 | "higher": false
333 | },
334 | "3_dG": {
335 | "threshold": null,
336 | "higher": false
337 | },
338 | "4_dG": {
339 | "threshold": null,
340 | "higher": false
341 | },
342 | "5_dG": {
343 | "threshold": null,
344 | "higher": false
345 | },
346 | "Average_dSASA": {
347 | "threshold": 1,
348 | "higher": true
349 | },
350 | "1_dSASA": {
351 | "threshold": 1,
352 | "higher": true
353 | },
354 | "2_dSASA": {
355 | "threshold": 1,
356 | "higher": true
357 | },
358 | "3_dSASA": {
359 | "threshold": null,
360 | "higher": true
361 | },
362 | "4_dSASA": {
363 | "threshold": null,
364 | "higher": true
365 | },
366 | "5_dSASA": {
367 | "threshold": null,
368 | "higher": true
369 | },
370 | "Average_dG/dSASA": {
371 | "threshold": null,
372 | "higher": false
373 | },
374 | "1_dG/dSASA": {
375 | "threshold": null,
376 | "higher": false
377 | },
378 | "2_dG/dSASA": {
379 | "threshold": null,
380 | "higher": false
381 | },
382 | "3_dG/dSASA": {
383 | "threshold": null,
384 | "higher": false
385 | },
386 | "4_dG/dSASA": {
387 | "threshold": null,
388 | "higher": false
389 | },
390 | "5_dG/dSASA": {
391 | "threshold": null,
392 | "higher": false
393 | },
394 | "Average_Interface_SASA_%": {
395 | "threshold": null,
396 | "higher": true
397 | },
398 | "1_Interface_SASA_%": {
399 | "threshold": null,
400 | "higher": true
401 | },
402 | "2_Interface_SASA_%": {
403 | "threshold": null,
404 | "higher": true
405 | },
406 | "3_Interface_SASA_%": {
407 | "threshold": null,
408 | "higher": true
409 | },
410 | "4_Interface_SASA_%": {
411 | "threshold": null,
412 | "higher": true
413 | },
414 | "5_Interface_SASA_%": {
415 | "threshold": null,
416 | "higher": true
417 | },
418 | "Average_Interface_Hydrophobicity": {
419 | "threshold": null,
420 | "higher": true
421 | },
422 | "1_Interface_Hydrophobicity": {
423 | "threshold": null,
424 | "higher": true
425 | },
426 | "2_Interface_Hydrophobicity": {
427 | "threshold": null,
428 | "higher": true
429 | },
430 | "3_Interface_Hydrophobicity": {
431 | "threshold": null,
432 | "higher": true
433 | },
434 | "4_Interface_Hydrophobicity": {
435 | "threshold": null,
436 | "higher": true
437 | },
438 | "5_Interface_Hydrophobicity": {
439 | "threshold": null,
440 | "higher": true
441 | },
442 | "Average_n_InterfaceResidues": {
443 | "threshold": 4,
444 | "higher": true
445 | },
446 | "1_n_InterfaceResidues": {
447 | "threshold": 4,
448 | "higher": true
449 | },
450 | "2_n_InterfaceResidues": {
451 | "threshold": 4,
452 | "higher": true
453 | },
454 | "3_n_InterfaceResidues": {
455 | "threshold": null,
456 | "higher": true
457 | },
458 | "4_n_InterfaceResidues": {
459 | "threshold": null,
460 | "higher": true
461 | },
462 | "5_n_InterfaceResidues": {
463 | "threshold": null,
464 | "higher": true
465 | },
466 | "Average_n_InterfaceHbonds": {
467 | "threshold": 1,
468 | "higher": true
469 | },
470 | "1_n_InterfaceHbonds": {
471 | "threshold": 1,
472 | "higher": true
473 | },
474 | "2_n_InterfaceHbonds": {
475 | "threshold": 1,
476 | "higher": true
477 | },
478 | "3_n_InterfaceHbonds": {
479 | "threshold": null,
480 | "higher": true
481 | },
482 | "4_n_InterfaceHbonds": {
483 | "threshold": null,
484 | "higher": true
485 | },
486 | "5_n_InterfaceHbonds": {
487 | "threshold": null,
488 | "higher": true
489 | },
490 | "Average_InterfaceHbondsPercentage": {
491 | "threshold": null,
492 | "higher": true
493 | },
494 | "1_InterfaceHbondsPercentage": {
495 | "threshold": null,
496 | "higher": true
497 | },
498 | "2_InterfaceHbondsPercentage": {
499 | "threshold": null,
500 | "higher": true
501 | },
502 | "3_InterfaceHbondsPercentage": {
503 | "threshold": null,
504 | "higher": true
505 | },
506 | "4_InterfaceHbondsPercentage": {
507 | "threshold": null,
508 | "higher": true
509 | },
510 | "5_InterfaceHbondsPercentage": {
511 | "threshold": null,
512 | "higher": true
513 | },
514 | "Average_n_InterfaceUnsatHbonds": {
515 | "threshold": 3,
516 | "higher": false
517 | },
518 | "1_n_InterfaceUnsatHbonds": {
519 | "threshold": 3,
520 | "higher": false
521 | },
522 | "2_n_InterfaceUnsatHbonds": {
523 | "threshold": 3,
524 | "higher": false
525 | },
526 | "3_n_InterfaceUnsatHbonds": {
527 | "threshold": null,
528 | "higher": false
529 | },
530 | "4_n_InterfaceUnsatHbonds": {
531 | "threshold": null,
532 | "higher": false
533 | },
534 | "5_n_InterfaceUnsatHbonds": {
535 | "threshold": null,
536 | "higher": false
537 | },
538 | "Average_InterfaceUnsatHbondsPercentage": {
539 | "threshold": null,
540 | "higher": false
541 | },
542 | "1_InterfaceUnsatHbondsPercentage": {
543 | "threshold": null,
544 | "higher": false
545 | },
546 | "2_InterfaceUnsatHbondsPercentage": {
547 | "threshold": null,
548 | "higher": false
549 | },
550 | "3_InterfaceUnsatHbondsPercentage": {
551 | "threshold": null,
552 | "higher": false
553 | },
554 | "4_InterfaceUnsatHbondsPercentage": {
555 | "threshold": null,
556 | "higher": false
557 | },
558 | "5_InterfaceUnsatHbondsPercentage": {
559 | "threshold": null,
560 | "higher": true
561 | },
562 | "Average_Interface_Helix%": {
563 | "threshold": null,
564 | "higher": true
565 | },
566 | "1_Interface_Helix%": {
567 | "threshold": null,
568 | "higher": true
569 | },
570 | "2_Interface_Helix%": {
571 | "threshold": null,
572 | "higher": true
573 | },
574 | "3_Interface_Helix%": {
575 | "threshold": null,
576 | "higher": true
577 | },
578 | "4_Interface_Helix%": {
579 | "threshold": null,
580 | "higher": true
581 | },
582 | "5_Interface_Helix%": {
583 | "threshold": null,
584 | "higher": true
585 | },
586 | "Average_Interface_BetaSheet%": {
587 | "threshold": null,
588 | "higher": true
589 | },
590 | "1_Interface_BetaSheet%": {
591 | "threshold": null,
592 | "higher": true
593 | },
594 | "2_Interface_BetaSheet%": {
595 | "threshold": null,
596 | "higher": true
597 | },
598 | "3_Interface_BetaSheet%": {
599 | "threshold": null,
600 | "higher": true
601 | },
602 | "4_Interface_BetaSheet%": {
603 | "threshold": null,
604 | "higher": true
605 | },
606 | "5_Interface_BetaSheet%": {
607 | "threshold": null,
608 | "higher": true
609 | },
610 | "Average_Interface_Loop%": {
611 | "threshold": null,
612 | "higher": false
613 | },
614 | "1_Interface_Loop%": {
615 | "threshold": null,
616 | "higher": false
617 | },
618 | "2_Interface_Loop%": {
619 | "threshold": null,
620 | "higher": false
621 | },
622 | "3_Interface_Loop%": {
623 | "threshold": null,
624 | "higher": false
625 | },
626 | "4_Interface_Loop%": {
627 | "threshold": null,
628 | "higher": false
629 | },
630 | "5_Interface_Loop%": {
631 | "threshold": null,
632 | "higher": false
633 | },
634 | "Average_Binder_Helix%": {
635 | "threshold": null,
636 | "higher": true
637 | },
638 | "1_Binder_Helix%": {
639 | "threshold": null,
640 | "higher": true
641 | },
642 | "2_Binder_Helix%": {
643 | "threshold": null,
644 | "higher": true
645 | },
646 | "3_Binder_Helix%": {
647 | "threshold": null,
648 | "higher": true
649 | },
650 | "4_Binder_Helix%": {
651 | "threshold": null,
652 | "higher": true
653 | },
654 | "5_Binder_Helix%": {
655 | "threshold": null,
656 | "higher": true
657 | },
658 | "Average_Binder_BetaSheet%": {
659 | "threshold": null,
660 | "higher": true
661 | },
662 | "1_Binder_BetaSheet%": {
663 | "threshold": null,
664 | "higher": true
665 | },
666 | "2_Binder_BetaSheet%": {
667 | "threshold": null,
668 | "higher": true
669 | },
670 | "3_Binder_BetaSheet%": {
671 | "threshold": null,
672 | "higher": true
673 | },
674 | "4_Binder_BetaSheet%": {
675 | "threshold": null,
676 | "higher": true
677 | },
678 | "5_Binder_BetaSheet%": {
679 | "threshold": null,
680 | "higher": true
681 | },
682 | "Average_Binder_Loop%": {
683 | "threshold": 90,
684 | "higher": false
685 | },
686 | "1_Binder_Loop%": {
687 | "threshold": 90,
688 | "higher": false
689 | },
690 | "2_Binder_Loop%": {
691 | "threshold": 90,
692 | "higher": false
693 | },
694 | "3_Binder_Loop%": {
695 | "threshold": null,
696 | "higher": false
697 | },
698 | "4_Binder_Loop%": {
699 | "threshold": null,
700 | "higher": false
701 | },
702 | "5_Binder_Loop%": {
703 | "threshold": null,
704 | "higher": false
705 | },
706 | "Average_InterfaceAAs": {
707 | "A": {
708 | "threshold": null,
709 | "higher": false
710 | },
711 | "C": {
712 | "threshold": null,
713 | "higher": false
714 | },
715 | "D": {
716 | "threshold": null,
717 | "higher": false
718 | },
719 | "E": {
720 | "threshold": null,
721 | "higher": false
722 | },
723 | "F": {
724 | "threshold": null,
725 | "higher": false
726 | },
727 | "G": {
728 | "threshold": null,
729 | "higher": false
730 | },
731 | "H": {
732 | "threshold": null,
733 | "higher": false
734 | },
735 | "I": {
736 | "threshold": null,
737 | "higher": false
738 | },
739 | "K": {
740 | "threshold": 3,
741 | "higher": false
742 | },
743 | "L": {
744 | "threshold": null,
745 | "higher": false
746 | },
747 | "M": {
748 | "threshold": 3,
749 | "higher": false
750 | },
751 | "N": {
752 | "threshold": null,
753 | "higher": false
754 | },
755 | "P": {
756 | "threshold": null,
757 | "higher": false
758 | },
759 | "Q": {
760 | "threshold": null,
761 | "higher": false
762 | },
763 | "R": {
764 | "threshold": null,
765 | "higher": false
766 | },
767 | "S": {
768 | "threshold": null,
769 | "higher": false
770 | },
771 | "T": {
772 | "threshold": null,
773 | "higher": false
774 | },
775 | "V": {
776 | "threshold": null,
777 | "higher": false
778 | },
779 | "W": {
780 | "threshold": null,
781 | "higher": false
782 | },
783 | "Y": {
784 | "threshold": null,
785 | "higher": false
786 | }
787 | },
788 | "1_InterfaceAAs": {
789 | "A": {
790 | "threshold": null,
791 | "higher": false
792 | },
793 | "C": {
794 | "threshold": null,
795 | "higher": false
796 | },
797 | "D": {
798 | "threshold": null,
799 | "higher": false
800 | },
801 | "E": {
802 | "threshold": null,
803 | "higher": false
804 | },
805 | "F": {
806 | "threshold": null,
807 | "higher": false
808 | },
809 | "G": {
810 | "threshold": null,
811 | "higher": false
812 | },
813 | "H": {
814 | "threshold": null,
815 | "higher": false
816 | },
817 | "I": {
818 | "threshold": null,
819 | "higher": false
820 | },
821 | "K": {
822 | "threshold": null,
823 | "higher": false
824 | },
825 | "L": {
826 | "threshold": null,
827 | "higher": false
828 | },
829 | "M": {
830 | "threshold": null,
831 | "higher": false
832 | },
833 | "N": {
834 | "threshold": null,
835 | "higher": false
836 | },
837 | "P": {
838 | "threshold": null,
839 | "higher": false
840 | },
841 | "Q": {
842 | "threshold": null,
843 | "higher": false
844 | },
845 | "R": {
846 | "threshold": null,
847 | "higher": false
848 | },
849 | "S": {
850 | "threshold": null,
851 | "higher": false
852 | },
853 | "T": {
854 | "threshold": null,
855 | "higher": false
856 | },
857 | "V": {
858 | "threshold": null,
859 | "higher": false
860 | },
861 | "W": {
862 | "threshold": null,
863 | "higher": false
864 | },
865 | "Y": {
866 | "threshold": null,
867 | "higher": false
868 | }
869 | },
870 | "2_InterfaceAAs": {
871 | "A": {
872 | "threshold": null,
873 | "higher": false
874 | },
875 | "C": {
876 | "threshold": null,
877 | "higher": false
878 | },
879 | "D": {
880 | "threshold": null,
881 | "higher": false
882 | },
883 | "E": {
884 | "threshold": null,
885 | "higher": false
886 | },
887 | "F": {
888 | "threshold": null,
889 | "higher": false
890 | },
891 | "G": {
892 | "threshold": null,
893 | "higher": false
894 | },
895 | "H": {
896 | "threshold": null,
897 | "higher": false
898 | },
899 | "I": {
900 | "threshold": null,
901 | "higher": false
902 | },
903 | "K": {
904 | "threshold": null,
905 | "higher": false
906 | },
907 | "L": {
908 | "threshold": null,
909 | "higher": false
910 | },
911 | "M": {
912 | "threshold": null,
913 | "higher": false
914 | },
915 | "N": {
916 | "threshold": null,
917 | "higher": false
918 | },
919 | "P": {
920 | "threshold": null,
921 | "higher": false
922 | },
923 | "Q": {
924 | "threshold": null,
925 | "higher": false
926 | },
927 | "R": {
928 | "threshold": null,
929 | "higher": false
930 | },
931 | "S": {
932 | "threshold": null,
933 | "higher": false
934 | },
935 | "T": {
936 | "threshold": null,
937 | "higher": false
938 | },
939 | "V": {
940 | "threshold": null,
941 | "higher": false
942 | },
943 | "W": {
944 | "threshold": null,
945 | "higher": false
946 | },
947 | "Y": {
948 | "threshold": null,
949 | "higher": false
950 | }
951 | },
952 | "3_InterfaceAAs": {
953 | "A": {
954 | "threshold": null,
955 | "higher": false
956 | },
957 | "C": {
958 | "threshold": null,
959 | "higher": false
960 | },
961 | "D": {
962 | "threshold": null,
963 | "higher": false
964 | },
965 | "E": {
966 | "threshold": null,
967 | "higher": false
968 | },
969 | "F": {
970 | "threshold": null,
971 | "higher": false
972 | },
973 | "G": {
974 | "threshold": null,
975 | "higher": false
976 | },
977 | "H": {
978 | "threshold": null,
979 | "higher": false
980 | },
981 | "I": {
982 | "threshold": null,
983 | "higher": false
984 | },
985 | "K": {
986 | "threshold": null,
987 | "higher": false
988 | },
989 | "L": {
990 | "threshold": null,
991 | "higher": false
992 | },
993 | "M": {
994 | "threshold": null,
995 | "higher": false
996 | },
997 | "N": {
998 | "threshold": null,
999 | "higher": false
1000 | },
1001 | "P": {
1002 | "threshold": null,
1003 | "higher": false
1004 | },
1005 | "Q": {
1006 | "threshold": null,
1007 | "higher": false
1008 | },
1009 | "R": {
1010 | "threshold": null,
1011 | "higher": false
1012 | },
1013 | "S": {
1014 | "threshold": null,
1015 | "higher": false
1016 | },
1017 | "T": {
1018 | "threshold": null,
1019 | "higher": false
1020 | },
1021 | "V": {
1022 | "threshold": null,
1023 | "higher": false
1024 | },
1025 | "W": {
1026 | "threshold": null,
1027 | "higher": false
1028 | },
1029 | "Y": {
1030 | "threshold": null,
1031 | "higher": false
1032 | }
1033 | },
1034 | "4_InterfaceAAs": {
1035 | "A": {
1036 | "threshold": null,
1037 | "higher": false
1038 | },
1039 | "C": {
1040 | "threshold": null,
1041 | "higher": false
1042 | },
1043 | "D": {
1044 | "threshold": null,
1045 | "higher": false
1046 | },
1047 | "E": {
1048 | "threshold": null,
1049 | "higher": false
1050 | },
1051 | "F": {
1052 | "threshold": null,
1053 | "higher": false
1054 | },
1055 | "G": {
1056 | "threshold": null,
1057 | "higher": false
1058 | },
1059 | "H": {
1060 | "threshold": null,
1061 | "higher": false
1062 | },
1063 | "I": {
1064 | "threshold": null,
1065 | "higher": false
1066 | },
1067 | "K": {
1068 | "threshold": null,
1069 | "higher": false
1070 | },
1071 | "L": {
1072 | "threshold": null,
1073 | "higher": false
1074 | },
1075 | "M": {
1076 | "threshold": null,
1077 | "higher": false
1078 | },
1079 | "N": {
1080 | "threshold": null,
1081 | "higher": false
1082 | },
1083 | "P": {
1084 | "threshold": null,
1085 | "higher": false
1086 | },
1087 | "Q": {
1088 | "threshold": null,
1089 | "higher": false
1090 | },
1091 | "R": {
1092 | "threshold": null,
1093 | "higher": false
1094 | },
1095 | "S": {
1096 | "threshold": null,
1097 | "higher": false
1098 | },
1099 | "T": {
1100 | "threshold": null,
1101 | "higher": false
1102 | },
1103 | "V": {
1104 | "threshold": null,
1105 | "higher": false
1106 | },
1107 | "W": {
1108 | "threshold": null,
1109 | "higher": false
1110 | },
1111 | "Y": {
1112 | "threshold": null,
1113 | "higher": false
1114 | }
1115 | },
1116 | "5_InterfaceAAs": {
1117 | "A": {
1118 | "threshold": null,
1119 | "higher": false
1120 | },
1121 | "C": {
1122 | "threshold": null,
1123 | "higher": false
1124 | },
1125 | "D": {
1126 | "threshold": null,
1127 | "higher": false
1128 | },
1129 | "E": {
1130 | "threshold": null,
1131 | "higher": false
1132 | },
1133 | "F": {
1134 | "threshold": null,
1135 | "higher": false
1136 | },
1137 | "G": {
1138 | "threshold": null,
1139 | "higher": false
1140 | },
1141 | "H": {
1142 | "threshold": null,
1143 | "higher": false
1144 | },
1145 | "I": {
1146 | "threshold": null,
1147 | "higher": false
1148 | },
1149 | "K": {
1150 | "threshold": null,
1151 | "higher": false
1152 | },
1153 | "L": {
1154 | "threshold": null,
1155 | "higher": false
1156 | },
1157 | "M": {
1158 | "threshold": null,
1159 | "higher": false
1160 | },
1161 | "N": {
1162 | "threshold": null,
1163 | "higher": false
1164 | },
1165 | "P": {
1166 | "threshold": null,
1167 | "higher": false
1168 | },
1169 | "Q": {
1170 | "threshold": null,
1171 | "higher": false
1172 | },
1173 | "R": {
1174 | "threshold": null,
1175 | "higher": false
1176 | },
1177 | "S": {
1178 | "threshold": null,
1179 | "higher": false
1180 | },
1181 | "T": {
1182 | "threshold": null,
1183 | "higher": false
1184 | },
1185 | "V": {
1186 | "threshold": null,
1187 | "higher": false
1188 | },
1189 | "W": {
1190 | "threshold": null,
1191 | "higher": false
1192 | },
1193 | "Y": {
1194 | "threshold": null,
1195 | "higher": false
1196 | }
1197 | },
1198 | "Average_Hotspot_RMSD": {
1199 | "threshold": 3,
1200 | "higher": false
1201 | },
1202 | "1_Hotspot_RMSD": {
1203 | "threshold": 3,
1204 | "higher": false
1205 | },
1206 | "2_Hotspot_RMSD": {
1207 | "threshold": 3,
1208 | "higher": false
1209 | },
1210 | "3_Hotspot_RMSD": {
1211 | "threshold": null,
1212 | "higher": false
1213 | },
1214 | "4_Hotspot_RMSD": {
1215 | "threshold": null,
1216 | "higher": false
1217 | },
1218 | "5_Hotspot_RMSD": {
1219 | "threshold": null,
1220 | "higher": false
1221 | },
1222 | "Average_Target_RMSD": {
1223 | "threshold": null,
1224 | "higher": false
1225 | },
1226 | "1_Target_RMSD": {
1227 | "threshold": null,
1228 | "higher": false
1229 | },
1230 | "2_Target_RMSD": {
1231 | "threshold": null,
1232 | "higher": false
1233 | },
1234 | "3_Target_RMSD": {
1235 | "threshold": null,
1236 | "higher": false
1237 | },
1238 | "4_Target_RMSD": {
1239 | "threshold": null,
1240 | "higher": false
1241 | },
1242 | "5_Target_RMSD": {
1243 | "threshold": null,
1244 | "higher": false
1245 | },
1246 | "Average_Binder_pLDDT": {
1247 | "threshold": 0.8,
1248 | "higher": true
1249 | },
1250 | "1_Binder_pLDDT": {
1251 | "threshold": 0.8,
1252 | "higher": true
1253 | },
1254 | "2_Binder_pLDDT": {
1255 | "threshold": 0.8,
1256 | "higher": true
1257 | },
1258 | "3_Binder_pLDDT": {
1259 | "threshold": 0.8,
1260 | "higher": true
1261 | },
1262 | "4_Binder_pLDDT": {
1263 | "threshold": 0.8,
1264 | "higher": true
1265 | },
1266 | "5_Binder_pLDDT": {
1267 | "threshold": 0.8,
1268 | "higher": true
1269 | },
1270 | "Average_Binder_pTM": {
1271 | "threshold": null,
1272 | "higher": true
1273 | },
1274 | "1_Binder_pTM": {
1275 | "threshold": null,
1276 | "higher": true
1277 | },
1278 | "2_Binder_pTM": {
1279 | "threshold": null,
1280 | "higher": true
1281 | },
1282 | "3_Binder_pTM": {
1283 | "threshold": null,
1284 | "higher": true
1285 | },
1286 | "4_Binder_pTM": {
1287 | "threshold": null,
1288 | "higher": true
1289 | },
1290 | "5_Binder_pTM": {
1291 | "threshold": null,
1292 | "higher": true
1293 | },
1294 | "Average_Binder_pAE": {
1295 | "threshold": null,
1296 | "higher": false
1297 | },
1298 | "1_Binder_pAE": {
1299 | "threshold": null,
1300 | "higher": false
1301 | },
1302 | "2_Binder_pAE": {
1303 | "threshold": null,
1304 | "higher": false
1305 | },
1306 | "3_Binder_pAE": {
1307 | "threshold": null,
1308 | "higher": false
1309 | },
1310 | "4_Binder_pAE": {
1311 | "threshold": null,
1312 | "higher": false
1313 | },
1314 | "5_Binder_pAE": {
1315 | "threshold": null,
1316 | "higher": false
1317 | },
1318 | "Average_Binder_RMSD": {
1319 | "threshold": 2.5,
1320 | "higher": false
1321 | },
1322 | "1_Binder_RMSD": {
1323 | "threshold": 2.5,
1324 | "higher": false
1325 | },
1326 | "2_Binder_RMSD": {
1327 | "threshold": 2.5,
1328 | "higher": false
1329 | },
1330 | "3_Binder_RMSD": {
1331 | "threshold": 2.5,
1332 | "higher": false
1333 | },
1334 | "4_Binder_RMSD": {
1335 | "threshold": 2.5,
1336 | "higher": false
1337 | },
1338 | "5_Binder_RMSD": {
1339 | "threshold": 2.5,
1340 | "higher": false
1341 | }
1342 | }
--------------------------------------------------------------------------------
/settings_target/PDL1.json:
--------------------------------------------------------------------------------
1 | {
2 | "design_path": "/content/drive/My Drive/BindCraft/PDL1/",
3 | "binder_name": "PDL1",
4 | "starting_pdb": "/content/bindcraft/example/PDL1.pdb",
5 | "chains": "A",
6 | "target_hotspot_residues": "56",
7 | "lengths": [65, 150],
8 | "number_of_final_designs": 100
9 | }
--------------------------------------------------------------------------------