├── example_images ├── VirtualIce_example1.png ├── VirtualIce_example2.png ├── VirtualIce_example3.png └── VirtualIce_example4.png ├── ice_images └── download_EMPIAR-12287_ice_images_directory_here.txt ├── LICENSE └── README.md /example_images/VirtualIce_example1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alexjnoble/VirtualIce/HEAD/example_images/VirtualIce_example1.png -------------------------------------------------------------------------------- /example_images/VirtualIce_example2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alexjnoble/VirtualIce/HEAD/example_images/VirtualIce_example2.png -------------------------------------------------------------------------------- /example_images/VirtualIce_example3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alexjnoble/VirtualIce/HEAD/example_images/VirtualIce_example3.png -------------------------------------------------------------------------------- /example_images/VirtualIce_example4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/alexjnoble/VirtualIce/HEAD/example_images/VirtualIce_example4.png -------------------------------------------------------------------------------- /ice_images/download_EMPIAR-12287_ice_images_directory_here.txt: -------------------------------------------------------------------------------- 1 | To use VirtualIce, download and place the contents of the ice_images/ directory in the EMPIAR-12287 entry (https://doi.org/10.6019/EMPIAR-12287) to this directory. The directory should consist of vitrified buffer micrographs (MRC format), JSON files created by labeling obscuring objects with AnyLabeling and with the same base filenames as the micrographs, and a single text file named good_images_with_defocus.txt that has a list of all .mrc micrographs in the first column and a space-delimited second column with the micrographs' average defocus value in microns. 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 alexjnoble 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.13852111.svg)](https://doi.org/10.5281/zenodo.13852111) 2 | # VirtualIce: Half-synthetic CryoEM Micrograph Generator 3 | 4 | VirtualIce is a feature-rich half-synthetic cryoEM micrograph generator that uses buffer cryoEM micrographs with junk and carbon masked out as real background. It projects PDB, EMDB, or local structures onto buffer cryoEM micrographs, simulating realistic imaging conditions by adding noise, dose damage, and applying CTF to particles. It outputs particle coordinates after masking out junk. It outputs particles if requested. 5 | 6 | ## Release Notes 7 | 8 | ### v2.0.0 - September 27, 2024 9 | 10 | #### Features 11 | 12 | - Various bug fixes and version release. 13 | 14 |
v2.0.0beta - September 19, 2024

15 | 16 | - Multiple structures per micrograph can now be requested (structure sets). 17 | - Use the same --structures flag followed by either a single structure or multiple. Supports any number of structure sets, like this: 18 | - virtualice.py -s 1TIM [1PMA, 50882] [my_structure1.mrc, 3DRE, 6TIM] 19 | - The above command will make one set of micrographs with only PDB 1TIM, another set with PDB 1PMA and EMD-50882, and another set with a local file (my_structure1.mrc), PDB 3DRE, and PDB 6TIM. 20 | - Preferred orientation, particle distributions, and overlapping & aggregated particles are fully supported. 21 | - Filtering of edge, overlapping, and obscured particles is fully supported. 22 | - Coordinate files are saved independently in .star, .mod, and/or .coord files (one per structure in a structure set). 23 | - This update is significant because it allows for ground-truth datasets of heterogeneous proteins - e.g. continuous or discrete conformations, compositional heterogeneity, or completely different proteins. 24 | 25 |

26 | 27 |
v1.0.1 - September 19, 2024

28 | 29 | - Last release of VirtualIce for single-structure micrographs. Contains minor printout updates compared to v1.0.0. 30 | 31 |

32 | 33 |
v1.0.0 - September 10, 2024

34 | 35 | - Generates half-synthetic cryoEM micrographs and particles from buffer images and PDB IDs, EMDB IDs, or local files. 36 | - Creates coordinate files (.star, .mod, .coord), not including particles obscured by junk/substrate or too close to the edge. 37 | - Adds Poisson noise and dose-dependent damage to simulated frames and Gaussian noise to particles. 38 | - Applies the Contrast Transfer Function (CTF) to simulate microscope optics. 39 | - Control over overlapping particles and particle aggregation. 40 | - Outputs micrographs in MRC, PNG, and JPEG formats, and optionally cropped particles as MRCs. 41 | - Multi-core and GPU processing. 42 | - Extensive customization options including particle distribution, ice thickness, microscope parameters, and downsampling. 43 | 44 |

45 | 46 | ## Requirements and Installation 47 | 48 | VirtualIce requires Python 3, EMAN2, IMOD, and several dependencies, which can be installed using pip: 49 | 50 | ```bash 51 | pip install cupy gpustat mrcfile numpy opencv-python pandas scipy SimpleITK 52 | ``` 53 | 54 | To use VirtualIce, clone the github directory, make virtualice.py executable (`chmod +x virtualice.py`), download the ice_images/ directory from [EMPIAR-12287](https://www.ebi.ac.uk/empiar/EMPIAR-12287/) to the VirtualIce/ice_images/ directory, and ensure virtualice.py is in your environment for use. 55 | 56 | ## Examples 57 | 58 | 59 | ### Example 1 (top-left: PDB 1TIM, 53 kDa, top-right: [1DAT, 442 kDa, 7ZP8, 1385 kDa, and 2HCO, 65 kDa], bottom-left: 1RUZ, 165 kDa, bottom-right: 1PMA, 686 kDa): 60 | ![Example 1](example_images/VirtualIce_example1.png) 61 | 62 | ### Example 2 (T20S Proteasome; PDB 1PMA, 686 kDa): 63 | ![Example 2](example_images/VirtualIce_example2.png) 64 | 65 | ### Example 3 (Triose Phosphate Isomerase; PDB 1TIM, 53 kDa): 66 | ![Example 3](example_images/VirtualIce_example3.png) 67 | 68 | ### Example 4 (TRPV5; EMD 0594, 306 kDa): 69 | ![Example 4](example_images/VirtualIce_example4.png) 70 | 71 | ## User Guide 72 | 73 | The script can be run from the command line and takes a number of arguments. 74 | 75 | ### Basic example usage: 76 | 77 | ``` 78 | virtualice.py -s 1TIM -n 10 79 | ``` 80 | 81 | Generates `-n` _10_ random micrographs of PDB `-s` _1TIM_. 82 | 83 | Arguments: 84 | 85 | - `-s`, `--structures`: Specify PDB ID(s), EMDB ID(s), local files, and/or 'r' for random PDB/EMDB structures. 86 | - `-n`, `--num_images`: Number of micrographs to generate. 87 | 88 | ### Basic example usage: 89 | 90 | ``` 91 | virtualice.py -s [1TIM, 11638] 1PMA -n 50 92 | ``` 93 | 94 | Generates `-n` _50_ random micrographs for the structure set consisting of `-s` PDB _1TIM_ and EMDB-_11638_ (multi-structure micrographs), and `-n` _50_ random micrographs of `-s` PDB _1PMA_ (single-structure micrographs). 95 | 96 | ### Advanced example usage: 97 | 98 | ``` 99 | virtualice.py -s 1TIM r my_structure.mrc 11638 -n 3 -I -P -J -Q 90 -b 4 -D n -ps 2 100 | ``` 101 | 102 | Generates `-n` _3_ random micrographs of PDB `-s` _1TIM_, a random EMDB/PDB structure, a local structure called _my_structure.mrc_, and EMD-_11638_. Outputs an `-I` IMOD .mod coordinate file, `-P` png, and `-J` jpeg (quality `-Q` _90_) for each micrograph, and bins `-b` all images by _4_. Uses a `-D` non-random distribution of particles and `-ps` parallelizes micrograph generation across _2_ CPUs. 103 | 104 | Arguments: 105 | 106 | - `-I`, `--imod_coordinate_file`: Also output one IMOD .mod coordinate file per micrograph. 107 | - `-P`, `--png`: Output in PNG format. 108 | - `-J`, `--jpeg`: Output in JPEG format. 109 | - `-Q`, `--jpeg-quality`: JPEG image quality. 110 | - `-b`, `--binning`: Bin micrographs by downsampling. 111 | - `-D`, `--distribution`: Distribution type for generating particle locations. 112 | - `-ps`, `--parallellize_structures`: Parallel processes for micrograph generation across the structures requested. 113 | 114 | 115 | ### Advanced example usage: 116 | 117 | ``` 118 | virtualice.py -s 1PMA -n 5 -om preferred -pw 0.9 -pa [*,90,0] [90 180 *] -aa l h r -ne --use_cpu -V 2 -3 119 | ``` 120 | 121 | Generates `-n` _5_ random micrographs of PDB `-s` 1PMA (proteasome) with `-om` _preferred_ orientation for `-pw` 90% (0.9) of particles. The preferred orientations are defined by random selections of `-pa` _[*,90,0]_ (free to rotate along the first Z axis, then rotate 90 degrees in Y, do not rotate in Z) and _[90 180 0]_ (rotate 90 degrees along the first Z axis, then rotate 180 degrees in Y, then free to rotate along the resulting Z). The `-aa` aggregation amount is chosen from _l_ow and _h_igh values _r_andomly for each of the 5 micrographs. `-ne` Edge particles are not included. `--use_cpu` Only CPUs are used (no GPUs). Terminal `-V` verbosity is set to _2_ (verbose). The resulting micrographs are opened with `-3` 3dmod after generation. 122 | 123 | Arguments: 124 | 125 | - `-om`, `--orientation_mode`: Orientation mode for projections. 126 | - `-pw`, `--preferred_weight`: Weight of the preferred orientations in the range [0, 1]. 127 | - `-pa`, `--preferred_angles`: List of sets of three Euler angles (in degrees) for preferred orientations. 128 | - `-aa`, `--aggregation_amount`: Amount of particle aggregation. 129 | - `-ne`, `--no_edge_particles`: Prevent particles from being placed up to the edge of the micrograph. 130 | - `--use_cpu`, Use CPU for processing instead of GPU. 131 | - `-V`, `--verbosity`: Set verbosity level. 132 | - `-3`, `--view_in_3dmod`: View generated micrographs in 3dmod at the end of the run. 133 | 134 | Additional arguments exist for fine-tuning the generation process including ice thickness, junk filtering, and CTF parameters. 135 | 136 | ## Ethical Use Agreement 137 | 138 | VirtualIce is under the MIT License, allowing broad usage freedom, but it should be used responsibly and ethically via these guidelines: 139 | 140 | ### Intended Use 141 | 142 | VirtualIce is designed for educational and research purposes, specifically to aid in the development, testing, and validation of cryoEM image analysis algorithms by generating half-synthetic cryoEM micrographs and particles. 143 | 144 | ### Ethical Considerations 145 | 146 | - **Transparency**: Any data generated using VirtualIce should be clearly marked as synthetic when published or shared to distinguish it from real experimental data. 147 | - **No Misrepresentation**: Users should not present synthetic data generated by VirtualIce as real data from physical experiments in any publications or presentations unless explicitly stated. 148 | - **Research Integrity**: Users are encouraged to uphold the highest standards of scientific integrity in their work, ensuring that the use of synthetic data does not mislead, deceive, or otherwise harm the scientific community or the public. 149 | 150 | ## Issues and Support 151 | 152 | If you encounter any problems or have any questions about the script, please [Submit an Issue](https://github.com/alexjnoble/VirtualIce/issues). 153 | 154 | ## Contributions 155 | 156 | Contributions are welcome! Please open a [Pull Request](https://github.com/alexjnoble/VirtualIce/pulls) or [Issue](https://github.com/alexjnoble/VirtualIce/issues). 157 | 158 | ## Reference 159 | 160 | For more details about the VirtualIce algorithm and its applications, see and reference the associated manuscript: [https://doi.org/10.1101/2024.09.28.615520](https://doi.org/10.1101/2024.09.28.615520) 161 | 162 | ## Author 163 | 164 | This script was written by Alex J. Noble with assistance from OpenAI's GPT, Anthropic's Claude, and Google's Gemini models, 2023-2024 at SEMC. 165 | 166 | ## License 167 | 168 | This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. 169 | The Ethical Use Agreement is compatible with the MIT License, providing guidelines and recommendations for responsible use without legally restricting software use. 170 | --------------------------------------------------------------------------------