├── LICENSE ├── README.md ├── requirements.txt ├── tcia_dicom_to_nifti.py ├── tcia_dicom_to_nifti_generic.py ├── tcia_nifti_to_hdf5.py └── tcia_nifti_to_mha.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 midas.lab 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TCIA_processing: tcia_dicom_to_nifti.py 2 | 3 | Conversion script for conversion of TCIA DICOM data to NIfTI format (dataset: FDG-PET-CT-Lesion, doi: ). 4 | 5 | ## Requirements 6 | 7 | To run the script you will need a number of python packages. Use the terminal and run sequentially: 8 | 9 | ```bash 10 | pip3 install numpy 11 | pip3 install dicom2nifti 12 | pip3 install nibabel 13 | pip3 install pydicom 14 | pip3 install tqdm 15 | pip3 install nilearn 16 | ``` 17 | in case you use a Colab or Jupyter notebook and cannot use the terminal you can perform these installations by adding a "!" in front of the commands, e.g. 18 | ```python 19 | !pip3 install numpy 20 | ... 21 | ``` 22 | ## Data structure 23 | DICOM data downloaded from TCIA will have the following format: 24 | 25 | Directory structure of the original DICOM data within the folder /PATH/TO/DICOM/FDG-PET-CT-Lesions/ : 26 | 27 | image 28 | 29 | 30 | ## Usage 31 | 32 | In order to run this script use the terminal and navigate to the path where the script is stored, then run: 33 | 34 | ```bash 35 | python3 tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ 36 | 37 | ``` 38 | where 39 | 40 | ```/PATH/TO/DICOM/FDG-PET-CT-Lesions/``` 41 | is the directory of the DICOM data downloaded from TCIA (see above: data structure) and 42 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/``` 43 | is the path you want to store the NIfTI files in. 44 | 45 | You can ignore the nilearn warning: 46 | 47 | ```.../nilearn/image/resampling.py:527: UserWarning: Casting data from int16 to float32 warnings.warn("Casting data from %s to %s" % (data.dtype.name, aux))``` 48 | 49 | or suppress warnings by running the script as (after making sure everything works): 50 | 51 | ```bash 52 | python3 -W ignore tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ 53 | ``` 54 | 55 | ## Output 56 | The resulting NIfTI directory will have the following structure: 57 | 58 | image 59 | 60 | ## Execution time 61 | Running the script can take multiple hours. 62 | 63 | # TCIA_processing: tcia_nifti_to_mha.py 64 | 65 | Conversion script for conversion of TCIA NIfTI data (created using tcia_dicom_to_nifti.py - see above) to mha files. 66 | 67 | ## Requirements 68 | 69 | To run the script you will need a number of python packages. Use the terminal and run sequentially: 70 | 71 | ```bash 72 | pip3 install SimpleITK 73 | pip3 install tqdm 74 | ``` 75 | in case you use a Colab or Jupyter notebook and cannot use the terminal you can perform these installations by adding a "!" in front of the commands, e.g. 76 | ```python 77 | !pip3 install SimpleITK 78 | ... 79 | ``` 80 | ## Usage 81 | 82 | In order to run this script use the terminal and navigate to the path where the script is stored, then run: 83 | 84 | ```bash 85 | python3 tcia_nifti_to_mha.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/MHA/FDG-PET-CT-Lesions/ 86 | ``` 87 | where 88 | 89 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/``` 90 | is the directory of the NIfTI data generated using tcia_dicom_to_nifti.py (see above) and 91 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/``` 92 | is the path you want to store the MHA files in. 93 | 94 | You can ignore the nilearn warning: 95 | 96 | ```.WARNING: In /tmp/SimpleITK-build/ITK/Modules/IO/Meta/src/itkMetaImageIO.cxx, line 669 MetaImageIO (0x2d9b300): Unsupported or empty metaData item intent_name of type Ssfound, won't be written to image file``` 97 | 98 | or suppress warnings by running the script as (after making sure everything works): 99 | 100 | ```bash 101 | python3 -W ignore tcia_nifti_to_mha.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/MHA/FDG-PET-CT-Lesions/ 102 | ``` 103 | 104 | # TCIA_processing: tcia_nifti_to_hdf5.py 105 | 106 | Conversion script for conversion of TCIA NIfTI data (created using tcia_dicom_to_nifti.py - see above) to a single hdf5 file 107 | 108 | ## Requirements 109 | 110 | To run the script you will need a number of python packages. Use the terminal and run sequentially: 111 | 112 | ```bash 113 | pip3 install numpy 114 | pip3 install h5py 115 | pip3 install tqdm 116 | pip3 install nibabel 117 | ``` 118 | in case you use a Colab or Jupyter notebook and cannot use the terminal you can perform these installations by adding a "!" in front of the commands, e.g. 119 | ```python 120 | !pip3 install numpy 121 | ... 122 | ``` 123 | ## Usage 124 | 125 | In order to run this script use the terminal and navigate to the path where the script is stored, then run: 126 | 127 | ```bash 128 | python3 tcia_nifti_to_hdf5.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/HDF5/FDG-PET-CT-Lesions.hdf5 129 | 130 | ``` 131 | where 132 | 133 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/``` 134 | is the directory of the NIfTI data generated using tcia_dicom_to_nifti.py (see above) and 135 | ```/PATH/TO/HDF5/FDG-PET-CT-Lesions.hdf5``` 136 | is the path and filename of the hdf5 file to be created. 137 | 138 | ## Package Versions 139 | All scripts were tested under python 3.9 with the following package versions: 140 | 141 | dicom2nifti==2.3.3 142 | 143 | nibabel==3.2.2 144 | 145 | pydicom==2.3.0 146 | 147 | h5py==3.6.0 148 | 149 | tqdm==4.64.0 150 | 151 | SimpleITK==2.1.1.2 152 | 153 | nilearn==0.9.1 154 | 155 | numpy==1.22.3 156 | 157 | ## License 158 | [MIT](https://choosealicense.com/licenses/mit/) 159 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | dicom2nifti 3 | nibabel 4 | pydicom 5 | h5py 6 | tqdm 7 | SimpleITK 8 | nilearn 9 | 10 | # All scripts were tested under python 3.9 with the following package versions: 11 | 12 | #dicom2nifti==2.3.3 13 | #nibabel==3.2.2 14 | #pydicom==2.3.0 15 | #h5py==3.6.0 16 | #tqdm==4.64.0 17 | #SimpleITK==2.1.1.2 18 | #nilearn==0.9.1 19 | #numpy==1.22.3 20 | -------------------------------------------------------------------------------- /tcia_dicom_to_nifti.py: -------------------------------------------------------------------------------- 1 | # data preparation (conversion of DICOM PET/CT studies to nifti format for running automated lesion segmentation) 2 | 3 | # run script from command line as follows: 4 | # python tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ 5 | 6 | # you can ignore the nilearn warning: 7 | # .../nilearn/image/resampling.py:527: UserWarning: Casting data from int16 to float32 warnings.warn("Casting data from %s to %s" % (data.dtype.name, aux)) 8 | # or run as python -W ignore tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ 9 | 10 | import pathlib as plb 11 | import tempfile 12 | import os 13 | import dicom2nifti 14 | import nibabel as nib 15 | import numpy as np 16 | import pydicom 17 | import sys 18 | import shutil 19 | import nilearn.image 20 | from tqdm import tqdm 21 | 22 | 23 | def find_studies(path_to_data): 24 | # find all studies 25 | dicom_root = plb.Path(path_to_data) 26 | patient_dirs = list(dicom_root.glob('*')) 27 | 28 | study_dirs = [] 29 | 30 | for dir in patient_dirs: 31 | sub_dirs = list(dir.glob('*')) 32 | #print(sub_dirs) 33 | study_dirs.extend(sub_dirs) 34 | 35 | #dicom_dirs = dicom_dirs.append(dir.glob('*')) 36 | return study_dirs 37 | 38 | 39 | def identify_modalities(study_dir): 40 | # identify CT, PET and mask subfolders and return dicitionary of modalities and corresponding paths, also return series ID, output is a dictionary 41 | study_dir = plb.Path(study_dir) 42 | sub_dirs = list(study_dir.glob('*')) 43 | 44 | modalities = {} 45 | 46 | for dir in sub_dirs: 47 | first_file = next(dir.glob('*.dcm')) 48 | ds = pydicom.dcmread(str(first_file)) 49 | #print(ds) 50 | modality = ds.Modality 51 | modalities[modality] = dir 52 | 53 | modalities["ID"] = ds.StudyInstanceUID 54 | return modalities 55 | 56 | 57 | def dcm2nii_CT(CT_dcm_path, nii_out_path): 58 | # conversion of CT DICOM (in the CT_dcm_path) to nifti and save in nii_out_path 59 | with tempfile.TemporaryDirectory() as tmp: #convert CT 60 | tmp = plb.Path(str(tmp)) 61 | # convert dicom directory to nifti 62 | # (store results in temp directory) 63 | dicom2nifti.convert_directory(CT_dcm_path, str(tmp), 64 | compression=True, reorient=True) 65 | nii = next(tmp.glob('*nii.gz')) 66 | # copy niftis to output folder with consistent naming 67 | shutil.copy(nii, nii_out_path/'CT.nii.gz') 68 | 69 | 70 | def dcm2nii_PET(PET_dcm_path, nii_out_path): 71 | # conversion of PET DICOM (in the PET_dcm_path) to nifti (and SUV nifti) and save in nii_out_path 72 | first_pt_dcm = next(PET_dcm_path.glob('*.dcm')) 73 | suv_corr_factor = calculate_suv_factor(first_pt_dcm) 74 | 75 | with tempfile.TemporaryDirectory() as tmp: #convert PET 76 | tmp = plb.Path(str(tmp)) 77 | # convert dicom directory to nifti 78 | # (store results in temp directory) 79 | dicom2nifti.convert_directory(PET_dcm_path, str(tmp), 80 | compression=True, reorient=True) 81 | nii = next(tmp.glob('*nii.gz')) 82 | # copy nifti to output folder with consistent naming 83 | shutil.copy(nii, nii_out_path/'PET.nii.gz') 84 | 85 | # convert pet images to quantitative suv images and save nifti file 86 | suv_pet_nii = convert_pet(nib.load(nii_out_path/'PET.nii.gz'), suv_factor=suv_corr_factor) 87 | nib.save(suv_pet_nii, nii_out_path/'SUV.nii.gz') 88 | 89 | 90 | def conv_time(time_str): 91 | # function for time conversion in DICOM tag 92 | return (float(time_str[:2]) * 3600 + float(time_str[2:4]) * 60 + float(time_str[4:13])) 93 | 94 | 95 | def calculate_suv_factor(dcm_path): 96 | # reads a PET dicom file and calculates the SUV conversion factor 97 | ds = pydicom.dcmread(str(dcm_path)) 98 | total_dose = ds.RadiopharmaceuticalInformationSequence[0].RadionuclideTotalDose 99 | start_time = ds.RadiopharmaceuticalInformationSequence[0].RadiopharmaceuticalStartTime 100 | half_life = ds.RadiopharmaceuticalInformationSequence[0].RadionuclideHalfLife 101 | acq_time = ds.AcquisitionTime 102 | weight = ds.PatientWeight 103 | time_diff = conv_time(acq_time) - conv_time(start_time) 104 | act_dose = total_dose * 0.5 ** (time_diff / half_life) 105 | suv_factor = 1000 * weight / act_dose 106 | return suv_factor 107 | 108 | 109 | def convert_pet(pet, suv_factor): 110 | # function for conversion of PET values to SUV (should work on Siemens PET/CT) 111 | affine = pet.affine 112 | pet_data = pet.get_fdata() 113 | pet_suv_data = (pet_data*suv_factor).astype(np.float32) 114 | pet_suv = nib.Nifti1Image(pet_suv_data, affine) 115 | return pet_suv 116 | 117 | 118 | def dcm2nii_mask(mask_dcm_path, nii_out_path): 119 | # conversion of the mask dicom file to nifti (not directly possible with dicom2nifti) 120 | mask_dcm = list(mask_dcm_path.glob('*.dcm'))[0] 121 | mask = pydicom.read_file(str(mask_dcm)) 122 | mask_array = mask.pixel_array 123 | 124 | # get mask array to correct orientation (this procedure is dataset specific) 125 | mask_array = np.transpose(mask_array,(2,1,0) ) 126 | mask_orientation = mask[0x5200, 0x9229][0].PlaneOrientationSequence[0].ImageOrientationPatient 127 | if mask_orientation[4] == 1: 128 | mask_array = np.flip(mask_array, 1 ) 129 | 130 | # get affine matrix from the corresponding pet 131 | pet = nib.load(str(nii_out_path/'PET.nii.gz')) 132 | pet_affine = pet.affine 133 | 134 | # return mask as nifti object 135 | mask_out = nib.Nifti1Image(mask_array, pet_affine) 136 | nib.save(mask_out, nii_out_path/'SEG.nii.gz') 137 | 138 | 139 | def resample_ct(nii_out_path): 140 | # resample CT to PET and mask resolution 141 | ct = nib.load(nii_out_path/'CT.nii.gz') 142 | pet = nib.load(nii_out_path/'PET.nii.gz') 143 | CTres = nilearn.image.resample_to_img(ct, pet, fill_value=-1024) 144 | nib.save(CTres, nii_out_path/'CTres.nii.gz') 145 | 146 | 147 | def tcia_to_nifti(tcia_path, nii_out_path, modality='CT'): 148 | # conversion for a single file 149 | # creates a nifti file for one patient/study 150 | # tcia_path: path to a DICOM directory for a specific study of one patient 151 | # nii_out_path: path to a directory where nifti file for one patient, study and modality will be stored 152 | # modality: modality to be converted CT, PET or mask ('CT', 'PT', 'SEG') 153 | os.makedirs(nii_out_path, exist_ok=True) 154 | if modality == 'CT': 155 | dcm2nii_CT(tcia_path, nii_out_path) 156 | resample_ct(nii_out_path) 157 | elif modality == 'PET': 158 | dcm2nii_PET(tcia_path, nii_out_path) 159 | elif modality == 'SEG': 160 | dcm2nii_mask(tcia_path, nii_out_path) 161 | 162 | 163 | def tcia_to_nifti_study(study_path, nii_out_path): 164 | # conversion for a single study 165 | # creates NIfTI files for one patient 166 | # study_path: path to a study directory containing all DICOM files for a specific study of one patient 167 | # nii_out_path: path to a directory where all nifti files for one patient and study will be stored 168 | study_path = plb.Path(study_path) 169 | modalities = identify_modalities(study_path) 170 | nii_out_path = plb.Path(nii_out_root / study_path.parent.name) 171 | nii_out_path = nii_out_path/study_path.name 172 | os.makedirs(nii_out_path, exist_ok=True) 173 | 174 | ct_dir = modalities["CT"] 175 | dcm2nii_CT(ct_dir, nii_out_path) 176 | 177 | pet_dir = modalities["PT"] 178 | dcm2nii_PET(pet_dir, nii_out_path) 179 | 180 | seg_dir = modalities["SEG"] 181 | dcm2nii_mask(seg_dir, nii_out_path) 182 | 183 | resample_ct(nii_out_path) 184 | 185 | 186 | def convert_tcia_to_nifti(study_dirs,nii_out_root): 187 | # batch conversion of all patients 188 | for study_dir in tqdm(study_dirs): 189 | 190 | patient = study_dir.parent.name 191 | print("The following patient directory is being processed: ", patient) 192 | 193 | modalities = identify_modalities(study_dir) 194 | nii_out_path = plb.Path(nii_out_root/study_dir.parent.name) 195 | nii_out_path = nii_out_path/study_dir.name 196 | os.makedirs(nii_out_path, exist_ok=True) 197 | 198 | ct_dir = modalities["CT"] 199 | dcm2nii_CT(ct_dir, nii_out_path) 200 | 201 | pet_dir = modalities["PT"] 202 | dcm2nii_PET(pet_dir, nii_out_path) 203 | 204 | seg_dir = modalities["SEG"] 205 | dcm2nii_mask(seg_dir, nii_out_path) 206 | 207 | resample_ct(nii_out_path) 208 | 209 | 210 | if __name__ == "__main__": 211 | path_to_data = plb.Path(sys.argv[1]) # path to downloaded TCIA DICOM database, e.g. '.../FDG-PET-CT-Lesions/' 212 | nii_out_root = plb.Path(sys.argv[2]) # path to the to be created NiFTI files, e.g. '...tcia_nifti/FDG-PET-CT-Lesions/') 213 | 214 | study_dirs = find_studies(path_to_data) 215 | convert_tcia_to_nifti(study_dirs, nii_out_root) 216 | -------------------------------------------------------------------------------- /tcia_dicom_to_nifti_generic.py: -------------------------------------------------------------------------------- 1 | # data preparation (conversion of DICOM series to nifti format) 2 | 3 | # run script from command line as follows: 4 | # python tcia_dicom_to_nifti.py /PATH/TO/DICOM/TCIA_dataset_name/ /PATH/TO/NIFTI/TCIA_dataset_name/ 5 | # if not existing the output folder(s) (/PATH/TO/NIFTI/TCIA_dataset_name/) will be generated 6 | 7 | import pathlib as plb 8 | import tempfile 9 | import os 10 | import dicom2nifti 11 | import nibabel as nib 12 | import numpy as np 13 | import pydicom 14 | import sys 15 | import shutil 16 | from tqdm import tqdm 17 | 18 | 19 | def find_studies(path_to_data): 20 | # find all studies 21 | dicom_root = plb.Path(path_to_data) 22 | patient_dirs = list(dicom_root.glob('*')) 23 | 24 | study_dirs = [] 25 | 26 | for dir in patient_dirs: 27 | sub_dirs = list(dir.glob('*')) 28 | #print(sub_dirs) 29 | study_dirs.extend(sub_dirs) 30 | 31 | #dicom_dirs = dicom_dirs.append(dir.glob('*')) 32 | return study_dirs 33 | 34 | 35 | def get_series(study_dir): 36 | # returns paths of series directories 37 | study_dir = plb.Path(study_dir) 38 | series_dirs = list(study_dir.glob('*')) 39 | 40 | return series_dirs 41 | 42 | def dcm2nii(dcm_path, nii_out_path): 43 | # conversion of DICOM to nifti and save in nii_out_path 44 | 45 | dicom2nifti.convert_directory(str(dcm_path), str(nii_out_path), 46 | compression=True, reorient=True) 47 | 48 | def convert_tcia_to_nifti(study_dirs,nii_out_root): 49 | # batch conversion of all patients 50 | for study_dir in tqdm(study_dirs): 51 | 52 | patient = study_dir.parent.name 53 | print("The following patient directory is being processed: ", patient) 54 | 55 | series_dirs = get_series(study_dir) 56 | 57 | nii_out_path = plb.Path(nii_out_root/study_dir.name) 58 | os.makedirs(nii_out_path, exist_ok=True) 59 | 60 | for series in series_dirs: 61 | try: 62 | dcm2nii(series, nii_out_path) 63 | except: 64 | # ... PRINT THE ERROR MESSAGE ... # 65 | print('An error occurred, data may be (partially) not converted: '+ str(series)) 66 | 67 | 68 | if __name__ == "__main__": 69 | path_to_data = plb.Path(sys.argv[1]) # path to downloaded TCIA DICOM database, e.g. '...TCIA/manifest-1647440690095/FDG-PET-CT-Lesions/' 70 | nii_out_root = plb.Path(sys.argv[2]) # path to the to be created NiFTI files, e.g. '...tcia_nifti/FDG-PET-CT-Lesions/') 71 | 72 | study_dirs = find_studies(path_to_data) 73 | convert_tcia_to_nifti(study_dirs, nii_out_root) 74 | -------------------------------------------------------------------------------- /tcia_nifti_to_hdf5.py: -------------------------------------------------------------------------------- 1 | # data preparation (conversion of DICOM PET/CT studies to HDF5 format for running automated lesion segmentation) 2 | 3 | # run script from command line as follows: 4 | # python tcia_dicom_to_nifti.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/HDF5/FDG-PET-CT-Lesions.hdf5 5 | 6 | import h5py 7 | from tqdm import tqdm 8 | import pathlib as plb 9 | import sys 10 | import os 11 | import nibabel as nib 12 | import numpy as np 13 | 14 | def find_studies(path_to_data): 15 | # find all studies 16 | dicom_root = plb.Path(path_to_data) 17 | patient_dirs = list(dicom_root.glob('*')) 18 | 19 | study_dirs = [] 20 | 21 | for dir in patient_dirs: 22 | sub_dirs = list(dir.glob('*')) 23 | #print(sub_dirs) 24 | study_dirs.extend(sub_dirs) 25 | 26 | #dicom_dirs = dicom_dirs.append(dir.glob('*')) 27 | return study_dirs 28 | 29 | 30 | def nifti_to_hdf5(nii_file, path_to_h5_file): 31 | # conversion for a single file 32 | # creates an hdf5 file for one patient 33 | # nii_path: path to a study directory containing all nifti files for a specific study of one patient 34 | # path_to_h5_file: path to a single hdf5 file for one patient and study 35 | data = nib.load(nii_file) 36 | with h5py.File(path_to_h5_file, 'w') as h5_file: 37 | h5_file.create_dataset(data.get_fdata()) 38 | 39 | 40 | def nifti_to_hdf5_study(study_path, path_to_h5_file): 41 | # conversion for a single study 42 | # creates an hdf5 file for one patient 43 | # study_path: path to a study directory containing all nifti files for a specific study of one patient 44 | # path_to_h5_file: path to a single hdf5 file for one patient and study 45 | 46 | study_path = plb.Path(study_path) 47 | patient = study_path.parent.name 48 | study = study_path.name 49 | 50 | suv = nib.load(str(study_path / 'SUV.nii.gz')) 51 | ctres = nib.load(str(study_path / 'CTres.nii.gz')) 52 | ct = nib.load(str(study_path / 'CT.nii.gz')) 53 | pet = nib.load(str(study_path / 'PET.nii.gz')) 54 | seg = nib.load(str(study_path / 'SEG.nii.gz')) 55 | 56 | suv = suv.get_fdata() 57 | ctres = ctres.get_fdata() 58 | ct = ct.get_fdata() 59 | pet = pet.get_fdata() 60 | seg = seg.get_fdata() 61 | 62 | with h5py.File(path_to_h5_file, 'w') as h5_file: 63 | try: 64 | h5_file.create_group(patient + '/' + study) 65 | h5_file.create_dataset(patient + '/' + study + '/suv', data=suv, compression="gzip") 66 | h5_file.create_dataset(patient + '/' + study + '/ctres', data=ctres, compression="gzip") 67 | h5_file.create_dataset(patient + '/' + study + '/ct', data=ct, compression="gzip") 68 | h5_file.create_dataset(patient + '/' + study + '/pet', data=pet, compression="gzip") 69 | h5_file.create_dataset(patient + '/' + study + '/seg', data=seg, compression="gzip") 70 | except: 71 | h5_pat = h5_file.create_group(patient) 72 | h5_pat.create_group(study) 73 | h5_file.create_dataset(patient + '/' + study + '/suv', data=suv, compression="gzip") 74 | h5_file.create_dataset(patient + '/' + study + '/ctres', data=ctres, compression="gzip") 75 | h5_file.create_dataset(patient + '/' + study + '/ct', data=ct, compression="gzip") 76 | h5_file.create_dataset(patient + '/' + study + '/pet', data=pet, compression="gzip") 77 | h5_file.create_dataset(patient + '/' + study + '/seg', data=seg, compression="gzip") 78 | 79 | 80 | def convert_nifti_to_hdf5(study_dirs, path_to_h5_data): 81 | # batch conversion of all patients 82 | # creates a single hdf5 file for all patients 83 | # study_dirs: NiFTI study directories for all patients 84 | # path_to_h5_data: path to a single hdf5 file for all patients 85 | 86 | h5_file = h5py.File(path_to_h5_data, 'w') 87 | 88 | for pat_dir in tqdm(study_dirs): 89 | 90 | patient = pat_dir.parent.name 91 | study = pat_dir.name 92 | 93 | suv = nib.load(str(pat_dir/'SUV.nii.gz')) 94 | ctres = nib.load(str(pat_dir/'CTres.nii.gz')) 95 | ct = nib.load(str(pat_dir/'CT.nii.gz')) 96 | pet = nib.load(str(pat_dir/'PET.nii.gz')) 97 | seg = nib.load(str(pat_dir/'SEG.nii.gz')) 98 | 99 | suv = suv.get_fdata() 100 | ctres = ctres.get_fdata() 101 | ct = ct.get_fdata() 102 | pet = pet.get_fdata() 103 | seg = seg.get_fdata() 104 | 105 | try: 106 | h5_file.create_group(patient+'/'+study) 107 | h5_file.create_dataset(patient+'/'+study+'/suv', data=suv, compression="gzip") 108 | h5_file.create_dataset(patient+'/'+study+'/ctres', data=ctres, compression="gzip") 109 | h5_file.create_dataset(patient+'/'+study+'/ct', data=ct, compression="gzip") 110 | h5_file.create_dataset(patient+'/'+study+'/pet', data=pet, compression="gzip") 111 | h5_file.create_dataset(patient+'/'+study+'/seg', data=seg, compression="gzip") 112 | 113 | except: 114 | h5_pat = h5_file.create_group(patient) 115 | h5_pat.create_group(study) 116 | h5_file.create_dataset(patient+'/'+study+'/suv', data=suv, compression="gzip") 117 | h5_file.create_dataset(patient+'/'+study+'/ctres', data=ctres, compression="gzip") 118 | h5_file.create_dataset(patient+'/'+study+'/ct', data=ct, compression="gzip") 119 | h5_file.create_dataset(patient+'/'+study+'/pet', data=pet, compression="gzip") 120 | h5_file.create_dataset(patient+'/'+study+'/seg', data=seg, compression="gzip") 121 | 122 | h5_file.close() 123 | 124 | 125 | if __name__ == "__main__": 126 | path_to_data = sys.argv[1] # path to converted NiFTI files (see tcia2nifti) from downloaded TCIA DICOM database e.g. '...tcia_nifti/FDG-PET-CT-Lesions/' 127 | path_to_h5_data = sys.argv[2] # path to the to be saved HDF5 file, e.g. '...hdf5/FDG-PET-CT-Lesions.hdf5' 128 | study_dirs = find_studies(path_to_data) 129 | convert_nifti_to_hdf5(study_dirs, path_to_h5_data) 130 | 131 | 132 | -------------------------------------------------------------------------------- /tcia_nifti_to_mha.py: -------------------------------------------------------------------------------- 1 | # converts the entire dataset from the .nii.gz format to the .mha format 2 | #(the .mha format is required by grand-challenge.org as input and ouput data of algorithms) 3 | 4 | #run script from command line as follows: 5 | # python tcia_nifti_to_mha.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/MHA//FDG-PET-CT-Lesions/ 6 | 7 | import SimpleITK as sitk 8 | import pathlib as plb 9 | from tqdm import tqdm 10 | import os 11 | import sys 12 | 13 | def find_studies(path_to_data): # returns a list of unique study paths within the dataset 14 | dicom_root = plb.Path(path_to_data) 15 | patient_dirs = list(dicom_root.glob('*')) 16 | 17 | study_dirs = [] 18 | 19 | for dir in patient_dirs: 20 | sub_dirs = list(dir.glob('*')) 21 | #print(sub_dirs) 22 | study_dirs.extend(sub_dirs) 23 | 24 | #dicom_dirs = dicom_dirs.append(dir.glob('*')) 25 | return study_dirs 26 | 27 | def nii_to_mha(nii_path, mha_out_path): # converts a .nii.gz file to .mha and saves to a specified path 28 | img = sitk.ReadImage(nii_path) 29 | sitk.WriteImage(img, mha_out_path, True) 30 | 31 | 32 | def convert_to_mha(study_dirs,path_to_mha_data): # main function converting the entire dataset from .nii.gz to .mha 33 | 34 | for study_dir in tqdm(study_dirs): 35 | 36 | patient = study_dir.parent.name 37 | study = study_dir.name 38 | 39 | suv_nii = str(study_dir/'SUV.nii.gz') 40 | ctres_nii = str(study_dir/'CTres.nii.gz') 41 | ct_nii = str(study_dir/'CT.nii.gz') 42 | pet_nii = str(study_dir/'PET.nii.gz') 43 | seg_nii = str(study_dir/'SEG.nii.gz') 44 | 45 | suv_mha_dir = os.path.join(path_to_mha_data, patient, study) 46 | ctres_mha_dir = os.path.join(path_to_mha_data, patient, study) 47 | ct_mha_dir = os.path.join(path_to_mha_data, patient, study) 48 | pet_mha_dir = os.path.join(path_to_mha_data, patient, study) 49 | seg_mha_dir = os.path.join(path_to_mha_data, patient, study) 50 | 51 | os.makedirs(suv_mha_dir , exist_ok=True) 52 | os.makedirs(ctres_mha_dir, exist_ok=True) 53 | os.makedirs(ct_mha_dir , exist_ok=True) 54 | os.makedirs(pet_mha_dir , exist_ok=True) 55 | os.makedirs(seg_mha_dir , exist_ok=True) 56 | 57 | nii_to_mha(suv_nii, os.path.join(suv_mha_dir,'SUV.mha')) 58 | nii_to_mha(ctres_nii, os.path.join(ctres_mha_dir,'CTres.mha')) 59 | nii_to_mha(ct_nii, os.path.join(ct_mha_dir,'CT.mha')) 60 | nii_to_mha(pet_nii, os.path.join(pet_mha_dir,'PET.mha')) 61 | nii_to_mha(seg_nii, os.path.join(seg_mha_dir,'SEG.mha') ) 62 | 63 | 64 | if __name__ == "__main__": 65 | 66 | path_to_nii_data = sys.argv[1] # path to nifti data e.g. .../nifti/FDG-PET-CT-Lesions/ 67 | path_to_mha_data = sys.argv[2] # output path for mha data ... /mha/FDG-PET-CT-Lesions/ (will be created if non existing) 68 | study_dirs = find_studies(path_to_nii_data) 69 | 70 | convert_to_mha(study_dirs,path_to_mha_data) 71 | --------------------------------------------------------------------------------