├── LICENSE
├── README.md
├── requirements.txt
├── tcia_dicom_to_nifti.py
├── tcia_dicom_to_nifti_generic.py
├── tcia_nifti_to_hdf5.py
└── tcia_nifti_to_mha.py
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2022 midas.lab
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # TCIA_processing: tcia_dicom_to_nifti.py
2 |
3 | Conversion script for conversion of TCIA DICOM data to NIfTI format (dataset: FDG-PET-CT-Lesion, doi:
).
4 |
5 | ## Requirements
6 |
7 | To run the script you will need a number of python packages. Use the terminal and run sequentially:
8 |
9 | ```bash
10 | pip3 install numpy
11 | pip3 install dicom2nifti
12 | pip3 install nibabel
13 | pip3 install pydicom
14 | pip3 install tqdm
15 | pip3 install nilearn
16 | ```
17 | in case you use a Colab or Jupyter notebook and cannot use the terminal you can perform these installations by adding a "!" in front of the commands, e.g.
18 | ```python
19 | !pip3 install numpy
20 | ...
21 | ```
22 | ## Data structure
23 | DICOM data downloaded from TCIA will have the following format:
24 |
25 | Directory structure of the original DICOM data within the folder /PATH/TO/DICOM/FDG-PET-CT-Lesions/ :
26 |
27 |
28 |
29 |
30 | ## Usage
31 |
32 | In order to run this script use the terminal and navigate to the path where the script is stored, then run:
33 |
34 | ```bash
35 | python3 tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/
36 |
37 | ```
38 | where
39 |
40 | ```/PATH/TO/DICOM/FDG-PET-CT-Lesions/```
41 | is the directory of the DICOM data downloaded from TCIA (see above: data structure) and
42 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/```
43 | is the path you want to store the NIfTI files in.
44 |
45 | You can ignore the nilearn warning:
46 |
47 | ```.../nilearn/image/resampling.py:527: UserWarning: Casting data from int16 to float32 warnings.warn("Casting data from %s to %s" % (data.dtype.name, aux))```
48 |
49 | or suppress warnings by running the script as (after making sure everything works):
50 |
51 | ```bash
52 | python3 -W ignore tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/
53 | ```
54 |
55 | ## Output
56 | The resulting NIfTI directory will have the following structure:
57 |
58 |
59 |
60 | ## Execution time
61 | Running the script can take multiple hours.
62 |
63 | # TCIA_processing: tcia_nifti_to_mha.py
64 |
65 | Conversion script for conversion of TCIA NIfTI data (created using tcia_dicom_to_nifti.py - see above) to mha files.
66 |
67 | ## Requirements
68 |
69 | To run the script you will need a number of python packages. Use the terminal and run sequentially:
70 |
71 | ```bash
72 | pip3 install SimpleITK
73 | pip3 install tqdm
74 | ```
75 | in case you use a Colab or Jupyter notebook and cannot use the terminal you can perform these installations by adding a "!" in front of the commands, e.g.
76 | ```python
77 | !pip3 install SimpleITK
78 | ...
79 | ```
80 | ## Usage
81 |
82 | In order to run this script use the terminal and navigate to the path where the script is stored, then run:
83 |
84 | ```bash
85 | python3 tcia_nifti_to_mha.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/MHA/FDG-PET-CT-Lesions/
86 | ```
87 | where
88 |
89 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/```
90 | is the directory of the NIfTI data generated using tcia_dicom_to_nifti.py (see above) and
91 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/```
92 | is the path you want to store the MHA files in.
93 |
94 | You can ignore the nilearn warning:
95 |
96 | ```.WARNING: In /tmp/SimpleITK-build/ITK/Modules/IO/Meta/src/itkMetaImageIO.cxx, line 669 MetaImageIO (0x2d9b300): Unsupported or empty metaData item intent_name of type Ssfound, won't be written to image file```
97 |
98 | or suppress warnings by running the script as (after making sure everything works):
99 |
100 | ```bash
101 | python3 -W ignore tcia_nifti_to_mha.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/MHA/FDG-PET-CT-Lesions/
102 | ```
103 |
104 | # TCIA_processing: tcia_nifti_to_hdf5.py
105 |
106 | Conversion script for conversion of TCIA NIfTI data (created using tcia_dicom_to_nifti.py - see above) to a single hdf5 file
107 |
108 | ## Requirements
109 |
110 | To run the script you will need a number of python packages. Use the terminal and run sequentially:
111 |
112 | ```bash
113 | pip3 install numpy
114 | pip3 install h5py
115 | pip3 install tqdm
116 | pip3 install nibabel
117 | ```
118 | in case you use a Colab or Jupyter notebook and cannot use the terminal you can perform these installations by adding a "!" in front of the commands, e.g.
119 | ```python
120 | !pip3 install numpy
121 | ...
122 | ```
123 | ## Usage
124 |
125 | In order to run this script use the terminal and navigate to the path where the script is stored, then run:
126 |
127 | ```bash
128 | python3 tcia_nifti_to_hdf5.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/HDF5/FDG-PET-CT-Lesions.hdf5
129 |
130 | ```
131 | where
132 |
133 | ```/PATH/TO/NIFTI/FDG-PET-CT-Lesions/```
134 | is the directory of the NIfTI data generated using tcia_dicom_to_nifti.py (see above) and
135 | ```/PATH/TO/HDF5/FDG-PET-CT-Lesions.hdf5```
136 | is the path and filename of the hdf5 file to be created.
137 |
138 | ## Package Versions
139 | All scripts were tested under python 3.9 with the following package versions:
140 |
141 | dicom2nifti==2.3.3
142 |
143 | nibabel==3.2.2
144 |
145 | pydicom==2.3.0
146 |
147 | h5py==3.6.0
148 |
149 | tqdm==4.64.0
150 |
151 | SimpleITK==2.1.1.2
152 |
153 | nilearn==0.9.1
154 |
155 | numpy==1.22.3
156 |
157 | ## License
158 | [MIT](https://choosealicense.com/licenses/mit/)
159 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 | dicom2nifti
3 | nibabel
4 | pydicom
5 | h5py
6 | tqdm
7 | SimpleITK
8 | nilearn
9 |
10 | # All scripts were tested under python 3.9 with the following package versions:
11 |
12 | #dicom2nifti==2.3.3
13 | #nibabel==3.2.2
14 | #pydicom==2.3.0
15 | #h5py==3.6.0
16 | #tqdm==4.64.0
17 | #SimpleITK==2.1.1.2
18 | #nilearn==0.9.1
19 | #numpy==1.22.3
20 |
--------------------------------------------------------------------------------
/tcia_dicom_to_nifti.py:
--------------------------------------------------------------------------------
1 | # data preparation (conversion of DICOM PET/CT studies to nifti format for running automated lesion segmentation)
2 |
3 | # run script from command line as follows:
4 | # python tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/
5 |
6 | # you can ignore the nilearn warning:
7 | # .../nilearn/image/resampling.py:527: UserWarning: Casting data from int16 to float32 warnings.warn("Casting data from %s to %s" % (data.dtype.name, aux))
8 | # or run as python -W ignore tcia_dicom_to_nifti.py /PATH/TO/DICOM/FDG-PET-CT-Lesions/ /PATH/TO/NIFTI/FDG-PET-CT-Lesions/
9 |
10 | import pathlib as plb
11 | import tempfile
12 | import os
13 | import dicom2nifti
14 | import nibabel as nib
15 | import numpy as np
16 | import pydicom
17 | import sys
18 | import shutil
19 | import nilearn.image
20 | from tqdm import tqdm
21 |
22 |
23 | def find_studies(path_to_data):
24 | # find all studies
25 | dicom_root = plb.Path(path_to_data)
26 | patient_dirs = list(dicom_root.glob('*'))
27 |
28 | study_dirs = []
29 |
30 | for dir in patient_dirs:
31 | sub_dirs = list(dir.glob('*'))
32 | #print(sub_dirs)
33 | study_dirs.extend(sub_dirs)
34 |
35 | #dicom_dirs = dicom_dirs.append(dir.glob('*'))
36 | return study_dirs
37 |
38 |
39 | def identify_modalities(study_dir):
40 | # identify CT, PET and mask subfolders and return dicitionary of modalities and corresponding paths, also return series ID, output is a dictionary
41 | study_dir = plb.Path(study_dir)
42 | sub_dirs = list(study_dir.glob('*'))
43 |
44 | modalities = {}
45 |
46 | for dir in sub_dirs:
47 | first_file = next(dir.glob('*.dcm'))
48 | ds = pydicom.dcmread(str(first_file))
49 | #print(ds)
50 | modality = ds.Modality
51 | modalities[modality] = dir
52 |
53 | modalities["ID"] = ds.StudyInstanceUID
54 | return modalities
55 |
56 |
57 | def dcm2nii_CT(CT_dcm_path, nii_out_path):
58 | # conversion of CT DICOM (in the CT_dcm_path) to nifti and save in nii_out_path
59 | with tempfile.TemporaryDirectory() as tmp: #convert CT
60 | tmp = plb.Path(str(tmp))
61 | # convert dicom directory to nifti
62 | # (store results in temp directory)
63 | dicom2nifti.convert_directory(CT_dcm_path, str(tmp),
64 | compression=True, reorient=True)
65 | nii = next(tmp.glob('*nii.gz'))
66 | # copy niftis to output folder with consistent naming
67 | shutil.copy(nii, nii_out_path/'CT.nii.gz')
68 |
69 |
70 | def dcm2nii_PET(PET_dcm_path, nii_out_path):
71 | # conversion of PET DICOM (in the PET_dcm_path) to nifti (and SUV nifti) and save in nii_out_path
72 | first_pt_dcm = next(PET_dcm_path.glob('*.dcm'))
73 | suv_corr_factor = calculate_suv_factor(first_pt_dcm)
74 |
75 | with tempfile.TemporaryDirectory() as tmp: #convert PET
76 | tmp = plb.Path(str(tmp))
77 | # convert dicom directory to nifti
78 | # (store results in temp directory)
79 | dicom2nifti.convert_directory(PET_dcm_path, str(tmp),
80 | compression=True, reorient=True)
81 | nii = next(tmp.glob('*nii.gz'))
82 | # copy nifti to output folder with consistent naming
83 | shutil.copy(nii, nii_out_path/'PET.nii.gz')
84 |
85 | # convert pet images to quantitative suv images and save nifti file
86 | suv_pet_nii = convert_pet(nib.load(nii_out_path/'PET.nii.gz'), suv_factor=suv_corr_factor)
87 | nib.save(suv_pet_nii, nii_out_path/'SUV.nii.gz')
88 |
89 |
90 | def conv_time(time_str):
91 | # function for time conversion in DICOM tag
92 | return (float(time_str[:2]) * 3600 + float(time_str[2:4]) * 60 + float(time_str[4:13]))
93 |
94 |
95 | def calculate_suv_factor(dcm_path):
96 | # reads a PET dicom file and calculates the SUV conversion factor
97 | ds = pydicom.dcmread(str(dcm_path))
98 | total_dose = ds.RadiopharmaceuticalInformationSequence[0].RadionuclideTotalDose
99 | start_time = ds.RadiopharmaceuticalInformationSequence[0].RadiopharmaceuticalStartTime
100 | half_life = ds.RadiopharmaceuticalInformationSequence[0].RadionuclideHalfLife
101 | acq_time = ds.AcquisitionTime
102 | weight = ds.PatientWeight
103 | time_diff = conv_time(acq_time) - conv_time(start_time)
104 | act_dose = total_dose * 0.5 ** (time_diff / half_life)
105 | suv_factor = 1000 * weight / act_dose
106 | return suv_factor
107 |
108 |
109 | def convert_pet(pet, suv_factor):
110 | # function for conversion of PET values to SUV (should work on Siemens PET/CT)
111 | affine = pet.affine
112 | pet_data = pet.get_fdata()
113 | pet_suv_data = (pet_data*suv_factor).astype(np.float32)
114 | pet_suv = nib.Nifti1Image(pet_suv_data, affine)
115 | return pet_suv
116 |
117 |
118 | def dcm2nii_mask(mask_dcm_path, nii_out_path):
119 | # conversion of the mask dicom file to nifti (not directly possible with dicom2nifti)
120 | mask_dcm = list(mask_dcm_path.glob('*.dcm'))[0]
121 | mask = pydicom.read_file(str(mask_dcm))
122 | mask_array = mask.pixel_array
123 |
124 | # get mask array to correct orientation (this procedure is dataset specific)
125 | mask_array = np.transpose(mask_array,(2,1,0) )
126 | mask_orientation = mask[0x5200, 0x9229][0].PlaneOrientationSequence[0].ImageOrientationPatient
127 | if mask_orientation[4] == 1:
128 | mask_array = np.flip(mask_array, 1 )
129 |
130 | # get affine matrix from the corresponding pet
131 | pet = nib.load(str(nii_out_path/'PET.nii.gz'))
132 | pet_affine = pet.affine
133 |
134 | # return mask as nifti object
135 | mask_out = nib.Nifti1Image(mask_array, pet_affine)
136 | nib.save(mask_out, nii_out_path/'SEG.nii.gz')
137 |
138 |
139 | def resample_ct(nii_out_path):
140 | # resample CT to PET and mask resolution
141 | ct = nib.load(nii_out_path/'CT.nii.gz')
142 | pet = nib.load(nii_out_path/'PET.nii.gz')
143 | CTres = nilearn.image.resample_to_img(ct, pet, fill_value=-1024)
144 | nib.save(CTres, nii_out_path/'CTres.nii.gz')
145 |
146 |
147 | def tcia_to_nifti(tcia_path, nii_out_path, modality='CT'):
148 | # conversion for a single file
149 | # creates a nifti file for one patient/study
150 | # tcia_path: path to a DICOM directory for a specific study of one patient
151 | # nii_out_path: path to a directory where nifti file for one patient, study and modality will be stored
152 | # modality: modality to be converted CT, PET or mask ('CT', 'PT', 'SEG')
153 | os.makedirs(nii_out_path, exist_ok=True)
154 | if modality == 'CT':
155 | dcm2nii_CT(tcia_path, nii_out_path)
156 | resample_ct(nii_out_path)
157 | elif modality == 'PET':
158 | dcm2nii_PET(tcia_path, nii_out_path)
159 | elif modality == 'SEG':
160 | dcm2nii_mask(tcia_path, nii_out_path)
161 |
162 |
163 | def tcia_to_nifti_study(study_path, nii_out_path):
164 | # conversion for a single study
165 | # creates NIfTI files for one patient
166 | # study_path: path to a study directory containing all DICOM files for a specific study of one patient
167 | # nii_out_path: path to a directory where all nifti files for one patient and study will be stored
168 | study_path = plb.Path(study_path)
169 | modalities = identify_modalities(study_path)
170 | nii_out_path = plb.Path(nii_out_root / study_path.parent.name)
171 | nii_out_path = nii_out_path/study_path.name
172 | os.makedirs(nii_out_path, exist_ok=True)
173 |
174 | ct_dir = modalities["CT"]
175 | dcm2nii_CT(ct_dir, nii_out_path)
176 |
177 | pet_dir = modalities["PT"]
178 | dcm2nii_PET(pet_dir, nii_out_path)
179 |
180 | seg_dir = modalities["SEG"]
181 | dcm2nii_mask(seg_dir, nii_out_path)
182 |
183 | resample_ct(nii_out_path)
184 |
185 |
186 | def convert_tcia_to_nifti(study_dirs,nii_out_root):
187 | # batch conversion of all patients
188 | for study_dir in tqdm(study_dirs):
189 |
190 | patient = study_dir.parent.name
191 | print("The following patient directory is being processed: ", patient)
192 |
193 | modalities = identify_modalities(study_dir)
194 | nii_out_path = plb.Path(nii_out_root/study_dir.parent.name)
195 | nii_out_path = nii_out_path/study_dir.name
196 | os.makedirs(nii_out_path, exist_ok=True)
197 |
198 | ct_dir = modalities["CT"]
199 | dcm2nii_CT(ct_dir, nii_out_path)
200 |
201 | pet_dir = modalities["PT"]
202 | dcm2nii_PET(pet_dir, nii_out_path)
203 |
204 | seg_dir = modalities["SEG"]
205 | dcm2nii_mask(seg_dir, nii_out_path)
206 |
207 | resample_ct(nii_out_path)
208 |
209 |
210 | if __name__ == "__main__":
211 | path_to_data = plb.Path(sys.argv[1]) # path to downloaded TCIA DICOM database, e.g. '.../FDG-PET-CT-Lesions/'
212 | nii_out_root = plb.Path(sys.argv[2]) # path to the to be created NiFTI files, e.g. '...tcia_nifti/FDG-PET-CT-Lesions/')
213 |
214 | study_dirs = find_studies(path_to_data)
215 | convert_tcia_to_nifti(study_dirs, nii_out_root)
216 |
--------------------------------------------------------------------------------
/tcia_dicom_to_nifti_generic.py:
--------------------------------------------------------------------------------
1 | # data preparation (conversion of DICOM series to nifti format)
2 |
3 | # run script from command line as follows:
4 | # python tcia_dicom_to_nifti.py /PATH/TO/DICOM/TCIA_dataset_name/ /PATH/TO/NIFTI/TCIA_dataset_name/
5 | # if not existing the output folder(s) (/PATH/TO/NIFTI/TCIA_dataset_name/) will be generated
6 |
7 | import pathlib as plb
8 | import tempfile
9 | import os
10 | import dicom2nifti
11 | import nibabel as nib
12 | import numpy as np
13 | import pydicom
14 | import sys
15 | import shutil
16 | from tqdm import tqdm
17 |
18 |
19 | def find_studies(path_to_data):
20 | # find all studies
21 | dicom_root = plb.Path(path_to_data)
22 | patient_dirs = list(dicom_root.glob('*'))
23 |
24 | study_dirs = []
25 |
26 | for dir in patient_dirs:
27 | sub_dirs = list(dir.glob('*'))
28 | #print(sub_dirs)
29 | study_dirs.extend(sub_dirs)
30 |
31 | #dicom_dirs = dicom_dirs.append(dir.glob('*'))
32 | return study_dirs
33 |
34 |
35 | def get_series(study_dir):
36 | # returns paths of series directories
37 | study_dir = plb.Path(study_dir)
38 | series_dirs = list(study_dir.glob('*'))
39 |
40 | return series_dirs
41 |
42 | def dcm2nii(dcm_path, nii_out_path):
43 | # conversion of DICOM to nifti and save in nii_out_path
44 |
45 | dicom2nifti.convert_directory(str(dcm_path), str(nii_out_path),
46 | compression=True, reorient=True)
47 |
48 | def convert_tcia_to_nifti(study_dirs,nii_out_root):
49 | # batch conversion of all patients
50 | for study_dir in tqdm(study_dirs):
51 |
52 | patient = study_dir.parent.name
53 | print("The following patient directory is being processed: ", patient)
54 |
55 | series_dirs = get_series(study_dir)
56 |
57 | nii_out_path = plb.Path(nii_out_root/study_dir.name)
58 | os.makedirs(nii_out_path, exist_ok=True)
59 |
60 | for series in series_dirs:
61 | try:
62 | dcm2nii(series, nii_out_path)
63 | except:
64 | # ... PRINT THE ERROR MESSAGE ... #
65 | print('An error occurred, data may be (partially) not converted: '+ str(series))
66 |
67 |
68 | if __name__ == "__main__":
69 | path_to_data = plb.Path(sys.argv[1]) # path to downloaded TCIA DICOM database, e.g. '...TCIA/manifest-1647440690095/FDG-PET-CT-Lesions/'
70 | nii_out_root = plb.Path(sys.argv[2]) # path to the to be created NiFTI files, e.g. '...tcia_nifti/FDG-PET-CT-Lesions/')
71 |
72 | study_dirs = find_studies(path_to_data)
73 | convert_tcia_to_nifti(study_dirs, nii_out_root)
74 |
--------------------------------------------------------------------------------
/tcia_nifti_to_hdf5.py:
--------------------------------------------------------------------------------
1 | # data preparation (conversion of DICOM PET/CT studies to HDF5 format for running automated lesion segmentation)
2 |
3 | # run script from command line as follows:
4 | # python tcia_dicom_to_nifti.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/HDF5/FDG-PET-CT-Lesions.hdf5
5 |
6 | import h5py
7 | from tqdm import tqdm
8 | import pathlib as plb
9 | import sys
10 | import os
11 | import nibabel as nib
12 | import numpy as np
13 |
14 | def find_studies(path_to_data):
15 | # find all studies
16 | dicom_root = plb.Path(path_to_data)
17 | patient_dirs = list(dicom_root.glob('*'))
18 |
19 | study_dirs = []
20 |
21 | for dir in patient_dirs:
22 | sub_dirs = list(dir.glob('*'))
23 | #print(sub_dirs)
24 | study_dirs.extend(sub_dirs)
25 |
26 | #dicom_dirs = dicom_dirs.append(dir.glob('*'))
27 | return study_dirs
28 |
29 |
30 | def nifti_to_hdf5(nii_file, path_to_h5_file):
31 | # conversion for a single file
32 | # creates an hdf5 file for one patient
33 | # nii_path: path to a study directory containing all nifti files for a specific study of one patient
34 | # path_to_h5_file: path to a single hdf5 file for one patient and study
35 | data = nib.load(nii_file)
36 | with h5py.File(path_to_h5_file, 'w') as h5_file:
37 | h5_file.create_dataset(data.get_fdata())
38 |
39 |
40 | def nifti_to_hdf5_study(study_path, path_to_h5_file):
41 | # conversion for a single study
42 | # creates an hdf5 file for one patient
43 | # study_path: path to a study directory containing all nifti files for a specific study of one patient
44 | # path_to_h5_file: path to a single hdf5 file for one patient and study
45 |
46 | study_path = plb.Path(study_path)
47 | patient = study_path.parent.name
48 | study = study_path.name
49 |
50 | suv = nib.load(str(study_path / 'SUV.nii.gz'))
51 | ctres = nib.load(str(study_path / 'CTres.nii.gz'))
52 | ct = nib.load(str(study_path / 'CT.nii.gz'))
53 | pet = nib.load(str(study_path / 'PET.nii.gz'))
54 | seg = nib.load(str(study_path / 'SEG.nii.gz'))
55 |
56 | suv = suv.get_fdata()
57 | ctres = ctres.get_fdata()
58 | ct = ct.get_fdata()
59 | pet = pet.get_fdata()
60 | seg = seg.get_fdata()
61 |
62 | with h5py.File(path_to_h5_file, 'w') as h5_file:
63 | try:
64 | h5_file.create_group(patient + '/' + study)
65 | h5_file.create_dataset(patient + '/' + study + '/suv', data=suv, compression="gzip")
66 | h5_file.create_dataset(patient + '/' + study + '/ctres', data=ctres, compression="gzip")
67 | h5_file.create_dataset(patient + '/' + study + '/ct', data=ct, compression="gzip")
68 | h5_file.create_dataset(patient + '/' + study + '/pet', data=pet, compression="gzip")
69 | h5_file.create_dataset(patient + '/' + study + '/seg', data=seg, compression="gzip")
70 | except:
71 | h5_pat = h5_file.create_group(patient)
72 | h5_pat.create_group(study)
73 | h5_file.create_dataset(patient + '/' + study + '/suv', data=suv, compression="gzip")
74 | h5_file.create_dataset(patient + '/' + study + '/ctres', data=ctres, compression="gzip")
75 | h5_file.create_dataset(patient + '/' + study + '/ct', data=ct, compression="gzip")
76 | h5_file.create_dataset(patient + '/' + study + '/pet', data=pet, compression="gzip")
77 | h5_file.create_dataset(patient + '/' + study + '/seg', data=seg, compression="gzip")
78 |
79 |
80 | def convert_nifti_to_hdf5(study_dirs, path_to_h5_data):
81 | # batch conversion of all patients
82 | # creates a single hdf5 file for all patients
83 | # study_dirs: NiFTI study directories for all patients
84 | # path_to_h5_data: path to a single hdf5 file for all patients
85 |
86 | h5_file = h5py.File(path_to_h5_data, 'w')
87 |
88 | for pat_dir in tqdm(study_dirs):
89 |
90 | patient = pat_dir.parent.name
91 | study = pat_dir.name
92 |
93 | suv = nib.load(str(pat_dir/'SUV.nii.gz'))
94 | ctres = nib.load(str(pat_dir/'CTres.nii.gz'))
95 | ct = nib.load(str(pat_dir/'CT.nii.gz'))
96 | pet = nib.load(str(pat_dir/'PET.nii.gz'))
97 | seg = nib.load(str(pat_dir/'SEG.nii.gz'))
98 |
99 | suv = suv.get_fdata()
100 | ctres = ctres.get_fdata()
101 | ct = ct.get_fdata()
102 | pet = pet.get_fdata()
103 | seg = seg.get_fdata()
104 |
105 | try:
106 | h5_file.create_group(patient+'/'+study)
107 | h5_file.create_dataset(patient+'/'+study+'/suv', data=suv, compression="gzip")
108 | h5_file.create_dataset(patient+'/'+study+'/ctres', data=ctres, compression="gzip")
109 | h5_file.create_dataset(patient+'/'+study+'/ct', data=ct, compression="gzip")
110 | h5_file.create_dataset(patient+'/'+study+'/pet', data=pet, compression="gzip")
111 | h5_file.create_dataset(patient+'/'+study+'/seg', data=seg, compression="gzip")
112 |
113 | except:
114 | h5_pat = h5_file.create_group(patient)
115 | h5_pat.create_group(study)
116 | h5_file.create_dataset(patient+'/'+study+'/suv', data=suv, compression="gzip")
117 | h5_file.create_dataset(patient+'/'+study+'/ctres', data=ctres, compression="gzip")
118 | h5_file.create_dataset(patient+'/'+study+'/ct', data=ct, compression="gzip")
119 | h5_file.create_dataset(patient+'/'+study+'/pet', data=pet, compression="gzip")
120 | h5_file.create_dataset(patient+'/'+study+'/seg', data=seg, compression="gzip")
121 |
122 | h5_file.close()
123 |
124 |
125 | if __name__ == "__main__":
126 | path_to_data = sys.argv[1] # path to converted NiFTI files (see tcia2nifti) from downloaded TCIA DICOM database e.g. '...tcia_nifti/FDG-PET-CT-Lesions/'
127 | path_to_h5_data = sys.argv[2] # path to the to be saved HDF5 file, e.g. '...hdf5/FDG-PET-CT-Lesions.hdf5'
128 | study_dirs = find_studies(path_to_data)
129 | convert_nifti_to_hdf5(study_dirs, path_to_h5_data)
130 |
131 |
132 |
--------------------------------------------------------------------------------
/tcia_nifti_to_mha.py:
--------------------------------------------------------------------------------
1 | # converts the entire dataset from the .nii.gz format to the .mha format
2 | #(the .mha format is required by grand-challenge.org as input and ouput data of algorithms)
3 |
4 | #run script from command line as follows:
5 | # python tcia_nifti_to_mha.py /PATH/TO/NIFTI/FDG-PET-CT-Lesions/ /PATH/TO/MHA//FDG-PET-CT-Lesions/
6 |
7 | import SimpleITK as sitk
8 | import pathlib as plb
9 | from tqdm import tqdm
10 | import os
11 | import sys
12 |
13 | def find_studies(path_to_data): # returns a list of unique study paths within the dataset
14 | dicom_root = plb.Path(path_to_data)
15 | patient_dirs = list(dicom_root.glob('*'))
16 |
17 | study_dirs = []
18 |
19 | for dir in patient_dirs:
20 | sub_dirs = list(dir.glob('*'))
21 | #print(sub_dirs)
22 | study_dirs.extend(sub_dirs)
23 |
24 | #dicom_dirs = dicom_dirs.append(dir.glob('*'))
25 | return study_dirs
26 |
27 | def nii_to_mha(nii_path, mha_out_path): # converts a .nii.gz file to .mha and saves to a specified path
28 | img = sitk.ReadImage(nii_path)
29 | sitk.WriteImage(img, mha_out_path, True)
30 |
31 |
32 | def convert_to_mha(study_dirs,path_to_mha_data): # main function converting the entire dataset from .nii.gz to .mha
33 |
34 | for study_dir in tqdm(study_dirs):
35 |
36 | patient = study_dir.parent.name
37 | study = study_dir.name
38 |
39 | suv_nii = str(study_dir/'SUV.nii.gz')
40 | ctres_nii = str(study_dir/'CTres.nii.gz')
41 | ct_nii = str(study_dir/'CT.nii.gz')
42 | pet_nii = str(study_dir/'PET.nii.gz')
43 | seg_nii = str(study_dir/'SEG.nii.gz')
44 |
45 | suv_mha_dir = os.path.join(path_to_mha_data, patient, study)
46 | ctres_mha_dir = os.path.join(path_to_mha_data, patient, study)
47 | ct_mha_dir = os.path.join(path_to_mha_data, patient, study)
48 | pet_mha_dir = os.path.join(path_to_mha_data, patient, study)
49 | seg_mha_dir = os.path.join(path_to_mha_data, patient, study)
50 |
51 | os.makedirs(suv_mha_dir , exist_ok=True)
52 | os.makedirs(ctres_mha_dir, exist_ok=True)
53 | os.makedirs(ct_mha_dir , exist_ok=True)
54 | os.makedirs(pet_mha_dir , exist_ok=True)
55 | os.makedirs(seg_mha_dir , exist_ok=True)
56 |
57 | nii_to_mha(suv_nii, os.path.join(suv_mha_dir,'SUV.mha'))
58 | nii_to_mha(ctres_nii, os.path.join(ctres_mha_dir,'CTres.mha'))
59 | nii_to_mha(ct_nii, os.path.join(ct_mha_dir,'CT.mha'))
60 | nii_to_mha(pet_nii, os.path.join(pet_mha_dir,'PET.mha'))
61 | nii_to_mha(seg_nii, os.path.join(seg_mha_dir,'SEG.mha') )
62 |
63 |
64 | if __name__ == "__main__":
65 |
66 | path_to_nii_data = sys.argv[1] # path to nifti data e.g. .../nifti/FDG-PET-CT-Lesions/
67 | path_to_mha_data = sys.argv[2] # output path for mha data ... /mha/FDG-PET-CT-Lesions/ (will be created if non existing)
68 | study_dirs = find_studies(path_to_nii_data)
69 |
70 | convert_to_mha(study_dirs,path_to_mha_data)
71 |
--------------------------------------------------------------------------------