└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # Medical-image-processing-
  2 | 处理各种格式医疗图像的代码
  3 | 
  4 | 
  5 | 1. 介绍
  6 | 
  7 | -  常见医疗图像格式: DICOM（医学数字成像和通信），NIFTI（神经影像信息技术），PAR / REC（飞利浦MRI扫描仪格式），ANALYZE（梅奥医学影像），NRRD（近原始光栅数据） ）和MNIC。
  8 | 
  9 | - 需要安装的包: $Pydicom$  $SimpleITK$ $PIL$ $nibabel$ $nrrd$
 10 | 
 11 |   ```text
 12 |   conda install pydicom --channel conda-forge
 13 |   ```
 14 | 
 15 | - 看医疗图像的软件： ITK SNAP, 3D slicer
 16 | 
 17 | 2. DICOM图像处理
 18 | 
 19 |    Pydicom支持DICOM格式的读取：可以将dicom文件读入python结构，同时支持修改后的数据集可以再次写入DICOM格式文件。
 20 | 
 21 |    **读取Dicom文件并显示**
 22 | 
 23 |    ```python
 24 |    import pydicom
 25 |    import matplotlib.pyplot as plt
 26 |    ds = pydicom.dcmread(file)
 27 |    plt.figure(figsize=(10, 10))
 28 |    plt.imshow(ds.pixel_array, cmap=plt.cm.bone)
 29 |    plt.show()
 30 |    ```
 31 | 
 32 |    **完整CT预处理示例**
 33 | 
 34 |    对于CT图像，通常以患者的一次拍摄为一个文件夹，文件夹下有一序列的dicom文件，每个文件称为一个切片（slice）。但是每个患者的情况不同，所以slice间的间距不同，并且可能slice的排序也不同，因此需要在训练数据前做预处理。
 35 | 
 36 |    CT扫描中的测量单位是Hounsfield单位（HU），默认情况下，从DICOM文件中获得的值是HU这个单位
 37 | 
 38 |    这里展示利用**pydicom**对一个包含多个患者数据的文件夹处理的例子
 39 | 
 40 |    - 导入libraries，将所有患者列出来
 41 | 
 42 |    ```python
 43 |    import numpy as np # linear algebra
 44 |    import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
 45 |    import dicom
 46 |    import os
 47 |    import scipy.ndimage
 48 |    import matplotlib.pyplot as plt
 49 |    
 50 |    from skimage import measure, morphology
 51 |    from mpl_toolkits.mplot3d.art3d import Poly3DCollection
 52 |    
 53 |    # 包含所有患者目录的根目录
 54 |    INPUT_FOLDER = '../input/sample_images/'
 55 |    patients = os.listdir(INPUT_FOLDER)
 56 |    patients.sort()
 57 |    ```
 58 | 
 59 |    - 扫描一个患者的目录，加载所有slice，按z方向排序，并获得切片厚度
 60 | 
 61 |    ```python
 62 |    def load_scan(path):
 63 |        slices = [dicom.read_file(path + '/' + s) for s in os.listdir(path)]
 64 |        slices.sort(key = lambda x: float(x.ImagePositionPatient[2]))
 65 |        try:
 66 |            slice_thickness = np.abs(slices[0].ImagePositionPatient[2] - slices[1].ImagePositionPatient[2])
 67 |        except:
 68 |            slice_thickness = np.abs(slices[0].SliceLocation - slices[1].SliceLocation)
 69 |        for s in slices:
 70 |            s.SliceThickness = slice_thickness
 71 |        return slices
 72 |    ```
 73 | 
 74 |    - 提取患者信息
 75 | 
 76 |    ```python
 77 |    def loadFileInformation(filename):
 78 |        information = {}
 79 |        ds = dicom.read_file(filename)    
 80 |        information['PatientID'] = ds.PatientID
 81 |        information['PatientName'] = ds.PatientName
 82 |        information['PatientBirthDate'] = ds.PatientBirthDate
 83 |        information['PatientSex'] = ds.PatientSex
 84 |        information['StudyID'] = ds.StudyID
 85 |        information['StudyDate'] = ds.StudyDate
 86 |        information['StudyTime'] = ds.StudyTime
 87 |        information['InstitutionName'] = ds.InstitutionName
 88 |        information['Manufacturer'] = ds.Manufacturer
 89 |        information['NumberOfFrames'] = ds.NumberOfFrames    
 90 |        return information
 91 |    ```
 92 | 
 93 |    - 将HU值超出边界之外的像素置为0，再重新缩放
 94 | 
 95 |    ```python
 96 |    def get_pixels_hu(slices):
 97 |        image = np.stack([s.pixel_array for s in slices])
 98 |        # 转换为int16，int16是ok的，因为所有的数值都应该 <32k
 99 |        image = image.astype(np.int16)
100 |        # 设置边界外的元素为0
101 |        image[image == -2000] = 0
102 |        # 转换为HU单位
103 |        for slice_number in range(len(slices)):
104 |            intercept = slices[slice_number].RescaleIntercept
105 |            slope = slices[slice_number].RescaleSlope
106 |            if slope != 1:
107 |                image[slice_number] = slope * image[slice_number].astype(np.float64)
108 |                image[slice_number] = image[slice_number].astype(np.int16)
109 |            image[slice_number] += np.int16(intercept)
110 |        return np.array(image, dtype=np.int16)
111 |    ```
112 | 
113 |    - 查看一个患者的图像
114 | 
115 |    ```python
116 |    first_patient = load_scan(INPUT_FOLDER + patients[0])
117 |    first_patient_pixels = get_pixels_hu(first_patient)
118 |    plt.hist(first_patient_pixels.flatten(), bins=80, color='c')
119 |    plt.xlabel("Hounsfield Units (HU)")
120 |    plt.ylabel("Frequency")
121 |    plt.show()
122 |    # 显示一个中间位置的切片
123 |    plt.imshow(first_patient_pixels[80], cmap=plt.cm.gray)
124 |    plt.show()
125 |    ```
126 | 
127 |    - CT重新采样到[1 1 1]（由于不同的扫描，切片的距离可能不同）
128 | 
129 |    ```python
130 |    def resample(image, scan, new_spacing=[1,1,1]):
131 |        # Determine current pixel spacing
132 |        spacing = np.array([scan[0].SliceThickness] + scan[0].PixelSpacing, dtype=np.float32)
133 |        resize_factor = spacing / new_spacing
134 |        new_real_shape = image.shape * resize_factor
135 |        new_shape = np.round(new_real_shape)
136 |        real_resize_factor = new_shape / image.shape
137 |        new_spacing = spacing / real_resize_factor
138 |        image = scipy.ndimage.interpolation.zoom(image, real_resize_factor, mode='nearest')
139 |        return image, new_spacing
140 |    pix_resampled, spacing = resample(first_patient_pixels, first_patient, [1,1,1])
141 |    print("Shape before resampling\t", first_patient_pixels.shape)
142 |    print("Shape after resampling\t", pix_resampled.shape)
143 |    ```
144 | 
145 |    - 用matplotlib画3D图像
146 | 
147 |    ```python
148 |    def plot_3d(image, threshold=-300):
149 |        # Position the scan upright, 
150 |        # so the head of the patient would be at the top facing the camera
151 |        p = image.transpose(2,1,0)
152 |        verts, faces = measure.marching_cubes(p, threshold)
153 |        fig = plt.figure(figsize=(10, 10))
154 |        ax = fig.add_subplot(111, projection='3d')
155 |        # Fancy indexing: `verts[faces]` to generate a collection of triangles
156 |        mesh = Poly3DCollection(verts[faces], alpha=0.70)
157 |        face_color = [0.45, 0.45, 0.75]
158 |        mesh.set_facecolor(face_color)
159 |        ax.add_collection3d(mesh)
160 |        ax.set_xlim(0, p.shape[0])
161 |        ax.set_ylim(0, p.shape[1])
162 |        ax.set_zlim(0, p.shape[2])
163 |        plt.show()
164 |    
165 |    ```
166 | 
167 |    - 数据归一化，对关注的阈值段内的数据进行归一化操作
168 | 
169 |    ```python
170 |    MIN_BOUND = -1000.0
171 |    MAX_BOUND = 400.0
172 |    def normalize(image):
173 |        image = (image - MIN_BOUND) / (MAX_BOUND - MIN_BOUND)
174 |        image[image>1] = 1.
175 |        image[image<0] = 0.
176 |        return image
177 |    
178 |    ```
179 | 
180 |    - 将数据生成通用视频格式
181 | 
182 |    ```python
183 |    def writeVideo(img_array):
184 |        frame_num, width, height = img_array.shape
185 |        filename_output = filename.split('.')[0] + '.avi'    
186 |        video = cv2.VideoWriter(filename_output, -1, 16, (width, height))    
187 |        for img in img_array:
188 |            video.write(img)
189 |        video.release()
190 |    
191 |    ```
192 | 
193 |    利用**SimpleITK**处理DICOM文件
194 | 
195 |    ```python
196 |    import SimpleITK as sitk
197 |    import numpy as np
198 |    #读取一个序列
199 |    reader = sitk.ImageSeriesReader()
200 |    dicom_names = reader.GetGDCMSeriesFileNames(case_path)
201 |    reader.SetFileNames(dicom_names)
202 |    image = reader.Execute()
203 |    image_array = sitk.GetArrayFromImage(image) # z, y, x 切片数，宽，高
204 |    origin = image.GetOrigin() # x, y, z
205 |    spacing = image.GetSpacing() # x, y, z
206 |    #归一化
207 |    resample = sitk.ResampleImageFilter()
208 |    resample.SetOutputDirection(image.GetDirection())
209 |    resample.SetOutputOrigin(image.GetOrigin())
210 |    newspacing = [1, 1, 1]
211 |    resample.SetOutputSpacing(newspacing)
212 |    newimage = resample.Execute(image)
213 |    
214 |    #读取单张图片
215 |    image = sitk.ReadImage(slice_path)
216 |    image_array = sitk.GetArrayFromImage(image) # z, y, x
217 |    
218 |    ```
219 | 
220 | 3. NIFTI图像处理
221 | 
222 |    对于nii.gz格式文件，使用**SimpleITK**
223 | 
224 |    ```python
225 |    import SimpleITK as sitk
226 |    import skimage.io as io
227 |     
228 |    def read_img(path):
229 |        img = sitk.ReadImage(path)# path = 'F:/my_data/t1ce.nii.gz'
230 |        data = sitk.GetArrayFromImage(img)#channel_first
231 |        sitk.WriteImage(data,'***.nii.gz')#保存nii文件
232 |        return data
233 |    
234 |    ```
235 | 
236 |    使用**Nibabel**
237 | 
238 |    ```python
239 |    import nibabel as nib
240 |    import matplotlib.pyplot as plt
241 |     
242 |    def read_data(path):
243 |        img=nib.load(path)
244 |        img_array = img.get_data()#channel last,存放图像数据的矩阵 
245 |        affine_array = img.affine.copy()#get the affine array, 定义了图像数据在参考空间的位置
246 |        img_head = img.header.copy(); #get image metadat, 图像的一些属性信息，采集设备名称，体素的大小，扫描层数
247 |        #获取其他一些信息的方法
248 |    		img.shape # 获得维数信息
249 |    		img.get_data_dtype() # 获得数据类型
250 |    		img_head.get_data_dtype() #获得头信息的数据类型
251 |    		img_head.get_data_shape()# 获得维数信息
252 |    		img_head.get_zooms() #获得体素大小
253 |        return img_array,affine_array,img_head
254 |     
255 |    def save_data(img_array,affine_array,img_head):#保存处理后的nii文件
256 |      new_nii = nb.Nifti1Image(img_array,affine_array,img_head)
257 |      nb.save(new_nii,'new_test.nii.gz')
258 |    
259 |    ```
260 | 
261 | 4. 处理PAR/REC
262 | 
263 |    转为NIFTI
264 | 
265 |    ```python
266 |    fns = glob(os.path.join(img_dir, '*.par'))#or .PAR
267 |    for fn in fns:
268 |      print(f'Converting image:{fn}')
269 |      img = nib.load(fn)
270 |      _, base, _ = split_filename(fn)
271 |      out_fn = os.path.join(out_dir, base+'nii.gz')
272 |      nifti=nib.Nifti1Image(img.dataobj,img.affine, header=img.header)
273 |      nifti.set_data_dtype('<f4')
274 |      nifti.to_filename(out_fn)
275 |    
276 |    ```
277 | 
278 | 5. ANALYZE格式处理
279 | 
280 |    Analyze格式储存的每组数据组包含2个文件，一个为数据文件，其扩展名为.img，包含二进制的图像资料；另外一个为头文件，扩展名为.hdr，包含图像的元数据
281 | 
282 |    nibabel可以直接读取
283 | 
284 |    ```python
285 |    hdr = nib.load(add).get_data()
286 |    
287 |    ```
288 | 
289 | 6. NRRD格式处理
290 | 
291 |    ```python
292 |    nrrd_data, nrrd_options = nrrd.read(nrrd_filename)
293 |    #nrrd_data 保存图像的多维矩阵
294 |    #nrrd_options 保存图像的相关信息
295 |    >>
296 |    {u'dimension': 3, # 维度
297 |     u'encoding': 'raw', # 编码方式
298 |     u'endian': 'little', # 
299 |     u'keyvaluepairs': {},
300 |     u'kinds': ['domain', 'domain', 'domain'], # 三个维度的类型
301 |     u'sizes': [30, 30, 30], #三个维度的大小
302 |     u'space': 'left-posterior-superior', # 空间信息
303 |     u'space directions': [['1', '0', '0'], ['0', '1', '0'], ['0', '0', '1']],
304 |     u'space origin': ['0', '0', '0'],
305 |     u'type': 'short'}
306 |    
307 |    
308 |    ```
309 | 
310 | 7. MNIC格式处理
311 | 
312 |    医疗图像 NetCDF 工具包
313 | 
314 | 8. 格式转换
315 | 
316 |    **dicom 转换成 NIFTI**
317 | 
318 |     dicom2nii（https://www.nitrc.org/projects/dcm2nii/）是一个用来把 DICOM 转换为 NIFTI 的工具。nibabel 是一个读写 nifiti 文件的 python 库。如果你你想把 DICOM 转换成 NIFTI，可以使用自动转换的工具（例如，dcm2nii）。python2 下的库 dcmstack 可以将一系列的 DICOM 文件堆叠成多维数组。这些数组能够被写成 NIFTI 的文件，同时还加上一个可选的头部扩展，这个头文件包含原始 DICOM 文件的所有元数据。python3 提供了一个新的用来完成这个格式转换的库--dicom2nifti。我建议读者去查看一下 nipy 这个项目。
319 | 
320 |    **DICOM 到 MINC 的转换**
321 | 
322 |    脑成像中心（BIC）的 MINC 团队开发了将 DICOM 转换为 MINC 格式的工具。这个工具是用 C 语言开发的，它在 GitHub 上的地址为：https://github.com/BIC-MNI/minc-tools/tree/master/conversion/dcm2mnc
323 | 
324 |    **NIfTI 或 ANALYZE 转换成 MINC**
325 | 
326 |    脑成像中心（BIC）的 MINC 团队还开发了将 NIfTI 或 ANALYZE 转换成 MINC 格式图像的工具。这个程序叫做 nii2mnc。包括 nii2mnc 的很多格式转换工具在这里可以看到：https://github.com/BIC-MNI/minc-tools/tree/master/conversion。
327 | 
328 | 参考：
329 | 
330 | 处理医疗影像的python利器：PyDicomhttps://zhuanlan.zhihu.com/p/59413289
331 | 
332 | https://www.jiqizhixin.com/articles/2017-07-31
333 | 


--------------------------------------------------------------------------------