├── kaggle_UBC-OCEAN-MIL ├── WSI_crop.png ├── mask.png ├── oring.png ├── paper1.png ├── paper2.png ├── requirements.txt └── self-trained-se-efficientnetb.ipynb ├── readme.md └── readme_cn.md /kaggle_UBC-OCEAN-MIL/WSI_crop.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Metavers1/kaggle_UBC-OCEAN-MIL/6be6b600e6400007d66a81440b2390f685d7b3fe/kaggle_UBC-OCEAN-MIL/WSI_crop.png -------------------------------------------------------------------------------- /kaggle_UBC-OCEAN-MIL/mask.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Metavers1/kaggle_UBC-OCEAN-MIL/6be6b600e6400007d66a81440b2390f685d7b3fe/kaggle_UBC-OCEAN-MIL/mask.png -------------------------------------------------------------------------------- /kaggle_UBC-OCEAN-MIL/oring.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Metavers1/kaggle_UBC-OCEAN-MIL/6be6b600e6400007d66a81440b2390f685d7b3fe/kaggle_UBC-OCEAN-MIL/oring.png -------------------------------------------------------------------------------- /kaggle_UBC-OCEAN-MIL/paper1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Metavers1/kaggle_UBC-OCEAN-MIL/6be6b600e6400007d66a81440b2390f685d7b3fe/kaggle_UBC-OCEAN-MIL/paper1.png -------------------------------------------------------------------------------- /kaggle_UBC-OCEAN-MIL/paper2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Metavers1/kaggle_UBC-OCEAN-MIL/6be6b600e6400007d66a81440b2390f685d7b3fe/kaggle_UBC-OCEAN-MIL/paper2.png -------------------------------------------------------------------------------- /kaggle_UBC-OCEAN-MIL/requirements.txt: -------------------------------------------------------------------------------- 1 | albumentations==1.0.3 2 | colorama==0.4.4 3 | joblib==1.0.1 4 | matplotlib==3.4.2 5 | numpy==1.21.0 6 | opencv-python==4.5.3.56 7 | pandas==1.3.0 8 | Pillow==8.2.0 9 | scipy==1.7.0 10 | scikit-learn==0.24.2 11 | timm==0.4.9 12 | torch==1.9.0 13 | torchvision==0.10.0 14 | tqdm==4.61.2 15 | transformers==4.8.2 16 | -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # MIL-Based Multi-Instance Learning Code for Ovarian Cancer Histopathology Image Classification from kaggle_UBC_ocean 2 | 3 | 4 | **choose Language:** [![Chinese](https://img.shields.io/badge/Language-中文-blue)](readme_cn.md) 5 | 6 | Competition Link: [kaggle_UBC_ocean](https://www.kaggle.com/competitions/UBC-OCEAN) 7 | 8 | This project provides a method for classifying high-probability cancerous regions in high-resolution WSI images (up to 60000px): 9 | The 236 mask images provided by the competition organizers are scaled down to thumbnail size, and then the `U_net` segmentation network is trained using the WSI thumbnails in `train_thumbnails`. The segmentation model results are as follows: 10 | 11 | - For the image `train_images/22489.png`: 12 | 13 | ![Original Image](kaggle_UBC-OCEAN-MIL/oring.png) 14 | 15 | After segmentation using `U_net`, the mask is as follows: 16 | 17 | ![Segmentation Mask](kaggle_UBC-OCEAN-MIL/mask.png) 18 | 19 | From the above mask, the smallest squares that can enclose the segmented regions are selected, and then coordinate scaling and transformation are performed to crop the corresponding positions in the original WSI image, as shown below: 20 | 21 | ![Cropped Image](kaggle_UBC-OCEAN-MIL/WSI_crop.png) 22 | 23 | The blue boxes represent the smallest squares that enclose the cancerous regions, and within each blue box, `1000*1000px` random crops are performed as shown in the green squares. The cropped images are then sent to CNN for multi-instance classification of high-probability cancerous regions. 24 | 25 | Using the technique from the paper `TransMIL`: First, extract features using `cnn`, then convert the feature map of each image into a single `token`, and send it to the `encoder` of the `transformer` for multi-instance learning. 26 | 27 | The workflow is as follows: 28 | 29 | ![Workflow](kaggle_UBC-OCEAN-MIL/paper1.png) 30 | 31 | - This diagram is from the transmil paper, which is a link to the transmil paper: [arxiv.org/abs/2106.00908](https://arxiv.org/abs/2106.00908) 32 | 33 | This technique includes an image segmentation part compared to `TransMIL`, which reduces computational resources and improves classification accuracy. 34 | 35 | The workflow is as follows: 36 | 37 | ![Diagram](kaggle_UBC-OCEAN-MIL/paper2.png) 38 | 39 | After obtaining high-probability disease regions, CNN is used to extract features, which are then flattened and sent as sequences to the transformer for multi-instance learning. The code environment is the `kaggle` competition environment, and the code is in `.ipynb` format. 40 | 41 | The environment can be configured as follows: 42 | ```bash 43 | pip install -r requirements.txt 44 | -------------------------------------------------------------------------------- /readme_cn.md: -------------------------------------------------------------------------------- 1 | # kaggle_UBC_ocean的卵巢癌组织病理学图像分类的MIL多实例学习代码 2 | **Language:** [English](readme.md) 3 | 4 | 比赛链接:[kaggle_UBC_ocean](https://www.kaggle.com/competitions/UBC-OCEAN) 5 | 6 | 本项目提供了对如60000px分辨率大的WSI图像进行高概率癌变区域分类的方法: 7 | 使用赛方提供的236张mask掩码缩放至缩略图大小,而后使用`train_thumbnails`的WSI缩略图图像进行`U_net`分割网络训练和后得到的分割模型效果如下: 8 | 9 | - 其中对于图像 `train_images/22489.png`: 10 | 11 | ![原图](kaggle_UBC-OCEAN-MIL/oring.png) 12 | 13 | 使用 `U_net` 进行分割后,得到如下掩码: 14 | 15 | ![分割掩码](kaggle_UBC-OCEAN-MIL/mask.png) 16 | 17 | 对以上掩码选出最小的能够包裹分割区域的几块正方形,而后进行坐标缩放转换,在对WSI原图对应位置进行裁剪,如下图: 18 | 19 | ![裁剪图](kaggle_UBC-OCEAN-MIL/WSI_crop.png) 20 | 21 | 蓝色线条为包裹分割的癌变区域的最小方块,而后在蓝色的每个方块里进行`1000*1000px`的随机裁剪如绿色方框,裁剪后的图像送入CNN做高概率病变区域的多实例分类。 22 | 23 | 使用论文 `TransMIL` 的技术:先用 `cnn提取` 特征,而后每张图的特征图转换为单个 `token` 而后送入 `transformer` 的 `encoder` 进行多实例学习。 24 | 25 | 流程图如下: 26 | 27 | ![流程图](kaggle_UBC-OCEAN-MIL/paper1.png) 28 | 29 | - 论文链接:[arxiv.org/abs/2106.00908](https://arxiv.org/abs/2106.00908) 30 | 31 | 此技术比 `TransMIL` 多了图像分割部分,所以降低了计算资源, 提高了分类正确率。 32 | 33 | 流程图如下: 34 | 35 | ![示意图](kaggle_UBC-OCEAN-MIL/paper2.png) 36 | 37 | 获取高概率治病区域后用CNN提取特征,而后展平作为序列送入transformer进行多实例学习。 38 | 代码环境为`kaggle`比赛运行环境,代码使用.ipynb格式。 39 | 40 | 环境可如下配置: 41 | ```bash 42 | pip install -r requirements.txt 43 | --------------------------------------------------------------------------------