├── Images ├── images ├── summary.jpg ├── summary_2.png └── dataset-examples.jpg ├── .gitignore └── README.md /Images: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /images/summary.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dashishi/LDPolypVideo-Benchmark/HEAD/images/summary.jpg -------------------------------------------------------------------------------- /images/summary_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dashishi/LDPolypVideo-Benchmark/HEAD/images/summary_2.png -------------------------------------------------------------------------------- /images/dataset-examples.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dashishi/LDPolypVideo-Benchmark/HEAD/images/dataset-examples.jpg -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LDPolypVideo-Benchmark 2 | A public endoscopic video dataset for polyp detection 3 | ### Abstract 4 | Computer-Aided Diagnosis (CAD) systems for polyp detection provide essential support for colorectal cancer screening and prevention. 5 | Recently, deep learning technology has made breakthrough progress in medical image computation and computer-aided diagnosis. However, the deficiency of training data seriously impedes the development of polyp detection techniques. 6 | Existing fully-annotated databases, including CVC-ClinicDB, ETIS-Larib, CVC-Colon dataset, Kvasir-Seg, and CVC-ClinicVideoDB, are very limited in polyp size and shape diversity, which is far from the significant complexity in the actual clinical situation. 7 | In this paper, we propose LDPolypVideo, a large-scale colonoscopy video database that contains a variety of polyps and more complex bowel environments. 8 | Our database contains 160 colonoscopy videos and 40,266 frames in total with polyp annotations, which are four times the size of the largest existing colonoscopy video database CVC-ClinicVideoDB. 9 | In order to improve the efficiency of polyp annotation, we design an intelligent annotation tool based on object tracking. 10 | Extensive experiments have been conducted to evaluate state-of-the-art object detection approaches on our LDPolypVideo dataset. The average drops of Recall and Precision of four SOTA approaches on this dataset are 26\% and 15\%, respectively. 11 | The great performance drop demonstrates the significant challenges but also the great value of our large-scale and diverse polyp video dataset to facilitate the research on polyp detection. 12 | 13 | ### Overview of dataset 14 | In order to increase the size and diversity of colonoscopy data for training and evaluation of polyp detection approaches, we present a large-scale and diverse colonoscopy video dataset named LDPolyVideo. 15 | It consists of 160 videos with 40,266 frames, nearly four times the size of the largest existing fully-annotated dataset. 16 | There are 33,884 frames that contain at least one polyp and in total 200 labeled polyps, which are more than 11 times the polyps in CVC-ClinicVideoDB. 17 | The polyps present more diverse morphologies. 18 | Fig.1 shows a set of example images and the annotated masks for polyps. 19 | Besides, we also provide 103 videos, including 861,400 frames without full annotations. Each video has a label indicating whether it contains polyps. 20 | These videos enrich the data diversity and will support unsupervised and semi-supervised methods. 21 | Based on our LDPolyVideo dataset, we evaluate a number of state-of-the-art approaches for polyp detection to analyze their strengths and weaknesses, demonstrating the challenges of colonoscopy polyp detection in clinical examination. 22 | 23 | ![data example](https://github.com/dashishi/LDPolypVideo-Benchmark/blob/main/images/dataset-examples.jpg) 24 | 25 | ![summary](https://github.com/dashishi/LDPolypVideo-Benchmark/blob/main/images/summary.jpg) 26 | 27 | ![summary_2](https://github.com/dashishi/LDPolypVideo-Benchmark/blob/main/images/summary_2.png) 28 | ### Citation 29 | > Yiting. Ma, Xuejin. Chen, Kai. Cheng, Yang. Li and Bin. Sun. "LDPolypVideo Benchmark: A Large-scale Colonoscopy Video Dataset of Diverse Polyps", Medical Image Computing and Computer Assisted Intervention Society, 2021 30 | 31 | ### Download 32 | [Click here to download the whole dataset in Baidu Cloud](https://pan.baidu.com/s/1ETWQeYsveUF2tz0huaqlmg?pwd=ustc ) 33 | 34 | Extraction code: ustc 35 | 36 | [Click here to download the whole dataset in Google drive](https://drive.google.com/drive/folders/13KwU_uZcxsl6dL-mqcs39Yb0gjU9vn3G?usp=sharing) (Part-1) 37 | 38 | [Click here to download the whole dataset in Google drive](https://drive.google.com/file/d/1pxFYO-nRd5uqdYsjs7NRkwXQMMurbWsZ/view?usp=sharing) (Part-2) 39 | 40 | 41 | --------------------------------------------------------------------------------