├── .github
    └── workflows
    │   └── action.yml
├── .mlc_config.json
├── README.md
└── logo.png


/.github/workflows/action.yml:
--------------------------------------------------------------------------------
 1 | name: Check Markdown links
 2 | 
 3 | on:
 4 |   push:
 5 |     branches:
 6 |       - master
 7 |   pull_request:
 8 |     branches:
 9 |       - master
10 | 
11 | jobs:
12 |   markdown-link-check:
13 |     runs-on: ubuntu-latest
14 |     steps:
15 |     - uses: actions/checkout@main
16 |     - uses: gaurav-nelson/github-action-markdown-link-check@v1
17 |       with:
18 |         use-quiet-mode: 'no'
19 |         use-verbose-mode: 'yes'
20 |         config-file: '.mlc_config.json'
21 |         file-path: './README.md'
22 | 


--------------------------------------------------------------------------------
/.mlc_config.json:
--------------------------------------------------------------------------------
 1 | {
 2 |     "aliveStatusCodes": [
 3 |         0,
 4 |         200,
 5 |         403,
 6 |         429,
 7 |         500,
 8 |         503,
 9 |         999
10 |     ]
11 | }


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <div align="center">
  2 |   <p>
  3 |     <a href="https://www.satellite-image-deep-learning.com/">
  4 |         <img src="logo.png" width="700">
  5 |     </a>
  6 | </p>
  7 |   <h2>Datasets for deep learning applied to satellite and aerial imagery.</h2>
  8 | 
  9 | # 👉 [satellite-image-deep-learning.com](https://www.satellite-image-deep-learning.com/) 👈
 10 | 
 11 | </div>
 12 | 
 13 | **How to use this repository:** if you know exactly what you are looking for (e.g. you have the paper name) you can `Control+F` to search for it in this page (or search in the raw markdown).
 14 | 
 15 | # Lists of datasets
 16 | <!-- markdown-link-check-disable -->
 17 | * [Earth Observation Database](https://eod-grss-ieee.com/)
 18 | <!-- markdown-link-check-enable -->
 19 | * [awesome-satellite-imagery-datasets](https://github.com/chrieke/awesome-satellite-imagery-datasets)
 20 | * [Awesome_Satellite_Benchmark_Datasets](https://github.com/Seyed-Ali-Ahmadi/Awesome_Satellite_Benchmark_Datasets)
 21 | * [awesome-remote-sensing-change-detection](https://github.com/wenhwu/awesome-remote-sensing-change-detection) -> dedicated to change detection
 22 | * [Callisto-Dataset-Collection](https://github.com/Agri-Hub/Callisto-Dataset-Collection) -> datasets that use Copernicus/sentinel data
 23 | * [geospatial-data-catalogs](https://github.com/giswqs/geospatial-data-catalogs) -> A list of open geospatial datasets available on AWS, Earth Engine, Planetary Computer, and STAC Index
 24 | * [BED4RS](https://captain-whu.github.io/BED4RS/)
 25 | * [Satellite-Image-Time-Series-Datasets](https://github.com/corentin-dfg/Satellite-Image-Time-Series-Datasets)
 26 | 
 27 | # Remote sensing dataset hubs
 28 | * [Radiant MLHub](https://mlhub.earth/) -> both datasets and models
 29 | * [Registry of Open Data on AWS](https://registry.opendata.aws)
 30 | * [Microsoft Planetary Computer data catalog](https://planetarycomputer.microsoft.com/catalog)
 31 | * [Google Earth Engine Data Catalog](https://developers.google.com/earth-engine/datasets)
 32 | 
 33 | ## Sentinel
 34 | As part of the [EU Copernicus program](https://en.wikipedia.org/wiki/Copernicus_Programme), multiple Sentinel satellites are capturing imagery -> see [wikipedia](https://en.wikipedia.org/wiki/Copernicus_Programme#Sentinel_missions)
 35 | * [awesome-sentinel](https://github.com/Fernerkundung/awesome-sentinel) -> a curated list of awesome tools, tutorials and APIs related to data from the Copernicus Sentinel Satellites.
 36 | * [Sentinel-2 Cloud-Optimized GeoTIFFs](https://registry.opendata.aws/sentinel-2-l2a-cogs/) and [Sentinel-2 L2A 120m Mosaic](https://registry.opendata.aws/sentinel-s2-l2a-mosaic-120/)
 37 | * [Open access data on GCP](https://console.cloud.google.com/storage/browser/gcp-public-data-sentinel-2?prefix=tiles%2F31%2FT%2FCJ%2F)
 38 | * Paid access to Sentinel & Landsat data via [sentinel-hub](https://www.sentinel-hub.com/) and [python-api](https://github.com/sentinel-hub/sentinelhub-py)
 39 | * [Example loading sentinel data in a notebook](https://github.com/binder-examples/getting-data/blob/master/Sentinel2.ipynb)
 40 | * [Jupyter Notebooks for working with Sentinel-5P Level 2 data stored on S3](https://github.com/Sentinel-5P/data-on-s3). The data can be browsed [here](https://meeo-s5p.s3.amazonaws.com/index.html#/?t=catalogs)
 41 | * [Sentinel NetCDF data](https://github.com/acgeospatial/Sentinel-5P/blob/master/Sentinel_5P.ipynb)
 42 | * [Analyzing Sentinel-2 satellite data in Python with Keras](https://github.com/jensleitloff/CNN-Sentinel)
 43 | * [Xarray backend to Copernicus Sentinel-1 satellite data products](https://github.com/bopen/xarray-sentinel)
 44 | * [SEN2VENµS](https://zenodo.org/record/6514159#.YoRxM5PMK3I) -> a dataset for the training of Sentinel-2 super-resolution algorithms
 45 | * [SEN12MS](https://github.com/zhu-xlab/SEN12MS) -> A Curated Dataset of Georeferenced Multi-spectral Sentinel-1/2 Imagery for Deep Learning and Data Fusion. Checkout [SEN12MS toolbox](https://github.com/schmitt-muc/SEN12MS) and many referenced uses on [paperswithcode.com](https://paperswithcode.com/dataset/sen12ms)
 46 | * [Sen4AgriNet](https://github.com/Orion-AI-Lab/S4A) -> A Sentinel-2 multi-year, multi-country benchmark dataset for crop classification and segmentation with deep learning, with and [models](https://github.com/Orion-AI-Lab/S4A-Models)
 47 | * [earthspy](https://github.com/AdrienWehrle/earthspy) -> Monitor and study any place on Earth and in Near Real-Time (NRT) using the Sentinel Hub services developed by the EO research team at Sinergise
 48 | * [Space2Ground](https://github.com/Agri-Hub/Space2Ground) -> dataset with Space (Sentinel-1/2) and Ground (street-level images) components, annotated with crop-type labels for agriculture monitoring.
 49 | * [sentinel2tools](https://github.com/QuantuMobileSoftware/sentinel2tools) -> downloading & basic processing of Sentinel 2 imagesry. Read [Sentinel2tools: simple lib for downloading Sentinel-2 satellite images](https://medium.com/geekculture/sentinel2tools-simple-lib-for-downloading-sentinel-2-satellite-images-f8a6be3ee894)
 50 | * [open-sentinel-map](https://github.com/VisionSystemsInc/open-sentinel-map) -> The OpenSentinelMap dataset contains Sentinel-2 imagery and per-pixel semantic label masks derived from OpenStreetMap
 51 | * [MSCDUnet](https://github.com/Lihy256/MSCDUnet) -> change detection datasets containing VHR, multispectral (Sentinel-2) and SAR (Sentinel-1)
 52 | * [OMBRIA](https://github.com/geodrak/OMBRIA) -> Sentinel-1 & 2 dataset for adressing the flood mapping problem
 53 | * [Canadian-cropland-dataset](https://github.com/bioinfoUQAM/Canadian-cropland-dataset) -> a novel patch-based dataset compiled using optical satellite images of Canadian agricultural croplands retrieved from Sentinel-2
 54 | * [Sentinel-2 Cloud Cover Segmentation Dataset](https://mlhub.earth/data/ref_cloud_cover_detection_challenge_v1) on Radiant mlhub
 55 | * [The Azavea Cloud Dataset](https://www.azavea.com/blog/2021/08/02/the-azavea-cloud-dataset/) which is used to train this [cloud-model](https://github.com/azavea/cloud-model)
 56 | * [fMoW-Sentinel](https://purl.stanford.edu/vg497cb6002) -> The Functional Map of the World - Sentinel-2 corresponding images (fMoW-Sentinel) dataset consists of image time series collected by the Sentinel-2 satellite, corresponding to locations from the Functional Map of the World (fMoW) dataset across several different times. Used in [SatMAE](https://github.com/sustainlab-group/SatMAE)
 57 | * [Earth Surface Water Dataset](https://zenodo.org/record/5205674#.Y4iEFezP1hE) -> a dataset for deep learning of surface water features on Sentinel-2 satellite images. See [this ref using it in torchgeo](https://towardsdatascience.com/artificial-intelligence-for-geospatial-analysis-with-pytorchs-torchgeo-part-1-52d17e409f09)
 58 | * [Ship-S2-AIS dataset](https://zenodo.org/record/7229756#.Y5GsgOzP1hE) -> 13k tiles extracted from 29 free Sentinel-2 products. 2k images showing ships in Denmark sovereign waters: one may detect cargos, fishing, or container ships
 59 | * [Amazon Rainforest dataset for semantic segmentation](https://zenodo.org/record/3233081#.Y6LPLOzP1hE) -> Sentinel 2 images
 60 | * [Mining and clandestine airstrips datasets](https://github.com/earthrise-media/mining-detector)
 61 | * [Satellite Burned Area Dataset](https://zenodo.org/record/6597139#.Y9ufiezP1hE) -> segmentation dataset containing several satellite acquisitions related to past forest wildfires. It contains 73 acquisitions from Sentinel-2 and Sentinel-1 (Copernicus).
 62 | * [mmflood](https://github.com/edornd/mmflood) -> Flood delineation from Sentinel-1 SAR imagery, with [paper](https://ieeexplore.ieee.org/abstract/document/9882096)
 63 | * [MATTER](https://github.com/periakiva/MATTER) -> a Sentinel 2 dataset for Self-Supervised Training
 64 | * [Industrial Smoke Plumes](https://zenodo.org/record/4250706)
 65 | * [MARIDA: Marine Debris Archive](https://github.com/marine-debris/marine-debris.github.io)
 66 | * [S2GLC](https://s2glc.cbk.waw.pl/) -> High resolution Land Cover Map of Europe
 67 | * [Generating Imperviousness Maps from Multispectral Sentinel-2 Satellite Imagery](https://zenodo.org/record/7058860#.ZDrAeuzMLdo)
 68 | * [Sentinel-2 Water Edges Dataset (SWED)](https://openmldata.ukho.gov.uk/)
 69 | * [Sentinel-1 for Science Amazonas](https://sen4ama.gisat.cz/) -> forest lost time series dataset
 70 | * [Sentinel2 Munich480](https://www.kaggle.com/datasets/artelabsuper/sentinel2-munich480) -> dataset for crop mapping by exploiting the time series of Sentinel-2 satellite
 71 | * [Meadows vs Orchards](https://www.kaggle.com/datasets/baptistel/meadows-vs-orchards) -> a pixel time series dataset
 72 | * [SEN12_GUM](https://zenodo.org/record/6914898) -> SEN12 Global Urban Mapping Dataset
 73 | * [Sentinel-1&2 Image Pairs (SAR & Optical)](https://www.kaggle.com/datasets/requiemonk/sentinel12-image-pairs-segregated-by-terrain)
 74 | * [Sentinel-2 Image Time Series for Crop Mapping](https://www.kaggle.com/datasets/ignazio/sentinel2-crop-mapping) -> data for the Lombardy region in Italy
 75 | * [Deforestation in Ukraine from Sentinel2 data](https://www.kaggle.com/datasets/isaienkov/deforestation-in-ukraine)
 76 | * [Multitask Learning for Estimating Power Plant Greenhouse Gas Emissions from Satellite Imagery](https://zenodo.org/record/5644746)
 77 | * [METER-ML: A Multi-sensor Earth Observation Benchmark for Automated Methane Source Mapping](https://stanfordmlgroup.github.io/projects/meter-ml/) -> data [on Zenodo](https://zenodo.org/record/6911013)
 78 | * [satellite-change-events](https://www.cs.cornell.edu/projects/satellite-change-events/) -> CaiRoad & CalFire change detection Sentinel 2 datasets
 79 | * [OMS2CD](https://github.com/Dibz15/OpenMineChangeDetection) -> hand-labelled images for change-detection in open-pit mining areas
 80 | * [coal power plants’ emissions](https://transitionzero.medium.com/estimating-coal-power-plant-operation-from-satellite-images-with-computer-vision-b966af56919e) -> a dataset of coal power plants’ emissions, including images, metadata and labels.
 81 | * [RapidAI4EO](https://rapidai4eo.radiant.earth/) -> dense time series satellite imagery sampled at 500,000 locations across Europe, comprising S2 & Planet imagery, with CORINE Land Cover multiclass labels for 2018
 82 | * [Sentinel 2 super-resolved data cubes - 92 scenes over 2 regions in Switzerland spanning 5 years](https://ieee-dataport.org/documents/sentinel-2-super-resolved-data-cubes-92-scenes-over-2-regions-switzerland-spanning-5-years)
 83 | * [MS-HS-BCD-dataset](https://github.com/arcgislearner/MS-HS-BCD-dataset) -> multisource change detection dataset used in paper: Building Change Detection with Deep Learning by Fusing Spectral and Texture Features of Multisource Remote Sensing Images: A GF-1 and Sentinel 2B Data Case
 84 | * [MSOSCD](https://github.com/Lihy256/MSCDUnet) -> change detection datasets containing VHR, multispectral (Sentinel-2) and SAR (Sentinel-1)
 85 | * [Sentinel-2 dataset for ship detection](https://zenodo.org/records/3923841), also edited and redistributed as [VDS2RAW](https://zenodo.org/records/7982468#.ZIiLxS8QOo4)
 86 | * [MineSegSAT](https://github.com/macdonaldezra/MineSegSAT) -> dataset for paper: AN AUTOMATED SYSTEM TO EVALUATE MINING DISTURBED AREA EXTENTS FROM SENTINEL-2 IMAGERY
 87 | * [CropNet: An Open Large-Scale Dataset with Multiple Modalities for Climate Change-aware Crop Yield Predictions](https://anonymous.4open.science/r/CropNet/README.md) -> terabyte-sized, publicly available, and multi-modal dataset for climate change-aware crop yield predictions
 88 | * [Tiny CropNet dataset](https://github.com/fudong03/MMST-ViT)
 89 | * [CaBuAr](https://github.com/DarthReca/CaBuAr) -> California Burned Areas dataset for delineation
 90 | * [sen12mscr](https://patricktum.github.io/cloud_removal/sen12mscr/) -> Multimodal Cloud Removal
 91 | * [Greenearthnet](https://github.com/vitusbenson/greenearthnet/tree/main) -> dataset specifically designed for high-resolution vegetation forecasting
 92 | * [MultiSenGE](https://zenodo.org/records/6375466) -> large-scale multimodal and multitemporal benchmark dataset
 93 | * [Floating-Marine-Debris-Data](https://github.com/miguelmendesduarte/Floating-Marine-Debris-Data) -> floating marine debris, with annotations for six debris classes, including plastic, driftwood, seaweed, pumice, sea snot, and sea foam.
 94 | * [Sen2Fire](https://zenodo.org/records/10881058) -> A Challenging Benchmark Dataset for Wildfire Detection using Sentinel Data
 95 | * [L1BSR](https://zenodo.org/records/7826696) -> 3740 pairs of overlapping image crops extracted from two L1B products
 96 | * [GloSoFarID](https://github.com/yzyly1992/GloSoFarID) -> Global multispectral dataset for Solar Farm IDentification
 97 | * [SICKLE](https://github.com/Depanshu-Sani/SICKLE) -> A Multi-Sensor Satellite Imagery Dataset Annotated with Multiple Key Cropping Parameters. Multi-resolution time-series images from Landsat-8, Sentinel-1, and Sentinel-2
 98 | * [MARIDA](https://marine-debris.github.io/index.html) -> Marine Debris detection from Sentinel-2
 99 | * [MADOS](https://github.com/gkakogeorgiou/mados) -> Marine Debris and Oil Spill from Sentinel-2
100 | * [Sentinel-1 and Sentinel-2 Vessel Detection](https://github.com/allenai/vessel-detection-sentinels)
101 | * [TreeSatAI](https://zenodo.org/records/6780578) -> Sentinel-1, Sentinel-2
102 | * [Sentinel-2 dataset for ship detection and characterization](https://zenodo.org/records/10418786) -> RGB
103 | * [S2-SHIPS](https://github.com/alina2204/contrastive_SSL_ship_detection) -> all 12 channels
104 | * [ChatEarthNet](https://github.com/zhu-xlab/ChatEarthNet) -> A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models, utilizes Sentinel-2 data with captions generated by ChatGPT
105 | * [UKFields](https://github.com/Spiruel/UKFields) -> over 2.3 million automatically delineated field boundaries spanning England, Wales, Scotland, and Northern Ireland
106 | * [ShipWakes](https://zenodo.org/records/7947694) -> Keypoints Method for Recognition of Ship Wake Components in Sentinel-2 Images by Deep Learning
107 | * [TimeSen2Crop](https://zenodo.org/records/4715631) -> a Million Labeled Samples Dataset of Sentinel 2 Image Time Series for Crop Type Classification
108 | * [AgriSen-COG](https://github.com/tselea/agrisen-cog) -> a Multicountry, Multitemporal Large-Scale Sentinel-2 Benchmark Dataset for Crop Mapping: includes an anomaly detection preprocessing step
109 | * [MagicBathyNet](https://www.magicbathy.eu/magicbathynet.html) -> a new multimodal benchmark dataset made up of image patches of Sentinel-2, SPOT-6 and aerial imagery, bathymetry in raster format and seabed classes annotations
110 | * [AI2-S2-NAIP](https://huggingface.co/datasets/allenai/s2-naip) -> aligned NAIP, Sentinel-2, Sentinel-1, and Landsat images spanning the entire continental US
111 | * [MuS2: A Benchmark for Sentinel-2 Multi-Image Super-Resolution](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi%3A10.7910%2FDVN%2F1JMRAT)
112 | * [Sen4Map](https://datapub.fz-juelich.de/sen4map/) -> Sentinel-2 time series images, covering over 335,125 geo-tagged locations across the European Union. These geo-tagged locations are associated with detailed landcover and land-use information
113 | * [CloudSEN12Plus](https://huggingface.co/datasets/isp-uv-es/CloudSEN12Plus) -> the largest cloud detection dataset to date for Sentinel-2
114 | * [mayrajeo S2 ship detection](https://github.com/mayrajeo/ship-detection) -> labels for Detecting marine vessels from Sentinel-2 imagery with YOLOv8
115 | * [Fields of The World](https://fieldsofthe.world/) -> instance segmentation of agricultural field boundaries
116 | * [ai4boundaries](https://github.com/waldnerf/ai4boundaries) -> field boundaries with Sentinel-2 and aerial photography
117 | * [California Wildfire GeoImaging Dataset - CWGID](https://arxiv.org/abs/2409.16380) -> Development and Application of a Sentinel-2 Satellite Imagery Dataset for Deep-Learning Driven Forest Wildfire Detection
118 | * [POPCORN: High-resolution Population Maps Derived from Sentinel-1 and Sentinel-2](https://popcorn-population.github.io/)
119 | * [substation-seg](https://github.com/Lindsay-Lab/substation-seg) -> segmenting substations dataset
120 | * [PhilEO-downstream](https://huggingface.co/datasets/PhilEO-community/PhilEO-downstream) -> a 400GB Sentinel-2 dataset for building density estimation, road segmentation, and land cover classification.
121 | * [PhilEO-pretrain](https://huggingface.co/datasets/PhilEO-community/PhilEO-pretrain) -> a 500GB global dataset of Sentinel-2 images for model pre-training.
122 | * [KappaSet: Sentinel-2 KappaZeta Cloud and Cloud Shadow Masks](https://zenodo.org/records/7100327)
123 | * [AllClear](https://allclear.cs.cornell.edu/) A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery
124 | * [Sentinel-2 reference cloud masks generated by an active learning method](https://zenodo.org/records/1460961)
125 | * [Cloud gap-filling with deep learning for improved grassland monitoring](https://zenodo.org/records/11651601)
126 | 
127 | ## Landsat
128 | Long running US program -> see [Wikipedia](https://en.wikipedia.org/wiki/Landsat_program)
129 | * 8 bands, 15 to 60 meters, 185km swath, the temporal resolution is 16 days
130 | * [Landsat 4, 5, 7, and 8 imagery on Google](https://cloud.google.com/storage/docs/public-datasets/landsat), see [the GCP bucket here](https://console.cloud.google.com/storage/browser/gcp-public-data-landsat/), with Landsat 8 imagery in COG format analysed in [this notebook](https://github.com/pangeo-data/pangeo-example-notebooks/blob/master/landsat8-cog-ndvi.ipynb)
131 | * [Landsat 8 imagery on AWS](https://registry.opendata.aws/landsat-8/), with many tutorials and tools listed
132 | * https://github.com/kylebarron/landsat-mosaic-latest -> Auto-updating cloudless Landsat 8 mosaic from AWS SNS notifications
133 | * [Visualise landsat imagery using Datashader](https://examples.pyviz.org/landsat/landsat.html#landsat-gallery-landsat)
134 | * [Landsat-mosaic-tiler](https://github.com/kylebarron/landsat-mosaic-tiler) -> This repo hosts all the code for landsatlive.live website and APIs.
135 | * [LandsatSCD](https://github.com/ggsDing/SCanNet/tree/main) -> a change detection dataset, it consists of 8468 pairs of images, each having the spatial resolution of 416 × 416
136 | * [The Landsat Irish Coastal Segmentation Dataset](https://zenodo.org/records/8414665)
137 | 
138 | ## VENμS
139 | Vegetation and Environment monitoring on a New Micro-Satellite ([VENμS](https://en.wikipedia.org/wiki/VEN%CE%BCS))
140 | * [VENUS L2A Cloud-Optimized GeoTIFFs](https://registry.opendata.aws/venus-l2a-cogs/)
141 | * [VENuS cloud mask training dataset](https://zenodo.org/records/7040177)
142 | * [Sen2Venµs](https://zenodo.org/records/6514159) -> a dataset for the training of Sentinel-2 super-resolution algorithms
143 | * [sen2venus-pytorch-dataset](https://github.com/piclem/sen2venus-pytorch-dataset) -> torch dataloader and other utilities
144 | 
145 | ## Maxar
146 | Satellites owned by Maxar (formerly DigitalGlobe) include [GeoEye-1](https://en.wikipedia.org/wiki/GeoEye-1), [WorldView-2](https://en.wikipedia.org/wiki/WorldView-2), [3](https://en.wikipedia.org/wiki/WorldView-3) & [4](https://en.wikipedia.org/wiki/WorldView-4)
147 | * [Maxar Open Data Program](https://github.com/opengeos/maxar-open-data) provides pre and post-event high-resolution satellite imagery in support of emergency planning, response, damage assessment, and recovery
148 | * [WorldView-2 European Cities](https://earth.esa.int/eogateway/catalog/worldview-2-european-cities) -> dataset covering the most populated areas in Europe at 40 cm resolution
149 | 
150 | ## Planet
151 | Also see Spacenet-7 and the Kaggle ship and plane classifications datasets later in this page
152 | * [Planet’s high-resolution, analysis-ready mosaics of the world’s tropics](https://www.planet.com/nicfi/), supported through Norway’s International Climate & Forests Initiative. [BBC coverage](https://www.bbc.co.uk/news/science-environment-54651453)
153 | * Planet have made imagery available via kaggle competitions
154 | * [Alberta Wells Dataset](https://zenodo.org/records/13743323) -> Pinpointing Oil and Gas Wells from Satellite Imagery
155 | * [ARGO ship classification dataset](https://zenodo.org/records/6058710) -> 1750 labelled images from PlanetScope-4-Band satelites. Created [here](https://github.com/elizamanelli/ship_dataset/tree/main)
156 | * [Marine Debris Dataset for Object Detection in Planetscope Imagery](https://cmr.earthdata.nasa.gov/search/concepts/C2781412735-MLHUB.html)
157 | 
158 | ## UC Merced
159 | Land use classification dataset with 21 classes and 100 RGB TIFF images for each class. Each image measures 256x256 pixels with a pixel resolution of 1 foot
160 | * http://weegee.vision.ucmerced.edu/datasets/landuse.html
161 | * Also [available as a multi-label dataset](https://towardsdatascience.com/multi-label-land-cover-classification-with-deep-learning-d39ce2944a3d)
162 | * Read [Vision Transformers for Remote Sensing Image Classification](https://www.mdpi.com/2072-4292/13/3/516/htm) where a Vision Transformer classifier achieves 98.49% classification accuracy on Merced
163 | 
164 | ## EuroSAT
165 | Land use classification dataset of Sentinel-2 satellite images covering 13 spectral bands and consisting of 10 classes with 27000 labeled and geo-referenced samples. Available in RGB and 13 band versions
166 | * [EuroSAT: Land Use and Land Cover Classification with Sentinel-2](https://github.com/phelber/EuroSAT) -> publication where a CNN achieves a classification accuracy 98.57%
167 | * Repos using fastai [here](https://github.com/shakasom/Deep-Learning-for-Satellite-Imagery) and [here](https://www.luigiselmi.eu/eo/lulc-classification-deeplearning.html)
168 | * [evolved_channel_selection](http://matpalm.com/blog/evolved_channel_selection/) -> explores the trade off between mixed resolutions and whether to use a channel at all, with [repo](https://github.com/matpalm/evolved_channel_selection)
169 | * RGB version available as [dataset in pytorch](https://pytorch.org/vision/stable/generated/torchvision.datasets.EuroSAT.html#torchvision.datasets.EuroSAT) with the 13 band version [in torchgeo](https://torchgeo.readthedocs.io/en/latest/api/datasets.html#eurosat). Checkout the tutorial on [data augmentation with this dataset](https://torchgeo.readthedocs.io/en/latest/tutorials/transforms.html)
170 | * [EuroSAT-SAR](https://huggingface.co/datasets/wangyi111/EuroSAT-SAR) -> matched each Sentinel-2 image in EuroSAT with one Sentinel-1 patch according to the geospatial coordinates
171 | 
172 | ## PatternNet
173 | Land use classification dataset with 38 classes and 800 RGB JPG images for each class
174 | * https://sites.google.com/view/zhouwx/dataset?authuser=0
175 | * Publication: [PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval](https://arxiv.org/abs/1706.03424)
176 | 
177 | ## Gaofen Image Dataset (GID) for classification
178 | - https://captain-whu.github.io/GID/
179 | - a large-scale classification set and a fine land-cover classification set
180 | 
181 | ## Million-AID
182 | A large-scale benchmark dataset containing million instances for RS scene classification, 51 scene categories organized by the hierarchical category
183 | * https://captain-whu.github.io/DiRS/
184 | * [Pretrained models](https://github.com/ViTAE-Transformer/ViTAE-Transformer-Remote-Sensing)
185 | * Also see [AID](https://captain-whu.github.io/AID/), [AID-Multilabel-Dataset](https://github.com/Hua-YS/AID-Multilabel-Dataset) & [DFC15-multilabel-dataset](https://github.com/Hua-YS/DFC15-Multilabel-Dataset)
186 | 
187 | ## DIOR object detection dataset
188 | A large-scale benchmark dataset for object detection in optical remote sensing images, which consists of 23,463 images and 192,518 object instances annotated with horizontal bounding boxes
189 | * https://gcheng-nwpu.github.io/
190 | * https://arxiv.org/abs/1909.00133
191 | * [ors-detection](https://github.com/Vlad15lav/ors-detection) -> Object Detection on the DIOR dataset using YOLOv3
192 | * [dior_detect](https://github.com/hm-better/dior_detect) -> benchmarks for object detection on DIOR dataset
193 | * [Tools](https://github.com/CrazyStoneonRoad/Tools) -> for dealing with the DIOR
194 | * [Object_Detection_Satellite_Imagery_Yolov8_DIOR](https://github.com/JohnPPinto/Object_Detection_Satellite_Imagery_Yolov8_DIOR)
195 | 
196 | ## Multiscene
197 | MultiScene dataset aims at two tasks: Developing algorithms for multi-scene recognition & Network learning with noisy labels
198 | * https://multiscene.github.io/ & https://github.com/Hua-YS/Multi-Scene-Recognition
199 | 
200 | ## FAIR1M object detection dataset
201 | A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery
202 | * [arxiv papr](https://arxiv.org/abs/2103.05569)
203 | * Download at gaofen-challenge.com
204 | * [2020Gaofen](https://github.com/AICyberTeam/2020Gaofen) -> 2020 Gaofen Challenge data, baselines, and metrics
205 | 
206 | ## DOTA object detection dataset
207 | A Large-Scale Benchmark and Challenges for Object Detection in Aerial Images. Segmentation annotations available in iSAID dataset
208 | * https://captain-whu.github.io/DOTA/index.html
209 | * [DOTA_devkit](https://github.com/CAPTAIN-WHU/DOTA_devkit) for loading dataset
210 | * [Arxiv paper](https://arxiv.org/abs/1711.10398)
211 | * [Pretrained models in mmrotate](https://github.com/open-mmlab/mmrotate)
212 | * [DOTA2VOCtools](https://github.com/Complicateddd/DOTA2VOCtools) -> dataset split and transform to voc format
213 | * [dotatron](https://github.com/naivelogic/dotatron) -> 2021 Learning to Understand Aerial Images Challenge on DOTA dataset
214 | 
215 | ## iSAID instance segmentation dataset
216 | A Large-scale Dataset for Instance Segmentation in Aerial Images
217 | * https://captain-whu.github.io/iSAID/dataset.html
218 | * Uses images from the DOTA dataset
219 | 
220 | ## HRSC RGB ship object detection dataset
221 | * https://www.kaggle.com/datasets/guofeng/hrsc2016
222 | * [Pretrained models in mmrotate](https://github.com/open-mmlab/mmrotate)
223 | * [Rotation-RetinaNet-PyTorch](https://github.com/HsLOL/Rotation-RetinaNet-PyTorch)
224 | 
225 | ## SAR Ship Detection Dataset (SSDD)
226 | * https://github.com/TianwenZhang0825/Official-SSDD
227 | * [Rotation-RetinaNet-PyTorch](https://github.com/HsLOL/Rotation-RetinaNet-PyTorch)
228 | 
229 | ## High-Resolution SAR Rotation Ship Detection Dataset (SRSDD)
230 | * [Github](https://github.com/HeuristicLU/SRSDD-V1.0)
231 | * [A Lightweight Model for Ship Detection and Recognition in Complex-Scene SAR Images](https://www.mdpi.com/2072-4292/14/23/6053)
232 | 
233 | ## LEVIR ship dataset
234 | A dataset for tiny ship detection under medium-resolution remote sensing images. Annotations in bounding box format
235 | * [LEVIR-Ship](https://github.com/WindVChen/LEVIR-Ship)
236 | <!-- markdown-link-check-disable -->
237 | * Hosted on [Nucleus](https://dashboard.scale.com/nucleus/ds_cbsghny30nf00b1x3w7g?utm_source=open_dataset&utm_medium=github&utm_campaign=levir_ships)
238 | <!-- markdown-link-check-enable -->
239 | 
240 | ## SAR Aircraft Detection Dataset
241 | 2966 non-overlapped 224×224 slices are collected with 7835 aircraft targets
242 | * https://github.com/hust-rslab/SAR-aircraft-data
243 | 
244 | ## xView1: Objects in context for overhead imagery
245 | A fine-grained object detection dataset with 60 object classes along an ontology of 8 class types. Over 1,000,000 objects across over 1,400 km^2 of 0.3m resolution imagery. Annotations in bounding box format
246 | * [Official website](http://xviewdataset.org/)
247 | * [arXiv paper](https://arxiv.org/abs/1802.07856).
248 | * [paperswithcode](https://paperswithcode.com/dataset/xview)
249 | * [Satellite_Imagery_Detection_YOLOV7](https://github.com/Radhika-Keni/Satellite_Imagery_Detection_YOLOV7) -> YOLOV7 applied to xView1
250 | 
251 | ## xView2: xBD building damage assessment
252 | Annotated high-resolution satellite imagery for building damage assessment, precise segmentation masks and damage labels on a four-level spectrum, 0.3m resolution imagery
253 | * [Official website](https://xview2.org/)
254 | * [arXiv paper](https://arxiv.org/abs/1911.09296)
255 | * [paperswithcode](https://paperswithcode.com/paper/xbd-a-dataset-for-assessing-building-damage)
256 | * [xView2_baseline](https://github.com/DIUx-xView/xView2_baseline) -> baseline solution in tensorflow
257 | * [metadamagenet](https://github.com/nimaafshar/metadamagenet) -> pytorch solution
258 | * [U-Net models from michal2409](https://github.com/michal2409/xView2)
259 | * [DAHiTra](https://github.com/nka77/DAHiTra) -> code for 2022 [paper](https://arxiv.org/abs/2208.02205): Large-scale Building Damage Assessment using a Novel Hierarchical Transformer Architecture on Satellite Images. Uses xView2 xBD dataset
260 | * [Damage assessment using Amazon SageMaker geospatial capabilities and custom SageMaker models](https://aws.amazon.com/blogs/machine-learning/damage-assessment-using-amazon-sagemaker-geospatial-capabilities-and-custom-sagemaker-models/)
261 | * [Xview2_Strong_Baseline](https://github.com/PaulBorneP/Xview2_Strong_Baseline) -> a simple implementation of a strong baseline
262 | 
263 | ## xView3: Detecting dark vessels in SAR
264 | Detecting dark vessels engaged in illegal, unreported, and unregulated (IUU) fishing activities on synthetic aperture radar (SAR) imagery. With human and algorithm annotated instances of vessels and fixed infrastructure across 43,200,000 km^2 of Sentinel-1 imagery, this multi-modal dataset enables algorithms to detect and classify dark vessels
265 | * [Official website](https://iuu.xview.us/)
266 | * [arXiv paper](https://arxiv.org/abs/2206.00897)
267 | * [Github](https://github.com/DIUx-xView) -> all reference code, dataset processing utilities, and winning model codes + weights
268 | * [paperswithcode](https://paperswithcode.com/dataset/xview3-sar)
269 | * [xview3_ship_detection](https://github.com/naivelogic/xview3_ship_detection)
270 | 
271 | ## Vehicle Detection in Aerial Imagery (VEDAI)
272 | Vehicle Detection in Aerial Imagery. Bounding box annotations
273 | * https://downloads.greyc.fr/vedai/
274 | * [pytorch-vedai](https://github.com/MichelHalmes/pytorch-vedai)
275 | 
276 | ## Cars Overhead With Context (COWC)
277 | Large set of annotated cars from overhead. Established baseline for object detection and counting tasks. Annotations in bounding box format
278 | * http://gdo152.ucllnl.org/cowc/
279 | * https://github.com/LLNL/cowc
280 | * [Detecting cars from aerial imagery for the NATO Innovation Challenge](https://arthurdouillard.com/post/nato-challenge/)
281 | 
282 | ## AI-TOD & AI-TOD-v2 - tiny object detection
283 | The mean size of objects in AI-TOD is about 12.8 pixels, which is much smaller than other datasets. Annotations in bounding box format. V2 is a meticulous relabelling of the v1 dataset
284 | * https://github.com/jwwangchn/AI-TOD
285 | * https://chasel-tsui.github.io/AI-TOD-v2/
286 | * [NWD](https://github.com/jwwangchn/NWD) -> code for 2021 [paper](https://arxiv.org/abs/2110.13389): A Normalized Gaussian Wasserstein Distance for Tiny Object Detection. Uses AI-TOD dataset
287 | * [ORFENet](https://github.com/dyl96/ORFENet) -> Tiny Object Detection in Remote Sensing Images Based on Object Reconstruction and Multiple Receptive Field Adaptive Feature Enhancement. Uses LEVIR-ship & AI-TOD-v2
288 | 
289 | ## RarePlanes
290 | * [RarePlanes](https://registry.opendata.aws/rareplanes/) -> incorporates both real and synthetically generated satellite imagery including aircraft. Read the [arxiv paper](https://arxiv.org/abs/2006.02963) and checkout [this repo](https://github.com/jdc08161063/RarePlanes). Note the dataset is available through the AWS Open-Data Program for free download
291 | * [Understanding the RarePlanes Dataset and Building an Aircraft Detection Model](https://encord.com/blog/rareplane-dataset-aircraft-detection-model/) -> blog post
292 | * Read [this article from NVIDIA](https://developer.nvidia.com/blog/preparing-models-for-object-detection-with-real-and-synthetic-data-and-tao-toolkit/) which discusses fine tuning a model pre-trained on synthetic data (Rareplanes) with 10% real data, then pruning the model to reduce its size, before quantizing the model to improve inference speed
293 | * [yoltv4](https://github.com/avanetten/yoltv4) includes examples on the [RarePlanes dataset](https://registry.opendata.aws/rareplanes/)
294 | * [rareplanes-yolov5](https://github.com/jeffaudi/rareplanes-yolov5) -> using YOLOv5 and the RarePlanes dataset to detect and classify sub-characteristics of aircraft, with [article](https://medium.com/artificialis/detecting-aircrafts-on-airbus-pleiades-imagery-with-yolov5-5f3d464b75ad)
295 | 
296 | ## Counting from Sky
297 | A Large-scale Dataset for Remote Sensing Object Counting and A Benchmark Method
298 | * https://github.com/gaoguangshuai/Counting-from-Sky-A-Large-scale-Dataset-for-Remote-Sensing-Object-Counting-and-A-Benchmark-Method
299 | 
300 | ## AIRS (Aerial Imagery for Roof Segmentation)
301 | Public dataset for roof segmentation from very-high-resolution aerial imagery (7.5cm). Covers almost the full area of Christchurch, the largest city in the South Island of New Zealand.
302 | * [On Kaggle](https://www.kaggle.com/datasets/atilol/aerialimageryforroofsegmentation)
303 | * [Rooftop-Instance-Segmentation](https://github.com/MasterSkepticista/Rooftop-Instance-Segmentation) -> VGG-16, Instance Segmentation, uses the Airs dataset
304 | 
305 | ## Inria building/not building segmentation dataset
306 | RGB GeoTIFF at spatial resolution of 0.3 m. Data covering Austin, Chicago, Kitsap County, Western & Easter Tyrol, Innsbruck, San Francisco & Vienna
307 | * https://project.inria.fr/aerialimagelabeling/contest/
308 | * [SemSegBuildings](https://github.com/SharpestProjects/SemSegBuildings) -> Project using fast.ai framework for semantic segmentation on Inria building segmentation dataset
309 | * [UNet_keras_for_RSimage](https://github.com/loveswine/UNet_keras_for_RSimage) -> keras code for binary semantic segmentation
310 | 
311 | ## AICrowd Mapping Challenge: building segmentation dataset
312 | 300x300 pixel RGB images with annotations in COCO format. Imagery appears to be global but with significant fraction from North America
313 | * Dataset release as part of the [mapping-challenge](https://www.aicrowd.com/challenges/mapping-challenge)
314 | * Winning solution published by neptune.ai [here](https://github.com/neptune-ai/open-solution-mapping-challenge), achieved precision 0.943 and recall 0.954 using Unet with Resnet.
315 | * [mappingchallenge](https://github.com/krishanr/mappingchallenge) -> YOLOv5 applied to the AICrowd Mapping Challenge dataset
316 | 
317 | ## BONAI - building footprint dataset
318 | BONAI (Buildings in Off-Nadir Aerial Images) is a dataset for building footprint extraction (BFE) in off-nadir aerial images
319 | * https://github.com/jwwangchn/BONAI
320 | 
321 | ## LEVIR-CD building change detection dataset
322 | * https://justchenhao.github.io/LEVIR/
323 | * [FCCDN_pytorch](https://github.com/chenpan0615/FCCDN_pytorch) -> pytorch implemention of FCCDN for change detection task
324 | * [RSICC](https://github.com/Chen-Yang-Liu/RSICC) -> the Remote Sensing Image Change Captioning dataset uses LEVIR-CD imagery
325 | 
326 | ## Onera (OSCD) Sentinel-2 change detection dataset
327 | It comprises 24 pairs of multispectral images taken from the Sentinel-2 satellites between 2015 and 2018. 
328 | * [Onera Satellite Change Detection Dataset](https://ieee-dataport.org/open-access/oscd-onera-satellite-change-detection) comprises 24 pairs of multispectral images taken from the Sentinel-2 satellites between 2015 and 2018
329 | * [Website](https://rcdaudt.github.io/oscd/)
330 | * [change_detection_onera_baselines](https://github.com/previtus/change_detection_onera_baselines) -> Siamese version of U-Net baseline model
331 | * [Urban Change Detection for Multispectral Earth Observation Using Convolutional Neural Networks](https://github.com/rcdaudt/patch_based_change_detection) -> with [paper](https://ieeexplore.ieee.org/abstract/document/8518015)
332 | * [DS_UNet](https://github.com/SebastianHafner/DS_UNet) -> code for 2021 paper: Sentinel-1 and Sentinel-2 Data Fusion for Urban Change Detection using a Dual Stream U-Net, uses Onera Satellite Change Detection dataset
333 | * [ChangeDetection_wOnera](https://github.com/tonydp03/ChangeDetection_wOnera)
334 | * [OSCD + additional Dates](https://github.com/granularai/fabric) -> extended with three different dates
335 | * [MSOSCD](https://github.com/Lihy256/MSCDUnet) -> change detection datasets containing VHR, multispectral (Sentinel-2) and SAR (Sentinel-1)
336 | 
337 | ## SECOND - semantic change detection
338 | * https://captain-whu.github.io/SCD/
339 | * Change detection at the pixel level
340 | 
341 | ## Amazon and Atlantic Forest dataset
342 | For semantic segmentation with Sentinel 2
343 | * [Amazon and Atlantic Forest image datasets for semantic segmentation](https://zenodo.org/record/4498086#.Y6LPLuzP1hE)
344 | * [attention-mechanism-unet](https://github.com/davej23/attention-mechanism-unet) -> An attention-based U-Net for detecting deforestation within satellite sensor imagery
345 | * [TransUNetplus2](https://github.com/aj1365/TransUNetplus2) -> Rethinking attention gated TransU-Net for deforestation mapping
346 | 
347 | ## Functional Map of the World ( fMoW)
348 | * https://github.com/fMoW/dataset
349 | * RGB & multispectral variants
350 | * High resolution, chip classification dataset
351 | * Purpose: predicting the functional purpose of buildings and land use from temporal sequences of satellite images and a rich set of metadata features
352 | 
353 | ## HRSCD change detection
354 | * https://rcdaudt.github.io/hrscd/
355 | * 291 coregistered image pairs of high resolution RGB aerial images
356 | * Pixel-level change and land cover annotations are provided
357 | 
358 | ## MiniFrance-DFC22 - semi-supervised semantic segmentation
359 | * The [MiniFrance-DFC22 (MF-DFC22) dataset](https://ieee-dataport.org/competitions/data-fusion-contest-2022-dfc2022) extends and modifies the [MiniFrance dataset](https://ieee-dataport.org/open-access/minifrance) for training semi-supervised semantic segmentation models for land use/land cover mapping
360 | * [dfc2022-baseline](https://github.com/isaaccorley/dfc2022-baseline) -> baseline solution to the 2022 IEEE GRSS Data Fusion Contest (DFC2022) using TorchGeo, PyTorch Lightning, and Segmentation Models PyTorch to train a U-Net with a ResNet-18 backbone and a loss function of Focal + Dice loss to perform semantic segmentation on the DFC2022 dataset
361 | * https://github.com/mveo/mveo-challenge
362 | 
363 | ## FLAIR
364 | Semantic segmentation and domain adaptation challenge proposed by the French National Institute of Geographical and Forest Information (IGN). Uses a dataset composed of over 70,000 aerial imagery patches with pixel-based annotations and 50,000 Sentinel-2 satellite acquisitions.
365 | * [Challenge on codalab](https://codalab.lisn.upsaclay.fr/competitions/13447)
366 | * [FLAIR-2 github](https://github.com/IGNF/FLAIR-2)
367 | * [flair-2 8th place solution](https://github.com/association-rosia/flair-2)
368 | * [IGNF HuggingFace](https://huggingface.co/IGNF)
369 | 
370 | ## ISPRS
371 | Semantic segmentation dataset. 38 patches of 6000x6000 pixels, each consisting of a true orthophoto (TOP) extracted from a larger TOP mosaic, and a DSM. Resolution 5 cm
372 | * https://www.isprs.org/education/benchmarks/UrbanSemLab/2d-sem-label-potsdam.aspx
373 | 
374 | ## SpaceNet
375 | SpaceNet is a series of competitions with datasets and utilities provided. The challenges covered are: (1 & 2) building segmentation, (3) road segmentation, (4) off-nadir buildings, (5) road network extraction, (6) multi-senor mapping, (7) multi-temporal urban change, (8) Flood Detection Challenge Using Multiclass Segmentation
376 | * [spacenet.ai](https://spacenet.ai/) is an online hub for data, challenges, algorithms, and tools
377 | * [The SpaceNet 7 Multi-Temporal Urban Development Challenge: Dataset Release](https://medium.com/the-downlinq/the-spacenet-7-multi-temporal-urban-development-challenge-dataset-release-9e6e5f65c8d5)
378 | * [spacenet-three-topcoder](https://github.com/snakers4/spacenet-three-topcoder) solution
379 | * [official utilities](https://github.com/SpaceNetChallenge/utilities) -> Packages intended to assist in the preprocessing of SpaceNet satellite imagery dataset to a format that is consumable by machine learning algorithms
380 | * [andraugust spacenet-utils](https://github.com/andraugust/spacenet-utils) -> Display geotiff image with building-polygon overlay & label buildings using kNN on the pixel spectra
381 | * [Spacenet-Building-Detection](https://github.com/IdanC1s2/Spacenet-Building-Detection) -> uses keras and [Spacenet 1 dataset](https://spacenet.ai/spacenet-buildings-dataset-v1/)
382 | * [Spacenet 8 winners blog post](https://medium.com/@SpaceNet_Project/spacenet-8-a-closer-look-at-the-winning-approaches-75ff4033bf53)
383 | 
384 | ## WorldStrat Dataset
385 | Nearly 10,000 km² of free high-resolution satellite imagery of unique locations which ensure stratified representation of all types of land-use across the world: from agriculture to ice caps, from forests to multiple urbanization densities.
386 | * https://github.com/worldstrat/worldstrat
387 | * [Quick tour of the WorldStrat Dataset](https://medium.com/@robmarkcole/quick-tour-of-the-worldstrat-dataset-b2d1c2d435db)
388 | * Each high-resolution image (1.5 m/pixel) comes with multiple temporally-matched low-resolution images from the freely accessible lower-resolution Sentinel-2 satellites (10 m/pixel)
389 | * Several super-resolution benchmark models trained on it
390 | 
391 | ## Satlas Pretrain
392 | SatlasPretrain is a large-scale pre-training dataset for tasks that involve understanding satellite images. Regularly-updated satellite data is publicly available for much of the Earth through sources such as Sentinel-2 and NAIP, and can inform numerous applications from tackling illegal deforestation to monitoring marine infrastructure. 
393 | * [Website](https://satlas-pretrain.allen.ai/)
394 | * [Code](https://github.com/allenai/satlas)
395 | 
396 | ## FLAIR 1 & 2 Segmentation datasets
397 | * https://ignf.github.io/FLAIR/
398 | * The FLAIR #1 semantic segmentation dataset consists of 77,412 high resolution patches (512x512 at 0.2 m spatial resolution) with 19 semantic classes
399 | * FLAIR #2 includes an expanded dataset of Sentinel-2 time series for multi-modal semantic segmentation
400 | 
401 | ## Five Billion Pixels segmentation dataset
402 | * https://x-ytong.github.io/project/Five-Billion-Pixels.html
403 | * 4m Gaofen-2 imagery over China
404 | * 24 land cover classes
405 | * Paper and code demonstrating domain adaptation to Sentinel-2 and Planetscope imagery
406 | * Extends the [GID15 large scale semantic segmentation dataset](https://captain-whu.github.io/GID15/)
407 | * [GID](https://x-ytong.github.io/project/GID.html) -> the Gaofen Image Dataset is a large-scale land-cover dataset with Gaofen-2 (GF-2) satellite images
408 | 
409 | ## RF100 object detection benchmark
410 | RF100 is compiled from 100 real world datasets that straddle a range of domains. The aim is that performance evaluation on this dataset will enable a more nuanced guide of how a model will perform in different domains. Contains 10k aerial images
411 | * https://www.rf100.org/
412 | * https://github.com/roboflow-ai/roboflow-100-benchmark
413 | 
414 | ## SODA-A rotated bounding boxes
415 | * https://shaunyuan22.github.io/SODA/
416 | * SODA-A comprises 2513 high-resolution images of aerial scenes, which has 872069 instances annotated with oriented rectangle box annotations over 9 classes
417 | * https://github.com/shaunyuan22/CFINet
418 | 
419 | ## EarthView from Satellogic
420 | * https://huggingface.co/datasets/satellogic/EarthView
421 | * Dataset for foundational models, with Sentinel 1 & 2 and 1m RGB
422 | 
423 | ## Microsoft datasets
424 | * [US Building Footprints](https://github.com/Microsoft/USBuildingFootprints) -> building footprints in all 50 US states, GeoJSON format, generated using semantic segmentation. Also [Australia](https://github.com/microsoft/AustraliaBuildingFootprints), [Canadian](https://github.com/Microsoft/CanadianBuildingFootprints), [Uganda-Tanzania](https://github.com/microsoft/Uganda-Tanzania-Building-Footprints), [Kenya-Nigeria](https://github.com/microsoft/KenyaNigeriaBuildingFootprints) and [GlobalMLBuildingFootprints](https://github.com/microsoft/GlobalMLBuildingFootprints) are available. Checkout [RasterizingBuildingFootprints](https://github.com/mehdiheris/RasterizingBuildingFootprints) to convert vector shapefiles to raster layers
425 | * [Microsoft Planetary Computer](https://planetarycomputer.microsoft.com/) is a Dask-Gateway enabled JupyterHub deployment focused on supporting scalable geospatial analysis, [source repo](https://github.com/microsoft/planetary-computer-hub)
426 | * [landcover-orinoquia](https://github.com/microsoft/landcover-orinoquia) -> Land cover mapping of the Orinoquía region in Colombia, in collaboration with Wildlife Conservation Society Colombia. An #AIforEarth project
427 | * [RoadDetections dataset by Microsoft](https://github.com/microsoft/RoadDetections)
428 | 
429 | ## Google datasets
430 | * [open-buildings](https://sites.research.google/open-buildings/) -> A dataset of building footprints to support social good applications covering 64% of the African continent. Read [Mapping Africa’s Buildings with Satellite Imagery](https://ai.googleblog.com/2021/07/mapping-africas-buildings-with.html)
431 | 
432 | ## Google Earth Engine (GEE)
433 | Since there is a whole community around GEE I will not reproduce it here but list very select references. Get started at https://developers.google.com/earth-engine/
434 | * Various imagery and climate datasets, including Landsat & Sentinel imagery
435 | * Supports large scale processing with classical algorithms, e.g. clustering for land use. For deep learning, you export datasets from GEE as tfrecords, train on your preferred GPU platform, then upload inference results back to GEE
436 | * [awesome-google-earth-engine](https://github.com/gee-community/awesome-google-earth-engine)
437 | * [Awesome-GEE](https://github.com/giswqs/Awesome-GEE)
438 | * [awesome-earth-engine-apps](https://github.com/philippgaertner/awesome-earth-engine-apps)
439 | * [How to Use Google Earth Engine and Python API to Export Images to Roboflow](https://blog.roboflow.com/how-to-use-google-earth-engine-with-roboflow/) -> to acquire training data
440 | * [ee-fastapi](https://github.com/csaybar/ee-fastapi) is a simple FastAPI web application for performing flood detection using Google Earth Engine in the backend.
441 | * [How to Download High-Resolution Satellite Data for Anywhere on Earth](https://towardsdatascience.com/how-to-download-high-resolution-satellite-data-for-anywhere-on-earth-5e6dddee2803)
442 | * [wxee](https://github.com/aazuspan/wxee) -> Export data from GEE to xarray using wxee then train with pytorch or tensorflow models. Useful since GEE only suports tfrecord export natively
443 | 
444 | ## Image captioning datasets
445 | * [RSICD](https://github.com/201528014227051/RSICD_optimal) -> 10921 images with five sentences descriptions per image. Used in  [Fine tuning CLIP with Remote Sensing (Satellite) images and captions](https://huggingface.co/blog/fine-tune-clip-rsicd), models at [this repo](https://github.com/arampacha/CLIP-rsicd)
446 | * [RSICC](https://github.com/Chen-Yang-Liu/RSICC) -> the Remote Sensing Image Change Captioning dataset contains 10077 pairs of bi-temporal remote sensing images and 50385 sentences describing the differences between images. Uses LEVIR-CD imagery
447 | * [ChatEarthNet](https://github.com/zhu-xlab/ChatEarthNet) -> A Global-Scale Image-Text Dataset Empowering Vision-Language Geo-Foundation Models, utilizes Sentinel-2 data with captions generated by ChatGPT
448 | 
449 | ## Weather Datasets
450 | * NASA (make request and emailed when ready) -> https://search.earthdata.nasa.gov
451 | * NOAA (requires BigQuery) -> https://www.kaggle.com/datasets/noaa/goes16/home
452 | * Time series weather data for several US cities -> https://www.kaggle.com/datasets/selfishgene/historical-hourly-weather-data
453 | * [DeepWeather](https://github.com/adamhazimeh/DeepWeather) -> improve weather forecasting accuracy by analyzing satellite images
454 | 
455 | ## Cloud datasets
456 | * [Planet-CR](https://github.com/zhu-xlab/Planet-CR) -> A Multi-Modal and Multi-Resolution Dataset for Cloud Removal in High Resolution Optical Remote Sensing Imagery, 3m resolution, with [paper](https://arxiv.org/abs/2301.03432)
457 | * [The Azavea Cloud Dataset](https://www.azavea.com/blog/2021/08/02/the-azavea-cloud-dataset/) which is used to train this [cloud-model](https://github.com/azavea/cloud-model)
458 | * [Sentinel-2 Cloud Cover Segmentation Dataset](https://mlhub.earth/data/ref_cloud_cover_detection_challenge_v1) on Radiant mlhub
459 | * [cloudsen12](https://cloudsen12.github.io/) -> see [video](https://youtu.be/GhQwnVhJ1wo)
460 | * [HRC_WHU](https://github.com/dr-lizhiwei/HRC_WHU) -> High-Resolution Cloud Detection Dataset comprising 150 RGB images and a resolution varying from 0.5 to 15 m in different global regions
461 | * [AIR-CD](https://github.com/AICyberTeam/AIR-CD) -> a challenging cloud detection data set called AIR-CD, with higher spatial resolution and more representative landcover types
462 | * [Landsat 8 Cloud Cover Assessment Validation Data](https://landsat.usgs.gov/landsat-8-cloud-cover-assessment-validation-data)
463 | 
464 | ## Forest datasets
465 | * [OpenForest](https://github.com/RolnickLab/OpenForest) -> A catalogue of open access forest datasets
466 | * [awesome-forests](https://github.com/blutjens/awesome-forests) -> A curated list of ground-truth forest datasets for the machine learning and forestry community
467 | * [ReforesTree](https://github.com/gyrrei/ReforesTree) -> A dataset for estimating tropical forest biomass based on drone and field data
468 | * [yosemite-tree-dataset](https://github.com/nightonion/yosemite-tree-dataset) -> a benchmark dataset for tree counting from aerial images
469 | * [Amazon Rainforest dataset for semantic segmentation](https://zenodo.org/record/3233081#.Y6LPLOzP1hE) -> Sentinel 2 images. Used in the paper 'An attention-based U-Net for detecting deforestation within satellite sensor imagery'
470 | * [Amazon and Atlantic Forest image datasets for semantic segmentation](https://zenodo.org/record/4498086#.Y6LPLuzP1hE) -> Sentinel 2 images. Used in paper 'An attention-based U-Net for detecting deforestation within satellite sensor imagery'
471 | * [TreeSatAI](https://zenodo.org/records/6780578) -> Sentinel-1, Sentinel-2
472 | * [PureForest](https://huggingface.co/datasets/IGNF/PureForest) -> VHR RGB + Near-Infrared & lidar, each patch represents a monospecific forest
473 | 
474 | ## Geospatial datasets
475 | * [Resource Watch](https://resourcewatch.org/data/explore) provides a wide range of geospatial datasets and a UI to visualise them
476 | 
477 | ## Time series & change detection datasets
478 | * [BreizhCrops](https://github.com/dl4sits/BreizhCrops) -> A Time Series Dataset for Crop Type Mapping
479 | * The SeCo dataset contains image patches from Sentinel-2 tiles captured at different timestamps at each geographical location. [Download SeCo here](https://github.com/ElementAI/seasonal-contrast)
480 | * [SYSU-CD](https://github.com/liumency/SYSU-CD) -> The dataset contains 20000 pairs of 0.5-m aerial images of size 256×256 taken between the years 2007 and 2014 in Hong Kong
481 | 
482 | ### DEM (digital elevation maps)
483 | * Shuttle Radar Topography Mission, search online at usgs.gov
484 | * Copernicus Digital Elevation Model (DEM) on S3, represents the surface of the Earth including buildings, infrastructure and vegetation. Data is provided as Cloud Optimized GeoTIFFs. [link](https://registry.opendata.aws/copernicus-dem/)
485 | * [Awesome-DEM](https://github.com/DahnJ/Awesome-DEM)
486 | 
487 | ## UAV & Drone datasets
488 | * Many on https://www.visualdata.io
489 | * [AU-AIR dataset](https://bozcani.github.io/auairdataset) -> a multi-modal UAV dataset for object detection.
490 | * [ERA](https://lcmou.github.io/ERA_Dataset/) ->  A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos.
491 | * [Aerial Maritime Drone Dataset](https://public.roboflow.ai/object-detection/aerial-maritime) -> bounding boxes
492 | * [RetinaNet for pedestrian detection](https://towardsdatascience.com/pedestrian-detection-in-aerial-images-using-retinanet-9053e8a72c6) -> bounding boxes
493 | * [BIRDSAI: A Dataset for Detection and Tracking in Aerial Thermal Infrared Videos](https://github.com/exb7900/BIRDSAI) -> Thermal IR videos of humans and animals
494 | * [ERA: A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos](https://lcmou.github.io/ERA_Dataset/)
495 | * [DroneVehicle](https://github.com/VisDrone/DroneVehicle) -> Drone-based RGB-Infrared Cross-Modality Vehicle Detection via Uncertainty-Aware Learning. Annotations are rotated bounding boxes. With [Github repo](https://github.com/SunYM2020/UA-CMDet)
496 | * [UAVOD10](https://github.com/weihancug/10-category-UAV-small-weak-object-detection-dataset-UAVOD10) -> 10 class of objects at 15 cm resolution. Classes are; building, ship, vehicle, prefabricated house, well, cable tower, pool, landslide, cultivation mesh cage, and quarry. Bounding boxes
497 | * [Busy-parking-lot-dataset---vehicle-detection-in-UAV-video](https://github.com/zhu-xlab/Busy-parking-lot-dataset---vehicle-detection-in-UAV-video) -> Vehicle instance segmentation. Unsure format of annotations, possible Matlab specific
498 | * [dd-ml-segmentation-benchmark](https://github.com/dronedeploy/dd-ml-segmentation-benchmark) -> DroneDeploy Machine Learning Segmentation Benchmark
499 | * [SeaDronesSee](https://github.com/Ben93kie/SeaDronesSee) -> Vision Benchmark for Maritime Search and Rescue. Bounding box object detection, single-object tracking and multi-object tracking annotations
500 | * [aeroscapes](https://github.com/ishann/aeroscapes) -> semantic segmentation benchmark comprises of images captured using a commercial drone from an altitude range of 5 to 50 metres.
501 | * [ALTO](https://github.com/MetaSLAM/ALTO) -> Aerial-view Large-scale Terrain-Oriented. For deep learning based UAV visual place recognition and localization tasks.
502 | * [HIT-UAV-Infrared-Thermal-Dataset](https://github.com/suojiashun/HIT-UAV-Infrared-Thermal-Dataset) -> A High-altitude Infrared Thermal Object Detection Dataset for Unmanned Aerial Vehicles
503 | * [caltech-aerial-rgbt-dataset](https://github.com/aerorobotics/caltech-aerial-rgbt-dataset) -> synchronized RGB, thermal, GPS, and IMU data
504 | * [Leafy Spurge Dataset](https://leafy-spurge-dataset.github.io/) -> Real-world Weed Classification Within Aerial Drone Imagery
505 | * [UAV-HSI-Crop-Dataset](https://github.com/MrSuperNiu/UAV-HSI-Crop-Dataset) -> dataset for "HSI-TransUNet: A Transformer based semantic segmentation model for crop mapping from UAV hyperspectral imagery"
506 | * [UAVVaste](https://github.com/PUTvision/UAVVaste) -> COCO-like dataset and effective waste detection in aerial images
507 | 
508 | ## Other datasets
509 | * [land-use-land-cover-datasets](https://github.com/r-wenger/land-use-land-cover-datasets)
510 | * [EORSSD-dataset](https://github.com/rmcong/EORSSD-dataset) -> Extended Optical Remote Sensing Saliency Detection (EORSSD) Dataset
511 | * [RSD46-WHU](https://github.com/RSIA-LIESMARS-WHU/RSD46-WHU) -> 46 scene classes for image classification, free for education, research and commercial use
512 | * [RSOD-Dataset](https://github.com/RSIA-LIESMARS-WHU/RSOD-Dataset-) -> dataset for object detection in PASCAL VOC format. Aircraft, playgrounds, overpasses & oiltanks
513 | * [VHR-10_dataset_coco](https://github.com/chaozhong2010/VHR-10_dataset_coco) -> Object detection and instance segmentation dataset based on NWPU VHR-10 dataset. RGB & SAR
514 | * [HRSID](https://github.com/chaozhong2010/HRSID) -> high resolution sar images dataset for ship detection, semantic segmentation, and instance segmentation tasks
515 | * [MAR20](https://gcheng-nwpu.github.io/) -> Military Aircraft Recognition dataset
516 | * [RSSCN7](https://github.com/palewithout/RSSCN7) -> Dataset of the article “Deep Learning Based Feature Selection for Remote Sensing Scene Classification”
517 | * [Sewage-Treatment-Plant-Dataset](https://github.com/peijinwang/Sewage-Treatment-Plant-Dataset) -> object detection
518 | * [TGRS-HRRSD-Dataset](https://github.com/CrazyStoneonRoad/TGRS-HRRSD-Dataset) -> High Resolution Remote Sensing Detection (HRRSD)
519 | * [MUSIC4HA](https://github.com/gistairc/MUSIC4HA) -> MUltiband Satellite Imagery for object Classification (MUSIC) to detect Hot Area
520 | * [MUSIC4GC](https://github.com/gistairc/MUSIC4GC) -> MUltiband Satellite Imagery for object Classification (MUSIC) to detect Golf Course
521 | * [MUSIC4P3](https://github.com/gistairc/MUSIC4P3) -> MUltiband Satellite Imagery for object Classification (MUSIC) to detect Photovoltaic Power Plants (solar panels)
522 | * [ABCDdataset](https://github.com/gistairc/ABCDdataset) -> damage detection dataset to identify whether buildings have been washed-away by tsunami
523 | * [OGST](https://data.mendeley.com/datasets/bkxj8z84m9/3) -> Oil and Gas Tank Dataset
524 | * [LS-SSDD-v1.0-OPEN](https://github.com/TianwenZhang0825/LS-SSDD-v1.0-OPEN) -> Large-Scale SAR Ship Detection Dataset
525 | * [S2Looking](https://github.com/S2Looking/Dataset) -> A Satellite Side-Looking Dataset for Building Change Detection, [paper](https://arxiv.org/abs/2107.09244)
526 | * [AISD](https://github.com/RSrscoder/AISD) -> Aerial Imagery dataset for Shadow Detection
527 | * [Awesome-Remote-Sensing-Relative-Radiometric-Normalization-Datasets](https://github.com/ArminMoghimi/Awesome-Remote-Sensing-Relative-Radiometric-Normalization-Datasets)
528 | * [SearchAndRescueNet](https://github.com/michaelthoreau/SearchAndRescueNet) -> Satellite Imagery for Search And Rescue Dataset, with example Faster R-CNN model
529 | * [geonrw](https://ieee-dataport.org/open-access/geonrw) -> orthorectified aerial photographs, LiDAR derived digital elevation models and segmentation maps with 10 classes. With [repo](https://github.com/gbaier/geonrw)
530 | * [Thermal power plans dataset](https://github.com/wenxinYin/AIR-TPPDD)
531 | * [University1652-Baseline](https://github.com/layumi/University1652-Baseline) -> A Multi-view Multi-source Benchmark for Drone-based Geo-localization
532 | * [benchmark_ISPRS2021](https://github.com/whuwuteng/benchmark_ISPRS2021) -> A new stereo dense matching benchmark dataset for deep learning
533 | * [WHU-SEN-City](https://github.com/whu-csl/WHU-SEN-City) -> A paired SAR-to-optical image translation dataset which covers 34 big cities of China
534 | * [SAR_vehicle_detection_dataset](https://github.com/whu-csl/SAR_vehicle_detection_dataset) -> 104 SAR images for vehicle detection, collected from Sandia MiniSAR/FARAD SAR images and MSTAR images
535 | * [ERA-DATASET](https://github.com/zhu-xlab/ERA-DATASET) -> A Dataset and Deep Learning Benchmark for Event Recognition in Aerial Videos
536 | * [SSL4EO-S12](https://github.com/zhu-xlab/SSL4EO-S12) -> a large-scale dataset for self-supervised learning in Earth observation
537 | * [UBC-dataset](https://github.com/AICyberTeam/UBC-dataset) -> a dataset for building detection and classification from very high-resolution satellite imagery with the focus on object-level interpretation of individual buildings
538 | * [AIR-CD](https://github.com/AICyberTeam/AIR-CD) -> a challenging cloud detection data set called AIR-CD, with higher spatial resolution and more representative landcover types
539 | * [AIR-PolSAR-Seg](https://github.com/AICyberTeam/AIR-PolSAR-Seg) -> a challenging PolSAR terrain segmentation dataset
540 | * [HRC_WHU](https://github.com/dr-lizhiwei/HRC_WHU) -> High-Resolution Cloud Detection Dataset comprising 150 RGB images and a resolution varying from 0.5 to 15 m in different global regions
541 | * [AeroRIT](https://github.com/aneesh3108/AeroRIT) -> A New Scene for Hyperspectral Image Analysis
542 | * [Building_Dataset](https://github.com/QiaoWenfan/Building_Dataset) -> High-speed Rail Line Building Dataset Display
543 | * [Haiming-Z/MtS-WH-reference-map](https://github.com/Haiming-Z/MtS-WH-reference-map) -> a reference map for change detection based on MtS-WH
544 | * [MtS-WH-Dataset](https://github.com/rulixiang/MtS-WH-Dataset) -> Multi-temporal Scene WuHan (MtS-WH) Dataset
545 | * [Multi-modality-image-matching](https://github.com/StaRainJ/Multi-modality-image-matching-database-metrics-methods) -> image matching dataset including several remote sensing modalities
546 | * [RID](https://github.com/TUMFTM/RID) -> Roof Information Dataset for CV-Based Photovoltaic Potential Assessment. With [paper](https://www.mdpi.com/2072-4292/14/10/2299)
547 | * [APKLOT](https://github.com/langheran/APKLOT) -> A dataset for aerial parking block segmentation
548 | * [QXS-SAROPT](https://github.com/yaoxu008/QXS-SAROPT) -> Optical and SAR pairing dataset from the [paper](https://arxiv.org/abs/2103.08259): The QXS-SAROPT Dataset for Deep Learning in SAR-Optical Data Fusion
549 | * [SAR-ACD](https://github.com/AICyberTeam/SAR-ACD) -> SAR-ACD consists of 4322 aircraft clips with 6 civil aircraft categories and 14 other aircraft categories
550 | * [SODA](https://shaunyuan22.github.io/SODA/) -> A large-scale Small Object Detection dataset. SODA-A comprises 2510 high-resolution images of aerial scenes, which has 800203 instances annotated with oriented rectangle box annotations over 9 classes.
551 | * [Data-CSHSI](https://github.com/YuxiangZhang-BIT/Data-CSHSI) -> Open source datasets for Cross-Scene Hyperspectral Image Classification, includes Houston, Pavia & HyRank datasets
552 | * [SynthWakeSAR](https://data.bris.ac.uk/data/dataset/30kvuvmatwzij2mz1573zqumfx) -> A Synthetic SAR Dataset for Deep Learning Classification of Ships at Sea, with [paper](https://www.mdpi.com/2072-4292/14/16/3999)
553 | * [SAR2Opt-Heterogeneous-Dataset](https://github.com/MarsZhaoYT/SAR2Opt-Heterogeneous-Dataset) -> SAR-optical images to be used as a benchmark in change detection and image transaltion on remote sensing images
554 | * [urban-tree-detection-data](https://github.com/jonathanventura/urban-tree-detection-data) -> Dataset for training and evaluating tree detectors in urban environments with aerial imagery
555 | * [Landsat 8 Cloud Cover Assessment Validation Data](https://landsat.usgs.gov/landsat-8-cloud-cover-assessment-validation-data)
556 | * [Attribute-Cooperated-Classification-Datasets](https://github.com/CrazyStoneonRoad/Attribute-Cooperated-Classification-Datasets) -> Three datasets based on AID, UCM, and Sydney. For each image, there is a label of scene classification and a label vector of attribute items.
557 | * [dynnet](https://github.com/aysim/dynnet) -> DynamicEarthNet: Daily Multi-Spectral Satellite Dataset for Semantic Change Segmentation
558 | * [open_earth_map](https://github.com/bao18/open_earth_map) -> a benchmark dataset for global high-resolution land cover mapping
559 | * [Satellite imagery datasets containing ships](https://github.com/NaLiu613/Satellite-Imagery-Datasets-Containing-Ships) -> A list of radar and optical satellite datasets for ship detection, classification, semantic segmentation and instance segmentation tasks
560 | * [SolarDK](https://arxiv.org/abs/2212.01260) -> A high-resolution urban solar panel image classification and localization dataset
561 | * [Roofline-Extraction](https://github.com/loosgagnet/Roofline-Extraction) -> dataset for paper 'Knowledge-Based 3D Building Reconstruction (3DBR) Using Single Aerial Images and Convolutional Neural Networks (CNNs)'
562 | * [Building-detection-and-roof-type-recognition](https://github.com/loosgagnet/Building-detection-and-roof-type-recognition) -> datasets for the paper 'A CNN-Based Approach for Automatic Building Detection and Recognition of Roof Types Using a Single Aerial Image'
563 | * [PanCollection](https://github.com/liangjiandeng/PanCollection) -> Pansharpening Datasets from WorldView 2, WorldView 3, QuickBird, Gaofen 2 sensors
564 | * [OnlyPlanes](https://github.com/naivelogic/OnlyPlanes) -> Synthetic dataset and pretrained models for Detectron2
565 | * [Remote Sensing Satellite Video Dataset for Super-resolution](https://zenodo.org/record/6969604#.ZCBd-OzMJhE)
566 | * [WHU-Stereo](https://github.com/Sheng029/WHU-Stereo) -> A Challenging Benchmark for Stereo Matching of High-Resolution Satellite Images
567 | * [FireRisk](https://github.com/CharmonyShen/FireRisk) -> A Remote Sensing Dataset for Fire Risk Assessment with Benchmarks Using Supervised and Self-supervised Learning
568 | * [Road-Change-Detection-Dataset](https://github.com/fightingMinty/Road-Change-Detection-Dataset)
569 | * [3DCD](https://sites.google.com/uniroma1.it/3dchangedetection/home-page) -> infer 3D CD maps using only remote sensing optical bitemporal images as input without the need of Digital Elevation Models (DEMs)
570 | * [Hyperspectral Change Detection Dataset Irrigated Agricultural Area](https://github.com/SicongLiuRS/Hyperspectral-Change-Detection-Dataset-Irrigated-Agricultural-Area)
571 | * [CNN-RNN-Yield-Prediction](https://github.com/saeedkhaki92/CNN-RNN-Yield-Prediction) -> soybean dataset
572 | * [HySpecNet-11k](https://hyspecnet.rsim.berlin/) -> a large-scale hyperspectral benchmark dataset
573 | * [Mumbai-Semantic-Segmentation-Dataset](https://github.com/GeoAI-Research-Lab/Mumbai-Semantic-Segmentation-Dataset)
574 | * [SZTAKI](http://web.eee.sztaki.hu/remotesensing/airchange_benchmark.html) -> A Ground truth collection for change detection in optical aerial images taken with several years time differences
575 | * [DSIFN](https://github.com/GeoZcx/A-deeply-supervised-image-fusion-network-for-change-detection-in-remote-sensing-images/tree/master/dataset) -> change detection dataset, it consists of six large bi-temporal high resolution images covering six cities in China
576 | * [SV248S](https://github.com/xdai-dlgvv/SV248S) -> Single Object Tracking Dataset, tracking Vehicle, Large-Vehicle, Ship and Airplane
577 | * [GAMUS](https://github.com/EarthNets/RSI-MMSegmentation) ->  A Geometry-aware Multi-modal Semantic Segmentation Benchmark for Remote Sensing Data
578 | * [Oil and Gas Infrastructure Mapping (OGIM) database](https://zenodo.org/record/7922117) -> includes locations and facility attributes of oil and gas infrastructure types that are important sources of methane emissions
579 | * [openWUSU](https://github.com/AngieNikki/openWUSU) -> WUSU is a semantic understanding dataset focusing on urban structure and the urbanization process in Wuhan
580 | * [Digital Typhoon Dataset](https://github.com/kitamoto-lab/digital-typhoon/) -> aimed at benchmarking machine learning models for long-term spatio-temporal data
581 | * [RSE_Cross-city](https://github.com/danfenghong/RSE_Cross-city) -> Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for Cross-City Semantic Segmentation using High-Resolution Domain Adaptation Networks
582 | * [AErial Lane](https://github.com/Jiawei-Yao0812/AerialLaneNet) -> AErial Lane (AEL) Dataset is a first large-scale aerial image dataset built for lane detection, with high-quality polyline lane annotations on high-resolution images of around 80 kilometers of road
583 | * [GeoPile pretraining dataset](https://github.com/mmendiet/GFM) -> compiles imagery from other datasets including RSD46-WHU, MLRSNet and RESISC45 for pretraining of Foundational models
584 | * [NWPU-MOC](https://github.com/lyongo/NWPU-MOC) -> A Benchmark for Fine-grained Multi-category Object Counting in Aerial Images
585 | * [Chesapeake Roads Spatial Context (RSC)](https://github.com/isaaccorley/ChesapeakeRSC)
586 | * [STARCOP dataset: Semantic Segmentation of Methane Plumes with Hyperspectral Machine Learning Models](https://zenodo.org/records/7863343)
587 | * [Toulouse Hyperspectral Data Set](https://www.toulouse-hyperspectral-data-set.com/)
588 | * [CloudTracks: A Dataset for Localizing Ship Tracks in Satellite Images of Clouds](https://zenodo.org/records/10042922) -> the dataset consists of 1,780 MODIS satellite images hand-labeled for the presence of more than 12,000 ship tracks.
589 | * [Vehicle Perception from Satellite](https://github.com/Chenxi1510/Vehicle-Perception-from-Satellite-Videos) -> a large-scale benchmark for traffic monitoring from satellite
590 | * [SARDet-100K](https://github.com/zcablii/SARDet_100K) -> Large-Scale Synthetic Aperture Radar (SAR) Object Detection
591 | * [So2Sat-POP-DL](https://github.com/zhu-xlab/So2Sat-POP-DL) -> Dataset discovery: So2Sat Population dataset covering 98 EU cities
592 | * [Urban Vehicle Segmentation Dataset (UV6K)](https://zenodo.org/records/8404754)
593 | * [TimeMatch](https://zenodo.org/records/5636422) -> dataset for cross-region adaptation for crop identification from SITS in four different regions in Europe
594 | * [BirdSAT](https://github.com/mvrl/BirdSAT) -> Cross-View iNAT Birds 2021: This cross-view birds species dataset consists of paired ground-level bird images and satellite images, along with meta-information associated with the iNaturalist-2021 dataset.
595 | * [OpenSARWake](https://github.com/libzzluo/OpenSARWake) -> A SAR ship wake rotation detection benchmark dataset.
596 | * [TUE-CD](https://github.com/RSMagneto/MSI-Net) -> A change detection detection for building damage estimation after earthquake
597 | * [Overhead Wind Turbine Dataset - NAIP](https://zenodo.org/records/7385227#.Y419qezMLdr)
598 | * [Toulouse Hyperspectral Data Set](https://github.com/Romain3Ch216/TlseHypDataSet)
599 | * [Hi-UCD](https://github.com/Daisy-7/Hi-UCD-S) -> ultra-High Urban Change Detection for urban semantic change detection
600 | * [LEVIR-CC-Dataset](https://github.com/Chen-Yang-Liu/LEVIR-CC-Dataset) -> A Large Dataset for Remote Sensing Image Change Captioning
601 | * [ShipRSImageNet](https://github.com/zzndream/ShipRSImageNet) -> A Large-scale Fine-Grained Dataset for Ship Detection in High-Resolution Optical Remote Sensing Images
602 | * [pangaea-bench](https://github.com/yurujaja/pangaea-bench) -> A Global and Inclusive Benchmark for Geospatial Foundation Models
603 | * [VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding](https://vrsbench.github.io/)
604 | * [SeeFar](https://coastalcarbon.ai/seefar) -> Satellite Agnostic Multi-Resolution Dataset for Geospatial Foundation Models
605 | * [RSHaze+](https://zenodo.org/records/13837162) -> remote sensing dehazing datasets in PhDnet: A novel physic-aware dehazing network for remote sensing images
606 | * [GDCLD](https://zenodo.org/records/13612636) -> A globally distributed dataset of coseismic landslide mapping via multi-source high-resolution remote sensing images
607 | * [10,000 Crop Field Boundaries across India](https://zenodo.org/records/7315090) -> using Airbus SPOT
608 | * [BANet change dataset - RS image to cadastral map](https://github.com/lqycrystal/BANet)
609 | * [VME: A Satellite Imagery Dataset and Benchmark for Detecting Vehicles in the Middle East and Beyond](https://github.com/nalemadi/VME_CDSI_dataset_benchmark)
610 | * [HouseTS](https://www.kaggle.com/datasets/shengkunwang/housets-dataset) -> Long-term, Multimodal Housing Dataset Across 30 U.S. Metropolitan Area. Uses NAIP. [With paper](https://arxiv.org/abs/2506.00765)
611 | 
612 | ## Kaggle
613 | Kaggle hosts over > 200 satellite image datasets, [search results here](https://www.kaggle.com/search?q=satellite+image+in%3Adatasets).
614 | The [kaggle blog](http://blog.kaggle.com) is an interesting read.
615 | 
616 | ### Kaggle - Amazon from space - classification challenge
617 | * https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data
618 | * 3-5 meter resolution GeoTIFF images from planet Dove satellite constellation
619 | * 12 classes including - **cloudy, primary + waterway** etc
620 | * [1st place winner interview - used 11 custom CNN](http://blog.kaggle.com/2017/10/17/planet-understanding-the-amazon-from-space-1st-place-winners-interview/)
621 | * [FastAI Multi-label image classification](https://towardsdatascience.com/fastai-multi-label-image-classification-8034be646e95)
622 | * [Multi-Label Classification of Satellite Photos of the Amazon Rainforest](https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-network-to-classify-satellite-photos-of-the-amazon-rainforest/)
623 | * [Understanding the Amazon Rainforest with Multi-Label Classification + VGG-19, Inceptionv3, AlexNet & Transfer Learning](https://towardsdatascience.com/understanding-the-amazon-rainforest-with-multi-label-classification-vgg-19-inceptionv3-5084544fb655)
624 | * [amazon-classifier](https://github.com/mikeskaug/amazon-classifier) -> compares random forest with CNN
625 | * [multilabel-classification](https://github.com/muneeb706/multilabel-classification) -> compares various CNN architecutres
626 | * [Planet-Amazon-Kaggle](https://github.com/Skumarr53/Planet-Amazon-Kaggle) -> uses fast.ai
627 | * [deforestation_deep_learning](https://github.com/schumanzhang/deforestation_deep_learning)
628 | * [Track-Human-Footprint-in-Amazon-using-Deep-Learning](https://github.com/sahanasub/Track-Human-Footprint-in-Amazon-using-Deep-Learning)
629 | * [Amazon-Rainforest-CNN](https://github.com/cldowdy/Amazon-Rainforest-CNN) -> uses a 3-layer CNN in Tensorflow
630 | * [rainforest-tagging](https://github.com/minggli/rainforest-tagging) -> Convolutional Neural Net and Recurrent Neural Net in Tensorflow for satellite images multi-label classification
631 | * [satellite-deforestation](https://github.com/drewhibbard/satellite-deforestation) -> Using Satellite Imagery to Identify the Leading Indicators of Deforestation, applied to the Kaggle Challenge Understanding the Amazon from Space
632 | 
633 | ### Kaggle - DSTL segmentation challenge
634 | * https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection
635 | * Rating - medium, many good examples (see the Discussion as well as kernels), but as this competition was run a couple of years ago many examples use python 2
636 | * WorldView 3 - 45 satellite images covering 1km x 1km in both 3 (i.e. RGB) and 16-band (400nm - SWIR) images
637 | * 10 Labelled classes include - **Buildings, Road, Trees, Crops, Waterway, Vehicles**
638 | * [Interview with 1st place winner who used segmentation networks](http://blog.kaggle.com/2017/04/26/dstl-satellite-imagery-competition-1st-place-winners-interview-kyle-lee/) - 40+ models, each tweaked for particular target (e.g. roads, trees)
639 | * [ZF_UNET_224_Pretrained_Model 2nd place solution](https://github.com/ZFTurbo/ZF_UNET_224_Pretrained_Model) ->
640 | * [3rd place soluton](https://github.com/osin-vladimir/kaggle-satellite-imagery-feature-detection) -> which explored pansharpening & calculating reflectance indices, with [arxiv paper](https://arxiv.org/abs/1706.06169) 
641 | * [Deepsense 4th place solution](https://deepsense.ai/deep-learning-for-satellite-imagery-via-image-segmentation/)
642 | * [Entry by lopuhin](https://github.com/lopuhin/kaggle-dstl) using UNet with batch-normalization
643 | * [Multi-class semantic segmentation of satellite images using U-Net](https://github.com/rogerxujiang/dstl_unet) using DSTL dataset, tensorflow 1 & python 2.7. Accompanying [article](https://towardsdatascience.com/dstl-satellite-imagery-contest-on-kaggle-2f3ef7b8ac40)
644 | * [Deep-Satellite-Image-Segmentation](https://github.com/antoine-spahr/Deep-Satellite-Image-Segmentation)
645 | * [Dstl-Satellite-Imagery-Feature-Detection-Improved](https://github.com/dsp6414/Dstl-Satellite-Imagery-Feature-Detection-Improved)
646 | * [Satellite-imagery-feature-detection](https://github.com/ArangurenAndres/Satellite-imagery-feature-detection)
647 | * [Satellite_Image_Classification](https://github.com/aditya-sawadh/Satellite_Image_Classification) -> using XGBoost and ensemble classification methods
648 | * [Unet-for-Satellite](https://github.com/justinishikawa/Unet-for-Satellite)
649 | * [building-segmentation](https://github.com/jimpala/building-segmentation) -> TensorFlow U-Net implementation trained to segment buildings in satellite imagery
650 | 
651 | ### Kaggle - DeepSat land cover classification
652 | * https://www.kaggle.com/datasets/crawford/deepsat-sat4 & https://www.kaggle.com/datasets/crawford/deepsat-sat6
653 | * [DeepSat-Kaggle](https://github.com/athulsudheesh/DeepSat-Kaggle) -> uses Julia
654 | * [deepsat-aws-emr-pyspark](https://github.com/hellosaumil/deepsat-aws-emr-pyspark) -> Using PySpark for Image Classification on Satellite Imagery of Agricultural Terrains
655 | 
656 | ### Kaggle - Airbus ship detection challenge
657 | * https://www.kaggle.com/c/airbus-ship-detection/overview
658 | * Rating - medium, most solutions using deep-learning, many kernels, [good example kernel](https://www.kaggle.com/kmader/baseline-u-net-model-part-1)
659 | * [Detecting ships in satellite imagery: five years later…](https://medium.com/artificialis/detecting-ships-in-satellite-imagery-five-years-later-28df2e83f987)
660 | * I believe there was a problem with this dataset, which led to many complaints that the competition was ruined
661 | * [Lessons Learned from Kaggle’s Airbus Challenge](https://towardsdatascience.com/lessons-learned-from-kaggles-airbus-challenge-252e25c5efac)
662 | * [Airbus-Ship-Detection](https://github.com/kheyer/Airbus-Ship-Detection) -> This solution scored 139 out of 884 for the competition, combines ResNeXt50 based classifier and a U-net segmentation model
663 | * [Ship-Detection-Project](https://github.com/ZTong1201/Ship-Detection-Project) -> uses Mask R-CNN and UNet model
664 | * [Airbus_SDC](https://github.com/WillieMaddox/Airbus_SDC)
665 | * [Airbus_SDC_dup](https://github.com/WillieMaddox/Airbus_SDC_dup) -> Project focused on detecting duplicate regions of overlapping satellite imagery. Applied to Airbus ship detection dataset
666 | * [airbus-ship-detection](https://github.com/jancervenka/airbus-ship-detection) -> CNN with REST API
667 | * [Ship-Detection-from-Satellite-Images-using-YOLOV4](https://github.com/debasis-dotcom/Ship-Detection-from-Satellite-Images-using-YOLOV4) -> uses Kaggle Airbus Ship Detection dataset
668 | * [Image Segmentation: Kaggle experience](https://towardsdatascience.com/image-segmentation-kaggle-experience-9a41cb8924f0) -> Medium article by gold medal winner Vlad Shmyhlo
669 | 
670 | ### Kaggle - Ships in Google Earth
671 | * https://www.kaggle.com/datasets/tomluther/ships-in-google-earth
672 | * 794 jpegs showing various sized ships in satellite imagery, annotations in Pascal VOC format for object detection models
673 | * [/kaggle-ships-in-satellite-imagery-with-YOLOv8](https://github.com/robmarkcole/kaggle-ships-in-satellite-imagery-with-YOLOv8)
674 | 
675 | ### Kaggle - Classify Ships in San Franciso Bay using Planet satellite imagery
676 | * https://www.kaggle.com/datasets/rhammell/ships-in-satellite-imagery
677 | * 4000 80x80 RGB images labeled with either a "ship" or "no-ship" classification, 3 meter pixel size
678 | * [shipsnet-detector](https://github.com/rhammell/shipsnet-detector) -> Detect container ships in Planet imagery using machine learning
679 | * [DeepLearningShipDetection](https://github.com/PenguinDan/DeepLearningShipDetection)
680 | * [Ship-Detection-Using-Satellite-Imagery](https://github.com/Dhruvisha29/Ship-Detection-Using-Satellite-Imagery)
681 | 
682 | ### Kaggle - Planesnet classification dataset
683 | * https://www.kaggle.com/datasets/rhammell/planesnet -> Detect aircraft in Planet satellite image chips
684 | * 20x20 RGB images, the "plane" class includes 8000 images and the "no-plane" class includes 24000 images
685 | * [Dataset repo](https://github.com/rhammell/planesnet) and [planesnet-detector](https://github.com/rhammell/planesnet-detector) demonstrates a small CNN classifier on this dataset
686 | * [ergo-planes-detector](https://github.com/evilsocket/ergo-planes-detector) -> An ergo based project that relies on a convolutional neural network to detect airplanes from satellite imagery, uses the PlanesNet dataset
687 | * [Using AWS SageMaker/PlanesNet to process Satellite Imagery](https://github.com/kskalvar/aws-sagemaker-planesnet-imagery)
688 | * [Airplane-in-Planet-Image](https://github.com/MaxLenormand/Airplane-in-Planet-Image) -> pytorch model
689 | 
690 | ### Kaggle - CGI Planes in Satellite Imagery w/ BBoxes
691 | * https://www.kaggle.com/datasets/aceofspades914/cgi-planes-in-satellite-imagery-w-bboxes
692 | * 500 computer generated satellite images of planes
693 | * [Faster RCNN to detect airplanes](https://github.com/ShubhankarRawat/Airplane-Detection-for-Satellites)
694 | * [aircraft-detection-from-satellite-images-yolov3](https://github.com/emrekrtorun/aircraft-detection-from-satellite-images-yolov3)
695 | 
696 | ### Kaggle - Swimming pool and car detection using satellite imagery
697 | * https://www.kaggle.com/datasets/kbhartiya83/swimming-pool-and-car-detection
698 | * 3750 satellite images of residential areas with annotation data for swimming pools and cars
699 | * [Object detection on Satellite Imagery using RetinaNet](https://medium.com/@ije_good/object-detection-on-satellite-imagery-using-retinanet-part-1-training-e589975afbd5)
700 | 
701 | ### Kaggle - Draper challenge to place images in order of time
702 | * https://www.kaggle.com/c/draper-satellite-image-chronology/data
703 | * Rating - hard. Not many useful kernels.
704 | * Images are grouped into sets of five, each of which have the same setId. Each image in a set was taken on a different day (but not necessarily at the same time each day). The images for each set cover approximately the same area but are not exactly aligned.
705 | * Kaggle interviews for entrants who [used XGBOOST](http://blog.kaggle.com/2016/09/15/draper-satellite-image-chronology-machine-learning-solution-vicens-gaitan/) and a [hybrid human/ML approach](http://blog.kaggle.com/2016/09/08/draper-satellite-image-chronology-damien-soukhavong/)
706 | * [deep-cnn-sat-image-time-series](https://github.com/MickyDowns/deep-cnn-sat-image-time-series) -> uses LSTM
707 | 
708 | ### Kaggle - Dubai segmentation
709 | * https://www.kaggle.com/datasets/humansintheloop/semantic-segmentation-of-aerial-imagery
710 | * 72 satellite images of Dubai, the UAE, and is segmented into 6 classes
711 | * [dubai-satellite-imagery-segmentation](https://github.com/ayushdabra/dubai-satellite-imagery-segmentation) -> due to the small dataset, image augmentation was used
712 | * [U-Net for Semantic Segmentation on Unbalanced Aerial Imagery](https://towardsdatascience.com/u-net-for-semantic-segmentation-on-unbalanced-aerial-imagery-3474fa1d3e56) -> using the Dubai dataset
713 | * [Semantic-Segmentation-using-U-Net](https://github.com/Anay21110/Semantic-Segmentation-using-U-Net) -> uses keras
714 | * [unet_satelite_image_segmentation](https://github.com/nassimaliou/unet_satelite_image_segmentation)
715 | 
716 | ### Kaggle - Massachusetts Roads & Buildings Datasets - segmentation
717 | * https://www.kaggle.com/datasets/balraj98/massachusetts-roads-dataset
718 | * https://www.kaggle.com/datasets/balraj98/massachusetts-buildings-dataset
719 | * [Official published dataset](https://www.cs.toronto.edu/~vmnih/data/)
720 | * [Road_seg_dataset](https://github.com/parth1620/Road_seg_dataset) -> subset of the roads dataset containing only 200 images and masks
721 | * [Road and Building Semantic Segmentation in Satellite Imagery](https://github.com/Paulymorphous/Road-Segmentation) uses U-Net on the Massachusetts Roads Dataset & keras
722 | * [Semantic-segmentation repo by fuweifu-vtoo](https://github.com/fuweifu-vtoo/Semantic-segmentation) -> uses pytorch and the [Massachusetts Buildings & Roads Datasets](https://www.cs.toronto.edu/~vmnih/data/)
723 | * [ssai-cnn](https://github.com/mitmul/ssai-cnn) -> This is an implementation of Volodymyr Mnih's dissertation methods on his Massachusetts road & building dataset
724 | * [building-footprint-segmentation](https://github.com/fuzailpalnak/building-footprint-segmentation) -> pip installable library to train building footprint segmentation on satellite and aerial imagery, applied to Massachusetts Buildings Dataset and Inria Aerial Image Labeling Dataset
725 | * [Road detection using semantic segmentation and albumentations for data augmention](https://towardsdatascience.com/road-detection-using-segmentation-models-and-albumentations-libraries-on-keras-d5434eaf73a8) using the Massachusetts Roads Dataset, U-net & Keras
726 | * [Image-Segmentation)](https://github.com/mschulz/Image-Segmentation) -> using Massachusetts Road dataset and fast.ai
727 | 
728 | ### Kaggle - Deepsat classification challenge
729 | Not satellite but airborne imagery. Each sample image is 28x28 pixels and consists of 4 bands - red, green, blue and near infrared. The training and test labels are one-hot encoded 1x6 vectors. Each image patch is size normalized to 28x28 pixels. Data in `.mat` Matlab format. JPEG?
730 | * [Sat4](https://www.kaggle.com/datasets/crawford/deepsat-sat4) 500,000 image patches covering four broad land cover classes - **barren land, trees, grassland and a class that consists of all land cover classes other than the above three**
731 | * [Sat6](https://www.kaggle.com/datasets/crawford/deepsat-sat6) 405,000 image patches each of size 28x28 and covering 6 landcover classes - **barren land, trees, grassland, roads, buildings and water bodies.**
732 | 
733 | ### Kaggle - High resolution ship collections 2016 (HRSC2016)
734 | * https://www.kaggle.com/datasets/guofeng/hrsc2016
735 | * Ship images harvested from Google Earth
736 | * [HRSC2016_SOTA](https://github.com/ming71/HRSC2016_SOTA) -> Fair comparison of different algorithms on the HRSC2016 dataset
737 | 
738 | ### Kaggle - SWIM-Ship Wake Imagery Mass
739 | * https://www.kaggle.com/datasets/lilitopia/swimship-wake-imagery-mass
740 | * An optical ship wake detection benchmark dataset built for deep learning
741 | * [WakeNet](https://github.com/Lilytopia/WakeNet) -> A CNN-based optical image ship wake detector, code for 2021 paper: Rethinking Automatic Ship Wake Detection: State-of-the-Art CNN-based Wake Detection via Optical Images
742 | 
743 | ### Kaggle - Understanding Clouds from Satellite Images
744 | In this challenge, you will build a model to classify cloud organization patterns from satellite images.
745 | * https://www.kaggle.com/c/understanding_cloud_organization/
746 | * [3rd place solution on Github by naivelamb](https://github.com/naivelamb/kaggle-cloud-organization)
747 | * [15th place solution on Github by Soongja](https://github.com/Soongja/kaggle-clouds)
748 | * [69th place solution on Github by yukkyo](https://github.com/yukkyo/Kaggle-Understanding-Clouds-69th-solution)
749 | * [161st place solution on Github by michal-nahlik](https://github.com/michal-nahlik/kaggle-clouds-2019)
750 | * [Solution by yurayli](https://github.com/yurayli/satellite-cloud-segmentation)
751 | * [Solution by HazelMartindale](https://github.com/HazelMartindale/kaggle_understanding_clouds_learning_project) uses 3 versions of U-net architecture
752 | * [Solution by khornlund](https://github.com/khornlund/understanding-cloud-organization)
753 | * [Solution by Diyago](https://github.com/Diyago/Understanding-Clouds-from-Satellite-Images)
754 | * [Solution by tanishqgautam](https://github.com/tanishqgautam/Multi-Label-Segmentation-With-FastAI)
755 | 
756 | ### Kaggle - 38-Cloud Cloud Segmentation
757 | * https://www.kaggle.com/datasets/sorour/38cloud-cloud-segmentation-in-satellite-images
758 | * Contains 38 Landsat 8 images and manually extracted pixel-level ground truths
759 | * [38-Cloud Github repository](https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset) and follow up [95-Cloud](https://github.com/SorourMo/95-Cloud-An-Extension-to-38-Cloud-Dataset) dataset
760 | * [How to create a custom Dataset / Loader in PyTorch, from Scratch, for multi-band Satellite Images Dataset from Kaggle](https://medium.com/analytics-vidhya/how-to-create-a-custom-dataset-loader-in-pytorch-from-scratch-for-multi-band-satellite-images-c5924e908edf)
761 | * [Cloud-Net: A semantic segmentation CNN for cloud detection](https://github.com/SorourMo/Cloud-Net-A-semantic-segmentation-CNN-for-cloud-detection) -> an end-to-end cloud detection algorithm for Landsat 8 imagery, trained on 38-Cloud Training Set
762 | * [Segmentation of Clouds in Satellite Images Using Deep Learning](https://medium.com/swlh/segmentation-of-clouds-in-satellite-images-using-deep-learning-a9f56e0aa83d) -> semantic segmentation using a Unet on the Kaggle 38-Cloud dataset
763 | 
764 | ### Kaggle - Airbus Aircraft Detection Dataset
765 | * https://www.kaggle.com/airbusgeo/airbus-aircrafts-sample-dataset
766 | * One hundred civilian airports and over 3000 annotated commercial aircrafts
767 | * [detecting-aircrafts-on-airbus-pleiades-imagery-with-yolov5](https://medium.com/artificialis/detecting-aircrafts-on-airbus-pleiades-imagery-with-yolov5-5f3d464b75ad)
768 | * [pytorch-remote-sensing](https://github.com/miko7879/pytorch-remote-sensing) -> Aircraft detection using the 'Airbus Aircraft Detection' dataset and Faster-RCNN with ResNet-50 backbone in pytorch
769 | 
770 | ### Kaggle - Airbus oil storage detection dataset
771 | * https://www.kaggle.com/airbusgeo/airbus-oil-storage-detection-dataset
772 | * [Oil-Storage Tank Instance Segmentation with Mask R-CNN](https://github.com/georgiosouzounis/instance-segmentation-mask-rcnn/blob/main/mask_rcnn_oiltanks_gpu.ipynb) with [accompanying article](https://medium.com/@georgios.ouzounis/oil-storage-tank-instance-segmentation-with-mask-r-cnn-77c94433045f)
773 | * [Oil Storage Detection on Airbus Imagery with YOLOX](https://medium.com/artificialis/oil-storage-detection-on-airbus-imagery-with-yolox-9e38eb6f7e62) -> uses the Kaggle Airbus Oil Storage Detection dataset
774 | * [Oil-Storage-Tanks-Data-Preparation-YOLO-Format](https://github.com/shah0nawaz/Oil-Storage-Tanks-Data-Preparation-YOLO-Format)
775 | 
776 | ### Kaggle - Satellite images of hurricane damage
777 | * https://www.kaggle.com/datasets/kmader/satellite-images-of-hurricane-damage
778 | * https://github.com/dbuscombe-usgs/HurricaneHarvey_buildingdamage
779 | 
780 | ### Kaggle - Austin Zoning Satellite Images
781 | * https://www.kaggle.com/datasets/franchenstein/austin-zoning-satellite-images
782 | * classify a images of Austin into one of its zones, such as residential, industrial, etc. 3667 satellite images
783 | 
784 | ### Kaggle - Statoil/C-CORE Iceberg Classifier Challenge
785 | Classify the target in a SAR image chip as either a ship or an iceberg. The dataset for the competition included 5000 images extracted from multichannel SAR data collected by the Sentinel-1 satellite. Top entries used ensembles to boost prediction accuracy from about 92% to 97%.
786 | * https://www.kaggle.com/c/statoil-iceberg-classifier-challenge/data
787 | * [An interview with David Austin: 1st place winner](https://pyimagesearch.com/2018/03/26/interview-david-austin-1st-place-25000-kaggles-popular-competition/)
788 | * [radar-image-recognition](https://github.com/siarez/radar-image-recognition)
789 | * [Iceberg-Classification-Using-Deep-Learning](https://github.com/mankadronit/Iceberg-Classification-Using-Deep-Learning) -> uses keras
790 | * [Deep-Learning-Project](https://github.com/singh-shakti94/Deep-Learning-Project) -> uses keras
791 | * [iceberg-classifier-challenge solution by ShehabSunny](https://github.com/ShehabSunny/iceberg-classifier-challenge) -> uses keras
792 | * [Analyzing Satellite Radar Imagery with Deep Learning](https://uk.mathworks.com/company/newsletters/articles/analyzing-satellite-radar-imagery-with-deep-learning.html) -> by Matlab, uses ensemble with greedy search
793 | * [16th place solution](https://github.com/sergeyshilin/kaggle-statoil-iceberg-classifier-challenge)
794 | * [fastai solution](https://github.com/smarkochev/ds_notebooks/blob/master/Statoil_Kaggle_competition_google_colab_notebook.ipynb)
795 | 
796 | ### Kaggle - Land Cover Classification Dataset from DeepGlobe Challenge - segmentation
797 | * https://www.kaggle.com/datasets/balraj98/deepglobe-land-cover-classification-dataset
798 | * [Satellite Imagery Semantic Segmentation with CNN](https://joshting.medium.com/satellite-imagery-segmentation-with-convolutional-neural-networks-f9254de3b907) -> 7 different segmentation classes, DeepGlobe Land Cover Classification Challenge dataset, with [repo](https://github.com/justjoshtings/satellite_image_segmentation)
799 | * [Land Cover Classification with U-Net](https://baratam-tarunkumar.medium.com/land-cover-classification-with-u-net-aa618ea64a1b) -> Satellite Image Multi-Class Semantic Segmentation Task with PyTorch Implementation of U-Net, uses DeepGlobe Land Cover Segmentation dataset, with [code](https://github.com/TarunKumar1995-glitch/land_cover_classification_unet)
800 | * [DeepGlobe Land Cover Classification Challenge solution](https://github.com/GeneralLi95/deepglobe_land_cover_classification_with_deeplabv3plus)
801 | 
802 | ### Kaggle - Next Day Wildfire Spread
803 | A Data Set to Predict Wildfire Spreading from Remote-Sensing Data
804 | * https://www.kaggle.com/datasets/fantineh/next-day-wildfire-spread
805 | * https://arxiv.org/abs/2112.02447
806 | 
807 | ### Kaggle - Satellite Next Day Wildfire Spread
808 | Inspired by the above dataset, using different data sources
809 | * https://www.kaggle.com/datasets/satellitevu/satellite-next-day-wildfire-spread
810 | * https://github.com/SatelliteVu/SatelliteVu-AWS-Disaster-Response-Hackathon
811 | 
812 | ## Kaggle - Spacenet 7 Multi-Temporal Urban Change Detection
813 | * https://www.kaggle.com/datasets/amerii/spacenet-7-multitemporal-urban-development
814 | * [SatFootprint](https://github.com/PriyanK7n/SatFootprint) -> building segmentation on the Spacenet 7 dataset
815 | 
816 | ## Kaggle - Satellite Images to predict poverty in Africa
817 | * https://www.kaggle.com/datasets/sandeshbhat/satellite-images-to-predict-povertyafrica
818 | * Uses satellite imagery and nightlights data to predict poverty levels at a local level
819 | * [Predicting-Poverty](https://github.com/jmather625/predicting-poverty-replication) -> Combining satellite imagery and machine learning to predict poverty, in PyTorch
820 | 
821 | ## Kaggle - NOAA Fisheries Steller Sea Lion Population Count
822 | * https://www.kaggle.com/competitions/noaa-fisheries-steller-sea-lion-population-count -> count sea lions from aerial images
823 | * [Sealion-counting](https://github.com/babyformula/Sealion-counting)
824 | * [Sealion_Detection_Classification](https://github.com/yyc9268/Sealion_Detection_Classification)
825 | 
826 | ## Kaggle - Arctic Sea Ice Image Masking
827 | * https://www.kaggle.com/datasets/alexandersylvester/arctic-sea-ice-image-masking
828 | * [sea_ice_remote_sensing](https://github.com/sum1lim/sea_ice_remote_sensing)
829 | 
830 | ## Kaggle - Overhead-MNIST
831 | * A Benchmark Satellite Dataset as Drop-In Replacement for MNIST
832 | * https://www.kaggle.com/datasets/datamunge/overheadmnist -> kaggle
833 | * https://arxiv.org/abs/2102.04266 -> paper
834 | * https://github.com/reveondivad/ov-mnist -> github
835 | 
836 | ## Kaggle - Satellite Image Classification
837 | * https://www.kaggle.com/datasets/mahmoudreda55/satellite-image-classification
838 | * [satellite-image-classification-pytorch](https://github.com/dilaraozdemir/satellite-image-classification-pytorch)
839 | 
840 | ## Kaggle - EuroSAT - Sentinel-2 Dataset
841 | * https://www.kaggle.com/datasets/raoofnaushad/eurosat-sentinel2-dataset
842 | * RGB Land Cover and Land Use Classification using Sentinel-2 Satellite
843 | * Used in paper [Image Augmentation for Satellite Images](https://arxiv.org/abs/2207.14580)
844 | 
845 | ## Kaggle - Satellite Images of Water Bodies
846 | * https://www.kaggle.com/datasets/franciscoescobar/satellite-images-of-water-bodies
847 | * [pytorch-waterbody-segmentation](https://github.com/gauthamk02/pytorch-waterbody-segmentation) -> UNET model trained on the Satellite Images of Water Bodies dataset from Kaggle. The model is deployed on Hugging Face Spaces
848 | 
849 | ## Kaggle - NOAA sea lion count
850 | * https://www.kaggle.com/c/noaa-fisheries-steller-sea-lion-population-count
851 | * [noaa](https://github.com/darraghdog/noaa) -> UNET, object detection and image level regression approaches
852 | 
853 | ### Kaggle - miscellaneous
854 | * https://www.kaggle.com/datasets/reubencpereira/spatial-data-repo -> Satellite + loan data
855 | * https://www.kaggle.com/datasets/towardsentropy/oil-storage-tanks -> Image data of industrial oil tanks with bounding box annotations, estimate tank fill % from shadows
856 | * https://www.kaggle.com/datasets/airbusgeo/airbus-wind-turbines-patches -> Airbus SPOT satellites images over wind turbines for classification
857 | * https://www.kaggle.com/datasets/aceofspades914/cgi-planes-in-satellite-imagery-w-bboxes -> CGI planes object detection dataset
858 | * https://www.kaggle.com/datasets/atilol/aerialimageryforroofsegmentation -> Aerial Imagery for Roof Segmentation
859 | * https://www.kaggle.com/datasets/andrewmvd/ship-detection -> 621 images of boats and ships
860 | * https://www.kaggle.com/datasets/alpereniek/vehicle-detection-from-satellite-images-data-set
861 | * https://www.kaggle.com/datasets/sergiishchus/maxar-satellite-data -> Example Maxar data at 15 cm resolution
862 | * https://www.kaggle.com/datasets/cici118/swimming-pool-detection-algarves-landscape
863 | * https://www.kaggle.com/datasets/donkroco/solar-panel-module -> object detection for solar panels
864 | * https://www.kaggle.com/datasets/balraj98/deepglobe-road-extraction-dataset -> segment roads
865 | * https://www.kaggle.com/datasets/towardsentropy/oil-storage-tanks -> Image data of industrial Oil Storage Tanks with bounding box annotations
866 | * https://www.kaggle.com/competitions/widsdatathon2019/ -> Palm oil plantations
867 | * https://www.kaggle.com/datasets/siddharthkumarsah/ships-in-aerial-images -> Ships/Vessels in Aerial Images
868 | * https://www.kaggle.com/datasets/jangsienicajzkowy/afo-aerial-dataset-of-floating-objects -> Aerial dataset for maritime Search and Rescue applications
869 | * https://www.kaggle.com/datasets/yaroslavnaychuk/satelliteimagesegmentation -> Segmentation on Gaofen Satellite Image, extracted from GID-15 dataset
870 | 
871 | # Competitions
872 | Competitions are an excellent source for accessing clean, ready-to-use satellite datasets and model benchmarks.  
873 | 
874 | * https://codalab.lisn.upsaclay.fr/competitions/9603 -> object detection from diversified satellite imagery
875 | * https://www.drivendata.org/competitions/143/tick-tick-bloom/ -> detect and classify algal bloom
876 | * https://www.drivendata.org/competitions/81/detect-flood-water/ -> map floodwater from radar imagery
877 | * https://platform.ai4eo.eu/enhanced-sentinel2-agriculture -> map cultivated land using Sentinel imagery
878 | * https://www.diu.mil/ai-xview-challenge -> multiple challenges ranging from detecting fishing vessals to estimating building damages
879 | * https://competitions.codalab.org/competitions/30440 -> flood detection
880 | * https://www.drivendata.org/competitions/83/cloud-cover/ -> cloud cover detection
881 | * https://www.drivendata.org/competitions/78/overhead-geopose-challenge/page/372/ -> predicts geocentric pose from single-view oblique satellite images
882 | * https://www.drivendata.org/competitions/60/building-segmentation-disaster-resilience/ -> building segmentation
883 | * https://captain-whu.github.io/DOTA/ -> large dataset for object detection in aerial imagery
884 | * https://spacenet.ai/ -> set of 8 challenges such as road network detection
885 | * https://huggingface.co/spaces/competitions/ChaBuD-ECML-PKDD2023 -> binary image segmentation task on forest fires monitored over California
886 | <!-- markdown-link-check-disable -->
887 | * https://spaceml.org/repo/project/6269285b14d764000d798fde -> ML for floods
888 | * https://spaceml.org/repo/project/60002402f5647f00129f7287 -> lightning and extreme weather
889 | * https://spaceml.org/repo/project/6025107d79c197001219c481/true -> ~1TB dataset for precipitation forecasting
890 | * https://spaceml.org/repo/project/61c0a1b9ff8868000dfb79e1/true -> Sentinel-2 image super-resolution
891 | <!-- markdown-link-check-enable --
892 | 


--------------------------------------------------------------------------------
/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/satellite-image-deep-learning/datasets/8dd0136f922c73226d4052c1be9060d46d4ed3f5/logo.png


--------------------------------------------------------------------------------