├── images ├── example.jpg ├── bldgmetrics.JPG ├── polygonization.jpg ├── segmentation.jpg ├── small_building_example.jpg └── connected_buildings_example.JPG ├── LICENCE-DATA ├── CODE_OF_CONDUCT.md ├── SECURITY.md └── README.md /images/example.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/AustraliaBuildingFootprints/HEAD/images/example.jpg -------------------------------------------------------------------------------- /images/bldgmetrics.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/AustraliaBuildingFootprints/HEAD/images/bldgmetrics.JPG -------------------------------------------------------------------------------- /images/polygonization.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/AustraliaBuildingFootprints/HEAD/images/polygonization.jpg -------------------------------------------------------------------------------- /images/segmentation.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/AustraliaBuildingFootprints/HEAD/images/segmentation.jpg -------------------------------------------------------------------------------- /LICENCE-DATA: -------------------------------------------------------------------------------- 1 | Data in this repository has been licensed by Microsoft under the Open Data Commons Open Database License (ODbL). 2 | -------------------------------------------------------------------------------- /images/small_building_example.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/AustraliaBuildingFootprints/HEAD/images/small_building_example.jpg -------------------------------------------------------------------------------- /images/connected_buildings_example.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/microsoft/AustraliaBuildingFootprints/HEAD/images/connected_buildings_example.JPG -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Microsoft Open Source Code of Conduct 2 | 3 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 4 | 5 | Resources: 6 | 7 | - [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/) 8 | - [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) 9 | - Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns 10 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Security 4 | 5 | Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/). 6 | 7 | If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below. 8 | 9 | ## Reporting Security Issues 10 | 11 | **Please do not report security vulnerabilities through public GitHub issues.** 12 | 13 | Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report). 14 | 15 | If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com). If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc). 16 | 17 | You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc). 18 | 19 | Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue: 20 | 21 | * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.) 22 | * Full paths of source file(s) related to the manifestation of the issue 23 | * The location of the affected source code (tag/branch/commit or direct URL) 24 | * Any special configuration required to reproduce the issue 25 | * Step-by-step instructions to reproduce the issue 26 | * Proof-of-concept or exploit code (if possible) 27 | * Impact of the issue, including how an attacker might exploit the issue 28 | 29 | This information will help us triage your report more quickly. 30 | 31 | If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs. 32 | 33 | ## Preferred Languages 34 | 35 | We prefer all communications to be in English. 36 | 37 | ## Policy 38 | 39 | Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd). 40 | 41 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Introduction 2 | Bing Maps is releasing country wide open building footprints datasets in Australia. This dataset contains 11,334,866 computer generated building footprints derived using Bing Maps algorithms on satellite imagery. Satellite imagery used for extraction is from our imagery partners Maxar Technologies among others. The data is freely available for download and use under applicable license. 3 | 4 | ![](/images/example.jpg) 5 | 6 | ## License 7 | This data is licensed by Microsoft under the [Open Data Commons Open Database License (ODbL)](https://opendatacommons.org/licenses/odbl/). 8 | 9 | ## FAQ 10 | ### What does the data include? 11 | 11,334,866‬ building footprint polygon geometries in Australia in GeoJSON format. You may download the data in GeoJSON format here: 12 | 13 | | Country | Number of Buildings | Zipped MB | Unzipped MB | 14 | | ------------- |:-------------:|:-----:|:-----:| 15 | | [Australia](https://usbuildingdata.blob.core.windows.net/australia-buildings/Australia_2020-06-21.geojson.zip)|11,334,866‬|845|6,410| 16 | 17 | ### What is the GeoJSON format? 18 | GeoJSON is a format for encoding a variety of geographic data structures. 19 | For intensive documentation and tutorials, refer to [GeoJson blog](http://geojson.org/). 20 | 21 | ### Why is the data being released? 22 | Microsoft has a continued interest in supporting a thriving OpenStreetMap ecosystem. 23 | 24 | ### Should we import the data into OpenStreetMap? 25 | Maybe. Never overwrite the hard work of other contributors or blindly import data into OSM without first checking the local quality. While our metrics show that this data meets or exceeds the quality of hand-drawn building footprints, the data does vary in quality from place to place, between rural and urban, mountains and plains, and so on. Inspect quality locally and discuss an import plan with the community. Always follow the [OSM import community guidelines](https://wiki.openstreetmap.org/wiki/Import/Guidelines). 26 | 27 | ### Will the data be used or made available in larger OpenStreetMap ecosystem? 28 | Yes. Currently Microsoft Open Buildings dataset is used in ml-enabler for task creation. You can try it out at [AI assisted Tasking Manager](https://tasks-assisted.hotosm.org/). The data will also be made avaialble in Facebook [RapiD](https://mapwith.ai/rapid#background=Bing&disable_features=boundaries&map=2.00/0.0/0.0). 29 | 30 | ### What is the creation process for this data? 31 | The building extraction is done in two stages: 32 | 1. Semantic Segmentation – Recognizing building pixels on the aerial image using DNNs 33 | 2. Polygonization – Converting building pixel blobs into polygons 34 | 35 | #### Stage1: Semantic Segmentation 36 | ![](/images/segmentation.jpg) 37 | 38 | #### Stage 2: Polygonization 39 | ![](/images/polygonization.jpg) 40 | 41 | ### Is there any technical improvement used in this round than previous ones? 42 | To train models for Australia we only had a few thousand building labels, which made it hard to rely only on supervised training. Typically we’ve used hundreds of thousands or best case tens of millions of building labels for training. In order to create a good and robust model for Australia we took advantage of self-supervised training and unsupervised domain adaptation techniques to leverage our training data from other countries and domains. We believe this is a good proof of concept to scale to building extraction to the whole world. 43 | 44 | ### Evaluation set metrics 45 | Australia evaluation set contains 6,785 buildings from several diverse and represenative regions. 46 | 47 | Building match metrics on the evaluation set: 48 | 49 | | Metric | Value | 50 | | --- | :---: | 51 | | Precision | 98.59% | 52 | | Recall | 64.95% | 53 | 54 | We track following metrics to measure the quality of matched buildings in the evaluation set: 55 | 1. Intersection over Union – This is a standard metric measuring the overlap quality against the labels 56 | 2. Shape distance – With this metric we measure the polygon outline similarity 57 | 3. Dominant angle rotation error – This measures the polygon rotation deviation 58 | 59 | | IoU | Shape distance | Rotation error [deg] | 60 | | :---: | :---: | :---: | 61 | | 0.79 | 0.44 | 4.46 | 62 | 63 | ![](/images/bldgmetrics.JPG) 64 | 65 | ### False positive ratio in the corpus 66 | 67 | We estimate ~1% false postive ratio in 1000 randomly sampled buildings from the entire output corpus. 68 | 69 | ### Evaluation recall error space 70 | Correctly detecting connected buildings and small buildings are sometimes difficult tasks, even for a human labeller. There are often ambiguities in whether one is looking at multiple connected buildings or a single fragmented building. Similarly, it is sometimes hard to estimate for a small object if it should be classified as a building or not. 71 | 72 | Output precision and recall metrics are calculated after optimal 1-to-1 matching between output polygons and labels scored by polygons intersection over union. The labels are usually very granular whilst it is sometimes very hard for DNN model to separate connected buildings. This results with significant ratio of unmatched false negatives which are pushing the recall down. 73 | 74 | | Error category | 35.05% Gap | 75 | | --- | :---: | 76 | | Very small buildings | 15.4% | 77 | | Connected buildings | 14.0% | 78 | | DNN | 2.8% | 79 | | Various | 2.1% | 80 | | Polygonization | 0.7% | 81 | 82 | Small building example: 83 | ![](/images/small_building_example.jpg) 84 | 85 | Connected buildings example: 86 | ![](/images/connected_buildings_example.JPG) 87 | 88 | ### What is the vintage of this data? 89 | Vintage of extracted building footprints depends on vintage of the underlying imagery. Bing Imagery is a composite of multiple sources, therefore it is difficult to know the exact dates for individual pieces of data. However we believe the vintage is anywhere from 2013 to 2018, with majority being from 2018. 90 | 91 | ### How good is the data? 92 | Our metrics show that in the vast majority of cases the quality is at least as good as data hand digitized buildings in OpenStreetMap. It is not perfect, particularly in dense urban areas but it provides good recall in rural areas. 93 | 94 | ### What is the coordinate reference system? 95 | EPSG: 4326 96 | 97 | ### Will there be more data coming for other geographies? 98 | Maybe. This is a work in progress. 99 | 100 |
101 | 102 | ## Contributing 103 | 104 | This project welcomes contributions and suggestions. Most contributions require you to agree to a 105 | Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us 106 | the rights to use your contribution. For details, visit https://cla.microsoft.com. 107 | 108 | When you submit a pull request, a CLA-bot will automatically determine whether you need to provide 109 | a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions 110 | provided by the bot. You will only need to do this once across all repos using our CLA. 111 | 112 | This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). 113 | For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or 114 | contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments. 115 | 116 | ## Legal Notices 117 | 118 | Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation 119 | may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. 120 | The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. 121 | Microsoft's general trademark guidelines can be found [here](http://go.microsoft.com/fwlink/?LinkID=254653). 122 | 123 | Privacy information can be found [here](https://privacy.microsoft.com/en-us/). 124 | 125 | Microsoft and any contributors reserve all others rights, whether under their respective copyrights, patents, 126 | or trademarks, whether by implication, estoppel or otherwise. 127 | --------------------------------------------------------------------------------