Microsoft Generates 125 million Building Footprints using AI and Deep Learning

by | Jul 2, 2018

Microsoft has announced the availability of approximately 125 million building footprint polygon geometries in all 50 US States in an open source GeoJSON format.  Using a two step process centered around the use of artificial intelligence (AI), deep learning, and computer vision, the Microsoft Maps team extracted 124,885,597 footprints in the United States. In OpenStreetMap there are currently 30,567,953 building footprints in the US (at last count) both from editor contributions and various city or county wide imports. Bing is making this data available for download free of charge.   MapShaper is a good tool for importing the data.

The Maps team relied on the Open Source CNTK Unified Toolkit which was developed by Microsoft. Using CNTK Microsoft applied a Deep Neural Network and the ResNet34 with RefineNet up-sampling layers to detect building footprints from the Bing imagery.

The building extraction was done in two stages:

  1. Semantic Segmentation – Recognizing building pixels on the aerial image using DNNs
  2. Polygonization – Converting building pixel blobs into polygons

1. Semantic Segmentation

Training details

The training set consisted of 5 million labeled images. A majority of the satellite images covered diverse residential areas in US. For the sake of good set representation, the dataset was enriched with samples from various areas covering mountains, glaciers, forests, deserts, beaches, coasts, etc. Images in the set are of 256×256 pixel size with 1 ft/pixel resolution. The training was done with CNTK toolkit using 32 GPUs.

Read more about the semantic segmentation process used.

2. Polygonization

Method description

Microsoft developed a method that approximates the prediction pixels into polygons making decisions based on the whole prediction feature space. This is very different from standard approaches, e.g. Douglas-Pecker algorithm, which are greedy in nature. The method tries to impose some of a priory building properties, which are, at the moment, manually defined and automatically tuned. Some of these a priory properties are:

  1. The building edge must be of at least some length, both relative and absolute, e.g. 3 meters
  2. Consecutive edge angles are likely to be 90 degrees
  3. Consecutive angles cannot be very sharp, smaller by some auto-tuned threshold, e.g. 30 degrees
  4. Building angles likely have very few dominant angles, meaning all building edges are forming angle of (dominant angle ± nπ/2)

Microsoft plans to deduce this information automatically in the near future using existing building information.

More Information

Data Vintage

The vintage of the footprints depends on the vintage of the underlying imagery. Because Bing Imagery is a composite of multiple sources it is difficult to know the exact dates for individual pieces of data.

How good is the data?

Metrics show that in the vast majority of cases the quality is at least as good as data hand digitized buildings in OpenStreetMap. It is not perfect, particularly in dense urban areas but it is still awesome.

Read the full GitHub release from Microsoft.


Recent Posts

Eric Pimpler
Eric is the founder and owner of GeoSpatial Training Services ( and has over 25 years of experience implementing and teaching GIS solutions using ESRI, Google Earth/Maps, Open Source technology. Currently Eric focuses on ArcGIS scripting with Python, and the development of custom ArcGIS Server web and mobile applications using JavaScript. Eric is the author of Programming ArcGIS with Python Cookbook - 1st and 2nd Edition, Building Web and Mobile ArcGIS Server Applications with JavaScript, Spatial Analytics with ArcGIS, and ArcGIS Blueprints. Eric has a Bachelor’s degree in Geography from Texas A&M University and a Master's of Applied Geography degree with a concentration in GIS from Texas State University.

Sign up for our weekly newsletter
to receive content like this in your email box.