AI Atlas Intro

By Grant Pearse

August 27, 2022

What This Is

The AI atlas is an experiment in training and applying deep learning models to New Zealand’s open aerial imagery and serving the results using open-source mapping technologies. My main motivation was that I thought it would be fun to build and I would learn a lot in the process. Both true so far.

Deep learning (AI) has unlocked all kinds of remote sensing applications that seemed impossible just a few years ago and it has really captured my interest. At the same time, cloud-native geospatial technologies have made it much easier to work with large volumes of data. To try and keep up, I wanted a project that would blend deep learning, geospatial and cloud technologies. Trying to build a demo of an AI-powered atlas of New Zealand seemed like a good fit.

I plan to run the site for a year or two and try to do the following:

Get at least a few models working at scale using very little / no data from NZ.
Setup an automated workflow for inference and spatial post-processing.
Build some NZ-specific models from scratch using the new tools and models that promise much faster dataset creation.

What This Isn’t

This is in no way intended to be an authoritative or up-to-date data source.
The predictions are not ‘production-grade’ - the failure cases are often the most interesting.
This is not a basemap tile service. I rely fully on the LINZ services to stream the underlying imagery. A big thank you to LINZ for granting me developer API access.
This is not a commercial service and has nothing to do with my employer or any other service or company.

New Zealand’s Aerial Basemap

New Zealand has an amazing collection of open aerial imagery. The datasets are paid for by different local authorities for planning and monitoring. Land Information New Zealand (LINZ), our national spatial and geodetics agency, continuously gathers these datasets and publishes them under an open license on the LINZ Data Service.

LINZ also combines the imagery into a national aerial basemap using an open-source system you can find on GitHub. The team at LINZ gave an excellent presentation on their cloud-based processing chain at the 2022 FOSS4G meetup in Auckland. Things to note about the basemap:

Imagery is generated by overlaying the datasets covering an area in a defined order - usually placing the newest imagery on top.
The resolution varies quite a bit with most, but not all, urban areas having much higher resolution (7.5cm - 12.5cm) imagery than rural areas (20cm - 75cm).
The whole process is done on AWS leveraging technologies like AWS Lambda, STAC and Argo for Kubernetes to ingest the datasets and dynamically serve basemap imagery via APIs for [multiple tile grids and formats] (https://www.linz.govt.nz/guidance/data-service/linz-basemaps-guide/how-use-linz-basemaps-apis).

The final product is an eye-pleasing composite mosaic covering New Zealand. The image below shows the blending of datasets with different resolutions and capture dates. More thanks to the LINZ team for describing this process to me and documenting it on GitHub.

**Blending urban and rural imagery from different dates in the basemap. Imagery: LINZ**

Sources of Error

New Imagery, Old Predictions

New imagery is constantly added to the basemap. This means that the predictions in these areas are no longer current. Automating the inference pipeline is on the roadmap but for now the updates are manual and infrequent. This kind of error shows up as detections in likely areas (e.g. vehicle detections on roads or in parking lots) but with nothing underneath. I try to highlight these areas by marking them as ’expired’ on the web map until I have time to process the new imagery.

**Stale detections over newer imagery from Taranaki, NZ. Imagery: LINZ**

False Positives and False Negatives

This is very much an experiment and the outputs are not edited for quality so incorrect detections are easy to find. The failure cases are often interesting because they highlight differences between the source imagery and the NZ aerial imagery that aren’t always obvious. One of the great things about deep learning is that the model capacity is typically large enough that the errors are the starting point for adding valuable examples to the dataset.

Upsampled Imagery

I run much larger images through the model during inference than the image chips used by the LINZ basemap tile service. Some low-resolution chips get upsampled to the highest resolution covered by the larger tile’s extent. This can cause false positives/negatives with models trained mostly on high-resolution data. This is especially common at the borders between urban and rural datasets in the basemap.

**False positives and false negatives where 30cm rural imagery borders a 7.5cm urban tile. Imagery: LINZ**

Datasets

Deep learning needs lots and lots of training data and collecting these datasets can be expensive and time consuming. Thankfully, there is a huge selection of public remote sensing datasets available for a range of tasks. I highly recommend the ‘awesome lists’ by Robin Cole and Christoph Rieke as good places to start.

Training and Inference

This is very much a hobby project, so resources are limited. I use a mix of AWS and GCP for training and my underpowered and overworked RTX3060 for inference.