Making a mosaic of Landsat scenes
We just published a story about trash from mainland China ending up on Hong Kong's beaches.
Josh Horowitz a reporter in our Hong Kong bureau was working on the story and had cooked up this diagram based on his reporting.
The basic premise is that trash from the cities north of Hong Kong was flowing out south and west into the ocean, per usual after storms, but atypical winds were pushing that trash back towards Hong Kong.
I thought it was important to show the extent of the urban areas as that's where the trash was coming from. As you can see in Josh’s annotated google map you don’t get a good sense of how large or sprawling those urban areas are. This meant that relying on outlines from shapefiles (basically creating a prettier google map) wasn't going to cut it. We needed to be able to see the cities. We needed to use satellite images.
While I had done this before, processing satellite images is not a simple task. As to help me, my colleagues, and you in the future here’s a run through on how I made it:
1. Search for Landsat 8 data that covers the area around Hong Kong
The USGS has a website called EarthExplorer that lets you search through decades of satellite data. I limited my search to Landsat 8 data with less than 10% cloud cover.
(you can do this search on the command line with landsat-util too but i prefer the web interface. In the future I will probably use this online tool from Development Seed)
2. Copy the scene IDs
Sometimes you only have one, but in our case our area of focus spanned four scenes, which is where all the complexity comes from. Multiple scenes means having to match colors and hide seams where the scenes are stitched together.
I'm doing this for journalism, so I need repeatable processes. I used a Makefile in this case and set a variable for the list of IDs
LANDSAT_IDS = \ LC81220442016038LGN00 \ LC81220452016038LGN00 \ LC81210442014281LGN00 \ LC81210452014281LGN00
(Landsat IDs aren't random numbers, they're encoded with information about the scene. Here are the component parts
L C 8 121 045 2014 281 LGN 00
Which means, in order: Landsat, using OLI and TIRS, on satellite 8, captured at path 121, row 045, in 2014, on the 281 julian day of the year, downloaded at a LGN facility, with an archive version number of 00. You can learn more about that here. What's useful to know is that you can see these scenes are all adjacent because they're on path 121 or 122 and row 44 or 45)
3. Use landsat-util to download the scenes
landsat-util is a great piece of command line software that will download landsat scenes in just the way you want them. I only wanted the bands to make a natural color image so I used this command
landsat download LC81220442016038LGN00 --bands 4328
My makefile did this (and all the other commands below) for every scene based on the list of IDs
4. Use gdal to make 432 composites of the scenes
If you only have one scene you can use landsat-util for this, but because I need to color match all of them I used gdal so that there'd be no contrast adjustment.
gdal_merge.py \ -co "PHOTOMETRIC=RGB" \ -separate ~/landsat/downloads/LC81220442016038LGN00/LC81220442016038LGN00_B{4,3,2}.TIF \ -o LC81220442016038LGN00_rgb.tif
5. Resize and reproject the scenes
If you didn't know, these files are enormous: over 100MB each, in order to stitch them together I needed to reduce their size to 1000px wide to make sure I had enough memory.
At the same time I reprojected the scenes so that they'd be oriented how I wanted. I used UTM 50.
gdalwarp \ -co "PHOTOMETRIC=RGB" \ -t_srs '+proj=utm +zone=50 +datum=WGS84' \ -ts 1000 0 \ -r cubic \ LC81220442016038LGN00_rgb.tif \ LC81220442016038LGN00_rgb_resized_utm50.tif
6. Stitch all the images together
If you're stitching together two scenes from the same path you could probably use schooner-stitch in most cases
schooner-stitch \ LC81220442016038LGN00_rgb_resized_utm50.tif \ LC81220452016038LGN00_rgb_resized_utm50.tif \ stiched_432.tiff
Here's what those results looked like with my four scenes stitched together (after using some other tools to adjust the contrast on each one individually)
The seams are pretty apparent, despite being blended.
Here’s one where the contrast adjustment was done after the stitching (you can see how scenes on the same path—i.e. captures that immediately followed each other—blend much more easily than those on adjacent paths)
n.b. if you get an error from schooner-stitch referencing OpenCV it's probably because you don't have enough RAM to support the operation. You'll need to reduce the size of your images.
So I tried Photoshop's photomerge tool instead, here are where that put the seams:
As you’ll see in a moment, it's way better. Even in the ocean, they're tough to pick out.
7. Adjust the colors of the entire mosaic
Because I haven’t made any contrast adjustments yet the mosaic is still very dark.
Sensors on satellites have a huge range of sensitivity, the brightness of the clouds is vastly different than the brightness of the surface, thus the earth looks really dark, even in the day. So I add a bunch of adjustments to mosaic as layers
Again this allows me to be journalistically responsible by making sure my edits are reversible and non-destructive.
This left me with my final mosaic
8. Annotate
And here's how I used it in the the final graphic explaining the story:
Read all about it here http://qz.com/725498/heres-how-huge-amounts-of-trash-from-the-pearl-river-delta-washed-up-on-hong-kongs-shores/
And check out my complete Makefile
—David Yanofsky with thanks to Rob Simmon, Jeff Larson and Alex Kappel














