Raster processing using Python Tools

Introduction: Working with Raster Data


Teaching: 15 min
Exercises: 0 min
  • What is a raster?

  • What sorts of information does a raster typically model?

  • What are the major charateristics of a raster dataset?

  • What assumptions does the format imply?

  • Understand the raster data model


Scientists working with spatial data often need to manipulate datasets structured as arrays. A common example is in working with hydrological modelling (e.g. where will water flow in a storm event?) using topographical data gathered from the US Geological Survey, or based on aerial or lidar photography gathered from a plane flying below the clouds.

Common types of Geospatial data

Remote sensing is a major source for raster graphics.

Characteristics of a Raster:

Data Model

Rasters use a space-filling model, where all geographic features are represented by discreet cells, arranged in a specific sequence. Geospatial raster datasets are effectively matrices with additional metadata to show which part of the planet the dataset represents.


Bands are layers of values that overlap perfectly. Depending on the source of your raster and the information it represents, you may have multiple bands with numeric values that represent different things. A DEM might have a single band where values represent the height of a pixel relative to sea level. Or you could use a georeferenced natural-color aerial photograph using three bands, one each for Red, Green, and Blue. Raw LANDSAT data contains several other bands as well:

Bands of LANDSAT Thematic Mapper (TM) imagery
Graphic labelling six bands of LANDSAT aerial photography
Source: NASA

NoData Value

Since raster datasets use a space-filling model, every pixel has a value. Where pixel values should not have a value, a Nodata Value is used to indicate a lack of information in a pixel. Visually, this is often represented in GIS as transparency.

NW corner of /data/landcover.tif
Northwest corner os sample landcover raster demonstrating nodata

Coordinate Reference System

Raster datasets are fundamentally images, where each pixel has a value. The spatial element of this dataset, however, lies in its Coordinate Reference System (CRS). Here’s an example CRS, represented as Well-Known text :

PROJCS["WGS 84 / UTM zone 11N",                         # Name of the projected coordinate system
    GEOGCS["WGS 84",                                    # Geographic coordinate system
        DATUM["WGS_1984",                               # The datum (mathematical shape of the earth)
            SPHEROID["WGS 84",6378137,298.257223563,
    PROJECTION["Transverse_Mercator"],                  # Specific projection

This collection of numbers and standards describes how to transform the 3-dimensional surface of the earth into a 2-dimensional image. Getting projections right is a tricky business, and one that can take a great deal of expertise. ASTER DEMs come in the WGS84 geographic coordinate system, but are unprojected. The datasets included in this tutorial are projected into UTM zone 11N.

The projection you choose can make a very big difference in the display of information. Here’s an ASTER DEM from the Sierra Nevada mountains in California with different projections:

Unprojected Projected in UTM Zone 11N Projected in North Pole LAEA Alaska
Unprojected DEM UTM zone 11N North Pole LAEA

Affine GeoTransform

Unfortunately, the CRS by itself is not enough to place the raster on the planet. To do this, we need the Affine Geotransform, which allows us to map pixel coordinates into georeferenced space.

GT = (233025.03117445827, 30.0, 0.0, 4210078.842723392, 0.0, -30.0)
Xgeo = GT(0) + Xpixel*GT(1) + Yline*GT(2)
Ygeo = GT(3) + Xpixel*GT(4) + Yline*GT(5)

Note that this significantly affects how pixels are arranged in space. ‘Up’ could be either an increase or decrease in the row index, depending on the geotransform, and the image could be rotated.

Many raster datasets have square or near-square pixels as well, but this is not a strict requirement. The geotransform can support rectangular pixels of arbitrary size.

Challenges with handling geospatial data

Key Points