Overview
Teaching: 4 min Exercises: 0 minQuestions
What are common ways to encode vector geospatial data in Python, and how do they relate to broader encoding standards?
Objectives
Learn about common community standards (broader than Python) for vector data encoding, and how they’re implemented in core Python libraries.
Learn to transform from one encoding type to another. Learn the
GeoJSON
format and exchange encoding interface, including the__geo_interface__
method implemented across libraries.
Standards exist for text (and binary) encoding of the geometry and feature types we saw. These include OGC/ISO Well-Known Text (WKT) and Well-Known Binary (WKB), OGC Geography Markup Language (GML), and GeoJSON (based on JSON, and now an ISO standard). GML is a complex XML-based encoding that we won’t discuss much here. We also won’t touch on WKB in much detail. In this tutorial we’ll focus on WKT and specially GeoJSON.
WKT text representations of geometric entities are fairly intuitive. Here are some examples of the basic geometric entities we saw earlier (where coordinates are ordered as X Y):
POINT(0 0)
LINESTRING(0 0, 1 1, 1 2)
POLYGON((0 0, 4 0, 4 4, 0 4, 0 0), (1 1, 2 1, 2 2, 1 2, 1 1))
MULTIPOINT((0 0), (1 2))
WKT is specially common in interactions with spatially enabled relational databases, and also as readable representations of internal binary encodings.
__geo_interface__
interfaceGeoJSON arose as a community convention leveraging the ubiquity of JSON encoding, specially on the web. Being JSON, it maps easily into Python dictionaries. See http://geojson.org, including examples at http://geojson.org/geojson-spec.html#examples. Its straightforward data structure has made it popular for data exchange in web applications and for passing data between software libraries. But beware that for large data, GeoJSON, being a text-based file format, is a poor choice for geospatial data storage.
The Python __geo_interface__
method has been widely adopted for passing feature data around between code packages as GeoJSON-like objects (basically, Python dictionaries). We’ll explore this method in the next two episodes. To read more about it, see GeoJSON and the geo interface for Python, https://gist.github.com/sgillies/2217756, and Processing vector features in Python. For converting to and from convenient, Pythonic, GeoJSON-like objects, the light-weight geojson
package is a good option.
GeoJSON Example of a point feature collection with two features:
fiona.from_epsg
, GeoPandas.to_srs
Key Points
Python vector packages implement community standards for vector encoding. While these can seem complex, tools exist for conversion into various forms, and many of the tools include common interfaces for seamles exchange of data across tools.
Packages exist for easily reading data from file-based and other serialized data formats.
Most of these packages involve a series of steps to handle data, such as stepping through features via a loop, etc. Most tools do one or a couple of things only.
GeoPandas
addresses these challenges by enabling operations on feature collections in one step and bundling multiple tools via a coherent interface that builds onPandas
.