Geographic Data#

import hvplot.pandas  # noqa
import hvplot.xarray  # noqa
import xarray as xr

from bokeh.sampledata.airport_routes import airports

Introduction#

hvPlot can display geographic data and can be used to add contextualisation layers:

  • Overlay the data over a tiled web map

  • Overlay the data over geographic/administrative features

  • Plot the data in a specific projection

Many, but not all, of these features require GeoViews to be installed, which is the geographic extension of HoloViews built on Cartopy and pyproj. You can install GeoViews with pip install geoviews or conda install geoviews.

Before plotting geographical data, you should know in which coordinate system the data is represented. This is important for a few reasons:

  • The data may have to be projected to another coordinate system, and to do so you may have to provide the original coordinate system.

  • Projecting data can be a computationally expensive operation and you should be aware of when this operation happens, to perhaps optimize it.

Here are two common coordinate systems you will encounter when dealing with geographic data and using hvPlot:

  • Data represented by latitude/longitude (lat/lon) coordinates are often tied to the WGS84 geographic coordinate system (EPSG:4326), that’s how GPS data are located on the Earth. For example, the coordinates of Paris in this system are (lat=48.856697, lon=2.351462).

  • Tiled web maps are displayed in the Web Mercator projection (also known as Pseudo-Mercator or Google Mercator, EPSG:3857). Unlike WGS84 which expresses coordinates in lat/long decimal degrees, Web Mercator expresses them in easting (X) /northing (Y) metric units. For example, the coordinates of Paris in this system are (easting=261763.552, northing=6250580.761).

Supported plot types#

Only certain hvPlot types support geographic coordinates, currently including: points, polygons, paths, image, quadmesh, contour, and contourf.

Tiled web map#

A tiled web map, or tile map, is an interactive map displayed on a web browser that is divided into small, pre-rendered image tiles, allowing for efficient loading and seamless navigation across various zoom levels and geographic areas. hvPlot allows us to add a tile map as a basemap to a plot with the tiles parameter. Importantly, tiles is a parameter that can be used without installing GeoViews.

We’ll display this dataframe of all US airports (including military bases overseas), the points are expressed in latitude/longitude coordinates:

airports.head(3)
AirportID Name City Country IATA ICAO Latitude Longitude Altitude Timezone DST TZ Type source
0 3411 Barter Island LRRS Airport Barter Island United States BTI PABA 70.134003 -143.582001 2 -9 A America/Anchorage airport OurAirports
1 3413 Cape Lisburne LRRS Airport Cape Lisburne United States LUR PALU 68.875099 -166.110001 16 -9 A America/Anchorage airport OurAirports
2 3414 Point Lay LRRS Airport Point Lay United States PIZ PPIZ 69.732903 -163.005005 22 -9 A America/Anchorage airport OurAirports

We’ll first start by displaying the airports without GeoViews with tiles by setting tiles=True.

Under the hood, hvPlot projects lat/lon to easting/northing (EPSG:4326 to EPSG:3857) coordinates without additional package dependencies if it detects that the values falls within expected lat/lon ranges, unless the data is lazily loaded (dask / ibis), or the data is a spatialpandas object.

Note, this feature is only available after hvplot>=0.11.0; older versions, hvplot<0.11.0, require manual projection (see below).

airports.hvplot.points('Longitude', 'Latitude', tiles=True, color='red', alpha=0.2)

If you do not want it to be auto-projected under the hood for any reason, set projection=False, but note it will not be properly placed on the tiles without manual intervention. In doing so, the data will be plotted around the null island location, roughly 600km off the coast of West Africa. padding was added to demonstrate this.

airports.hvplot.points('Longitude', 'Latitude', tiles=True, projection=False, color='red', alpha=0.2, padding=10000)

To manually project use the lon_lat_to_easting_northing function provided by HoloViews, that doesn’t require installing any of the usual geo dependencies like pyproj or cartopy.

from holoviews.util.transform import lon_lat_to_easting_northing

airports['x'], airports['y'] = lon_lat_to_easting_northing(airports.Longitude, airports.Latitude)
airports[['Latitude', 'Longitude', 'x', 'y']].head(3)
airports.hvplot.points('x', 'y', tiles=True, color='red', alpha=0.2)

When GeoViews is installed, you can set geo=True to let hvPlot know that it can leverage it to display the data. You can directly plot the airports without having to project the data yourself, GeoViews will take care of projecting it to Web Mercator automatically.

airports.hvplot.points(
    'Longitude', 'Latitude', geo=True, tiles=True, color='red', alpha=0.2
)

The easiest ways to set tiles include:

  • to True to get the default layer which currently is from OpenStreetMap (prefer one of the explicit approaches below if you care about reproducibility)

  • with an xyzservices.TileProvider instance that can be created from the optional dependency xyzservices which gives you access to hundreds of tiled web maps

  • with the name of one the layers shipped by HoloViews and GeoViews (when it’s installed), like 'EsriTerrain'.

The tile map can be additionally configured by setting tiles_opts with a dictionary of options that will be applied to the basemap layer.

import xyzservices.providers as xyz
airports.hvplot.points(
    'x', 'y', tiles=xyz.Esri.WorldPhysical, tiles_opts={'alpha': 0.5}, color='red', alpha=0.2
)

Data and plot projections#

import cartopy.crs as ccrs 

As we have seen in the previous section, you can set geo=True to let hvPlot know that your data is geographic. GeoViews must be installed when setting geo=True, or when any of the other geographic options is set, except tiles. In this mode hvPlot:

  1. Assumes the data is expressed in lat/lon coordinates. Internally, hvPlot assumes a Plate Carrée projection, also known as the equirectangular projection.

  2. Displays the data in the projection Plate Carrée assumed in step (1), or, if tiles=True, projects and displays the data in the Web Mercator projection.

The input data projection assumed in step (1), and the output plot projection used in step (2) can both be overridden with the crs (for Coordinate Reference System) and projection parameters, respectively. Both parameters accept cartopy CRS objects (see the full list of values they accept in their definition). You can now see that that setting geo=True, tiles=True is strictly equivalent to setting crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR! Coordinate reference systems are described in more details in the GeoViews documentation and the full list of available CRSs is in the cartopy documentation.

For example, we can set projection to display the airports on an Orthographic projection centered over North America.

airports.hvplot.points(
    'Longitude', 'Latitude', color='red', alpha=0.2,
    coastline=True, projection=ccrs.Orthographic(-90, 30)
)

Raster data#

air_ds = xr.tutorial.open_dataset('air_temperature').load()
air_ds
<xarray.Dataset> Size: 31MB
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
  * time     (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

When displaying raster data in a projection other than the one in which the data is stored, it is more accurate to render it as a quadmesh rather than an image. As you can see below, a QuadMesh will project each original bin or pixel into the correct non-rectangular shape determined by the projection, accurately showing the geographic extent covered by each sample. An Image, on the other hand, will always be rectangularly aligned in the 2D plane, which requires warping and resampling the data in a way that allows efficient display but loses accuracy at the pixel level.

air_ds.isel(time=0).hvplot.quadmesh('lon', 'lat', 'air', projection=ccrs.LambertConformal())

Unfortunately, rendering a large QuadMesh using Bokeh can be very slow, but there are two useful alternatives for datasets too large to be practical as native QuadMeshes.

The first is using the rasterize or datashade options to regrid the data before rendering it, i.e., rendering the data on the backend and then sending a more efficient image-based representation to the browser. One thing to note when using these operations is that it may be necessary to project the data before rasterizing it, e.g. to address wrapping issues. To do this provide project=True, which will project the data before it is rasterized (this also works for other types and even when not using these operations). Another reason why this is important when rasterizing the data is that if the CRS of the data does not match the displayed projection, all the data will be projected every time you zoom or pan, which can be very slow. Deciding whether to project is therefore a tradeoff between projecting the raw data ahead of time or accepting the overhead on dynamic zoom and pan actions.

rasm = xr.tutorial.open_dataset('rasm').load().isel(time=0)
rasm
<xarray.Dataset> Size: 1MB
Dimensions:  (y: 205, x: 275)
Coordinates:
    time     object 8B 1980-09-16 12:00:00
    xc       (y, x) float64 451kB 189.2 189.4 189.6 189.7 ... 17.4 17.15 16.91
    yc       (y, x) float64 451kB 16.53 16.78 17.02 17.27 ... 28.01 27.76 27.51
Dimensions without coordinates: y, x
Data variables:
    Tair     (y, x) float64 451kB nan nan nan nan ... 27.91 27.02 26.56 26.73
Attributes:
    title:                     /workspace/jhamman/processed/R1002RBRxaaa01a/l...
    institution:               U.W.
    source:                    RACM R1002RBRxaaa01a
    output_frequency:          daily
    output_mode:               averaged
    convention:                CF-1.4
    references:                Based on the initial model of Liang et al., 19...
    comment:                   Output from the Variable Infiltration Capacity...
    nco_openmp_thread_number:  1
    NCO:                       netCDF Operators version 4.7.9 (Homepage = htt...
    history:                   Fri Aug  7 17:57:38 2020: ncatted -a bounds,,d...
rasm.hvplot.quadmesh(
    'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR,
    tiles=xyz.Esri.WorldGrayCanvas, project=True, rasterize=True,
    xlim=(-20, 40), ylim=(30, 70), cmap='viridis', frame_width=400,
)

The second option that’s still relatively slow for larger data but avoids sending large data into your browser is to plot the data using contour and contourf visualizations, generating a line or filled contour with a discrete number of levels:

rasm.hvplot.contourf(
    'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR,
    tiles=xyz.Esri.WorldGrayCanvas, levels=10,
    xlim=(-20, 40), ylim=(30, 70), cmap='viridis', frame_width=400,
)

Geopandas usage#

Since a GeoPandas DataFrame is just a Pandas DataFrames with additional geographic information, it inherits the .hvplot method. We can thus easily load geographic files (GeoJSON, Shapefiles, etc) and plot them on a map:

import geopandas as gpd
import geodatasets
path = geodatasets.get_path("geoda.nyc_neighborhoods")
path
'/home/runner/.cache/geodatasets/nycnhood_acs.zip'
nyc = gpd.read_file(path)
nyc.head(3)
UEMPRATE cartodb_id borocode withssi withsocial withpubass struggling profession popunemplo poptot ... boroname popdty ntacode medianinco medianagem medianagef medianage HHsize gini geometry
0 0.095785 1 3 652 5067 277 6421 889 2225 48351 ... Brooklyn 497498.701 BK45 1520979 663.3 777.1 722.6 2.96421052631579 0.386315789473684 POLYGON ((-73.91716 40.63173, -73.91722 40.631...
1 0.090011 2 3 2089 7132 1016 10981 1075 2652 61584 ... Brooklyn 589296.926 BK17 1054259 791.4 868.5 827.6 2.46578947368421 0.448089473684211 POLYGON ((-73.91809 40.58657, -73.91813 40.586...
2 0.130393 3 3 3231 8847 2891 21235 712 6483 100130 ... Brooklyn 1506628.84 BK61 980637 863.1 983.9 923.8 2.42925925925926 0.473666666666667 POLYGON ((-73.92165 40.67887, -73.92171 40.678...

3 rows × 99 columns

nyc.hvplot(geo=True, tiles=True, c="poptot", alpha=0.8, tiles_opts={'alpha': 0.5})

The GeoPandas support allows plotting GeoDataFrames containing 'Point', 'Polygon', 'LineString' and 'LineRing' geometries, but not ones containing a mixture of different geometry types. Calling .hvplot will automatically figure out the geometry type to plot, but it also possible to call .hvplot.points, .hvplot.polygons, and .hvplot.paths explicitly.

Spatialpandas usage#

Spatialpandas is another powerful library for working with geometries and is optimized for rendering with datashader, making it possible to plot millions of individual geometries very quickly:

import spatialpandas as spd

spd_nyc = spd.GeoDataFrame(nyc)
spd_nyc.hvplot(
    datashade=True, project=True, aggregator='count_cat',
    c='boroname', color_key='Category10'
)

As you can see, hvPlot makes it simple to work with geographic data visually. For more complex plot types and additional details, see the GeoViews documentation.

Options#

The API provides various geo-specific options:

  • coastline (default=False): Whether to display a coastline on top of the plot, setting coastline='10m'/'50m'/'110m' specifies a specific scale

  • crs (default=None): Coordinate reference system of the data (input projection) specified as a string or integer EPSG code, a CRS or Proj pyproj object, a Cartopy CRS object or class name, a WKT string, or a proj.4 string. Defaults to PlateCarree.

  • features features (default=None): A list of features or a dictionary of features and the scale at which to render it. Available features include ‘borders’, ‘coastline’, ‘lakes’, ‘land’, ‘ocean’, ‘rivers’ and ‘states’. Available scales include ‘10m’/’50m’/’110m’.

  • geo (default=False): Whether the plot should be treated as geographic (and assume PlateCarree, i.e. lat/lon coordinates)

  • global_extent (default=False): Whether to expand the plot extent to span the whole globe

  • project (default=False): Whether to project the data before plotting (adds initial overhead but avoids projecting data when plot is dynamically updated)

  • projection (default=None): Coordinate reference system of the plot (output projection) specified as a string or integer EPSG code, a CRS or Proj pyproj object, a Cartopy CRS object or class name, a WKT string, or a proj.4 string. Defaults to PlateCarree.

  • tiles (default=False): Whether to overlay the plot on a tile source. If coordinate values fall within lat/lon bounds, auto-projects to EPSG:3857 unless projection=False or if the data is lazily loaded (dask / ibis). Accepts the following values:

    • True: OpenStreetMap layer

    • xyzservices.TileProvider instance (requires xyzservices to be installed)

    • a map string name based on one of the default layers made available by HoloViews (‘CartoDark’, ‘CartoLight’, ‘EsriImagery’, ‘EsriNatGeo’, ‘EsriUSATopo’, ‘EsriTerrain’, ‘EsriStreet’, ‘EsriReference’, ‘OSM’, ‘OpenTopoMap’) or GeoViews (‘CartoDark’, ‘CartoEco’, ‘CartoLight’, ‘CartoMidnight’, ‘EsriImagery’, ‘EsriNatGeo’, ‘EsriUSATopo’, ‘EsriTerrain’, ‘EsriReference’, ‘EsriOceanBase’, ‘EsriOceanReference’, ‘EsriWorldPhysical’, ‘EsriWorldShadedRelief’, ‘EsriWorldTopo’, ‘EsriWorldDarkGrayBase’, ‘EsriWorldDarkGrayReference’, ‘EsriWorldLightGrayBase’, ‘EsriWorldLightGrayReference’, ‘EsriWorldHillshadeDark’, ‘EsriWorldHillshade’, ‘EsriAntarcticImagery’, ‘EsriArcticImagery’, ‘EsriArcticOceanBase’, ‘EsriArcticOceanReference’, ‘EsriWorldBoundariesAndPlaces’, ‘EsriWorldBoundariesAndPlacesAlternate’, ‘EsriWorldTransportation’, ‘EsriDelormeWorldBaseMap’, ‘EsriWorldNavigationCharts’, ‘EsriWorldStreetMap’, ‘OSM’, ‘OpenTopoMap’). Note that Stamen tile sources require a Stadia account when not running locally; see stadiamaps.com.

    • a holoviews.Tiles or geoviews.WMTS instance or class

  • tiles_opts (default=None): Dictionary of plotting options to customize the tiles layer created when tiles is set, e.g. dict(alpha=0.5).

This web page was generated from a Jupyter notebook and not all interactivity will work on this website.