Geographic Data#
import hvplot.pandas # noqa
import hvplot.xarray # noqa
import xarray as xr
from bokeh.sampledata.airport_routes import airports
Introduction#
hvPlot can display geographic data and can be used to add contextualisation layers:
Overlay the data over a tiled web map
Overlay the data over geographic/administrative features
Plot the data in a specific projection
Many, but not all, of these features require GeoViews to be installed, which is the geographic extension of HoloViews built on Cartopy and pyproj. You can install GeoViews with pip install geoviews
or conda install geoviews
.
Before plotting geographical data, you should know in which coordinate system the data is represented. This is important for a few reasons:
The data may have to be projected to another coordinate system, and to do so you may have to provide the original coordinate system.
Projecting data can be a computationally expensive operation and you should be aware of when this operation happens, to perhaps optimize it.
Here are two common coordinate systems you will encounter when dealing with geographic data and using hvPlot:
Data represented by latitude/longitude (lat/lon) coordinates are often tied to the WGS84 geographic coordinate system (EPSG:4326), that’s how GPS data are located on the Earth. For example, the coordinates of Paris in this system are
(lat=48.856697, lon=2.351462)
.Tiled web maps are displayed in the Web Mercator projection (also known as Pseudo-Mercator or Google Mercator, EPSG:3857). Unlike WGS84 which expresses coordinates in lat/long decimal degrees, Web Mercator expresses them in easting (X) /northing (Y) metric units. For example, the coordinates of Paris in this system are
(easting=261763.552, northing=6250580.761)
.
Supported plot types#
Only certain hvPlot types support geographic coordinates, currently including: points
, polygons
, paths
, image
, quadmesh
, contour
, and contourf
.
Tiled web map#
A tiled web map, or tile map, is an interactive map displayed on a web browser that is divided into small, pre-rendered image tiles, allowing for efficient loading and seamless navigation across various zoom levels and geographic areas. hvPlot allows us to add a tile map as a basemap to a plot with the tiles
parameter. Importantly, tiles
is a parameter that can be used without installing GeoViews.
We’ll display this dataframe of all US airports (including military bases overseas), the points are expressed in latitude/longitude coordinates:
airports.head(3)
AirportID | Name | City | Country | IATA | ICAO | Latitude | Longitude | Altitude | Timezone | DST | TZ | Type | source | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 3411 | Barter Island LRRS Airport | Barter Island | United States | BTI | PABA | 70.134003 | -143.582001 | 2 | -9 | A | America/Anchorage | airport | OurAirports |
1 | 3413 | Cape Lisburne LRRS Airport | Cape Lisburne | United States | LUR | PALU | 68.875099 | -166.110001 | 16 | -9 | A | America/Anchorage | airport | OurAirports |
2 | 3414 | Point Lay LRRS Airport | Point Lay | United States | PIZ | PPIZ | 69.732903 | -163.005005 | 22 | -9 | A | America/Anchorage | airport | OurAirports |
We’ll first start by displaying the airports without GeoViews with tiles by setting tiles=True
.
Under the hood, hvPlot projects lat/lon to easting/northing (EPSG:4326 to EPSG:3857) coordinates without additional package dependencies if it detects that the values falls within expected lat/lon ranges, unless the data is lazily loaded (dask / ibis), or the data is a spatialpandas object.
Note, this feature is only available after hvplot>=0.11.0
; older versions, hvplot<0.11.0
, require manual projection (see below).
airports.hvplot.points('Longitude', 'Latitude', tiles=True, color='red', alpha=0.2)
If you do not want it to be auto-projected under the hood for any reason, set projection=False
, but note it will not be properly placed on the tiles without manual intervention. In doing so, the data will be plotted around the null island location, roughly 600km off the coast of West Africa. padding
was added to demonstrate this.
airports.hvplot.points('Longitude', 'Latitude', tiles=True, projection=False, color='red', alpha=0.2, padding=10000)
To manually project use the lon_lat_to_easting_northing
function provided by HoloViews, that doesn’t require installing any of the usual geo dependencies like pyproj
or cartopy
.
from holoviews.util.transform import lon_lat_to_easting_northing
airports['x'], airports['y'] = lon_lat_to_easting_northing(airports.Longitude, airports.Latitude)
airports[['Latitude', 'Longitude', 'x', 'y']].head(3)
airports.hvplot.points('x', 'y', tiles=True, color='red', alpha=0.2)
When GeoViews is installed, you can set geo=True
to let hvPlot know that it can leverage it to display the data. You can directly plot the airports without having to project the data yourself, GeoViews will take care of projecting it to Web Mercator automatically.
airports.hvplot.points(
'Longitude', 'Latitude', geo=True, tiles=True, color='red', alpha=0.2
)
The easiest ways to set tiles
include:
to
True
to get the default layer which currently is from OpenStreetMap (prefer one of the explicit approaches below if you care about reproducibility)with an
xyzservices.TileProvider
instance that can be created from the optional dependency xyzservices which gives you access to hundreds of tiled web mapswith the name of one the layers shipped by HoloViews and GeoViews (when it’s installed), like
'EsriTerrain'
.
The tile map can be additionally configured by setting tiles_opts
with a dictionary of options that will be applied to the basemap layer.
import xyzservices.providers as xyz
airports.hvplot.points(
'x', 'y', tiles=xyz.Esri.WorldPhysical, tiles_opts={'alpha': 0.5}, color='red', alpha=0.2
)
Data and plot projections#
import cartopy.crs as ccrs
As we have seen in the previous section, you can set geo=True
to let hvPlot know that your data is geographic. GeoViews must be installed when setting geo=True
, or when any of the other geographic options is set, except tiles
. In this mode hvPlot:
Assumes the data is expressed in lat/lon coordinates. Internally, hvPlot assumes a Plate Carrée projection, also known as the equirectangular projection.
Displays the data in the projection Plate Carrée assumed in step (1), or, if
tiles=True
, projects and displays the data in the Web Mercator projection.
The input data projection assumed in step (1), and the output plot projection used in step (2) can both be overridden with the crs
(for Coordinate Reference System) and projection
parameters, respectively. Both parameters accept cartopy CRS
objects (see the full list of values they accept in their definition). You can now see that that setting geo=True, tiles=True
is strictly equivalent to setting crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR
! Coordinate reference systems are described in more details in the GeoViews documentation and the full list of available CRSs is in the cartopy documentation.
For example, we can set projection
to display the airports on an Orthographic projection centered over North America.
airports.hvplot.points(
'Longitude', 'Latitude', color='red', alpha=0.2,
coastline=True, projection=ccrs.Orthographic(-90, 30)
)
Raster data#
air_ds = xr.tutorial.open_dataset('air_temperature').load()
air_ds
<xarray.Dataset> Size: 31MB Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00 Data variables: air (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7 Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
When displaying raster data in a projection other than the one in which the data is stored, it is more accurate to render it as a quadmesh
rather than an image
. As you can see below, a QuadMesh will project each original bin or pixel into the correct non-rectangular shape determined by the projection, accurately showing the geographic extent covered by each sample. An Image, on the other hand, will always be rectangularly aligned in the 2D plane, which requires warping and resampling the data in a way that allows efficient display but loses accuracy at the pixel level.
air_ds.isel(time=0).hvplot.quadmesh('lon', 'lat', 'air', projection=ccrs.LambertConformal())
Unfortunately, rendering a large QuadMesh using Bokeh can be very slow, but there are two useful alternatives for datasets too large to be practical as native QuadMeshes.
The first is using the rasterize
or datashade
options to regrid the data before rendering it, i.e., rendering the data on the backend and then sending a more efficient image-based representation to the browser. One thing to note when using these operations is that it may be necessary to project the data before rasterizing it, e.g. to address wrapping issues. To do this provide project=True
, which will project the data before it is rasterized (this also works for other types and even when not using these operations). Another reason why this is important when rasterizing the data is that if the CRS of the data does not match the displayed projection, all the data will be projected every time you zoom or pan, which can be very slow. Deciding whether to project
is therefore a tradeoff between projecting the raw data ahead of time or accepting the overhead on dynamic zoom and pan actions.
rasm = xr.tutorial.open_dataset('rasm').load().isel(time=0)
rasm
<xarray.Dataset> Size: 1MB Dimensions: (y: 205, x: 275) Coordinates: time object 8B 1980-09-16 12:00:00 xc (y, x) float64 451kB 189.2 189.4 189.6 189.7 ... 17.4 17.15 16.91 yc (y, x) float64 451kB 16.53 16.78 17.02 17.27 ... 28.01 27.76 27.51 Dimensions without coordinates: y, x Data variables: Tair (y, x) float64 451kB nan nan nan nan ... 27.91 27.02 26.56 26.73 Attributes: title: /workspace/jhamman/processed/R1002RBRxaaa01a/l... institution: U.W. source: RACM R1002RBRxaaa01a output_frequency: daily output_mode: averaged convention: CF-1.4 references: Based on the initial model of Liang et al., 19... comment: Output from the Variable Infiltration Capacity... nco_openmp_thread_number: 1 NCO: netCDF Operators version 4.7.9 (Homepage = htt... history: Fri Aug 7 17:57:38 2020: ncatted -a bounds,,d...
rasm.hvplot.quadmesh(
'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR,
tiles=xyz.Esri.WorldGrayCanvas, project=True, rasterize=True,
xlim=(-20, 40), ylim=(30, 70), cmap='viridis', frame_width=400,
)
The second option that’s still relatively slow for larger data but avoids sending large data into your browser is to plot the data using contour
and contourf
visualizations, generating a line or filled contour with a discrete number of levels:
rasm.hvplot.contourf(
'xc', 'yc', crs=ccrs.PlateCarree(), projection=ccrs.GOOGLE_MERCATOR,
tiles=xyz.Esri.WorldGrayCanvas, levels=10,
xlim=(-20, 40), ylim=(30, 70), cmap='viridis', frame_width=400,
)
Geopandas usage#
Since a GeoPandas DataFrame
is just a Pandas DataFrames with additional geographic information, it inherits the .hvplot
method. We can thus easily load geographic files (GeoJSON, Shapefiles, etc) and plot them on a map:
import geopandas as gpd
import geodatasets
path = geodatasets.get_path("geoda.nyc_neighborhoods")
path
'/home/runner/.cache/geodatasets/nycnhood_acs.zip'
nyc = gpd.read_file(path)
nyc.head(3)
UEMPRATE | cartodb_id | borocode | withssi | withsocial | withpubass | struggling | profession | popunemplo | poptot | ... | boroname | popdty | ntacode | medianinco | medianagem | medianagef | medianage | HHsize | gini | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.095785 | 1 | 3 | 652 | 5067 | 277 | 6421 | 889 | 2225 | 48351 | ... | Brooklyn | 497498.701 | BK45 | 1520979 | 663.3 | 777.1 | 722.6 | 2.96421052631579 | 0.386315789473684 | POLYGON ((-73.91716 40.63173, -73.91722 40.631... |
1 | 0.090011 | 2 | 3 | 2089 | 7132 | 1016 | 10981 | 1075 | 2652 | 61584 | ... | Brooklyn | 589296.926 | BK17 | 1054259 | 791.4 | 868.5 | 827.6 | 2.46578947368421 | 0.448089473684211 | POLYGON ((-73.91809 40.58657, -73.91813 40.586... |
2 | 0.130393 | 3 | 3 | 3231 | 8847 | 2891 | 21235 | 712 | 6483 | 100130 | ... | Brooklyn | 1506628.84 | BK61 | 980637 | 863.1 | 983.9 | 923.8 | 2.42925925925926 | 0.473666666666667 | POLYGON ((-73.92165 40.67887, -73.92171 40.678... |
3 rows × 99 columns
nyc.hvplot(geo=True, tiles=True, c="poptot", alpha=0.8, tiles_opts={'alpha': 0.5})
The GeoPandas support allows plotting GeoDataFrames
containing 'Point'
, 'Polygon'
, 'LineString'
and 'LineRing'
geometries, but not ones containing a mixture of different geometry types. Calling .hvplot
will automatically figure out the geometry type to plot, but it also possible to call .hvplot.points
, .hvplot.polygons
, and .hvplot.paths
explicitly.
Spatialpandas usage#
Spatialpandas is another powerful library for working with geometries and is optimized for rendering with datashader, making it possible to plot millions of individual geometries very quickly:
import spatialpandas as spd
spd_nyc = spd.GeoDataFrame(nyc)
spd_nyc.hvplot(
datashade=True, project=True, aggregator='count_cat',
c='boroname', color_key='Category10'
)
As you can see, hvPlot makes it simple to work with geographic data visually. For more complex plot types and additional details, see the GeoViews documentation.
Options#
The API provides various geo-specific options:
coastline
(default=False): Whether to display a coastline on top of the plot, settingcoastline='10m'/'50m'/'110m'
specifies a specific scalecrs
(default=None): Coordinate reference system of the data (input projection) specified as a string or integer EPSG code, a CRS or Proj pyproj object, a Cartopy CRS object or class name, a WKT string, or a proj.4 string. Defaults to PlateCarree.features
features (default=None): A list of features or a dictionary of features and the scale at which to render it. Available features include ‘borders’, ‘coastline’, ‘lakes’, ‘land’, ‘ocean’, ‘rivers’ and ‘states’. Available scales include ‘10m’/’50m’/’110m’.geo
(default=False): Whether the plot should be treated as geographic (and assume PlateCarree, i.e. lat/lon coordinates)global_extent
(default=False): Whether to expand the plot extent to span the whole globeproject
(default=False): Whether to project the data before plotting (adds initial overhead but avoids projecting data when plot is dynamically updated)projection
(default=None): Coordinate reference system of the plot (output projection) specified as a string or integer EPSG code, a CRS or Proj pyproj object, a Cartopy CRS object or class name, a WKT string, or a proj.4 string. Defaults to PlateCarree.tiles
(default=False): Whether to overlay the plot on a tile source. If coordinate values fall within lat/lon bounds, auto-projects to EPSG:3857 unlessprojection=False
or if the data is lazily loaded (dask / ibis). Accepts the following values:True
: OpenStreetMap layerxyzservices.TileProvider
instance (requiresxyzservices
to be installed)a map string name based on one of the default layers made available by HoloViews (‘CartoDark’, ‘CartoLight’, ‘EsriImagery’, ‘EsriNatGeo’, ‘EsriUSATopo’, ‘EsriTerrain’, ‘EsriStreet’, ‘EsriReference’, ‘OSM’, ‘OpenTopoMap’) or GeoViews (‘CartoDark’, ‘CartoEco’, ‘CartoLight’, ‘CartoMidnight’, ‘EsriImagery’, ‘EsriNatGeo’, ‘EsriUSATopo’, ‘EsriTerrain’, ‘EsriReference’, ‘EsriOceanBase’, ‘EsriOceanReference’, ‘EsriWorldPhysical’, ‘EsriWorldShadedRelief’, ‘EsriWorldTopo’, ‘EsriWorldDarkGrayBase’, ‘EsriWorldDarkGrayReference’, ‘EsriWorldLightGrayBase’, ‘EsriWorldLightGrayReference’, ‘EsriWorldHillshadeDark’, ‘EsriWorldHillshade’, ‘EsriAntarcticImagery’, ‘EsriArcticImagery’, ‘EsriArcticOceanBase’, ‘EsriArcticOceanReference’, ‘EsriWorldBoundariesAndPlaces’, ‘EsriWorldBoundariesAndPlacesAlternate’, ‘EsriWorldTransportation’, ‘EsriDelormeWorldBaseMap’, ‘EsriWorldNavigationCharts’, ‘EsriWorldStreetMap’, ‘OSM’, ‘OpenTopoMap’). Note that Stamen tile sources require a Stadia account when not running locally; see stadiamaps.com.
a
holoviews.Tiles
orgeoviews.WMTS
instance or class
tiles_opts
(default=None): Dictionary of plotting options to customize the tiles layer created whentiles
is set, e.g.dict(alpha=0.5)
.