Timeseries Data#
Almost all the other sections in the user guide mention timeseries. This section demonstrates the special functionality that hvPlot provides specifically for dealing with time.
import numpy as np
import hvplot.pandas # noqa
from bokeh.sampledata.sea_surface_temperature import sea_surface_temperature as sst
sst.hvplot()
By default, the index will be used as the x-axis when plotting tabular data. If the index is composed of datetimes, and they are not in chronological order, hvPlot will try to sort them before plotting (unless you set sort_date=False
).
scrampled = sst.sample(frac=1)
scrampled.hvplot()
Tickers#
The datetime tickers will be set to a default that is meant to fit in the allotted space. If youโd rather use a different format, then you can declare an explicit date ticker according to the rules on Bokeh DatetimeTickFormatter.
from bokeh.models.formatters import DatetimeTickFormatter
formatter = DatetimeTickFormatter(months='%b %Y')
sst.hvplot(xformatter=formatter)
Auto-range#
(Available with HoloViews >= 1.16)
Automatic ranging, aka auto-ranging, on the data in x or y is supported, making it easy to scale the given axes and fit the entire visible curve after a zoom or pan. Try zooming in on the plot and panning around, the y range nicely adapt to fit the curve.
sst.hvplot(autorange="y")
Subcoordinate y-axis#
hvPlot enables you to create overlays where each element has its own distinct y-axis subcoordinate system, which is particularly useful to analyse multiple timeseries. To activate this feature that automatically distributes overlay elements along the y-axis, set the subcoordinate_y
keyword to True
. subcoordinate_y
also accepts a dictionary of related options, for example set subcoordinate_y={'subcoordinate_scale': 2}
to increase the scale of each sub-plot, resulting in each curveโs vertical range overlapping 50% with its adjacent elements. Additionally, the y-axis wheel-zoom will apply to each curveโs respective sub-coordinate y-axis, rather than the global coordinate frame. More information about this feature can be found in HoloViewsโ documentation.
For demonstration purposes, weโll temporarily add a new column of โsensorsโ that splits up the temperature data into several series.
sensor = np.random.choice(['s1', 's2', 's3', 's4'], size=len(sst))
sst.assign(sensor=sensor).hvplot(by='sensor', subcoordinate_y=True)
Pandas datetime features#
hvPlot takes advantage of datetime features to make it trivial to produce plots that are aggregated on some feature of the date. For instance in the case of temperature data, it might be interesting to examine the monthly temperature distribution. We can easily do that by setting by='index.month'
.
sst.hvplot.violin(by='index.month')
We can also use these datetime features as the x
and y
. Here weโll look at the mean temperature at each hour of the day for each month in our dataset.
sst.hvplot.heatmap(x='index.hour', y='index.month', C='temperature', cmap='reds')
WARNING:param.HeatMapPlot01003: HeatMap element index is not unique, ensure you aggregate the data before displaying it, e.g. using heatmap.aggregate(function=np.mean). Duplicate index values have been dropped.
Combining this with the information from the section on widgets, we can even use the datetime features to produce a plot that steps through each month in the data.
sst.hvplot(groupby=['index.year', 'index.month'], widget_type='scrubber', widget_location='bottom')
Xarray datetime features#
The same datetime features can be used with xarray data as well, although for now, the functionality is only supported for non-gridded output.
import xarray as xr
import hvplot.xarray # noqa
air_ds = xr.tutorial.open_dataset('air_temperature').load()
air_ds
<xarray.Dataset> Size: 31MB Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00 Data variables: air (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7 Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
Similar to how we did for sea surface temperature above, we can get the distribution of air temperature by month.
air_ds.hvplot.violin(y='air', by='time.month')
Once we reduce the dimensionality (by taking the mean over โlatโ and โlonโ), we can groupby various datetime features.
air_ds.mean(dim=['lat', 'lon']).hvplot(by='time.hour', groupby=['time.year', 'time.month'])
Note that xarray supports grouping and aggregation using a similar syntax. To learn more about timeseries in xarray, see the xarray timeseries docs.
Working with Large Timeseries#
Working with large timeseries presents new visualization challenges. Consult our Large Timeseries User Guide to learn about various approaches.