interactive#

hvPlot isn’t only a plotting library, it is dedicated to make data exploration easier. In this guide you will see how it can help you to get better control over your data pipelines. We define a data pipeline as a series of commands that transform some data, such as aggregating, filtering, reshaping, renaming, etc. A data pipeline may include a load step that will provide the input data to the pipeline, e.g. reading the data from a data base.

When you analyze some data in a notebook that is for instance held in a Pandas DataFrame, you may find yourself having to re-run many cells after changing the parameters you provide to Pandas’ methods, either to get more insights on the data or fine tune an algorithm. .interactive() is a solution to improve this rather cumbersome workflow, by which you replace the constant parameters in the pipeline by widgets (e.g. a number slider), that will automatically get displayed next to your pipeline output and will trigger an output update on changes. With this approach all your pipeline parameters are available in one place and you get full interactive control over the pipeline.

.interactive() doesn’t only work with DataFrames but also with Xarray data structures, this is what we are going to show in this guide. First we will import hvplot.xarray which is going make available the .interactive() accessor on Xarray objects.

import hvplot.xarray  # noqa
import xarray as xr

We load a dataset and get a handle on its unique air variable.

ds = xr.tutorial.load_dataset('air_temperature')
air = ds.air
ds
<xarray.Dataset> Size: 31MB
Dimensions:  (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
  * time     (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
Data variables:
    air      (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

We want to better understand the temporal evolution of the air temperature over different latitudes compared to a baseline. The data pipeline we build includes:

  1. filtering the data at one latitude

  2. cleaning up the data

  3. aggregating the temperatures by time

  4. computing a rolling mean

  5. subtracting a baseline from the above

The output we choose for now is the .describe() method of Pandas. This pipeline has two parameters, the latitude and the temporal window of the rolling operation.

LATITUDE = 30.
ROLLING_WINDOW = '1D'

baseline = air.sel(lat=LATITUDE).mean().item()
pipeline = (
    air
    .sel(lat=LATITUDE)
    .to_dataframe()
    .drop(columns='lat')
    .groupby('time').mean()
    .rolling(ROLLING_WINDOW).mean()
    - baseline
)
pipeline.describe()
air
count 2920.000000
mean 0.000418
std 3.185757
min -6.247181
25% -2.930566
50% -0.344752
75% 3.222418
max 5.372583

Without .interactive() we would manually change the values of LATITUDE and ROLLING_WINDOW to see how they affect the pipeline output. Instead we create two widgets with the values we expect them to take, we are basically declaring beforehand our parameter space. To create widgets we import Panel and pick two appropriate widgets from its Reference Gallery.

import panel as pn

w_latitude = pn.widgets.DiscreteSlider(name='Latitude', options=list(air.lat.values))
w_rolling_window = pn.widgets.RadioButtonGroup(name='Rolling window', options=['1D', '7D', '30D'])

Now we instantiate an Interactive object by calling .interactive() on our data. This object mirrors the underlying object API, it accepts all of its natural operations. We replace the data by the interactive object in the pipeline, and replace the constant parameters by the widgets we have just created.

airi = air.interactive()
baseline = airi.sel(lat=w_latitude).mean().item()
pipeline = (
    airi
    .sel(lat=w_latitude)
    .to_dataframe()
    .drop(columns='lat')
    .groupby('time').mean()
    .rolling(w_rolling_window).mean()
    - baseline
)
pipeline.describe()

You can see that now the pipeline when rendered doesn’t only consist of its output, it also includes the widgets that control it. Change the widgets’ values and observe how the output dynamically updates.

You can notice that .interactive() supports the fact that the data type changed in the pipeline (see the call to .to_dataframe) and that it also supports math operators (- baseline).

A plot would be a better output for this pipeline. We will use .hvplot() to create an interactive Bokeh line plot, note that using Pandas’ .plot() is also possible.

import hvplot.pandas  # noqa

pipeline.hvplot()

For information on using .interactive() take a look at the User Guide.

This web page was generated from a Jupyter notebook and not all interactivity will work on this website.