Getting started#

Welcome to hvPlot!

This getting started guide will get you set up with hvPlot and provide a basic overview of its features and strengths.

Installation#

hvPlot supports Python 3.9 and above on Linux, Windows, or Mac. You can install hvPlot via the following options:

pip install hvplot
conda install conda-forge::hvplot
uv pip install hvplot

To run the guides in this site locally, create an environment with the required dependencies:

conda create -n hvplot-env -c conda-forge --override-channels hvplot geoviews datashader xarray pandas geopandas dask streamz networkx intake intake-xarray intake-parquet s3fs scipy spatialpandas pooch rasterio fiona plotly matplotlib hvsampledata jupyterlab

Overview#

The core functionality provided by hvPlot is a simple and high-level plotting interface (API), modeled on Pandas’s .plot API and extended in various ways leveraging capabilities offered by the packages of the HoloViz ecosystem, most notably HoloViews. hvPlot can generate interactive plots with either Bokeh (default) or Plotly, or static plots with Matplotlib. hvPlot supports many data libraries of the Python ecosystem such as pandas, xarray, dask etc.

hvPlot diagram

Register .hvplot#

Let’s create a simple Pandas DataFrame we’ll plot later.

import numpy as np
import pandas as pd
idx = pd.date_range('1/1/2000', periods=1000)
df = pd.DataFrame(np.random.randn(1000, 4), index=idx, columns=list('ABCD')).cumsum()
df.head(2)
A B C D
2000-01-01 1.624345 -0.611756 -0.528172 -1.072969
2000-01-02 2.489753 -2.913295 1.216640 -1.834176

The most convenient way to use hvPlot is to register the .hvplot accessor on the data type you are working with. This is done with a special import of the form import hvplot.<data library>.

import hvplot.pandas  # noqa

In addition to registering the .hvplot accessor on Pandas objects, the import above sets the Bokeh plotting library as the default one and loads its corresponding extension.

Attention

In a notebook, loading the extension means that there’s actually some front-end code that is injected in the cell output of the import, this code being required for HoloViews plots to behave correctly. So make sure not to remove this cell!

Now simply call .hvplot() on the DataFrame as you would call Pandas’ .plot().

first_plot = df.hvplot()
first_plot

The same process can be applied to other libraries, we’ll just show another example with Xarray.

import hvplot.xarray  # noqa
import hvsampledata

air_ds = hvsampledata.air_temperature("xarray")
air_ds
<xarray.Dataset> Size: 212kB
Dimensions:  (lat: 25, time: 20, lon: 53)
Coordinates:
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
  * time     (time) datetime64[ns] 160B 2014-02-24 ... 2014-02-28T18:00:00
Data variables:
    air      (time, lat, lon) float64 212kB ...
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
air_ds.hvplot.image(data_aspect=1, frame_width=400, dynamic=False)

The default plots hvPlot generate are Bokeh plots. These plots are interactive and support panning, zooming, hovering, and clickable/selectable legends. It’s worth spending some time getting used to interacting with this kind of plot.

hvplot namespace#

The .hvplot namespace holds the range of supported plot methods (e.g. line, scatter, hist, etc.). Use tab completion in a notebook to explore the available plot types.

df.hvplot.<TAB>

Similarly to Panda’s API, every plot method accepts a wide range of parameters. You can explore them by calling hvplot.help('line') or using tab completion:

df.hvplot.line(<TAB>

Compose plots#

The object returned by an .hvplot() call is a HoloViews object whose repr includes the HoloViews element type (e.g. Curve) and the dimensions.

plot1 = df['A'].hvplot.area(alpha=0.2, color='red', height=150, responsive=True)
plot2 = df['B'].hvplot.line(height=150, responsive=True)
print(type(plot2))
print(plot2)
<class 'holoviews.element.chart.Curve'>
:Curve   [index]   (B)

HoloViews objects can be easily composed using the + and * operators:

  • <plot1> + <plot2> lays the plots out in a row container

  • (<plot1> + <plot2>).cols(1) lays the plots out in a column container

  • <plot1> * <plot2> overlays the plots

plot1 + plot2
(plot1 + plot2).cols(1)
plot1 * plot2

Widgets-based exploration#

df_penguins = hvsampledata.penguins("pandas")
df_penguins.head(2)
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007

The groupby parameter allows us to declare which dimension(s) of the dataset we would like to make explorable with widgets.

df_penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm', groupby=['island', 'sex'],
    height=300, width=400, dynamic=False,
)

Display large data#

hvPlot provides multiple ways to display arbitrarily large datasets. The most versatile option depends on Datashader (optional dependency) and simply consists of setting rasterize=True. The plot returned is an image, each pixel of that image being colorized based on the number of points it contains; which means that all the points contribute to the image. This plot is also dynamic, zooming in and out and panning leads to a recomputation of the image. Luckily, this all happens really fast! The plot below is generated with no less than 5 million points.

NUM = 1_000_000
dists = [
    pd.DataFrame(dict(x=np.random.normal(x, s, NUM), y=np.random.normal(y, s, NUM)))
     for x,  y,    s in [
       ( 5,  2, 0.20),
       ( 2, -4, 0.10),
       (-2, -3, 0.50),
       (-5,  2, 1.00),
       ( 0,  0, 3.00)]
]
df_large_data = pd.concat(dists, ignore_index=True)
print(len(df_large_data))
df_large_data.head(2)
5000000
x y
0 4.971926 2.216931
1 5.028328 2.174570
df_large_data.hvplot.points(
    'x', 'y', rasterize=True, cnorm='eq_hist',
    data_aspect=1, colorbar=False
)

Note

This interactive functionality requires a live Python process to be running.

Geographic plots#

hvPlot can generate geographic plots and handle geospatial data (e.g., GeoPandas DataFrame) on its own.

Without GeoViews#

hvPlot allows us to add a tile map as a basemap to a plot with the tiles parameter without having to install GeoViews. This is possible because hvPlot projects lat/lon values to easting/northing (EPSG:4326 to EPSG:3857) coordinates without additional package dependencies if it detects that the values falls within expected lat/lon ranges.

earthquakes = hvsampledata.earthquakes("pandas")
earthquakes.head(2)
time lat lon depth depth_class mag mag_class place
0 2024-04-01 10:26:27.337 -5.7711 112.6357 11.954 Shallow 4.1 Light 125 km NNE of Paciran, Indonesia
1 2024-04-01 18:20:38.934 5.6918 126.5299 55.452 Shallow 4.2 Light 83 km SSE of Pondaguitan, Philippines
earthquakes.hvplot.points('lon', 'lat', tiles=True, alpha=0.6)

With GeoViews#

For more advanced mapping features, you can optionally install GeoViews.

import cartopy.crs as crs

air_ds.hvplot.quadmesh(
    'lon', 'lat', projection=crs.Orthographic(-90, 30), project=True,
    global_extent=True, cmap='viridis', coastline=True
)

For more information, see the User guide section on geographic plots

Matplotlib or Plotly plots#

hvPlot offers the possibility to create Matplotlib and Plotly plots. Load and enable the chosen plotting library with the extension function.

hvplot.extension('matplotlib')
df_penguins.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', by='sex')

Rendering and saving output#

In notebook environments like Jupyter or VS Code, objects returned by hvPlot calls are displayed automatically. For example, this documentation page is built from a Jupyter notebook.

If you’re running code outside of a notebook (e.g., in a Python script or console), you can:

  • Display plots: Use hvplot.show(<obj>) to render plots in a browser or interactive viewer.

  • Save plots: Use hvplot.save(<obj>, 'this_plot.html') to save plots to an HTML file (e.g., for Bokeh) for later viewing.

Tip

When using Bokeh as the plotting backend, you can also save plots directly using the “Save” icon in the plot’s toolbar.

For more information about viewing or saving plots, check out the user guide section on viewing plots

hvPlot explorer#

The Explorer is a Panel-based web application with which you can easily explore your data. While using .hvplot() is a convenient way to create plots from data, it assumes some a piori knowledge about the data itself and its structure, and also knowledge about .hvplot()’s API itself. The Explorer is a graphical interface that offers a simple way to select and visualize the kind of plot you want to see your data with, and many options to customize that plot.

Set up#

Setting up the explorer is pretty simple in a notebook. You just need to make sure you have loaded the extension, either via a data type import (e.g. import hvplot.pandas) or directly (e.g. hvplot.extension('bokeh')).

Basic usage#

The explorer is available on the .hvplot namespace together with the other plotting methods. It accepts most of the parameters accepted by the .hvplot() API. For the purpose of producing a nice example on the documentation, we will instantiate an explorer with some pre-defined parameters; usually you would instantiate it without any parameter.

explorer = df_penguins.hvplot.explorer(x='bill_length_mm', y='bill_depth_mm', by=['species'])
explorer

Spend some time browsing the explorer and the options it offers.

Note

This interactive functionality requires a live Python process to be running.

Once you are done exploring the data you may want to record the settings you have configured or save the plot. The easiest option consists of opening the Code tab next to Plot and copy/pasting the code displayed in a new notebook cell, executing it will generate exactly the same code as seen in the explorer.

hvPlot interactive#

hvPlot isn’t only a plotting library, it is dedicated to making data exploration easier. The hvplot.interactive() API can help you to get better control over your data pipelines. We define a data pipeline as a series of commands that transform some data, such as aggregating, filtering, reshaping, renaming, etc. A data pipeline may include a load step that will provide the input data to the pipeline, e.g. reading the data from a database.

When you analyze some data in a notebook that is for instance held in a Pandas DataFrame, you may find yourself having to re-run many cells after changing the parameters you provide to Pandas’ methods, either to get more insights on the data or fine-tune an algorithm. .interactive() is a solution to improve this rather cumbersome workflow, by which you replace the constant parameters in the pipeline by widgets (e.g. a number slider), that will automatically get displayed next to your pipeline output and will trigger an output update on changes. With this approach all your pipeline parameters are available in one place and you get full interactive control over the pipeline.

.interactive() doesn’t only work with DataFrames but also with Xarray data structures, this is what we are going to show in this guide. First we will import hvplot.xarray which is going to make available the .interactive() accessor on Xarray objects.

import hvplot.xarray  # noqa

We load the air_temperature dataset and get a handle on its unique air variable.

air = air_ds.air
air
<xarray.DataArray 'air' (time: 20, lat: 25, lon: 53)> Size: 212kB
[26500 values with dtype=float64]
Coordinates:
  * lat      (lat) float32 100B 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0
  * lon      (lon) float32 212B 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0
  * time     (time) datetime64[ns] 160B 2014-02-24 ... 2014-02-28T18:00:00
Attributes:
    long_name:     4xDaily Air temperature at sigma level 995
    units:         degK
    precision:     2
    GRIB_id:       11
    GRIB_name:     TMP
    var_desc:      Air temperature
    dataset:       NMC Reanalysis
    level_desc:    Surface
    statistic:     Individual Obs
    parent_stat:   Other
    actual_range:  [185.16 322.1 ]

We want to better understand the temporal evolution of the air temperature over different latitudes compared to a baseline. The data pipeline we build includes:

  1. filtering the data at one latitude

  2. cleaning up the data

  3. aggregating the temperatures by time

  4. computing a rolling mean

  5. subtracting a baseline from the above

The output we choose for now is the .describe() method of Pandas. This pipeline has two parameters, the latitude and the temporal window of the rolling operation.

LATITUDE = 30.
ROLLING_WINDOW = '1D'

baseline = air.sel(lat=LATITUDE).mean().item()
pipeline = (
    air
    .sel(lat=LATITUDE)
    .to_dataframe()
    .drop(columns='lat')
    .groupby('time').mean()
    .rolling(ROLLING_WINDOW).mean()
    - baseline
)
pipeline.describe()
air
count 20.000000
mean 0.124671
std 0.956915
min -1.425651
25% -0.759094
50% 0.623429
75% 0.822973
max 1.757887

Without using .interactive(), we would need to manually change the values of LATITUDE and ROLLING_WINDOW to observe how they affect the pipeline’s output. Instead, we can create two widgets that represent the range of values we expect these parameters to take. Essentially, this allows us to define our parameter space in advance. To create the widgets, we import Panel and select two appropriate widgets from its Reference Gallery.

import panel as pn

w_latitude = pn.widgets.DiscreteSlider(name='Latitude', options=list(air.lat.values))
w_rolling_window = pn.widgets.RadioButtonGroup(name='Rolling window', options=['1D', '7D', '30D'])

Now we instantiate an Interactive object by calling .interactive() on our data. This object mirrors the underlying object API, accepting all of its natural operations. We replace the data by the interactive object in the pipeline, and replace the constant parameters by the widgets we have just created.

airi = air.interactive()
baseline = airi.sel(lat=w_latitude).mean().item()
pipeline = (
    airi
    .sel(lat=w_latitude)
    .to_dataframe()
    .drop(columns='lat')
    .groupby('time').mean()
    .rolling(w_rolling_window).mean()
    - baseline
)
pipeline.describe()

You can see that now the pipeline when rendered doesn’t only consist of its output, it also includes the widgets that control it. Change the widgets’ values and observe how the output dynamically updates.

Note

This interactive functionality requires a live Python process to be running.

You can notice that .interactive() supports the fact that the data type changed in the pipeline (see the call to .to_dataframe) and that it also supports math operators (- baseline).

A plot would be a better output for this pipeline. We will use .hvplot() to create an interactive Bokeh line plot.

pipeline.hvplot(height=300, responsive=True)

Note

This interactive functionality requires a live Python process to be running.

For more information about the various hvPlot capabilities, take a look at the User Guide.

This web page was generated from a Jupyter notebook and not all interactivity will work on this website.