hvPlot 0.10 has just been released! Checkout the blog post and support hvPlot by giving it a 🌟 on Github.

hvplot#

The core functionality provided by hvPlot is a simple and high-level plotting interface (API), modeled on Pandas’s .plot API and extended in various ways leveraging capabilities offered by the packages of the HoloViz ecosystem, most notably HoloViews. hvPlot can generate interactive plots with either Bokeh (default) or Plotly, or static plots with Matplotlib. hvPlot supports many data libraries of the Python ecosystem:

  • Pandas: DataFrame, Series (columnar/tabular data)

  • XArray: Dataset, DataArray (labelled multidimensional arrays)

  • GeoPandas: GeoDataFrame (geometry data)

  • Rapids cuDF: GPU DataFrame, Series (columnar/tabular data)

  • Polars: DataFrame, LazyFrame, Series (columnar/tabular data)

  • Dask: DataFrame, Series (distributed/out of core arrays and columnar data)

  • Streamz: DataFrame(s), Series(s) (streaming columnar data)

  • Intake: DataSource (data catalogues)

  • Ibis: DataFrame interface for many backends (DuckDB, SQLite, SnowFlake, etc.)

  • NetworkX: Graph (network graphs)

Register .hvplot#

Let’s create a simple Pandas DataFrame we’ll plot later.

import numpy as np
import pandas as pd
np.random.seed(1)

idx = pd.date_range('1/1/2000', periods=1000)
df = pd.DataFrame(np.random.randn(1000, 4), index=idx, columns=list('ABCD')).cumsum()
df.head(2)
A B C D
2000-01-01 1.624345 -0.611756 -0.528172 -1.072969
2000-01-02 2.489753 -2.913295 1.216640 -1.834176

The most convenient way to use hvPlot is to register the .hvplot accessor on the data type you are working with. This is done with a special import of the form import hvplot.<data library>.

import hvplot.pandas  # noqa

In addition to registering the .hvplot accessor on Pandas objects (DataFrame and Series), the import above sets the Bokeh plotting library as the default one and loads its corresponding extension.

Note

In a notebook, loading the extension means that there’s actually some front-end code that is injected in the cell output of the import, this code being required for HoloViews plots to behave correctly. So make sure not to remove this cell!

Now simply call .hvplot() on the DataFrame as you would call Pandas’ .plot().

first_plot = df.hvplot()
first_plot

The same process can be applied to other libraries, we’ll just show another example with Xarray.

import hvplot.xarray  # noqa
import xarray as xr

air_ds = xr.tutorial.open_dataset('air_temperature').load()
air_ds['air'].isel(time=slice(0, 1000, 60)).hvplot.image(dynamic=False)

Bokeh plots#

As you can see, the default plots hvPlot generate are Bokeh plots. These plots are interactive and support panning, zooming, hovering, and clickable/selectable legends. It’s worth spending some time getting used to interact with this kind of plot, try for instance zooming in on an axis and see what happens!

first_plot

hvplot namespace#

The .hvplot namespace holds the range of supported plot methods (e.g. line, scatter, hist, etc.). Use tab completion to explore the available plot types.

df.hvplot.<TAB>

Similarly to Panda’s API, every plot method accepts a wide range of parameters. You can explore them by calling hvplot.help('line') or using tab completion:

df.hvplot.line(<TAB>

Compose plots#

The object returned by a .hvplot.<type>() call is a HoloViews object.

plot1 = df['A'].hvplot.area(alpha=0.2, color='red', width=300)
plot2 = df['B'].hvplot.line(width=300)
print(plot2)
:Curve   [index]   (B)

HoloViews objects can easily composed using the + and * operators:

  • <plot1> + <plot2> lays the plots out in a row container

  • (<plot1> + <plot2>).cols(1) lays the plots out in a column container

  • <plot1> * <plot2> overlays the plots

plot1 + plot2
plot1 * plot2

Widgets-based exploration#

from bokeh.sampledata.penguins import data as df_penguins
df_penguins.head(2)
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 MALE
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 FEMALE

The groupby parameter allows to explore a dimension of the data using widgets.

df_penguins.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', groupby=['island', 'sex'], dynamic=False)

Display large data#

NUM = 1_000_000
dists = [
    pd.DataFrame(dict(x=np.random.normal(x, s, NUM), y=np.random.normal(y, s, NUM)))
     for x,  y,    s in [
       ( 5,  2, 0.20), 
       ( 2, -4, 0.10), 
       (-2, -3, 0.50), 
       (-5,  2, 1.00), 
       ( 0,  0, 3.00)]
]
df_large_data = pd.concat(dists, ignore_index=True)
len(df_large_data)
5000000

hvPlot provides multiple ways to display arbitrarily large datasets. The most versatile option depends on Datashader (optional dependency) and simply consists of setting rasterize to True. The plot returned is an image, each pixel of that image being colorized based on the number of points it contains; which means that all the points contribute to the image. This plot is also dynamic, zooming in and out and panning leads to a recomputation of the image. Luckily, this all happens really fast! The plot below is generated with no less than 5 million points.

df_large_data.hvplot.points('x', 'y', rasterize=True, cnorm='eq_hist', aspect=1, colorbar=False)