Pandas#

hvPlot has been designed as a simple plotting interface (one-line call is enough for most cases) to many data libraries. It was greatly inspired by Pandas’ original plotting interface, that is mostly a convenient interface to Matplotlib’s plotting API, and allows in fact to pass many arguments directly to Matplotlib. On the other hand, hvPlot’s plotting interface is a convenient interface to HoloViews, that similarly allows to pass many arguments directly to HoloViews. Matplotlib and HoloViews are different types of visualization library, the former being a pure plotting tool (i.e. it knows how to draw pixels on your screen) while the later is more of a data exploration tool. These differences explain some of the differences you will observe between Pandas’ and hvPlot’s plotting APIs.

Pandas offers a mechanism to register a third-party plotting backend. When registered, <DataFrame|Series>.plot() calls are delegated to the third-party tool. hvPlot has implemented the required interface to be registered as Pandas’ plotting backend. As a consequence there are two main ways to generate hvPlot plots from Pandas objects:

By importing hvplot.pandas and using the .hvplot() namespace (recommended).
By registering hvPlot as Pandas’ plotting backend and using Pandas’ .plot() namespace.

Note

Pandas does not force third-party plotting tools like hvPlot to implement all of its plotting methods. It also does not enforce each method to implement specific arguments.

As a summary about hvPlot’s compatibility with Pandas’ plotting API:

hvPlot can be registered as Pandas plotting backend.
As an convenient interface to HoloViews and not to Matplotlib, hvPlot does not aim to be 100% compatible with Pandas’ API. However, Pandas users will find the plotting methods they are used to, and most of the generic arguments they accept. In a sense, hvPlot aims more for familiarity than compatibility.

For a more in-depth comparison between Pandas and hvPlot APIs, visit the Pandas API reference that recreates the Pandas chart visualization guide using both APIs.

%matplotlib inline

import hvplot.pandas  # noqa
import hvsampledata
import numpy as np
import pandas as pd

df = hvsampledata.penguins('pandas')

Switch Pandas backend to hvPlot#

Hint

This approach is an easy way (one-line change) to convert some code from generating plots with Pandas & Matplotlib to Pandas & hvPlot and see whether you like the output or not. Generally, we recommend installing the hvplot namespace on Pandas objects by importing hvplot.pandas, and invoking hvPlot via this namespace, e.g. df.hvplot.line(), as it can be adapted to other data libraries (e.g. if you use Dask, you can install the hvplot namespace on Dask objects with import hvplot.dask).

Note

This requires pandas >= 0.25.0.

hvPlot can be registered as Pandas’ plotting backend instead of Matplotlib with:

pd.options.plotting.backend = 'hvplot'

Once registered, hvPlot plots are generated when calling Pandas .plot():

df.plot.scatter('bill_length_mm', 'bill_depth_mm')

Note

To function correctly hvPlot needs to load some front-end (Javascript, CSS, etc.) content in a notebook. This is usually achieved as a side-effect of importing for example hvplot.pandas. In the example above, this step is done in the first cell that calls .plot(). It is important not to delete this cell to avoid running into hard-to-debug interactivity issues.

API comparison#

Kind	In Pandas	In hvPlot	Comment
`hvplot.hvPlot.area()`	✅	✅	Alpha set to 0.5 automatically in Pandas when `stacked=False`, not in hvPlot.
`hvplot.hvPlot.bar()`	✅	✅
`hvplot.hvPlot.barh()`	✅	✅
`hvplot.hvPlot.bivariate()`	❌	✅
`DataFrame.boxplot`	✅	✅
`hvplot.hvPlot.box()`	✅	✅	In Pandas `colors` can be used to specify the color of the components of the box plot, in hvPlot this can roughly be done via backend-specific style options. `sym` and `positions` are not supported in hvPlot. `vert` in Pandas can be replaced by `invert` in hvPlot.
`hvplot.hvPlot.density()`	✅	✅
`hvplot.hvPlot.errorbars()`	❌	✅	Error bars can be set with `xerr` and `yerr` in Pandas
`hvplot.hvPlot.heatmap()`	❌	✅
`hvplot.hvPlot.hexbin()`	✅	✅	`reduce_C_function` in Pandas is named `reduce_function` in hvPlot.
`hvplot.hvPlot.hist()`	✅	✅	Stacking not supported in hvPlot. hvPlot uses `invert=True` instead of `orientation='horizontal'`. Pandas’ `hist` method accepts a Numpy NdArray for `by` but hvPlot does not.
`DataFrame.hist`	✅	✅	Pandas’ `DataFrame.hist()` plots the histograms of the columns on multiple subplots. hvPlot creates instead an overlay of histogram plots. To reproduce Pandas’ behavior, you can set `subplots=True` to create a layout of plots (1 per column in this case), and additionally call `.cols(2)` on the object returned to lay the plots in a layout with a maximum number of 2 columns.
`hvplot.hvPlot.kde()`	✅	✅
`hvplot.hvPlot.labels()`	❌	✅
`hvplot.hvPlot.line()`	✅	✅	`colormap` not yet supported in hvPlot, use `color` instead.
`hvplot.hvPlot.ohlc()`	❌	✅
`hvplot.hvPlot.scatter()`	✅	✅
`hvplot.hvPlot.step()`	❌	✅
`hvplot.hvPlot.table()`	✅	✅	Pandas has a whole API dedicated to displaying and styling tables. It also offers `pandas.plotting.table()` to convert a DataFrame to a Matplotlib table
`pie`	✅	❌	Not yet implemented in HoloViews, see this issue
`hvplot.hvPlot.points()`	❌	✅	For two independent variables, useful for geographic data for examples
`hvplot.hvPlot.violin()`	❌	✅
`hvplot.plotting.andrews_curves()`	✅	✅
`autocorrelation_plot`	✅	❌
`bootstrap_plot`	✅	❌
`hvplot.plotting.lag_plot()`	✅	✅
`hvplot.plotting.parallel_coordinates()`	✅	✅
`radviz`	✅	❌
`hvplot.plotting.scatter_matrix()`	✅	✅

Notable differences#

pd.options.plotting.backend = 'matplotlib'

This section aims to describe a few of the main notable differences between Pandas and hvPlot plotting APIs. More specific differences can be found in the Pandas API page that recreates the Pandas chart visualization guide.

Figure handling#

A plot call in Pandas returns a Matplotlib Axes object. This object can be passed to Pandas’ plot API via the ax argument, for example to overlay two different plots. The ax argument is not supported in hvPlot.

plot = df.plot.scatter('bill_length_mm', 'bill_depth_mm', figsize=(4, 3))

../../../_images/a275480a66316a50e11538ca19d10635bb6930984a0aeafc4cc1b7d4fee607fa.png

print(plot)

Axes(0.125,0.11;0.775x0.77)

hvPlot’s plotting API returns HoloViews objects. These objects are wrappers around the original dataset, whose rich representation is a plot.

plot = df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', hover_cols=['species'])
print(plot)

:Scatter   [bill_length_mm]   (bill_depth_mm,species)

plot

plot.data.head()

	species	island	bill_length_mm	bill_depth_mm	flipper_length_mm	body_mass_g	sex	year
0	Adelie	Torgersen	39.1	18.7	181.0	3750.0	male	2007
1	Adelie	Torgersen	39.5	17.4	186.0	3800.0	female	2007
2	Adelie	Torgersen	40.3	18.0	195.0	3250.0	female	2007
3	Adelie	Torgersen	NaN	NaN	NaN	NaN	NaN	2007
4	Adelie	Torgersen	36.7	19.3	193.0	3450.0	female	2007

Using HoloViews’ API, this object can be further customized.

import holoviews as hv

plot.opts(
    height=300, width=300, color=hv.dim('species'),
    cmap='Category10', show_legend=False,
).hist(['bill_length_mm','bill_depth_mm'])

Overlays and layouts#

In Pandas, overlays are usually created by passing down an Axes object to another plot call via the ax argument. Layouts are created by setting subplots=True, and can be customized further with the layout argument, or with Matplotlib’s API.

The approach is quite different in hvPlot as HoloViews offers some very convenient API with * for overlaying plots and + for laying out plots. Together with the subplots argument and HoloViews’ .cols(N) method to limit the number N of plots per row, this forms an API flexible enough to handle most situations.

df1 = df.query('species == "Adelie"')
df2 = df.query('species == "Gentoo"')
ax = df1.plot.scatter('bill_length_mm', 'bill_depth_mm', color="blue", label="Adelie")
df2.plot.scatter('bill_length_mm', 'bill_depth_mm', color="green", label="Gentoo", ax=ax);

../../../_images/2d3b89e7e7119f7ce33e24864b5a758db451e648e1cd0caae7b2710f065eed87.png

(
    df1.hvplot.scatter('bill_length_mm', 'bill_depth_mm', color="blue", label="Adelie")
    * df2.hvplot.scatter('bill_length_mm', 'bill_depth_mm', color="green", label="Gentoo")
)

dft = pd.DataFrame(np.random.randn(1000, 4), columns=list("ABCD")).cumsum()
dft.plot.line(subplots=True, layout=(2, 3), figsize=(8, 6));

../../../_images/e29f05a337d6ee80b866c8cbe46739a753faf0127a617cf2cac35534ab232bbf.png

dft.hvplot.line(subplots=True, width=220).cols(3)

dft['A'].hvplot.line(width=220) + dft['B'].hvplot.line(width=220)

Plot dimensions#

Setting plot dimensions in Pandas is done with the figsize argument that accepts a tuple (width, height) in inches. figsize is not supported in hvPlot, instead, plot dimensions are set with the width (default is 700) and height (default is 700) arguments that accept integer values in pixels.

df.plot.scatter('bill_length_mm', 'bill_depth_mm', figsize=(4, 3));

df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', width=350, height=250)

Default color cycle and colormap#

Pandas and hvPlot have different default color cycle and colormap.

The default color cycle in Pandas is Matplotlib’s tab10 (or “Tableau 10”) 10-colors sequence. hvPlot’s default color cycle is inherited from HoloViews and is a custom 12-colors sequence.

dfl = pd.DataFrame({col: [0, i+1] for i, col in enumerate('ABCDEFGHIJLKMN')})
dfl.plot();

../../../_images/12805c30aac72fdc8bcbc4595a8b23b579db59c275dae3a72e86d4d6ec7aaf62.png

dfl.hvplot().opts(legend_cols=2)

Note

hvPlot’s default color cycle can be set via HoloViews API, make sure to run this before importing the plotting extension (e.g. hv.extension('bokeh'), done implicitly when running import hvplot.pandas).

import holoviews as hv
import matplotlib

hv.Cycle.default_cycles['default_colors'] = list(map(matplotlib.colors.rgb2hex, matplotlib.colormaps['tab10'].colors))

import hvplot.pandas

...

The default categorical colormap in Pandas is a gray scale. In hvPlot, it is glasbey_category10, a colormap with 256 colors that extends Bokeh’s Category10 colormap (originally from D3).

categories = list('ABCDEFGHIJLKMNOPQRST')
dfc = pd.DataFrame({
    'x': np.random.rand(len(categories)),
    'y': np.random.rand(len(categories)),
    'category': categories,
})
dfc['category'] = dfc['category'].astype('category')
dfc.plot.scatter('x', 'y', c='category');

../../../_images/24b844ebb86aaba92c1fb8f4dc2af86d1016f8b2895022d79b1519994a51fab7.png

dfc.hvplot.scatter(
    'x', 'y', c='category', legend='top_right'
).opts(legend_cols=3)

The default colormap for numerical values is viridis in Pandas and kbc_r (cyan to very dark blue) in hvPlot (see more info in this issue).

df.plot.scatter('bill_length_mm', 'flipper_length_mm', c=df['body_mass_g']);

../../../_images/ff1ffdb2f2fe0853d71f474fc1d959e88e1712fdec4325e13f4aaeec11009599.png

df.hvplot.scatter('bill_length_mm', 'flipper_length_mm', c=df['body_mass_g'])

Note

hvPlot does not allow yet to configure globally the default colormap. The colormap (or cmap) argument can be used instead locally.

df.hvplot.scatter('bill_length_mm', 'flipper_length_mm', c=df['body_mass_g'], cmap='viridis')

Marker size#

The marker size in hvplot.hvPlot.scatter() and hvplot.hvPlot.points() plots can be controlled with the s argument. When converting a plot from Pandas to hvPlot, the size has to be increased to obtain an output visually similar.

df.plot.scatter('bill_length_mm', 'bill_depth_mm', s=50, figsize=(4, 4));

../../../_images/b91ddb583066a875c55cf4ca3618fd3511b65d165b59329085de91f54c1444a0.png

df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', s=110, aspect=1)

This web page was generated from a Jupyter notebook and not all interactivity will work on this website.