Pandas#

hvPlot has been designed as a simple plotting interface (one-line call is enough for most cases) to many data libraries. It was greatly inspired by Pandas’ original plotting interface, that is mostly a convenient interface to Matplotlib’s plotting API, and allows in fact to pass many arguments directly to Matplotlib. On the other hand, hvPlot’s plotting interface is a convenient interface to HoloViews, that similarly allows to pass many arguments directly to HoloViews. Matplotlib and HoloViews are different types of visualization library, the former being a pure plotting tool (i.e. it knows how to draw pixels on your screen) while the later is more of a data exploration tool. These differences explain some of the differences you will observe between Pandas’ and hvPlot’s plotting APIs.

Pandas offers a mechanism to register a third-party plotting backend. When registered, <DataFrame|Series>.plot() calls are delegated to the third-party tool. hvPlot has implemented the required interface to be registered as Pandas’ plotting backend. As a consequence there are two main ways to generate hvPlot plots from Pandas objects:

  • By importing hvplot.pandas and using the .hvplot() namespace (recommended).

  • By registering hvPlot as Pandas’ plotting backend and using Pandas’ .plot() namespace.

Note

Pandas does not force third-party plotting tools like hvPlot to implement all of its plotting methods. It also does not enforce each method to implement specific arguments.

As a summary about hvPlot’s compatibility with Pandas’ plotting API:

  • hvPlot can be registered as Pandas plotting backend.

  • As an convenient interface to HoloViews and not to Matplotlib, hvPlot does not aim to be 100% compatible with Pandas’ API. However, Pandas users will find the plotting methods they are used to, and most of the generic arguments they accept. In a sense, hvPlot aims more for familiarity than compatibility.

For a more in-depth comparison between Pandas and hvPlot APIs, visit the Pandas API reference that recreates the Pandas chart visualization guide using both APIs.

%matplotlib inline
import hvplot.pandas  # noqa
import hvsampledata
import numpy as np
import pandas as pd

df = hvsampledata.penguins('pandas')

Switch Pandas backend to hvPlot#

Hint

This approach is an easy way (one-line change) to convert some code from generating plots with Pandas & Matplotlib to Pandas & hvPlot and see whether you like the output or not. Generally, we recommend installing the hvplot namespace on Pandas objects by importing hvplot.pandas, and invoking hvPlot via this namespace, e.g. df.hvplot.line(), as it can be adapted to other data libraries (e.g. if you use Dask, you can install the hvplot namespace on Dask objects with import hvplot.dask).

Note

This requires pandas >= 0.25.0.

hvPlot can be registered as Pandas’ plotting backend instead of Matplotlib with:

pd.options.plotting.backend = 'hvplot'

Once registered, hvPlot plots are generated when calling Pandas .plot():

df.plot.scatter('bill_length_mm', 'bill_depth_mm')

Note

To function correctly hvPlot needs to load some front-end (Javascript, CSS, etc.) content in a notebook. This is usually achieved as a side-effect of importing for example hvplot.pandas. In the example above, this step is done in the first cell that calls .plot(). It is important not to delete this cell to avoid running into hard-to-debug interactivity issues.

API comparison#

Kind

In Pandas

In hvPlot

Comment

hvplot.hvPlot.area()

Alpha set to 0.5 automatically in Pandas when stacked=False, not in hvPlot.

hvplot.hvPlot.bar()

hvplot.hvPlot.barh()

hvplot.hvPlot.bivariate()

DataFrame.boxplot

hvplot.hvPlot.box()

In Pandas colors can be used to specify the color of the components of the box plot, in hvPlot this can roughly be done via backend-specific style options. sym and positions are not supported in hvPlot. vert in Pandas can be replaced by invert in hvPlot.

hvplot.hvPlot.density()

hvplot.hvPlot.errorbars()

Error bars can be set with xerr and yerr in Pandas

hvplot.hvPlot.heatmap()

hvplot.hvPlot.hexbin()

reduce_C_function in Pandas is named reduce_function in hvPlot.

hvplot.hvPlot.hist()

Stacking not supported in hvPlot. hvPlot uses invert=True instead of orientation='horizontal'. Pandas’ hist method accepts a Numpy NdArray for by but hvPlot does not.

DataFrame.hist

Pandas’ DataFrame.hist() plots the histograms of the columns on multiple subplots. hvPlot creates instead an overlay of histogram plots. To reproduce Pandas’ behavior, you can set subplots=True to create a layout of plots (1 per column in this case), and additionally call .cols(2) on the object returned to lay the plots in a layout with a maximum number of 2 columns.

hvplot.hvPlot.kde()

hvplot.hvPlot.labels()

hvplot.hvPlot.line()

colormap not yet supported in hvPlot, use color instead.

hvplot.hvPlot.ohlc()

hvplot.hvPlot.scatter()

hvplot.hvPlot.step()

hvplot.hvPlot.table()

Pandas has a whole API dedicated to displaying and styling tables. It also offers pandas.plotting.table() to convert a DataFrame to a Matplotlib table

pie

Not yet implemented in HoloViews, see this issue

hvplot.hvPlot.points()

For two independent variables, useful for geographic data for examples

hvplot.hvPlot.violin()

hvplot.plotting.andrews_curves()

autocorrelation_plot

bootstrap_plot

hvplot.plotting.lag_plot()

hvplot.plotting.parallel_coordinates()

radviz

hvplot.plotting.scatter_matrix()

Notable differences#

pd.options.plotting.backend = 'matplotlib'

This section aims to describe a few of the main notable differences between Pandas and hvPlot plotting APIs. More specific differences can be found in the Pandas API page that recreates the Pandas chart visualization guide.

Figure handling#

A plot call in Pandas returns a Matplotlib Axes object. This object can be passed to Pandas’ plot API via the ax argument, for example to overlay two different plots. The ax argument is not supported in hvPlot.

plot = df.plot.scatter('bill_length_mm', 'bill_depth_mm', figsize=(4, 3))
../../../_images/a275480a66316a50e11538ca19d10635bb6930984a0aeafc4cc1b7d4fee607fa.png
print(plot)
Axes(0.125,0.11;0.775x0.77)

hvPlot’s plotting API returns HoloViews objects. These objects are wrappers around the original dataset, whose rich representation is a plot.

plot = df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', hover_cols=['species'])
print(plot)
:Scatter   [bill_length_mm]   (bill_depth_mm,species)
plot
plot.data.head()
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
0 Adelie Torgersen 39.1 18.7 181.0 3750.0 male 2007
1 Adelie Torgersen 39.5 17.4 186.0 3800.0 female 2007
2 Adelie Torgersen 40.3 18.0 195.0 3250.0 female 2007
3 Adelie Torgersen NaN NaN NaN NaN NaN 2007
4 Adelie Torgersen 36.7 19.3 193.0 3450.0 female 2007

Using HoloViews’ API, this object can be further customized.

import holoviews as hv

plot.opts(
    height=300, width=300, color=hv.dim('species'),
    cmap='Category10', show_legend=False,
).hist(['bill_length_mm','bill_depth_mm'])

Overlays and layouts#

In Pandas, overlays are usually created by passing down an Axes object to another plot call via the ax argument. Layouts are created by setting subplots=True, and can be customized further with the layout argument, or with Matplotlib’s API.

The approach is quite different in hvPlot as HoloViews offers some very convenient API with * for overlaying plots and + for laying out plots. Together with the subplots argument and HoloViews’ .cols(N) method to limit the number N of plots per row, this forms an API flexible enough to handle most situations.

df1 = df.query('species == "Adelie"')
df2 = df.query('species == "Gentoo"')
ax = df1.plot.scatter('bill_length_mm', 'bill_depth_mm', color="blue", label="Adelie")
df2.plot.scatter('bill_length_mm', 'bill_depth_mm', color="green", label="Gentoo", ax=ax);
../../../_images/2d3b89e7e7119f7ce33e24864b5a758db451e648e1cd0caae7b2710f065eed87.png
(
    df1.hvplot.scatter('bill_length_mm', 'bill_depth_mm', color="blue", label="Adelie")
    * df2.hvplot.scatter('bill_length_mm', 'bill_depth_mm', color="green", label="Gentoo")
)
dft = pd.DataFrame(np.random.randn(1000, 4), columns=list("ABCD")).cumsum()
dft.plot.line(subplots=True, layout=(2, 3), figsize=(8, 6));
../../../_images/e29f05a337d6ee80b866c8cbe46739a753faf0127a617cf2cac35534ab232bbf.png
dft.hvplot.line(subplots=True, width=220).cols(3)
dft['A'].hvplot.line(width=220) + dft['B'].hvplot.line(width=220)

Plot dimensions#

Setting plot dimensions in Pandas is done with the figsize argument that accepts a tuple (width, height) in inches. figsize is not supported in hvPlot, instead, plot dimensions are set with the width (default is 700) and height (default is 700) arguments that accept integer values in pixels.

df.plot.scatter('bill_length_mm', 'bill_depth_mm', figsize=(4, 3));
../../../_images/a275480a66316a50e11538ca19d10635bb6930984a0aeafc4cc1b7d4fee607fa.png
df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', width=350, height=250)

Default color cycle and colormap#

Pandas and hvPlot have different default color cycle and colormap.

The default color cycle in Pandas is Matplotlib’s tab10 (or “Tableau 10”) 10-colors sequence. hvPlot’s default color cycle is inherited from HoloViews and is a custom 12-colors sequence.

dfl = pd.DataFrame({col: [0, i+1] for i, col in enumerate('ABCDEFGHIJLKMN')})
dfl.plot();
../../../_images/12805c30aac72fdc8bcbc4595a8b23b579db59c275dae3a72e86d4d6ec7aaf62.png
dfl.hvplot().opts(legend_cols=2)

Note

hvPlot’s default color cycle can be set via HoloViews API, make sure to run this before importing the plotting extension (e.g. hv.extension('bokeh'), done implicitly when running import hvplot.pandas).

import holoviews as hv
import matplotlib

hv.Cycle.default_cycles['default_colors'] = list(map(matplotlib.colors.rgb2hex, matplotlib.colormaps['tab10'].colors))

import hvplot.pandas

...

The default categorical colormap in Pandas is a gray scale. In hvPlot, it is glasbey_category10, a colormap with 256 colors that extends Bokeh’s Category10 colormap (originally from D3).

categories = list('ABCDEFGHIJLKMNOPQRST')
dfc = pd.DataFrame({
    'x': np.random.rand(len(categories)),
    'y': np.random.rand(len(categories)),
    'category': categories,
})
dfc['category'] = dfc['category'].astype('category')
dfc.plot.scatter('x', 'y', c='category');
../../../_images/24b844ebb86aaba92c1fb8f4dc2af86d1016f8b2895022d79b1519994a51fab7.png
dfc.hvplot.scatter(
    'x', 'y', c='category', legend='top_right'
).opts(legend_cols=3)

The default colormap for numerical values is viridis in Pandas and kbc_r (cyan to very dark blue) in hvPlot (see more info in this issue).

df.plot.scatter('bill_length_mm', 'flipper_length_mm', c=df['body_mass_g']);
../../../_images/ff1ffdb2f2fe0853d71f474fc1d959e88e1712fdec4325e13f4aaeec11009599.png
df.hvplot.scatter('bill_length_mm', 'flipper_length_mm', c=df['body_mass_g'])

Note

hvPlot does not allow yet to configure globally the default colormap. The colormap (or cmap) argument can be used instead locally.

df.hvplot.scatter('bill_length_mm', 'flipper_length_mm', c=df['body_mass_g'], cmap='viridis')

Marker size#

The marker size in hvplot.hvPlot.scatter() and hvplot.hvPlot.points() plots can be controlled with the s argument. When converting a plot from Pandas to hvPlot, the size has to be increased to obtain an output visually similar.

df.plot.scatter('bill_length_mm', 'bill_depth_mm', s=50, figsize=(4, 4));
../../../_images/b91ddb583066a875c55cf4ca3618fd3511b65d165b59329085de91f54c1444a0.png
df.hvplot.scatter('bill_length_mm', 'bill_depth_mm', s=110, aspect=1)
This web page was generated from a Jupyter notebook and not all interactivity will work on this website.