Supported data libraries#

Hide code cell content
# hidden on the website
import numpy as np

np.random.seed(1)

The .hvplot() plotting API supports a wide range of data sources. For most of them, a special import can be executed to register the .hvplot accessor on specific objects. For instance, importing hvplot.pandas registers the .hvplot accessor on Pandas DataFrame and Series objects, allowing to call df.hvplot.line().

Among the data sources introduced below, Pandas is the only library that doesn’t need to be installed separately as it is a direct dependency of hvPlot.

The following table provides a summary of the data sources supported by hvPlot. The HoloViews interface column indicates whether HoloViews offers an interface for that data source, meaning the data would in most cases not be converted to another type until the very last steps of the rendering process. When no interface is available for a data type, hvPlot internally casts the data to a supported type, for instance polars objects are casted upfront to pandas objects.

Source

Module

Type

HoloViews interface

Comment

Pandas

hvplot.pandas

Tabular

Dask

hvplot.dask

Tabular

Geopandas

hvplot.pandas

Tabular

Ibis

hvplot.ibis

Tabular

Polars

hvplot.polars

Tabular

To Pandas

DuckDB

hvplot.duckdb

Tabular

To Pandas

RAPIDS cuDF

hvplot.cudf

Tabular

GPU

Fugue

hvplot.fugue

Tabular

Experimental

Xarray

hvplot.xarray

Multidimensional

Intake

hvplot.intake

Catalog

Streamz

hvplot.streamz

Streaming

NetworkX

-

Graph

-

Different API

Note

Supporting so many data sources is hard work! We are aware that the support for some of them isn’t as good as we would like. If you encounter any issue please report it on GitHub, we always welcome Pull Requests too!

Columnar/tabular#

Pandas#

.hvplot() supports Pandas DataFrame and Series objects. hvPlot can also be registered as Pandas’ default plotting backend to delegate Pandas .plot() calls to hvPlot directly instead of to Pandas’ Matplotlib backend. Find out more about hvPlot’s compatibility with Pandas’ plotting interface.

import hvplot.pandas  # noqa
import pandas as pd

df_pandas = pd.DataFrame(np.random.randn(1000, 4), columns=list('ABCD')).cumsum()
df_pandas.head(2)
A B C D
0 1.624345 -0.611756 -0.528172 -1.072969
1 2.489753 -2.913295 1.216640 -1.834176
# Pandas DataFrame
df_pandas.hvplot.line(height=150)
# Pandas Series
s_pandas = df_pandas['A']
s_pandas.hvplot.line(height=150)

Dask#

.hvplot() supports Dask DataFrame and Series objects.

import hvplot.dask  # noqa
import dask

df_dask = dask.dataframe.from_pandas(df_pandas, npartitions=2)
df_dask
Dask DataFrame Structure:
A B C D
npartitions=2
0 float64 float64 float64 float64
500 ... ... ... ...
999 ... ... ... ...
Dask Name: frompandas, 1 expression
# Dask DataFrame
df_dask.hvplot.line(height=150)
# Dask Series
s_dask = df_dask['A']
s_dask.hvplot.line(height=150)

GeoPandas#

.hvplot() supports GeoPandas GeoDataFrame objects.

import hvplot.pandas  # noqa
import geopandas as gpd

p_geometry = gpd.points_from_xy(
    x=[12.45339, 12.44177, 9.51667, 6.13000],
    y=[41.90328, 43.93610, 47.13372, 49.61166],
    crs='EPSG:4326'
)
p_names = ['Vatican City', 'San Marino', 'Vaduz', 'Luxembourg']
gdf = gpd.GeoDataFrame(dict(name=p_names), geometry=p_geometry)
gdf.head(2)
name geometry
0 Vatican City POINT (12.45339 41.90328)
1 San Marino POINT (12.44177 43.9361)
# GeoPandas GeoDataFrame
gdf.hvplot.points(geo=True, tiles='CartoLight', frame_height=150, data_aspect=0.5)

Ibis#

Ibis is the “portable Python dataframe library”, it provides a unified interface to many data backends (e.g. DuckDB, SQLite, SnowFlake, Google BigQuery). .hvplot() supports Ibis Expr objects.

import hvplot.ibis  # noqa
import ibis

table = ibis.memtable(df_pandas.reset_index())
table
InMemoryTable
  data:
    PandasDataFrameProxy:
           index          A          B          C          D
      0        0   1.624345  -0.611756  -0.528172  -1.072969
      1        1   2.489753  -2.913295   1.216640  -1.834176
      2        2   2.808792  -3.162665   2.678748  -3.894316
      3        3   2.486375  -3.546720   3.812517  -4.994207
      4        4   2.313947  -4.424578   3.854731  -4.411392
      ..     ...        ...        ...        ...        ...
      995    995  14.422083 -48.258487  19.318585  63.861029
      996    996  13.814368 -47.528673  18.431398  63.938357
      997    997  13.887784 -47.112647  16.552198  64.513816
      998    998  13.989847 -45.928343  15.757355  64.387913
      999    999  13.029500 -46.772256  16.385696  64.925127

      [1000 rows x 5 columns]
# Ibis Expr
table.hvplot.line(x='index', height=150)

Polars#

Note

Added in version 0.9.0.

Important

While other data sources like Pandas or Dask have built-in support in HoloViews, as of version 1.17.1 this is not yet the case for Polars. You can track this issue to follow the evolution of this feature in HoloViews. Internally hvPlot simply selects the columns that contribute to the plot and casts them to a Pandas object using Polars’ .to_pandas() method.

import hvplot.polars  # noqa 
import polars

df_polars = polars.from_pandas(df_pandas)
df_polars.head(2)
shape: (2, 4)
ABCD
f64f64f64f64
1.624345-0.611756-0.528172-1.072969
2.489753-2.9132951.21664-1.834176

.hvplot() supports Polars DataFrame, LazyFrame and Series objects.

# Polars DataFrame
df_polars.hvplot.line(y=['A', 'B', 'C', 'D'], height=150)
# Polars LazyFrame
df_polars.lazy().hvplot.line(y=['A', 'B', 'C', 'D'], height=150)
# Polars Series
df_polars['A'].hvplot.line(height=150)

DuckDB#

Note

Added in version 0.11.0.

import numpy as np
import pandas as pd

df_pandas = pd.DataFrame(np.random.randn(1000, 4), columns=list('ABCD')).cumsum()
df_pandas.head(2)
A B C D
0 -0.140371 0.141642 0.311969 0.769085
1 0.443915 1.930234 0.287338 2.259375
import hvplot.duckdb  # noqa 
import duckdb

connection = duckdb.connect(':memory:')
relation = duckdb.from_df(df_pandas, connection=connection)
relation.to_view("example_view");

.hvplot() supports DuckDB DuckDBPyRelation and DuckDBConnection objects.

relation.hvplot.line(y=['A', 'B', 'C', 'D'], height=150)

DuckDBPyRelation is a bit more optimized because it handles column subsetting directly within DuckDB before the data is converted to a pd.DataFrame.

So, it’s a good idea to use the connection.sql() method when possible, which gives you a DuckDBPyRelation, instead of connection.execute(), which returns a DuckDBPyConnection.

sql_expr = "SELECT * FROM example_view WHERE A > 0 AND B > 0"
connection.sql(sql_expr).hvplot.line(y=['A', 'B'], hover_cols=["C"], height=150)  # subsets A, B, C

Alternatively, you can directly subset the desired columns in the SQL expression.

sql_expr = "SELECT A, B, C FROM example_view WHERE A > 0 AND B > 0"
connection.execute(sql_expr).hvplot.line(y=['A', 'B'], hover_cols=["C"], height=150)

Rapids cuDF#

Important

Rapids cuDF is a Python GPU DataFrame library. Neither hvPlot’s nor HoloViews’ test suites currently run on a GPU part of their CI, as of versions 0.9.0 and 1.17.1, respectively. This is due to the non availability of machines equipped with a GPU on the free CI system we rely on (Github Actions). Therefore it’s possible that support for cuDF gets degraded in hvPlot without us noticing it immediately. Please report any issue you might encounter.

.hvplot() supports cuDF DataFrame and Series objects.

Fugue#

Experimental

Fugue support, added in version 0.9.0, is experimental and may change in future versions.

hvPlot adds the hvplot plotting extension to FugueSQL.

import hvplot.fugue  # noqa
import fugue

fugue.api.fugue_sql(
    """
    OUTPUT df_pandas USING hvplot:line(
        height=150,
    )
    """
)
A B C D
0 -0.140371 0.141642 0.311969 0.769085
1 0.443915 1.930234 0.287338 2.259375
2 0.122838 2.491040 0.558123 3.026044
3 0.156186 4.251613 -1.475462 3.886280
4 0.854354 5.019826 0.276333 5.212350
... ... ... ... ...
995 33.079834 -30.932895 0.340049 -11.429079
996 33.609601 -30.549642 0.540773 -11.335922
997 33.697890 -29.223362 0.920503 -9.163789
998 33.376851 -27.716442 1.177258 -9.545096
999 34.413737 -27.044706 1.687819 -8.538965

1000 rows × 4 columns

Multidimensional#

Xarray#

.hvplot() supports XArray Dataset and DataArray labelled multidimensional objects.

import hvplot.xarray  # noqa
import xarray as xr

ds = xr.Dataset({
    'A': (['x', 'y'], np.random.randn(100, 100)),
    'B': (['x', 'y'], np.random.randn(100, 100))},
    coords={'x': np.arange(100), 'y': np.arange(100)}
)
ds
<xarray.Dataset> Size: 162kB
Dimensions:  (x: 100, y: 100)
Coordinates:
  * x        (x) int64 800B 0 1 2 3 4 5 6 7 8 9 ... 91 92 93 94 95 96 97 98 99
  * y        (y) int64 800B 0 1 2 3 4 5 6 7 8 9 ... 91 92 93 94 95 96 97 98 99
Data variables:
    A        (x, y) float64 80kB -2.219 -1.234 1.828 ... 0.1273 0.3056 0.1591
    B        (x, y) float64 80kB -3.204 -0.4675 -0.2595 ... -0.2403 -0.7512
# Xarray Dataset
ds.hvplot.hist(height=150)
# Xarray DataArray
ds['A'].hvplot.image(height=150)

Catalog#

Intake#

.hvplot() supports Intake DataSource objects.

Streaming#

Streamz#

.hvplot() supports Streamz DataFrame, DataFrames, Series and Seriess objects.

Graph#

NetworkX#

The hvPlot NetworkX plotting API is meant as a drop-in replacement for the networkx.draw methods. The draw and other draw_<> methods are available in the hvplot.networkx module.

import hvplot.networkx as hvnx
import networkx as nx

G = nx.petersen_graph()
hvnx.draw(G, with_labels=True, height=150)
This web page was generated from a Jupyter notebook and not all interactivity will work on this website.