Data Options#
For organizing, grouping, and transforming the dataset before visualization, including labels, sorting, and indexing:
Parameters |
Description |
---|---|
attr_labels (bool or None, default=None) |
Whether to use an xarray object’s attributes as labels, defaults to None to allow best effort without throwing a warning. Set to True to see warning if the attrs can’t be found, set to False to disable the behavior. |
by (str or list of str or None, default=None) |
Dimension(s) by which to group the data categories. An
NdOverlay is
returned by default unless |
dynamic (bool, default=True) |
Whether to return a dynamic plot which sends updates on widget and
zoom/pan events or whether all the data should be embedded
(warning: for large groupby operations embedded data can become
very large if |
fields (dict, default={}) |
A dictionary of fields for renaming or transforming data dimensions. |
groupby (str or list or None, default=None) |
Dimension(s) by which to group data, enabling widgets. Returns a
DynamicMap if
|
group_label (str or None, default=None) |
Sets a custom label for the dimension created when plotting multiple columns.
When multiple columns are plotted (e.g., multiple y values), hvPlot automatically reshapes the data from wide to long format.
It creates a new grouping dimension that holds the original column names.
By default, this grouping dimension is labeled Note
|
kind (str, default=’line’) |
The type of plot to generate. Should only be set when calling
|
label (str or None, default=None) |
Label for the data, typically used in the plot title or legends. |
persist (bool, default=False) |
Whether to persist the data in memory when using dask. |
row (str or None, default=None) |
Column name to use for splitting the plot into separate subplots by rows. |
col (str or None, default=None) |
Column name to use for splitting the plot into separate subplots by columns. |
sort_date (bool, default=True) |
Whether to sort the x-axis by date before plotting |
subplots (bool, default=False) |
Whether to display data in separate subplots when using the |
transforms (dict, default={}) |
A dictionary of HoloViews dim transforms to apply before plotting |
use_dask (bool, default=False) |
Enables support for Dask-backed xarray datasets, allowing out-of-core computation and parallel processing. Only applicable when the input data is an xarray object. Has no effect on Pandas or other non-xarray data structures. |
use_index (bool, default=True) |
Whether to use the data’s index for the x-axis by default. |
value_label (str, default=’value’) |
Sets a custom label for the values when the data is reshaped from wide to long format (e.g., when plotting multiple columns). This label is typically used for the y-axis, colorbar, or in hover tooltips. |
by
#
The by
option allows you to group your data based on one or more categorical variables. By specifying a dimension name (or a list of dimension names) with by
, the plot automatically separates the data into groups, making it easier to compare different subsets in a single visualization. By default, a HoloViews NdOverlay is returned, overlaying all groups in one plot. However, when you set subplots=True
, an NdLayout is returned instead, arranging the groups as separate subplots.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', by='species')
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm', by='species',
subplots=True, width=250
)
dynamic
#
The dynamic
option controls whether the plot is interactive and updates in response to user actions such as zooming, panning, or widget changes. When set to True (the default), hvPlot returns a DynamicMap that updates the visualization on the fly, making it ideal for exploratory data analysis or streaming data scenarios. However, if you set dynamic=False
, all the data is embedded directly into the plot. This static approach might be preferable for smaller datasets, but be cautious with large datasets since embedding a lot of data can impact performance.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm', groupby=['island', 'sex'],
height=300, width=400, dynamic=False,
)
In this example, setting dynamic=False
produces an interactive plot in the browser. You can engage with the plot’s widgets without needing an active Python session, as all the data is embedded directly in the plot.
Warning
Using dynamic=False
with very large datasets may significantly impact performance.
fields
#
The fields
option lets you rename or transform your dataset’s dimensions before plotting. If your data contains dimension names that aren’t descriptive or need minor adjustments for clarity, you can use fields
to rename them or apply simple transformations. You can also assign metadata such as custom display labels and units by passing HoloViews Dimension objects as the values in the fields
dictionary.
Note
If you need to modify the data values themselves (for example, converting units or applying arithmetic operations), consider using the transforms
option instead.
import hvplot.pandas # noqa
import holoviews as hv
df = hvplot.sampledata.penguins("pandas")
plot1 = df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm',
fields={
'bill_length_mm': 'Bill Length',
'bill_depth_mm': 'Bill Depth'
},
title="Simple columns renaming",
width=350,
)
plot2 = df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm',
fields={
'bill_length_mm': hv.Dimension('bill_length', label='Bill Length', unit='mm'),
'bill_depth_mm': hv.Dimension('bill_depth', label='Bill Depth', unit='mm')
},
title="Using Holoviews dimension metadata",
width=350,
)
plot1 + plot2
In this example, the fields
dictionary changes the axis labels from the original dimension names to more reader-friendly ones.
groupby
#
The groupby
option specifies one or more dimensions by which to partition your data into separate groups. This grouping enables the creation of interactive widgets that let users filter or switch between different groups. When dynamic=True
(the default), each group is rendered interactively as a HoloViews DynamicMap
, updating on-the-fly; otherwise, with dynamic=False
, all groups are pre-rendered and returned as a HoloMap
.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm', groupby='species',
dynamic=False, width=250
)
In this example, the plot automatically generates a widget that lets users select among the different species, dynamically updating the plot for the selected group. See dynamic
for more information.
Note
While both by
and groupby
are used to segment your data based on categorical variables, they serve different purposes. The by
option creates an overlay (or layout, if subplots=True
) where all groups are displayed simultaneously, whereas the groupby
option builds an interactive widget. With groupby
, each group is rendered as a separate element (using a DynamicMap
if dynamic=True
or a HoloMap
otherwise), allowing users to toggle between groups dynamically.
group_label
#
The group_label
option lets you set a custom name for the dimension that distinguishes multiple columns when your data is automatically reshaped from wide to long format.
When you plot multiple columns (e.g. multiple y values), hvPlot reshapes the dataset internally to a long format by creating:
a value column (default label: “value”), and
a variable column (default label: “Variable”), which holds the names of the original columns.
The group_label
keyword allows you to rename this "Variable"
column to something more meaningful in your plot, improving clarity in legends and axis labels.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.box(
y=['bill_length_mm', 'bill_depth_mm'], group_label='Bill size',
value_label='Measurement (mm)'
)
See also
kind
#
The kind
option determines the type of plot to generate from your data. By specifying a plot kind (such as ‘line’, ‘scatter’, or ‘bar’), you instruct hvPlot to create a specific visualization. For tabular data, the default is ‘line’, which generates a line plot. However, when working with xarray data, hvPlot automatically infers the most appropriate plot type based on the structure of your dataset. For example, it may default to a ‘hist’ plot for two-dimensional data or ‘rgb’ for image-like data.
Changing the kind
parameter allows you to experiment with different visual representations without altering your underlying data.
Tabular data#
import pandas as pd
import hvplot.pandas # noqa
df = pd.DataFrame({
'year': [2018, 2019, 2020, 2021],
'sales': [50, 100, 150, 200]
})
line_plot = df.hvplot(x='year', y='sales', title="Default line plot", width=300)
bar_plot = df.hvplot(x='year', y='sales', kind='bar', title="Bar plot", width=300)
line_plot + bar_plot
In this example, the first plot uses the default (kind='line'
), while the second explicitly sets kind='bar'
to create a bar chart. You can also specify it as an attribute of the hvplot
class:
df.hvplot.bar(x='year', y='sales')
Xarray data#
import hvplot.xarray # noqa
ds = hvplot.sampledata.air_temperature("xarray")
hist_plot = ds.hvplot(title="Hist plot", width=300)
image_plot = ds.isel(time=0).hvplot.image(title="Image plot", width=300)
hist_plot + image_plot
label
#
The label
option allows to set the label
attribute on the HoloViews objects that are returned by calls to hvPlot plotting methods. When set, the label usually ends up being displayed as the plot title for simple plots, and in the legend for overlays.
Note
The plot’s title can alternatively be set with the title
option, which takes precedence over label
when both options are set. Note that even though the plot looks similar, the object returned wouldn’t have its label
attribute set if the title is declared with the title
option.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
plot = df.hvplot.scatter("bill_depth_mm", "bill_length_mm", width=300, label="label set")
print(plot.label)
plot
label set
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
plot = df.hvplot.scatter(
"bill_depth_mm", "bill_length_mm", width=300,
label="label set", title="title takes precedence",
)
print(plot.label)
plot
label set
Below is a typical of example showing how to use both label
and title
to set the legend labels and the plot title. Note title
can be set in any of the two plotting method calls.
import hvplot.pandas # noqa
import pandas as pd
df = pd.DataFrame({
'name': ["Mark", "Luke", "Ken", "June"],
'age': [15, 20, 25, 30]
})
line_plot = df.hvplot.line(x="name", y="age", color="red", label="Line plot")
bar_plot = df.hvplot.bar(
x="name", y="age", label="Bar plot",
width=400, title="Ages of students",
)
bar_plot * line_plot
persist
#
The persist
option is useful when working with Dask-backed datasets. Setting persist=True
tells Dask to compute and keep the data in memory, which can speed up subsequent interactions and visualizations for large or computationally expensive datasets.
See also
row
#
The row
and col
options allow you to facet your plot into multiple subplots based on categorical variables. Using row
arranges the facets vertically, while col
arranges them horizontally.
Faceting makes it easier to compare different subsets of your data side by side within a single visualization.
import hvplot.pandas # noqa
import pandas as pd
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', row='species', col='island')
In this example, the data is split into separate subplots: one row per species
and one column per island
, which allows for easy comparison between the different subsets.
col
#
See row
above.
sort_date
#
The sort_date
option ensures that the x-axis is sorted chronologically when your data contains date values. This helps to correctly display time series data even if the original dataset isn’t in order. It is set to True
by default.
import hvplot.pandas # noqa
df = hvplot.sampledata.apple_stocks("pandas")
sampled = df.sample(frac=1)
print(sampled.head(3))
plot1 = sampled.hvplot.line(x='date', y='close', width=300)
plot2 = sampled.hvplot(x='date', y='close', sort_date=False, width=300)
plot1 + plot2
date open high low close volume adj_close
134 2019-07-16 51.15 51.53 50.88 51.12 67467200 49.11
760 2022-01-06 172.70 175.30 171.64 172.00 96904000 168.82
571 2021-04-09 129.80 133.04 129.47 133.00 106686700 129.94
In the first plot, even though the dates
column in the sampled
DataFrame are unsorted, the plot’s x-axis will display them in chronological order. However, setting sort_date=False
results in jumbled lines in the plot because the lines are plotted in the order in which they appear in the dataframe.
subplots
#
The subplots
option is a boolean flag that, when enabled (set to True
), displays each group specified by the by
keyword in its own subplot. This contrasts with the default behavior of overlaying all groups in a single plot, and it can provide clearer side-by-side comparisons of grouped data.
Note
You can use subplots=True
together with .cols(N)
to specify the maximum number of columns to arrange the plots in.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='bill_depth_mm', by='species',
subplots=True, width=300
).cols(2)
In this example, setting .cols(2)
arranges the plots into two columns.
See by
for more example usage.
transforms
#
The transforms
option allows you to modify data values for specific dimensions before plotting. Unlike the fields
option which only renames or adds metadata, transforms
applies HoloViews expressions to the data. It accepts a dictionary where each key is a dimension (for example, a DataFrame column name) and each value is a HoloViews expression built with holoviews.dim()
that defines how to transform that dimension.
For instance, if you have a ‘probability’ column with values between 0 and 1 and you want to display them as percentages, you can define a transformation as:
percent = hv.dim('probability') * 100
When passed via the transforms keyword, this expression multiplies all values in the ‘probability’ column by 100 before plotting.
import holoviews as hv
import hvplot.pandas # noqa
import numpy as np
import pandas as pd
df = pd.DataFrame({'value': np.random.randn(50), 'probability': np.random.rand(50)})
percent = hv.dim('probability') * 100
df.hvplot.scatter(
x='value', y='probability', transforms={'probability': percent}
)
use_dask
#
The use_dask
option enables support for Dask-backed xarray datasets, allowing hvPlot to perform out-of-core and parallelized computations.
This is particularly useful for working with large multi-dimensional datasets that don’t fit in memory.
If you set persist=True
, it persists the data in memory for improved performance on subsequent operations.
use_index
#
The use_index
option determines whether the data’s index is used as the x-axis by default. By default hvPlot automatically assigns the DataFrame’s index as a coordinate for plotting. This is particularly useful when the index contains meaningful information (such as timestamps) and when no explicit x-axis column is specified.
If you set use_index=False
, hvPlot uses the first non-index column as the x-axis.
import hvplot.pandas # noqa
import pandas as pd
dates = pd.date_range('2024-01-01', periods=5, freq='D')
df = pd.DataFrame({
'open': [100, 102, 101, 103, 105],
'close': [101, 103, 102, 104, 106],
}, index=dates)
df.hvplot.line(y=['open', 'close'], group_label='Price')
Notice the use of the index column (dates
) as the x-axis.
value_label
#
The value_label
option sets a custom label for the data values, and is typically used to label the y-axis or to annotate legends. By default, it is set to ‘value’, but you can override it with a more descriptive name to better convey what the data represents.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.box(
y=['bill_length_mm', 'bill_depth_mm'], group_label='Bill size',
value_label='Measurement (mm)'
)
See also