hvPlot.kde#

hvPlot.kde(y=None, by=None, **kwds)[source]#

The Kernel density estimate (kde) plot shows the distribution of the data.

The KDE works by placing a Gaussian kernel at each sample with the supplied bandwidth, which are then summed to produce the density estimate. By default the bandwidth is determined using the Scott’s method, which usually produces good results, but it may be overridden by an explicit value.

density is an alias of kde.

Reference: https://hvplot.holoviz.org/ref/api/manual/hvplot.hvPlot.kde.html

Plotting options: https://hvplot.holoviz.org/ref/plotting_options/index.html

Parameters:
ystring or sequence

Field(s) in the data to compute distribution on. If not specified all numerical fields are used.

bystring or sequence

Field(s) in the data to group by.

bandwidthfloat, optional

Allows supplying explicit bandwidth value of the kernel for the density estimate, rather than relying on Scott. Higher value yields smoother contours. Default is None.

cutfloat, optional

Draw the estimate to cut * bw from the extreme data points. Default is 3.

filled

Whether the bivariate contours should be filled. Default is True.

bw_methodoptional

Not supported.

indoptional

Not supported.

kwdsoptional

Additional keywords arguments are documented in Plotting Options. Run hvplot.help('kde') for the full method documentation.

Returns:
holoviews.element.Distribution / Panel object

You can print the object to study its composition and run:

import holoviews as hv
hv.help(the_holoviews_object)

to learn more about its parameters and options.

See also

hist

Histogram plot.

bivariate

2D KDE plot.

contour

Isolines plot for gridded data.

Notes

This function requires scipy to be installed.

References

Backend-specific styling options#

alpha, color, fill_alpha, fill_color, hover_alpha, hover_color, hover_fill_alpha, hover_fill_color, hover_line_alpha, hover_line_cap, hover_line_color, hover_line_dash, hover_line_dash_offset, hover_line_join, hover_line_width, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, muted, muted_alpha, muted_color, muted_fill_alpha, muted_fill_color, muted_line_alpha, muted_line_cap, muted_line_color, muted_line_dash, muted_line_dash_offset, muted_line_join, muted_line_width, nonselection_alpha, nonselection_color, nonselection_fill_alpha, nonselection_fill_color, nonselection_line_alpha, nonselection_line_cap, nonselection_line_color, nonselection_line_dash, nonselection_line_dash_offset, nonselection_line_join, nonselection_line_width, selection_alpha, selection_color, selection_fill_alpha, selection_fill_color, selection_line_alpha, selection_line_cap, selection_line_color, selection_line_dash, selection_line_dash_offset, selection_line_join, selection_line_width, visible

alpha, c, capstyle, color, ec, ecolor, edgecolor, facecolor, fc, fill, hatch, interpolate, joinstyle, linestyle, linewidth, lw, step

Examples#

Basic KDE#

This example shows a KDE plot built from a sample of a Weibull distribution using kde with its default parameters.

import hvplot.pandas
import numpy as np
import pandas as pd

df = pd.DataFrame({'values': np.random.weibull(5, size=1000)})

df.hvplot.kde()

Let’s visualise the KDE of a dataset containaing the depth of earthquakes.

import hvplot.pandas # noqa

df = hvplot.sampledata.earthquakes("pandas")

df.hvplot.kde(y='depth')

Control smoothing with bandwidth#

You can control the smoothness of the estimate using the bandwidth argument that accepts a positive numerical value. Smaller values yield more detail. When not set, the bandwidth is internally computed using Scott’s rule of thumb.

import hvplot.pandas # noqa

df = hvplot.sampledata.earthquakes("pandas")

df.hvplot.kde(
    y='depth', bandwidth=0.1,
    width=300, title='bandwidth=0.1'
) +\
df.hvplot.kde(
    y='depth', bandwidth=0.5,
    width=300, shared_axes=False, title='bandwidth=0.5'
)

Control evaluation extent with cut#

cut is a factor, multiplied by the smoothing bandwidth, that determines how far the evaluation grid extends past the extreme datapoints. When set to 0, the curve is truncated at the data limits.

import hvplot.pandas # noqa

df = hvplot.sampledata.earthquakes("pandas")

df.hvplot.kde(y='depth', width=300, title='default') +\
df.hvplot.kde(y='depth', cut=0, width=300, title='cut=0')

KDE from wide-form data#

When setting y to a list of variables, the object returned is an overlay of the distribution of each variable (HoloViews NdOverlay object). This example uses multiple numerical columns from the penguins dataset to compare their distributions using a kernel density estimate.

import hvplot.pandas # noqa

df = hvplot.sampledata.penguins("pandas")

df.hvplot.kde(
    y=["bill_length_mm", "bill_depth_mm"], color=["orange", "green"],
)

Setting subplots to True, the object returned is a layout (HoloViews NdOverlay object).

import hvplot.pandas # noqa

df = hvplot.sampledata.penguins("pandas")

df.hvplot.kde(
    y=["bill_length_mm", "bill_depth_mm"],
    width=300, subplots=True, shared_axes=False,
)

KDE from long-form data#

by can also be used to generate an overlay or distribution of histograms, by setting it with categorical variable(s). This example shows how to use the by keyword to compare the distribution of bill lengths across penguin species.

import hvplot.pandas # noqa

df = hvplot.sampledata.penguins("pandas")

df.hvplot.kde(y="bill_length_mm", by="species")
import hvplot.pandas # noqa

df = hvplot.sampledata.penguins("pandas")

df.hvplot.kde(y="bill_length_mm", by=["species", "sex"], subplots=True, width=300).cols(2)

Xarray example#

import hvplot.xarray  # noqa

ds = hvplot.sampledata.air_temperature("xarray").sel(lat=[25, 50, 75])

ds.hvplot.kde("air", by="lat", alpha=0.5)
This web page was generated from a Jupyter notebook and not all interactivity will work on this website.