hvPlot.kde#
- hvPlot.kde(y=None, by=None, **kwds)[source]#
The Kernel density estimate (kde) plot shows the distribution of the data.
The KDE works by placing a Gaussian kernel at each sample with the supplied bandwidth, which are then summed to produce the density estimate. By default the bandwidth is determined using the Scott’s method, which usually produces good results, but it may be overridden by an explicit value.
density
is an alias ofkde
.Reference: https://hvplot.holoviz.org/ref/api/manual/hvplot.hvPlot.kde.html
Plotting options: https://hvplot.holoviz.org/ref/plotting_options/index.html
- Parameters:
- ystring or sequence
Field(s) in the data to compute distribution on. If not specified all numerical fields are used.
- bystring or sequence
Field(s) in the data to group by.
- bandwidthfloat, optional
Allows supplying explicit bandwidth value of the kernel for the density estimate, rather than relying on Scott. Higher value yields smoother contours. Default is None.
- cutfloat, optional
Draw the estimate to cut * bw from the extreme data points. Default is 3.
- filled
Whether the bivariate contours should be filled. Default is True.
- bw_methodoptional
Not supported.
- indoptional
Not supported.
- kwdsoptional
Additional keywords arguments are documented in Plotting Options. Run
hvplot.help('kde')
for the full method documentation.
- Returns:
holoviews.element.Distribution
/ Panel objectYou can print the object to study its composition and run:
import holoviews as hv hv.help(the_holoviews_object)
to learn more about its parameters and options.
Notes
This function requires
scipy
to be installed.References
Backend-specific styling options#
alpha, color, fill_alpha, fill_color, hover_alpha, hover_color, hover_fill_alpha, hover_fill_color, hover_line_alpha, hover_line_cap, hover_line_color, hover_line_dash, hover_line_dash_offset, hover_line_join, hover_line_width, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, muted, muted_alpha, muted_color, muted_fill_alpha, muted_fill_color, muted_line_alpha, muted_line_cap, muted_line_color, muted_line_dash, muted_line_dash_offset, muted_line_join, muted_line_width, nonselection_alpha, nonselection_color, nonselection_fill_alpha, nonselection_fill_color, nonselection_line_alpha, nonselection_line_cap, nonselection_line_color, nonselection_line_dash, nonselection_line_dash_offset, nonselection_line_join, nonselection_line_width, selection_alpha, selection_color, selection_fill_alpha, selection_fill_color, selection_line_alpha, selection_line_cap, selection_line_color, selection_line_dash, selection_line_dash_offset, selection_line_join, selection_line_width, visible
alpha, c, capstyle, color, ec, ecolor, edgecolor, facecolor, fc, fill, hatch, interpolate, joinstyle, linestyle, linewidth, lw, step
Examples#
Basic KDE#
This example shows a KDE plot built from a sample of a Weibull distribution using kde
with its default parameters.
import hvplot.pandas
import numpy as np
import pandas as pd
df = pd.DataFrame({'values': np.random.weibull(5, size=1000)})
df.hvplot.kde()
Let’s visualise the KDE of a dataset containaing the depth of earthquakes.
import hvplot.pandas # noqa
df = hvplot.sampledata.earthquakes("pandas")
df.hvplot.kde(y='depth')
Control smoothing with bandwidth
#
You can control the smoothness of the estimate using the bandwidth
argument that accepts a positive numerical value. Smaller values yield more detail. When not set, the bandwidth is internally computed using Scott’s rule of thumb.
import hvplot.pandas # noqa
df = hvplot.sampledata.earthquakes("pandas")
df.hvplot.kde(
y='depth', bandwidth=0.1,
width=300, title='bandwidth=0.1'
) +\
df.hvplot.kde(
y='depth', bandwidth=0.5,
width=300, shared_axes=False, title='bandwidth=0.5'
)
Control evaluation extent with cut
#
cut
is a factor, multiplied by the smoothing bandwidth
, that determines how far the evaluation grid extends past the extreme datapoints. When set to 0, the curve is truncated at the data limits.
import hvplot.pandas # noqa
df = hvplot.sampledata.earthquakes("pandas")
df.hvplot.kde(y='depth', width=300, title='default') +\
df.hvplot.kde(y='depth', cut=0, width=300, title='cut=0')
KDE from wide-form data#
When setting y
to a list of variables, the object returned is an overlay of the distribution of each variable (HoloViews NdOverlay object). This example uses multiple numerical columns from the penguins dataset to compare their distributions using a kernel density estimate.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.kde(
y=["bill_length_mm", "bill_depth_mm"], color=["orange", "green"],
)
Setting subplots
to True
, the object returned is a layout (HoloViews NdOverlay object).
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.kde(
y=["bill_length_mm", "bill_depth_mm"],
width=300, subplots=True, shared_axes=False,
)
KDE from long-form data#
by
can also be used to generate an overlay or distribution of histograms, by setting it with categorical variable(s). This example shows how to use the by
keyword to compare the distribution of bill lengths across penguin species.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.kde(y="bill_length_mm", by="species")
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.kde(y="bill_length_mm", by=["species", "sex"], subplots=True, width=300).cols(2)
Xarray example#
import hvplot.xarray # noqa
ds = hvplot.sampledata.air_temperature("xarray").sel(lat=[25, 50, 75])
ds.hvplot.kde("air", by="lat", alpha=0.5)