hvPlot.hist#

hvPlot.hist(y=None, by=None, bins=20, bin_range=None, normed=False, cumulative=False, **kwds)[source]#

A histogram displays an approximate representation of the distribution of continuous data.

Reference: https://hvplot.holoviz.org/ref/api/manual/hvplot.hvPlot.hist.html

Plotting options: https://hvplot.holoviz.org/ref/plotting_options/index.html

Parameters:
ystring or sequence

Field(s) in the wide data to compute the distribution(s) from. Please note the fields should contain continuous data. Not categorical.

bystring or sequence

Field(s) in the long data to group by.

binsint or string or np.ndarray or list or tuple, optional

The number of bins in the histogram, or an explicit set of bin edges or a method to find the optimal set of bin edges, e.g. ‘auto’, ‘fd’, ‘scott’ etc. For more documentation on these approaches see the numpy.histogram_bin_edges() documentation. Default is 20.

bin_range: tuple, optional

The lower and upper range of the bins. Default is the minimum and maximum values of the continuous data.

normedstr or bool, optional

Controls normalization behavior. If True or 'integral', then density=True is passed to np.histogram, and the distribution is normalized such that the integral is unity. If False, then the frequencies will be raw counts. If 'height', then the frequencies are normalized such that the max bin height is unity. Default is False.

cumulative: bool, optional

If True, then a histogram is computed where each bin gives the counts in that bin plus all bins for smaller values. The last bin gives the total number of data points. Default is False.

kwdsoptional

Additional keywords arguments are documented in Plotting Options. Run hvplot.help('hist') for the full method documentation.

Returns:
holoviews.element.Histogram / Panel object

You can print the object to study its composition and run:

import holoviews as hv
hv.help(the_holoviews_object)

to learn more about its parameters and options.

See also

kde

Kernel Density Estimate plot.

bivariate

2D KDE plot.

contour

Isolines plot for gridded data.

References

Backend-specific styling options#

alpha, cmap, color, fill_alpha, fill_color, hover_alpha, hover_color, hover_fill_alpha, hover_fill_color, hover_line_alpha, hover_line_cap, hover_line_color, hover_line_dash, hover_line_dash_offset, hover_line_join, hover_line_width, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, muted, muted_alpha, muted_color, muted_fill_alpha, muted_fill_color, muted_line_alpha, muted_line_cap, muted_line_color, muted_line_dash, muted_line_dash_offset, muted_line_join, muted_line_width, nonselection_alpha, nonselection_color, nonselection_fill_alpha, nonselection_fill_color, nonselection_line_alpha, nonselection_line_cap, nonselection_line_color, nonselection_line_dash, nonselection_line_dash_offset, nonselection_line_join, nonselection_line_width, selection_alpha, selection_color, selection_fill_alpha, selection_fill_color, selection_line_alpha, selection_line_cap, selection_line_color, selection_line_dash, selection_line_dash_offset, selection_line_join, selection_line_width, visible

align, alpha, c, capsize, color, ec, ecolor, edgecolor, error_kw, facecolor, fc, hatch, linewidth, log, lw, visible

Examples#

Histograms are used to approximate the distribution of continuous data by dividing the range into bins and counting the number of observations in each bin.

Basic histogram plot#

import hvplot.pandas  # noqa
import numpy as np
import pandas as pd

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist()

Basic histogram with bins#

You can control the number of bins by setting it with an integer value.

import hvplot.pandas  # noqa
import numpy as np
import pandas as pd

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist(
    y='values', width=300,
    bins=30, title='Histogram (bins=30)'
) +\
df.hvplot.hist(
    y='values', width=300, shared_axes=False,
    bins=50, title='Histogram (bins=50)'
)

Or with a string value referencing one of the values accepted by the bins keyword of np.histogram_bin_edges.

import hvplot.pandas  # noqa
import numpy as np
import pandas as pd

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist(
    y='values', width=300,
    bins='auto', title='Histogram (bins="auto")'
) +\
df.hvplot.hist(
    y='values', width=300, shared_axes=False,
    bins='scott', title='Histogram (bins="scott")'
)

Or with a list or 1D Numpy array of edges.

import hvplot.pandas  # noqa
import numpy as np
import pandas as pd

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist(
    y='values', width=300,
    bins=[-3+0.5*i for i in range(12)], title='Histogram (bins as a list)'
) +\
df.hvplot.hist(
    y='values', width=300, shared_axes=False,
    bins=np.arange(-3, 3, 0.25), title='Histogram (bins as a numpy array)'
)

Histogram with bin_range#

This limits the histogram to a specific range of values.

import hvplot.pandas  # noqa
import pandas as pd
import numpy as np

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist(y='values', bin_range=(-2, 2), title='Histogram in range -2 to 2')

Normalized histogram#

You can normalize the histogram with normed=True or normed='integral' to show density instead of raw counts. If normed='height', then the frequencies are normalized such that the max bin height is unity.

import hvplot.pandas  # noqa
import pandas as pd
import numpy as np

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist(
    y='values', width=300,
    normed=True, title='Normalize hist (normed=True)'
) +\
df.hvplot.hist(
    y='values', width=300, shared_axes=False,
    normed='height', title='Normalized hist (normed="height")'
)

Note

normed=True is equivalent to density=True in np.histogram.

Cumulative histogram#

An histogram generated with cumulative=True shows a running total of counts up to each bin.

import hvplot.pandas  # noqa
import pandas as pd
import numpy as np

df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})

df.hvplot.hist(y='values', cumulative=True, title='Cumulative Histogram')

Overlay and layout of histogram plots#

When setting y to a list of variables, the object returned is an overlay of the distribution of each variable (HoloViews NdOverlay object).

import hvplot.pandas  # noqa

df = hvplot.sampledata.penguins('pandas')

df.hvplot.hist(y=['bill_depth_mm', 'bill_length_mm'])

Setting subplots to True, the object returned is a layout (HoloViews NdOverlay object).

import hvplot.pandas  # noqa

df = hvplot.sampledata.penguins('pandas')

df.hvplot.hist(y=['bill_depth_mm', 'bill_length_mm'], subplots=True, width=300)

by can also be used to generate an overlay or distribution of histograms, by setting it with categorical variable(s).

import hvplot.pandas  # noqa

df = hvplot.sampledata.penguins('pandas')

df.hvplot.hist(y='body_mass_g', by='sex', alpha=0.5)
import hvplot.pandas  # noqa

df = hvplot.sampledata.penguins('pandas')

df.hvplot.hist(y='body_mass_g', by=['species', 'sex'], subplots=True, width=300).cols(2)

Xarray example#

import hvplot.xarray  # noqa

ds = hvplot.sampledata.air_temperature("xarray").sel(lon=285.)

ds.hvplot.hist()
This web page was generated from a Jupyter notebook and not all interactivity will work on this website.