hvPlot.hist#
- hvPlot.hist(y=None, by=None, bins=20, bin_range=None, normed=False, cumulative=False, **kwds)[source]#
A histogram displays an approximate representation of the distribution of continuous data.
Reference: https://hvplot.holoviz.org/ref/api/manual/hvplot.hvPlot.hist.html
Plotting options: https://hvplot.holoviz.org/ref/plotting_options/index.html
- Parameters:
- ystring or sequence
Field(s) in the wide data to compute the distribution(s) from. Please note the fields should contain continuous data. Not categorical.
- bystring or sequence
Field(s) in the long data to group by.
- binsint or string or np.ndarray or list or tuple, optional
The number of bins in the histogram, or an explicit set of bin edges or a method to find the optimal set of bin edges, e.g. ‘auto’, ‘fd’, ‘scott’ etc. For more documentation on these approaches see the
numpy.histogram_bin_edges()
documentation. Default is 20.- bin_range: tuple, optional
The lower and upper range of the bins. Default is the minimum and maximum values of the continuous data.
- normedstr or bool, optional
Controls normalization behavior. If
True
or'integral'
, thendensity=True
is passed to np.histogram, and the distribution is normalized such that the integral is unity. IfFalse
, then the frequencies will be raw counts. If'height'
, then the frequencies are normalized such that the max bin height is unity. Default is False.- cumulative: bool, optional
If True, then a histogram is computed where each bin gives the counts in that bin plus all bins for smaller values. The last bin gives the total number of data points. Default is False.
- kwdsoptional
Additional keywords arguments are documented in Plotting Options. Run
hvplot.help('hist')
for the full method documentation.
- Returns:
holoviews.element.Histogram
/ Panel objectYou can print the object to study its composition and run:
import holoviews as hv hv.help(the_holoviews_object)
to learn more about its parameters and options.
See also
References
Bokeh: https://docs.bokeh.org/en/latest/docs/examples/topics/stats/histogram.html
HoloViews: https://holoviews.org/reference/elements/bokeh/Histogram.html
Pandas: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.hist.html
Matplotlib: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html
Seaborn: https://seaborn.pydata.org/generated/seaborn.histplot.html
Backend-specific styling options#
alpha, cmap, color, fill_alpha, fill_color, hover_alpha, hover_color, hover_fill_alpha, hover_fill_color, hover_line_alpha, hover_line_cap, hover_line_color, hover_line_dash, hover_line_dash_offset, hover_line_join, hover_line_width, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, muted, muted_alpha, muted_color, muted_fill_alpha, muted_fill_color, muted_line_alpha, muted_line_cap, muted_line_color, muted_line_dash, muted_line_dash_offset, muted_line_join, muted_line_width, nonselection_alpha, nonselection_color, nonselection_fill_alpha, nonselection_fill_color, nonselection_line_alpha, nonselection_line_cap, nonselection_line_color, nonselection_line_dash, nonselection_line_dash_offset, nonselection_line_join, nonselection_line_width, selection_alpha, selection_color, selection_fill_alpha, selection_fill_color, selection_line_alpha, selection_line_cap, selection_line_color, selection_line_dash, selection_line_dash_offset, selection_line_join, selection_line_width, visible
align, alpha, c, capsize, color, ec, ecolor, edgecolor, error_kw, facecolor, fc, hatch, linewidth, log, lw, visible
Examples#
Histograms are used to approximate the distribution of continuous data by dividing the range into bins and counting the number of observations in each bin.
Basic histogram plot#
import hvplot.pandas # noqa
import numpy as np
import pandas as pd
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist()
Basic histogram with bins#
You can control the number of bins by setting it with an integer value.
import hvplot.pandas # noqa
import numpy as np
import pandas as pd
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist(
y='values', width=300,
bins=30, title='Histogram (bins=30)'
) +\
df.hvplot.hist(
y='values', width=300, shared_axes=False,
bins=50, title='Histogram (bins=50)'
)
Or with a string value referencing one of the values accepted by the bins
keyword of np.histogram_bin_edges
.
import hvplot.pandas # noqa
import numpy as np
import pandas as pd
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist(
y='values', width=300,
bins='auto', title='Histogram (bins="auto")'
) +\
df.hvplot.hist(
y='values', width=300, shared_axes=False,
bins='scott', title='Histogram (bins="scott")'
)
Or with a list or 1D Numpy array of edges.
import hvplot.pandas # noqa
import numpy as np
import pandas as pd
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist(
y='values', width=300,
bins=[-3+0.5*i for i in range(12)], title='Histogram (bins as a list)'
) +\
df.hvplot.hist(
y='values', width=300, shared_axes=False,
bins=np.arange(-3, 3, 0.25), title='Histogram (bins as a numpy array)'
)
Histogram with bin_range
#
This limits the histogram to a specific range of values.
import hvplot.pandas # noqa
import pandas as pd
import numpy as np
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist(y='values', bin_range=(-2, 2), title='Histogram in range -2 to 2')
Normalized histogram#
You can normalize the histogram with normed=True
or normed='integral'
to show density instead of raw counts. If normed='height'
, then the frequencies are normalized such that the max bin height is unity.
import hvplot.pandas # noqa
import pandas as pd
import numpy as np
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist(
y='values', width=300,
normed=True, title='Normalize hist (normed=True)'
) +\
df.hvplot.hist(
y='values', width=300, shared_axes=False,
normed='height', title='Normalized hist (normed="height")'
)
Note
normed=True
is equivalent to density=True
in np.histogram
.
Cumulative histogram#
An histogram generated with cumulative=True
shows a running total of counts up to each bin.
import hvplot.pandas # noqa
import pandas as pd
import numpy as np
df = pd.DataFrame({'values': np.random.normal(loc=0, scale=1, size=1000)})
df.hvplot.hist(y='values', cumulative=True, title='Cumulative Histogram')
Overlay and layout of histogram plots#
When setting y
to a list of variables, the object returned is an overlay of the distribution of each variable (HoloViews NdOverlay object).
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins('pandas')
df.hvplot.hist(y=['bill_depth_mm', 'bill_length_mm'])
Setting subplots
to True
, the object returned is a layout (HoloViews NdOverlay object).
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins('pandas')
df.hvplot.hist(y=['bill_depth_mm', 'bill_length_mm'], subplots=True, width=300)
by
can also be used to generate an overlay or distribution of histograms, by setting it with categorical variable(s).
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins('pandas')
df.hvplot.hist(y='body_mass_g', by='sex', alpha=0.5)
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins('pandas')
df.hvplot.hist(y='body_mass_g', by=['species', 'sex'], subplots=True, width=300).cols(2)
Xarray example#
import hvplot.xarray # noqa
ds = hvplot.sampledata.air_temperature("xarray").sel(lon=285.)
ds.hvplot.hist()