hvPlot 0.10 has just been released! Checkout the blog post and support hvPlot by giving it a 🌟 on Github.

Hist#

import hvplot.pandas  # noqa

# hvplot.extension("matplotlib")

hist is often a good way to start looking at continuous data to get a sense of the distribution. Similar methods include kde (also available as density).

from bokeh.sampledata.autompg import autompg_clean

autompg_clean.sample(n=5)
mpg cyl displ hp weight accel yr origin name mfr
236 33.5 4 98.0 83 2075 15.9 77 North America dodge colt m/m dodge
81 23.0 4 120.0 97 2506 14.5 72 Asia toyouta corona mark ii (sw) toyota
118 20.0 4 114.0 91 2582 14.0 73 Europe audi 100ls audi
276 31.5 4 89.0 71 1990 14.9 78 Europe volkswagen scirocco volkswagen
176 23.0 4 120.0 88 2957 17.0 75 Europe peugeot 504 peugeot
autompg_clean.hvplot.hist("weight")

When using by the plots are overlaid by default. To create subplots instead, use subplots=True.

autompg_clean.hvplot.hist("weight", by="origin", subplots=True, width=250)

You can also plot histograms of datetime data

import pandas as pd
from bokeh.sampledata.commits import data as commits

commits = commits.reset_index().sort_values("datetime")
commits.head(3)
datetime day time
4915 2012-12-29 11:57:50-06:00 Sat 11:57:50
4914 2013-01-02 17:46:43-06:00 Wed 17:46:43
4913 2013-01-03 16:28:49-06:00 Thu 16:28:49
commits.hvplot.hist(
    "datetime",
    bin_range=(pd.Timestamp('2012-11-30'), pd.Timestamp('2017-05-01')),
    bins=54,   
)

If you want to plot the distribution of a categorical column you can calculate the distribution using Pandas’ method value_counts and plot it using .hvplot.bar.

autompg_clean["mfr"].value_counts().hvplot.bar(invert=True, flip_yaxis=True, height=500)