Bar#
import pandas as pd
import hvplot.pandas # noqa
Introduction#
A bar
plot represents categorical data with rectangular bars with heights proportional to the numerical values that they represent.
The x-axis represents the categories and the y axis represents the numerical value scale.
The bars are of equal width which allows for instant comparison of data.
pd.DataFrame({
"framework": ["hvPlot", "HoloViews", "Panel"],
"stars": [700, 2400, 2600]
}).hvplot.bar(x="framework", y="stars", color="gold", title="Bar Plot of Github Stars", ylabel="⭐")
Data#
Let’s import some data.
from bokeh.sampledata.autompg import autompg_clean as autompg
autompg.head()
mpg | cyl | displ | hp | weight | accel | yr | origin | name | mfr | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130 | 3504 | 12.0 | 70 | North America | chevrolet chevelle malibu | chevrolet |
1 | 15.0 | 8 | 350.0 | 165 | 3693 | 11.5 | 70 | North America | buick skylark 320 | buick |
2 | 18.0 | 8 | 318.0 | 150 | 3436 | 11.0 | 70 | North America | plymouth satellite | plymouth |
3 | 16.0 | 8 | 304.0 | 150 | 3433 | 12.0 | 70 | North America | amc rebel sst | amc |
4 | 17.0 | 8 | 302.0 | 140 | 3449 | 10.5 | 70 | North America | ford torino | ford |
We define long form data, i.e. one row per yr
categorical value.
autompg_long_form = autompg.groupby("yr").mean(numeric_only=True).reset_index()
autompg_long_form.head()
yr | mpg | cyl | displ | hp | weight | accel | |
---|---|---|---|---|---|---|---|
0 | 70 | 17.689655 | 6.758621 | 281.413793 | 147.827586 | 3372.793103 | 12.948276 |
1 | 71 | 21.111111 | 5.629630 | 213.888889 | 107.037037 | 3030.592593 | 15.000000 |
2 | 72 | 18.714286 | 5.821429 | 218.375000 | 120.178571 | 3237.714286 | 15.125000 |
3 | 73 | 17.100000 | 6.375000 | 256.875000 | 130.475000 | 3419.025000 | 14.312500 |
4 | 74 | 22.769231 | 5.230769 | 170.653846 | 94.230769 | 2878.038462 | 16.173077 |
We define a dataset with a multi index representing multiple categories
autompg_multi_index = autompg.query("yr<=80").groupby(['yr', 'origin']).mean(numeric_only=True)
autompg_multi_index.head()
mpg | cyl | displ | hp | weight | accel | ||
---|---|---|---|---|---|---|---|
yr | origin | ||||||
70 | Asia | 25.500000 | 4.000000 | 105.000000 | 91.500000 | 2251.0 | 14.750000 |
Europe | 25.200000 | 4.000000 | 107.800000 | 86.200000 | 2309.2 | 16.500000 | |
North America | 15.272727 | 7.636364 | 336.909091 | 166.954545 | 3716.5 | 11.977273 | |
71 | Asia | 29.500000 | 4.000000 | 88.250000 | 79.250000 | 1936.0 | 16.375000 |
Europe | 28.750000 | 4.000000 | 95.000000 | 74.000000 | 2024.0 | 16.750000 |
We define wide form data, i.e. multiple columns representing a category like origin
.
autompg_wide = autompg_multi_index.reset_index().pivot(index='yr', columns='origin', values='mpg')
autompg_wide.head()
origin | Asia | Europe | North America |
---|---|---|---|
yr | |||
70 | 25.500000 | 25.20 | 15.272727 |
71 | 29.500000 | 28.75 | 17.736842 |
72 | 24.200000 | 22.00 | 16.277778 |
73 | 20.000000 | 24.00 | 15.034483 |
74 | 29.333333 | 27.00 | 18.142857 |
Basic Bar Plots#
You can plot long form data if you specify the categorical x-value using the x
argument and the numerical y-value using the y-argument
.
autompg_long_form.hvplot.bar(x="yr", y="mpg", width=1000)
If you don’t specify the x
argument, then the index will be used.
autompg_long_form.hvplot.bar(y="mpg", width=1000)
When the index is a MultiIndex
, the x-axis represents the multiple categories included in the index, the outer index level being displayed as the outer category.
autompg_multi_index.hvplot.bar(width=1000, rot=90)
You can instead stack on the y-axis the values of the nested index/category, origin in this example, by setting stacked
to True
.
autompg_multi_index.hvplot.bar(stacked=True, width=1000, legend="top_left", height=500)
To plot multiple categories on the x-axis when the data is wide form, you need to provide a list of columns to y
.
autompg_wide.hvplot.bar(y=['Asia', 'Europe', 'North America'], width=1000, ylabel="mpg", rot=90)
And you may also stack the values of the wide form data.
autompg_wide.hvplot.bar(y=['Asia', 'Europe', 'North America'], ylabel="mpg", stacked=True, width=1000, legend="top_left", height=500)
Colorful Bar Plots#
You can control the bar
color using the color
argument. It accepts the name of a column, the name of a color or a list of colors.
Here is an example using a single named color.
autompg_long_form.hvplot.bar(x="yr", y="mpg", color="teal", width=1000)
Here is an example using a list of colors.
autompg_wide.hvplot.bar(y=['Asia', 'Europe', 'North America'], width=1000, ylabel="mpg", color=["#ba2649", "#ffa7ca", "#1a6b54"], rot=90)
Here is an example using the name of a column.
autompg_long_form.hvplot.bar(y='mpg', color="weight", colorbar=True, clabel="Weight", cmap="bmy", width=1000)