Statistical Plots#
In addition to the plots available via the plot interface, hvPlot makes a number of more sophisticated, statistical plots available that are modelled on pandas.plotting
. To explore these, we will load the iris and stocks datasets from Bokeh:
import pandas as pd
import hvplot.pandas # noqa
from bokeh.sampledata import iris, stocks
iris = iris.flowers
Scatter Matrix#
When working with multi-dimensional data, it is often difficult to understand the relationship between all the different variables. A scatter_matrix
makes it possible to visualize all of the pairwise relationships in a compact format. hvplot.scatter_matrix
is closely modelled on pandas.plotting.scatter_matrix
:
hvplot.scatter_matrix(iris, c="species")
Compared to a static Seaborn/Matplotlib-based plot, here it is easy to explore the data interactively thanks to Bokeh’s linked zooming, linked panning, and linked brushing (using the box_select
and lasso_select
tools).
Parallel Coordinates#
Parallel coordinate plots provide another way of visualizing multi-variate data. hvplot.parallel_coordinates
provides a simple API to create such a plot, modelled on the API of pandas.plotting.parallel_coordinates()
:
hvplot.parallel_coordinates(iris, "species")
The plot quickly clarifies the relationship between different variables, highlighting the difference of the “setosa” species in the petal width and length dimensions.
Andrews Curves#
Another similar approach is to visualize the dimensions using Andrews curves, which are constructed by generating a Fourier series from the features of each observation, visualizing the aggregate differences between classes. The hvplot.andrews_curves()
function provides a simple API to generate Andrews curves from a datafrom, closely matching the API of pandas.plotting.andrews_curves()
:
hvplot.andrews_curves(iris, "species")