hvplot.plotting.parallelcoordinates#
- hvplot.plotting.parallel_coordinates(data, class_column, cols=None, alpha=0.5, width=600, height=300, var_name='variable', value_name='value', cmap=None, colormap=None, **kwds)[source]#
Parallel coordinates plotting.
To show a set of points in an n-dimensional space, a backdrop is drawn consisting of n parallel lines. A point in n-dimensional space is represented as a polyline with vertices on the parallel axes; the position of the vertex on the i-th axis corresponds to the i-th coordinate of the point.
- Parameters:
- frameDataFrame
The DataFrame to be plotted.
- class_columnstr
Column name containing class names
- colslist, optional
A list of column names to use
- alphafloat, optional
The transparency of the lines. Default is 0.5.
- cmap/colormapstr or colormap object, optional
Colormap to use for groups. Default to Colorcet’s
glasbey_category10
.
- Returns:
- objHoloViews object
The HoloViews representation of the plot.
See also
pandas.plotting.parallel_coordinates
matplotlib version of this routine
Examples#
Basic parallel coordinates plot#
This example shows how to create a simple parallel coordinates plot from a dataframe with 4 features and a categorical column.
import hvplot
import numpy as np
import pandas as pd
np.random.seed(42)
df = pd.DataFrame({
'feature_1': np.random.rand(10) * 20,
'feature_2': np.random.rand(10) * 10,
'feature_3': np.random.randint(0, 100, 10),
'feature_4': np.random.normal(0, 1, 10),
'class': np.random.choice(['A', 'B', 'C'], 10)
})
hvplot.plotting.parallel_coordinates(df, class_column='class')
Example with penguins#
In this example we use 4 features from the penguins dataset and analyze how they are related with their species
. We can see, for instance, that Gentoo penguins dominate the flipper_length_mm
feature, having the highest flipper lengths.
Note
It is important to normalize the features before plotting them. This example leverages scikit-learn
and its MinMaxScaler
transform.
import hvplot
from sklearn.preprocessing import MinMaxScaler
df = hvplot.sampledata.penguins("pandas")
df_scaled = df
cols = ["bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g"]
scaler = MinMaxScaler()
scaled_features = scaler.fit_transform(df[cols])
df_scaled = pd.DataFrame(scaled_features, columns=cols)
df_scaled["species"] = df["species"]
hvplot.plotting.parallel_coordinates(df_scaled, class_column="species")
With Matplotlib#
Parallel coordinates plots can quickly become pretty large and slow to explore with the Bokeh plotting backend. This example shows how to render such a plot with Matplotlib.
import hvplot
from sklearn.preprocessing import MinMaxScaler
hvplot.extension("matplotlib")
df = hvplot.sampledata.penguins("pandas")
cols = ["bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g"]
scaler = MinMaxScaler()
scaled_features = scaler.fit_transform(df[cols])
df_scaled = pd.DataFrame(scaled_features, columns=cols)
df_scaled["species"] = df["species"]
hvplot.plotting.parallel_coordinates(df_scaled, class_column="species")
hvplot.output(backend="bokeh")
Customize#
parallel_coordinates
offers multiple options to customize the plot, with cols
, var_name
, value_name
, cmap
and alpha
.
import hvplot
import numpy as np
import pandas as pd
np.random.seed(42)
df = pd.DataFrame({
'feature_1': np.random.rand(10) * 20,
'feature_2': np.random.rand(10) * 10,
'feature_3': np.random.randint(0, 100, 10),
'feature_4': np.random.normal(0, 1, 10),
'class': np.random.choice(['A', 'B', 'C'], 10)
})
hvplot.plotting.parallel_coordinates(
df, class_column='class', cols=['feature_2', 'feature_3', 'feature_4'],
var_name='Species', value_name='Scaled', cmap='Set1', alpha=0.8,
)