hvplot.plotting.scatter_matrix#
- hvplot.plotting.scatter_matrix(data, c=None, chart='scatter', diagonal='hist', alpha=0.5, nonselection_alpha=0.1, tools=None, cmap=None, colormap=None, diagonal_kwds=None, hist_kwds=None, density_kwds=None, datashade=False, rasterize=False, dynspread=False, spread=False, **kwds)[source]#
Scatter matrix of numeric columns.
A scatter_matrix shows all the pairwise relationships between the columns. Each non-diagonal plots the corresponding columns against each other, while the diagonal plot shows the distribution of each individual column.
This function is closely modelled on
pandas.plotting.scatter_matrix()
.- Parameters:
- dataDataFrame
The data to plot. Every column is compared to every other column.
- cstr, optional
Column to color by.
- chartstr, optional
Chart type for the off-diagonal plots (one of
'scatter'
,'bivariate'
,'hexbin'
). Default is'scatter'
.- diagonalstr, optional
Chart type for the diagonal plots (one of
'hist'
,'kde'
). Default is'hist'
.- alphafloat, optional
Transparency level for the off-diagonal plots. Default is 0.5.
- nonselection_alphafloat, optional
Transparency level for nonselected object in the off-diagonal plots. Default is 0.1.
- toolslist of str, optional
Interaction tools to include. Defaults are
'box_select'
and'lasso_select'
.- cmap/colormapstr or colormap object, optional
Colormap to use when
c
is set. Default isCategory10
(see d3/d3-3.x-api-reference).- diagonal_kwds/hist_kwds/density_kwdsdict, optional
Keyword options for the diagonal plots.
- datashadebool, default=False
Whether to apply rasterization and shading (colormapping) using the Datashader library, returning an RGB object instead of individual points.
- rasterizebool, default=False
Whether to apply rasterization using the Datashader library, returning an aggregated Image (to be colormapped by the plotting backend) instead of individual points.
- dynspreadbool, default=False
For plots generated with datashade=True or rasterize=True, automatically increase the point size when the data is sparse so that individual points become more visible. kwds supported include
max_px
,threshold
,shape
,how
andmask
.- spreadbool, default=False
Make plots generated with datashade=True or rasterize=True increase the point size to make points more visible, by applying a fixed spreading of a certain number of cells/pixels. kwds supported include:
px
,shape
,how
andmask
.- **kwdsoptional
Keyword options for the off-diagonal plots and datashader’s spreading , optional
- Returns:
- objHoloViews object
The HoloViews representation of the plot.
See also
pandas.plotting.scatter_matrix()
Equivalent pandas function.
Examples#
Basic scatter matrix plot#
This example shows how to create a simple scatter matrix plot from a DataFrame. The plot includes all the columns of the DataFrame.
Tip
Enable the Box Select tool and select an area on one of the off-diagonal scatter plots, you will see that a selection is automatically displayed on the other scatter plots, allowing to better explore and understand the dataset. This feature is called linked brushing (read more about it in this HoloViews user guide) and does not require a live Python process, it is purely a front-end functionality. Note that it also works with the Lasso Select tool.
import hvplot
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(200, 4), columns=['A','B','C','D'])
hvplot.plotting.scatter_matrix(df)
Customize the off-diagonal plot types#
The chart
keyword allows to change the type of the off-diagonal plots.
import hvplot
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])
hvplot.plotting.scatter_matrix(df, chart='bivariate') +\
hvplot.plotting.scatter_matrix(df, chart='hexbin')
Customize the off-diagonal plot types#
The diagonal
parameter allows to change the type of the diagonal plots.
import hvplot
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])
hvplot.plotting.scatter_matrix(df, diagonal='kde')
Control the tools#
Setting tools
to include a selection tool like box_select
and an inspection tool like hover
permits further analysis.
import hvplot
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])
hvplot.plotting.scatter_matrix(df, tools=['box_select', 'hover'])
Color the data per group#
The c
parameter allows to colorize the data by a given column, here by 'CAT'
. Note also that the diagonal_kwds
parameter (equivalent to hist_kwds
in this case or density_kwds
for kde plots) allow to customize the diagonal plots.
import hvplot
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])
df['CAT'] = np.random.choice(['X', 'Y', 'Z'], len(df))
hvplot.plotting.scatter_matrix(df, c='CAT', diagonal_kwds=dict(alpha=0.3))
Display large data#
Scatter matrix plots may end up with a large number of points having to be rendered which can be challenging for the browser or even just crash it. In that case you should consider setting to True
the rasterize
(or datashade
) option that uses Datashader to render the off-diagonal plots on the backend and then send more efficient image-based representations to the browser.
The following scatter matrix plot has 1,200,000 (12x100,000) points that are rendered efficiently by Datashader.
import hvplot
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randn(100_000, 4), columns=['A','B','C','D'])
hvplot.plotting.scatter_matrix(df, rasterize=True)