hvplot.plotting.scatter_matrix#

hvplot.plotting.scatter_matrix(data, c=None, chart='scatter', diagonal='hist', alpha=0.5, nonselection_alpha=0.1, tools=None, cmap=None, colormap=None, diagonal_kwds=None, hist_kwds=None, density_kwds=None, datashade=False, rasterize=False, dynspread=False, spread=False, **kwds)[source]#

Scatter matrix of numeric columns.

A scatter_matrix shows all the pairwise relationships between the columns. Each non-diagonal plots the corresponding columns against each other, while the diagonal plot shows the distribution of each individual column.

This function is closely modelled on pandas.plotting.scatter_matrix().

Parameters:
dataDataFrame

The data to plot. Every column is compared to every other column.

cstr, optional

Column to color by.

chartstr, optional

Chart type for the off-diagonal plots (one of 'scatter', 'bivariate', 'hexbin'). Default is 'scatter'.

diagonalstr, optional

Chart type for the diagonal plots (one of 'hist', 'kde'). Default is 'hist'.

alphafloat, optional

Transparency level for the off-diagonal plots. Default is 0.5.

nonselection_alphafloat, optional

Transparency level for nonselected object in the off-diagonal plots. Default is 0.1.

toolslist of str, optional

Interaction tools to include. Defaults are 'box_select' and 'lasso_select'.

cmap/colormapstr or colormap object, optional

Colormap to use when c is set. Default is Category10 (see d3/d3-3.x-api-reference).

diagonal_kwds/hist_kwds/density_kwdsdict, optional

Keyword options for the diagonal plots.

datashadebool, default=False

Whether to apply rasterization and shading (colormapping) using the Datashader library, returning an RGB object instead of individual points.

rasterizebool, default=False

Whether to apply rasterization using the Datashader library, returning an aggregated Image (to be colormapped by the plotting backend) instead of individual points.

dynspreadbool, default=False

For plots generated with datashade=True or rasterize=True, automatically increase the point size when the data is sparse so that individual points become more visible. kwds supported include max_px, threshold, shape, how and mask.

spreadbool, default=False

Make plots generated with datashade=True or rasterize=True increase the point size to make points more visible, by applying a fixed spreading of a certain number of cells/pixels. kwds supported include: px, shape, how and mask.

**kwdsoptional

Keyword options for the off-diagonal plots and datashader’s spreading , optional

Returns:
objHoloViews object

The HoloViews representation of the plot.

See also

pandas.plotting.scatter_matrix()

Equivalent pandas function.

Examples#

Basic scatter matrix plot#

This example shows how to create a simple scatter matrix plot from a DataFrame. The plot includes all the columns of the DataFrame.

Tip

Enable the Box Select tool and select an area on one of the off-diagonal scatter plots, you will see that a selection is automatically displayed on the other scatter plots, allowing to better explore and understand the dataset. This feature is called linked brushing (read more about it in this HoloViews user guide) and does not require a live Python process, it is purely a front-end functionality. Note that it also works with the Lasso Select tool.

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(200, 4), columns=['A','B','C','D'])

hvplot.plotting.scatter_matrix(df)

Customize the off-diagonal plot types#

The chart keyword allows to change the type of the off-diagonal plots.

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])

hvplot.plotting.scatter_matrix(df, chart='bivariate') +\
hvplot.plotting.scatter_matrix(df, chart='hexbin')

Customize the off-diagonal plot types#

The diagonal parameter allows to change the type of the diagonal plots.

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])

hvplot.plotting.scatter_matrix(df, diagonal='kde')

Control the tools#

Setting tools to include a selection tool like box_select and an inspection tool like hover permits further analysis.

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])

hvplot.plotting.scatter_matrix(df, tools=['box_select', 'hover'])

Color the data per group#

The c parameter allows to colorize the data by a given column, here by 'CAT'. Note also that the diagonal_kwds parameter (equivalent to hist_kwds in this case or density_kwds for kde plots) allow to customize the diagonal plots.

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100, 2), columns=['A','B'])
df['CAT'] = np.random.choice(['X', 'Y', 'Z'], len(df))

hvplot.plotting.scatter_matrix(df, c='CAT', diagonal_kwds=dict(alpha=0.3))

Display large data#

Scatter matrix plots may end up with a large number of points having to be rendered which can be challenging for the browser or even just crash it. In that case you should consider setting to True the rasterize (or datashade) option that uses Datashader to render the off-diagonal plots on the backend and then send more efficient image-based representations to the browser.

The following scatter matrix plot has 1,200,000 (12x100,000) points that are rendered efficiently by Datashader.

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100_000, 4), columns=['A','B','C','D'])

hvplot.plotting.scatter_matrix(df, rasterize=True)

When rasterize (or datashade) is toggled it’s possible to make individual points more visible by setting dynspread=True or spread=True. Head over to the Working with large data using datashader guide of HoloViews to learn more about these operations and what parameters they accept (which can be passed as kwds to scatter_matrix).

import hvplot
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randn(100_000, 4), columns=['A','B','C','D'])

hvplot.plotting.scatter_matrix(df, rasterize=True, dynspread=True)
This web page was generated from a Jupyter notebook and not all interactivity will work on this website.