hvPlot.scatter#
- hvPlot.scatter(x=None, y=None, **kwds)[source]#
The scatter plot visualizes your points as markers in 2D space. You can visualize one more dimension by using colors.
The scatter plot is a good first way to plot data with non continuous axes.
Reference: https://hvplot.holoviz.org/ref/api/manual/hvplot.hvPlot.scatter.html
Plotting options: https://hvplot.holoviz.org/ref/plotting_options/index.html
- Parameters:
- xstring, optional
Field name(s) to draw x-positions from. If not specified, the index is used. Can refer to continuous and categorical data.
- ystring or list, optional
Field name(s) to draw y-positions from. If not specified, all numerical fields are used.
- markerstring, optional
The marker shape depends on the activated plotting backend:
Bokeh: Bokeh marker styles and a subset of Matplotlib styles, e.g.
'circle'
(default),'dot'
,'cross'
,'x'
,'square'
for Bokeh markers and'+'
,'x'
,'s'
for Matplotlib- compatible markers. See https://docs.bokeh.org/en/latest/docs/examples/basic/scatters/markertypes.html for the list of Bokeh markers.Matplotlib: Any supported marker, e.g.
's'
(square),'x'
(cross),'+'
, etc. See https://matplotlib.org/stable/api/markers_api.html for the list of Matplotlib markers.
- cstring, optional
A color or a field name to draw the color of the marker from. Alias of
color
.- sint, optional, also available as ‘size’
The size of the marker.
- scale: number, optional
Scaling factor to apply to point scaling. Default is 1.
- logzbool
Whether to apply log scaling to the z-axis. Default is False.
- **kwdsoptional
Additional keywords arguments are documented in Plotting Options. Run
hvplot.help('scatter')
for the full method documentation.
- Returns:
holoviews.element.Scatter
/ Panel objectYou can print the object to study its composition and run:
import holoviews as hv hv.help(the_holoviews_object)
to learn more about its parameters and options.
References
Bokeh: https://docs.bokeh.org/en/latest/docs/examples/basic/scatters/color_scatter.html
HoloViews: https://holoviews.org/reference/elements/matplotlib/Scatter.html
Pandas: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.scatter.html
Matplotlib: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html
Seaborn: https://seaborn.pydata.org/generated/seaborn.scatterplot.html
Backend-specific styling options#
alpha, angle, cmap, color, fill_alpha, fill_color, hit_dilation, hover_alpha, hover_color, hover_fill_alpha, hover_fill_color, hover_line_alpha, hover_line_cap, hover_line_color, hover_line_dash, hover_line_dash_offset, hover_line_join, hover_line_width, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, marker, muted, muted_alpha, muted_color, muted_fill_alpha, muted_fill_color, muted_line_alpha, muted_line_cap, muted_line_color, muted_line_dash, muted_line_dash_offset, muted_line_join, muted_line_width, nonselection_alpha, nonselection_color, nonselection_fill_alpha, nonselection_fill_color, nonselection_line_alpha, nonselection_line_cap, nonselection_line_color, nonselection_line_dash, nonselection_line_dash_offset, nonselection_line_join, nonselection_line_width, palette, radius, radius_dimension, selection_alpha, selection_color, selection_fill_alpha, selection_fill_color, selection_line_alpha, selection_line_cap, selection_line_color, selection_line_dash, selection_line_dash_offset, selection_line_join, selection_line_width, size, visible
alpha, c, cmap, color, ec, ecolor, edgecolor, edgecolors, facecolors, linewidth, lw, marker, norm, s, visible, vmax, vmin
Examples#
Scatter plots are useful for exploring relationships, distributions, and potential correlations between numeric variables.
Basic scatter plot#
This example shows how to create a simple scatter plot.
import hvplot.pandas # noqa
import pandas as pd
df = pd.DataFrame({"x": [0, 1, 2, 3], "y": [0, 1, 4, 9]})
df.hvplot.scatter(x="x", y="y")
Let’s use a more realistic dataset.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='flipper_length_mm',
title='Bill Length vs Flipper Length'
)
Grouping by categories#
To distinguish categories visually, you can use the by
parameter. This automatically colors points based on the specified column(s). The generated plot is a HoloViews NdOverlay.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='flipper_length_mm',
by=['sex', 'species'], title='Scatter plot grouped by sex and species with "by"',
)
Note
If your goal is to simply color the plot by a given categorical variable, then you can use the color
option instead of by
. The former will vectorize the color styling (i.e., each marker has its own color) while the latter will generate an overlay of scatter plots. As a consequence, using color
is much more efficient in this case.
import hvplot.pandas # noqa
df = hvplot.sampledata.penguins("pandas")
df.hvplot.scatter(
x='bill_length_mm', y='flipper_length_mm',
color='species', title='Scatter plot colored by species with "color"',
)
Control marker style#
The marker style can be controlled with the styling option marker
. For Bokeh plots, the option accepts Bokeh-based markers (see the plot below) and a subset of Matplotlib-compatible markers like '+'
(note these markers cannot be vectorized). Matplotlib plots accept Matplotlib markers.
import bokeh as bk
import holoviews as hv
import hvplot.pandas # noqa
import itertools
import pandas as pd
bokeh_orig_markers = list(bk.core.enums.MarkerType)
hv_bk_mpl_compat_markers = list(hv.plotting.bokeh.styles.markers)
print('Bokeh original markers:')
print(*map(repr, bokeh_orig_markers), sep=', ', end='\n\n')
print('Matplotlib-compatible markers for Bokeh:')
print(*map(repr, hv_bk_mpl_compat_markers), sep=', ')
df = pd.DataFrame(list(itertools.product(range(6), range(6))), columns=['x', 'y'])
df['marker_col'] = bokeh_orig_markers + [''] * (len(df) - len(bokeh_orig_markers))
df.hvplot.scatter(
x='x', y='y', marker='marker_col', s=150, title='Bokeh-specific markers'
) *\
df.assign(y=df.y+0.2).hvplot.labels(
x='x', y='y', text='marker_col', text_color='black',
text_baseline='bottom', text_font_size='9pt', padding=0.2
)
Bokeh original markers:
'asterisk', 'circle', 'circle_cross', 'circle_dot', 'circle_x', 'circle_y', 'cross', 'dash', 'diamond', 'diamond_cross', 'diamond_dot', 'dot', 'hex', 'hex_dot', 'inverted_triangle', 'plus', 'square', 'square_cross', 'square_dot', 'square_pin', 'square_x', 'star', 'star_dot', 'triangle', 'triangle_dot', 'triangle_pin', 'x', 'y'
Matplotlib-compatible markers for Bokeh:
'+', 'x', 's', 'd', '^', '>', 'v', '<', '1', '2', '3', '4', 'o', '*'
Control color and size#
You can also vary marker size with the s
option and color with c
(or color
) using numeric columns.
import hvplot.pandas # noqa
df = hvplot.sampledata.earthquakes("pandas")
df.hvplot.scatter(
x='lon', y='lat', c='mag', s='depth', cmap="inferno_r",
clabel="Magnitude values", title='Earthquake depth (color by magnitude)',
)
Scatter plot with scaling and logarithmic color mapping#
This example shows how to fine-tune scatter plots by scaling point sizes and applying a logarithmic color scale. Note we set the scale
option to uniformally increase the marker size by a factor of 3.
import pandas as pd
import hvplot.pandas # noqa
import numpy as np
df = pd.DataFrame({
'x': np.random.rand(100) * 10,
'y': np.random.rand(100) * 10,
'size': np.random.rand(100) * 100 + 10,
'intensity': np.random.lognormal(mean=2, sigma=1, size=100)
})
df.hvplot.scatter(
x='x', y='y', s='size', scale=3,
c='intensity', cmap='Blues', logz=True,
title='Scatter plot with size scaling and log color'
)
Xarray example#
import hvplot.xarray # noqa
ds = hvplot.sampledata.air_temperature("xarray").sel(lon=285.,lat=40.)
ds.hvplot.scatter(y="air")