# Getting started


Welcome to hvPlot!

This getting started guide will get you set up with hvPlot and provide a basic overview of its features and strengths.


## Installation


hvPlot supports Python 3.9 and above on Linux, Windows, or Mac. You can install hvPlot via the following options:


::::{tab-set}

:::{tab-item} pip
```bash
pip install hvplot
```
:::

:::{tab-item} conda
```bash
conda install conda-forge::hvplot
```
:::

:::{tab-item} uv
```bash
uv pip install hvplot
```
:::

:::{tab-item} other
[Installing for development](../developer_guide.md)

:::

::::

To run the guides in this site locally, create an environment with the required dependencies:

```bash
conda create -n hvplot-env -c conda-forge --override-channels hvplot geoviews datashader xarray pandas geopandas dask streamz networkx intake intake-xarray intake-parquet s3fs scipy spatialpandas pooch rasterio fiona plotly matplotlib hvsampledata jupyterlab
```

## Overview

The core functionality provided by hvPlot is a **simple and high-level plotting interface** (API), modeled on [Pandas](https://pandas.pydata.org)'s `.plot` API and extended in various ways leveraging capabilities offered by the packages of the [HoloViz](https://holoviz.org) ecosystem, most notably [HoloViews](https://holoviews.org/). hvPlot can generate interactive plots with either [Bokeh](https://bokeh.org) (default) or [Plotly](https://plotly.com/python/), or static plots with [Matplotlib](https://matplotlib.org). hvPlot supports many data libraries of the Python ecosystem such as `pandas`, `xarray`, `dask` etc.

```{image} ../assets/diagram.svg
---
alt: hvPlot diagram
align: center
width: 80%
---
```

## Register `.hvplot`

Let’s create a simple Pandas DataFrame we’ll plot later.

In [None]:
import numpy as np
import pandas as pd

In [None]:
np.random.seed(1)

In [None]:
idx = pd.date_range('1/1/2000', periods=1000)
df = pd.DataFrame(np.random.randn(1000, 4), index=idx, columns=list('ABCD')).cumsum()
df.head(2)

The most convenient way to use hvPlot is to register the `.hvplot` *accessor* on the data type you are working with. This is done with a special import of the form `import hvplot.<data library>`.

In [None]:
import hvplot.pandas  # noqa

In addition to registering the `.hvplot` accessor on Pandas objects, the import above sets the Bokeh plotting library as the default one and loads its corresponding extension.

:::{attention}
In a notebook, loading the extension means that there's actually some front-end code that is injected in the cell output of the import, this code being required for HoloViews plots to behave correctly. So make sure not to remove this cell!
:::

Now simply call `.hvplot()` on the DataFrame as you would call Pandas' `.plot()`.

In [None]:
first_plot = df.hvplot()
first_plot

The same process can be applied to other libraries, we'll just show another example with Xarray.

In [None]:
import hvplot.xarray  # noqa
import hvsampledata

air_ds = hvsampledata.air_temperature("xarray")
air_ds

In [None]:
air_ds.hvplot.image(data_aspect=1, frame_width=400, dynamic=False)

The default plots hvPlot generate are Bokeh plots. These plots are interactive and support panning, zooming, hovering, and clickable/selectable legends. It's worth spending some time getting used to interacting with this kind of plot.

## hvplot namespace

The `.hvplot` namespace holds the range of supported plot methods (e.g. `line`, `scatter`, `hist`, etc.). Use tab completion in a notebook to explore the available plot types.

```python
df.hvplot.<TAB>
```

Similarly to Panda's API, every plot method accepts a wide range of parameters. You can explore them by calling `hvplot.help('line')` or using tab completion:

```python
df.hvplot.line(<TAB>
```

## Compose plots

The object returned by an `.hvplot()` call is a [HoloViews](https://holoviews.org/index.html) object whose `repr` includes the HoloViews element type (e.g. `Curve`) and the dimensions.

In [None]:
plot1 = df['A'].hvplot.area(alpha=0.2, color='red', height=150, responsive=True)
plot2 = df['B'].hvplot.line(height=150, responsive=True)
print(type(plot2))
print(plot2)

`HoloViews` objects can be easily composed using the `+` and `*` operators:

- `<plot1> + <plot2>` lays the plots out in a *row* container
- `(<plot1> + <plot2>).cols(1)` lays the plots out in a *column* container
- `<plot1> * <plot2>` overlays the plots

In [None]:
plot1 + plot2

In [None]:
(plot1 + plot2).cols(1)

In [None]:
plot1 * plot2

## Widgets-based exploration

In [None]:
df_penguins = hvsampledata.penguins("pandas")
df_penguins.head(2)

The `groupby` parameter allows us to declare which dimension(s) of the dataset we would like to make explorable with widgets.

In [None]:
df_penguins.hvplot.scatter(
    x='bill_length_mm', y='bill_depth_mm', groupby=['island', 'sex'],
    height=300, width=400, dynamic=False,
)

## Display large data

hvPlot provides multiple ways to display arbitrarily large datasets. The most versatile option depends on [Datashader](https://datashader.org/) (optional dependency) and simply consists of setting `rasterize=True`. The plot returned is an image, each pixel of that image being colorized based on the number of points it contains; which means that all the points contribute to the image. This plot is also dynamic, zooming in and out and panning leads to a recomputation of the image. Luckily, this all happens really fast! The plot below is generated with no less than 5 million points.

In [None]:
NUM = 1_000_000
dists = [
    pd.DataFrame(dict(x=np.random.normal(x, s, NUM), y=np.random.normal(y, s, NUM)))
     for x,  y,    s in [
       ( 5,  2, 0.20),
       ( 2, -4, 0.10),
       (-2, -3, 0.50),
       (-5,  2, 1.00),
       ( 0,  0, 3.00)]
]
df_large_data = pd.concat(dists, ignore_index=True)
print(len(df_large_data))
df_large_data.head(2)

In [None]:
df_large_data.hvplot.points(
    'x', 'y', rasterize=True, cnorm='eq_hist',
    data_aspect=1, colorbar=False
)

:::{note}
This interactive functionality requires a live Python process to be running.
:::

## Geographic plots

hvPlot can generate geographic plots and handle geospatial data (e.g., GeoPandas DataFrame) on its own.

### Without GeoViews

hvPlot allows us to add a tile map as a basemap to a plot with the `tiles` parameter without having to install [GeoViews](https://geoviews.org/). This is possible because hvPlot projects lat/lon values to easting/northing (EPSG:4326 to EPSG:3857) coordinates without additional package dependencies if it detects that the values falls within expected lat/lon ranges.

In [None]:
earthquakes = hvsampledata.earthquakes("pandas")
earthquakes.head(2)

In [None]:
earthquakes.hvplot.points('lon', 'lat', tiles=True, alpha=0.6)

### With GeoViews

For more advanced mapping features, you can optionally install [GeoViews](https://geoviews.org/).

In [None]:
import cartopy.crs as crs

air_ds.hvplot.quadmesh(
    'lon', 'lat', projection=crs.Orthographic(-90, 30), project=True,
    global_extent=True, cmap='viridis', coastline=True
)

For more information, see the [User guide section](https://hvplot.holoviz.org/user_guide/Geographic_Data.html) on geographic plots

## Matplotlib or Plotly plots

hvPlot offers the possibility to create [Matplotlib](https://matplotlib.org/) and [Plotly](https://plotly.com/) plots. Load and enable the chosen plotting library with the `extension` function.

In [None]:
hvplot.extension('matplotlib')

In [None]:
df_penguins.hvplot.scatter(x='bill_length_mm', y='bill_depth_mm', by='sex')

## Rendering and saving output

In notebook environments like Jupyter or VS Code, objects returned by hvPlot calls are displayed automatically. For example, this documentation page is built from a Jupyter notebook.

If you’re running code outside of a notebook (e.g., in a Python script or console), you can:
- Display plots: Use `hvplot.show(<obj>)` to render plots in a browser or interactive viewer.
- Save plots: Use `hvplot.save(<obj>, 'this_plot.html')` to save plots to an HTML file (e.g., for Bokeh) for later viewing.

:::{tip}
When using Bokeh as the plotting backend, you can also save plots directly using the “Save” icon in the plot’s toolbar.
:::

For more information about viewing or saving plots, check out the user guide section on [viewing plots](https://hvplot.holoviz.org/user_guide/Viewing.html#)

## hvPlot explorer

The *Explorer* is a [Panel](https://panel.holoviz.org)-based web application with which you can easily explore your data. While using `.hvplot()` is a convenient way to create plots from data, it assumes some *a piori* knowledge about the data itself and its structure, and also knowledge about `.hvplot()`'s API itself. The *Explorer* is a graphical interface that offers a simple way to select and visualize the kind of plot you want to see your data with, and many options to customize that plot.

### Set up

Setting up the explorer is pretty simple in a notebook. You just need to make sure you have loaded the extension, either via a data type import (e.g. `import hvplot.pandas`) or directly (e.g. `hvplot.extension('bokeh')`).

In [None]:
# not displayed on the site
hvplot.output(backend='bokeh')

### Basic usage

The explorer is available on the `.hvplot` namespace together with the other plotting methods. It accepts most of the parameters accepted by the `.hvplot()` API. For the purpose of producing a nice example on the documentation, we will instantiate an explorer with some pre-defined parameters; usually you would instantiate it without any parameter.

In [None]:
explorer = df_penguins.hvplot.explorer(x='bill_length_mm', y='bill_depth_mm', by=['species'])
explorer

Spend some time browsing the explorer and the options it offers.

:::{note}
This interactive functionality requires a live Python process to be running.
:::

Once you are done exploring the data you may want to record the settings you have configured or save the plot. The easiest option consists of opening the *Code* tab next to *Plot* and copy/pasting the code displayed in a new notebook cell, executing it will generate exactly the same code as seen in the explorer.

## hvPlot interactive

hvPlot isn't only a plotting library, it is dedicated to making data exploration easier. The `hvplot.interactive()` API can help you to get better control over your data pipelines. We define a *data pipeline* as a series of commands that *transform* some data, such as aggregating, filtering, reshaping, renaming, etc. A data pipeline may include a *load* step that will provide the input data to the pipeline, e.g. reading the data from a database. 

When you analyze some data in a notebook that is for instance held in a Pandas DataFrame, you may find yourself having to re-run many cells after changing the parameters you provide to Pandas' methods, either to get more insights on the data or fine-tune an algorithm. `.interactive()` is a solution to improve this rather cumbersome workflow, by which you replace the constant parameters in the pipeline by widgets (e.g. a number slider), that will automatically get displayed next to your pipeline output and will trigger an output update on changes. With this approach all your pipeline parameters are available in one place and you get full interactive control over the pipeline.

`.interactive()` doesn't only work with DataFrames but also with Xarray data structures, this is what we are going to show in this guide. First we will import `hvplot.xarray` which is going to make available the `.interactive()` accessor on Xarray objects.

In [None]:
import hvplot.xarray  # noqa

We load the `air_temperature` dataset and get a handle on its unique *air* variable.

In [None]:
air = air_ds.air
air

We want to better understand the temporal evolution of the air temperature over different latitudes compared to a baseline. The data pipeline we build includes:

1. filtering the data at one latitude
2. cleaning up the data
3. aggregating the temperatures by time
4. computing a rolling mean
5. subtracting a baseline from the above

The output we choose for now is the `.describe()` method of Pandas. This pipeline has two parameters, the latitude and the temporal window of the rolling operation.

In [None]:
LATITUDE = 30.
ROLLING_WINDOW = '1D'

baseline = air.sel(lat=LATITUDE).mean().item()
pipeline = (
    air
    .sel(lat=LATITUDE)
    .to_dataframe()
    .drop(columns='lat')
    .groupby('time').mean()
    .rolling(ROLLING_WINDOW).mean()
    - baseline
)
pipeline.describe()

Without using `.interactive()`, we would need to manually change the values of `LATITUDE` and `ROLLING_WINDOW` to observe how they affect the pipeline’s output. Instead, we can create two widgets that represent the range of values we expect these parameters to take. Essentially, this allows us to define our parameter space in advance. To create the widgets, we import [Panel](https://panel.holoviz.org) and select two appropriate widgets from its [Reference Gallery](https://panel.holoviz.org/reference/index.html#widgets).

In [None]:
import panel as pn

w_latitude = pn.widgets.DiscreteSlider(name='Latitude', options=list(air.lat.values))
w_rolling_window = pn.widgets.RadioButtonGroup(name='Rolling window', options=['1D', '7D', '30D'])

Now we instantiate an *Interactive* object by calling `.interactive()` on our data. This object mirrors the underlying object API, accepting all of its natural operations. We replace the data by the interactive object in the pipeline, and replace the constant parameters by the widgets we have just created.

In [None]:
airi = air.interactive()
baseline = airi.sel(lat=w_latitude).mean().item()
pipeline = (
    airi
    .sel(lat=w_latitude)
    .to_dataframe()
    .drop(columns='lat')
    .groupby('time').mean()
    .rolling(w_rolling_window).mean()
    - baseline
)
pipeline.describe()

You can see that now the pipeline when rendered doesn't only consist of its output, it also includes the widgets that control it. Change the widgets' values and observe how the output dynamically updates.

:::{note}
This interactive functionality requires a live Python process to be running.
:::

You can notice that `.interactive()` supports the fact that the data type changed in the pipeline (see the call to `.to_dataframe`) and that it also supports math operators (`- baseline`).

A plot would be a better output for this pipeline. We will use `.hvplot()` to create an interactive Bokeh line plot.

In [None]:
pipeline.hvplot(height=300, responsive=True)

:::{note}
This interactive functionality requires a live Python process to be running.
:::

For more information about the various hvPlot capabilities, take a look at the [User Guide](https://hvplot.holoviz.org/user_guide/index.html).