{ "cells": [ { "cell_type": "markdown", "id": "c611b682-e4ec-4028-9aa6-b33c7745c494", "metadata": {}, "source": [ "# interactive" ] }, { "cell_type": "markdown", "id": "66dddf49-c70a-4ed1-98e7-76703b82fa34", "metadata": {}, "source": [ "hvPlot isn't only a plotting library, it is dedicated to make data exploration easier. In this guide you will see how it can help you to get better control over your data pipelines. We define a *data pipeline* as a series of commands that *transform* some data, such as aggregating, filtering, reshaping, renaming, etc. A data pipeline may include a *load* step that will provide the input data to the pipeline, e.g. reading the data from a data base. \n", "\n", "When you analyze some data in a notebook that is for instance held in a Pandas DataFrame, you may find yourself having to re-run many cells after changing the parameters you provide to Pandas' methods, either to get more insights on the data or fine tune an algorithm. `.interactive()` is a solution to improve this rather cumbersome workflow, by which you replace the constant parameters in the pipeline by widgets (e.g. a number slider), that will automatically get displayed next to your pipeline output and will trigger an output update on changes. With this approach all your pipeline parameters are available in one place and you get full interactive control over the pipeline.\n", "\n", "`.interactive()` doesn't only work with DataFrames but also with Xarray data structures, this is what we are going to show in this guide. First we will import `hvplot.xarray` which is going make available the `.interactive()` accessor on Xarray objects." ] }, { "cell_type": "code", "execution_count": null, "id": "e14aa834-59c2-4f13-af01-dcd653505350", "metadata": {}, "outputs": [], "source": [ "import hvplot.xarray # noqa\n", "import xarray as xr" ] }, { "cell_type": "markdown", "id": "d8f29718-cf71-4f84-b4be-3e1e3c4b8786", "metadata": {}, "source": [ "We load a dataset and get a handle on its unique *air* variable." ] }, { "cell_type": "code", "execution_count": null, "id": "d60b6b3a-2a1e-4094-94b2-9cc8e5ea5242", "metadata": {}, "outputs": [], "source": [ "ds = xr.tutorial.load_dataset('air_temperature')\n", "air = ds.air\n", "ds" ] }, { "cell_type": "markdown", "id": "63da0ac9-3b01-4c55-8855-6436e4575aca", "metadata": {}, "source": [ "We want to better understand the temporal evolution of the air temperature over different latitudes compared to a baseline. The data pipeline we build includes:\n", "\n", "1. filtering the data at one latitude\n", "2. cleaning up the data\n", "3. aggregating the temperatures by time\n", "4. computing a rolling mean\n", "5. subtracting a baseline from the above\n", "\n", "The output we choose for now is the `.describe()` method of Pandas. This pipeline has two parameters, the latitude and the temporal window of the rolling operation." ] }, { "cell_type": "code", "execution_count": null, "id": "5f7905ce-da82-43f2-8b27-84c5b868a588", "metadata": {}, "outputs": [], "source": [ "LATITUDE = 30.\n", "ROLLING_WINDOW = '1D'\n", "\n", "baseline = air.sel(lat=LATITUDE).mean().item()\n", "pipeline = (\n", " air\n", " .sel(lat=LATITUDE)\n", " .to_dataframe()\n", " .drop(columns='lat')\n", " .groupby('time').mean()\n", " .rolling(ROLLING_WINDOW).mean()\n", " - baseline\n", ")\n", "pipeline.describe()" ] }, { "cell_type": "markdown", "id": "61d07327-08ee-45c4-b5cc-c08adf50f046", "metadata": {}, "source": [ "Without `.interactive()` we would manually change the values of `LATITUDE` and `ROLLING_WINDOW` to see how they affect the pipeline output. Instead we create two widgets with the values we expect them to take, we are basically declaring beforehand our parameter space. To create widgets we import [Panel](https://panel.holoviz.org) and pick two appropriate widgets from its [Reference Gallery](https://panel.holoviz.org/reference/index.html#widgets)." ] }, { "cell_type": "code", "execution_count": null, "id": "81684639-5d17-43aa-90b3-2590ed951272", "metadata": {}, "outputs": [], "source": [ "import panel as pn\n", "\n", "w_latitude = pn.widgets.DiscreteSlider(name='Latitude', options=list(air.lat.values))\n", "w_rolling_window = pn.widgets.RadioButtonGroup(name='Rolling window', options=['1D', '7D', '30D'])" ] }, { "cell_type": "markdown", "id": "a49cb2c6-2e69-4a17-9ca3-ceb918460765", "metadata": {}, "source": [ "Now we instantiate an *Interactive* object by calling `.interactive()` on our data. This object mirrors the underlying object API, it accepts all of its natural operations. We replace the data by the interactive object in the pipeline, and replace the constant parameters by the widgets we have just created." ] }, { "cell_type": "code", "execution_count": null, "id": "55524639-5d5d-40c1-b76f-11da152fc6bd", "metadata": {}, "outputs": [], "source": [ "airi = air.interactive()" ] }, { "cell_type": "code", "execution_count": null, "id": "6f63c03e-79c2-4a6c-9203-2b4727894228", "metadata": {}, "outputs": [], "source": [ "baseline = airi.sel(lat=w_latitude).mean().item()\n", "pipeline = (\n", " airi\n", " .sel(lat=w_latitude)\n", " .to_dataframe()\n", " .drop(columns='lat')\n", " .groupby('time').mean()\n", " .rolling(w_rolling_window).mean()\n", " - baseline\n", ")\n", "pipeline.describe()" ] }, { "cell_type": "markdown", "id": "4bbe8553-f79c-4e46-9a58-93b7cdb3e92f", "metadata": {}, "source": [ "You can see that now the pipeline when rendered doesn't only consist of its output, it also includes the widgets that control it. Change the widgets' values and observe how the output dynamically updates.\n", "\n", "You can notice that `.interactive()` supports the fact that the data type changed in the pipeline (see the call to `.to_dataframe`) and that it also supports math operators (`- baseline`).\n", "\n", "A plot would be a better output for this pipeline. We will use `.hvplot()` to create an interactive Bokeh line plot, note that using Pandas' `.plot()` is also possible." ] }, { "cell_type": "code", "execution_count": null, "id": "83e20794-0261-46f8-a094-1903937de550", "metadata": {}, "outputs": [], "source": [ "import hvplot.pandas # noqa\n", "\n", "pipeline.hvplot()" ] }, { "cell_type": "markdown", "id": "c458f2a3-eeca-477f-b7c1-6b53e4d0f18c", "metadata": {}, "source": [ "For information on using `.interactive()` take a look at the [User Guide](../user_guide/Interactive.ipynb)." ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 5 }