The EUPP postprocessing benchmark dataset documentation

This website document the EUMETNET postprocessing benchmark datasets, an initiative to provide high-quality datasets to derive easily analysis-ready datasets that can be used to perform benchmarking tasks of weather forecast postprocessing methods.

The main tool to download and manage the data is a Python plugin. It can however convert the data to formats that can then be processed by other languages, and a few line of Python codes suffice to obtain the datasets.

Note

  • Climetlab plugin version: 0.3.1

  • Intake catalogues version: 0.2.2

  • Base dataset version: v1.0

  • EUPPBench dataset version: v1.0

  • EUPreciPBench dataset version: 0.5

  • Dataset status: Datasets status

Using climetlab to access the data

A plugin for climetlab to retrieve the Eumetnet postprocessing benchmark datasets is available.

PyPI version PyPI pyversions build

It facilitates the download of the dataset time-aligned forecasts, reforecasts (hindcasts) and observations (ERA5 reanalysis).

See the demo notebooks Binder

The climetlab python plugin allows users to easily access the data with a few lines of code such as:

# Uncomment the line below if climetlab and the plugin are not yet installed
#!pip install climetlab climetlab-eumetnet-postprocessing-benchmark
import climetlab as cml
ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', "2017-12-02", "2t", "highres")
fcs = ds.to_xarray()

which for instance download the deterministic (high-resolution) forecasts for the 2 metres temperature.

Using the Intake catalogue to access the data

An Intake catalogue is also available, as an alternative way to get the datasets. Note that the Base datasets over Europe’s domain cannot be retrieved by this method.

Access through the catalogue can be done with the Python command line interface in a few lines:

# Uncomment the line below if the catalogue is not yet installed
#!pip install euppbench-datasets
import euppbench_datasets
cat = euppbench_datasets.open_catalog()
ds = cat.euppbench.training_data.gridded.EUPPBench_highres_forecasts_surface.to_dask()

which download the original EUPPBench deterministic (high-resolution) forecasts in the xarray format.

Indices and tables