EUPreciPBench datasets

The EUPreciPBench datasets are available on a small portion of Europe stored in Zarr format for an easy access allowing for slicing. The forecasts and observations datasets are already paired together, providing analysis-ready data for postprocessing benchmarking purposes.

It is a forecasts and observations dataset on a regular latitude-longitude grid, containing high-resolution precipitation data, along with some predictors in the column above the surface. It’s horizontal resolution is higher (0.025°) than the one of the EUPPBench datasets (0.25°), but they are conveniently defined on the same domain.

../_images/gridded_data_EUPP.jpg

In blue, the EUPreciPBench dataset domain inside the Base datasets over Europe’s domain.

  • The gridded EUPPreciBench postprocessing benchmark dataset contains COSMO DE and D2 ensemble forecasts over a small domain in Europe, from 46.0° to 53.2° in latitude, and from 2.5° to 10.4° in longitude, and covers the years 2017-2020.

  • It also contains the corresponding EURADCLIM radar composite for the purpose of providing observations for the benchmark.

  • The forecasts provided are the COSMO runs initialized at 03Z.

  • The COSMO ensemble consists of 20 members.

  • The gridded data resolution is 0.025° x 0.025° which corresponds roughly to 2.5 kilometers. COSMO DE, D2 and EURADCLIM data have been regridded to this resolution from their native grid.

  • COSMO DE forecasts (prior to May 2018) only covers part of the EUPPBench domain. COSMO D2 forecasts cover the full EUPPBench domain.

  • Forecasts are hourly, up to 2 days ahead, but do not include the analysis at 03Z.

Datasets description

There are 3 gridded sub-datasets:

1 - Precipitation Forecasts Data

It consists in the total precipitation variable accumulated in the past hour:

Parameter name

ECMWF key

Units

Remarks

Total precipitation

tp

mm

Warning

The units for the total precipitation here are not consistent with the EUPPBench datasets total precipitation units. As the latter uses meters as units, there is a factor 1000 between the two.

Usage: The precipitation forecasts can be retrieved by calling

import climetlab as cml
ds = cml.load_dataset('EUPreciPBench-gridded-precipitation-forecasts')
ds.to_xarray()

Alternatively, one can use the Intake catalogue

import euppbench_datasets
cat = euppbench_datasets.open_catalog()
ds = cat.euprecipbench.EUPreciPBench_precipitation_forecasts.to_dask()

Example:

import climetlab as cml
ds = cml.load_dataset('EUPreciPBench-gridded-precipitation-forecasts')
ds.to_xarray()
By downloading data from this dataset, you agree to the terms and conditions defined at

    https://github.com/Climdyn/climetlab-eumetnet-postprocessing-benchmark/blob/main/DATA_LICENSE

If you do not agree with such terms, do not download the data. 
<xarray.Dataset>
Dimensions:     (latitude: 289, longitude: 317, number: 20, step: 45,
                 surface: 1, time: 1440)
Coordinates:
  * latitude    (latitude) float64 46.0 46.02 46.05 46.07 ... 53.15 53.17 53.2
  * longitude   (longitude) float64 2.5 2.525 2.55 2.575 ... 10.35 10.37 10.4
  * number      (number) int64 1 2 3 4 5 6 7 8 9 ... 12 13 14 15 16 17 18 19 20
  * step        (step) timedelta64[ns] 01:00:00 02:00:00 ... 1 days 21:00:00
  * surface     (surface) float64 0.0
  * time        (time) datetime64[ns] 2017-01-01T03:00:00 ... 2020-12-31T03:0...
    valid_time  (time, step) datetime64[ns] ...
Data variables:
    tp          (time, step, number, surface, latitude, longitude) float32 ...
Attributes:
    Conventions:             CF-1.7
    GRIB_centre:             edzw
    GRIB_centreDescription:  Offenbach
    GRIB_edition:            2
    GRIB_subCentre:          255
    history:                 2025-02-10T04:45 GRIB to CDM+CF via cfgrib-0.9.1...
    license:                 Creative Commons Attribution 4.0
    producer:                Deutsche Wetterdienst (DWD), Offenbach

2 - Predictors Forecasts Data

It consists of several forecasts fields on pressure levels:

Parameter name

Levels

ECMWF key

Units

Remarks

Temperature

500, 700, 850

t

K

U component of wind

700, 950

u

m s^-1

V component of wind

700, 950

v

m s^-1

Relative humidity

700, 850, 950

r

%

These fields can for example be used to compute the Jefferson instability index and used as predictors for postprocessing the precipitation ensemble forecasts.

Usage: The predictors forecasts can be retrieved by calling

import climetlab as cml
ds = cml.load_dataset('EUPreciPBench-gridded-predictors-forecasts')
ds.to_xarray()

Alternatively, one can use the Intake catalogue

import euppbench_datasets
cat = euppbench_datasets.open_catalog()
ds = cat.euprecipbench.EUPreciPBench_predictors_forecasts.to_dask()

Example:

ds = cml.load_dataset('EUPreciPBench-gridded-predictors-forecasts')
ds.to_xarray()
By downloading data from this dataset, you agree to the terms and conditions defined at

    https://github.com/Climdyn/climetlab-eumetnet-postprocessing-benchmark/blob/main/DATA_LICENSE

If you do not agree with such terms, do not download the data. 
<xarray.Dataset>
Dimensions:        (isobaricInhPa: 4, latitude: 289, longitude: 317,
                    number: 20, time: 1439, step: 45)
Coordinates:
  * isobaricInhPa  (isobaricInhPa) float64 500.0 700.0 850.0 950.0
  * latitude       (latitude) float64 46.0 46.02 46.05 ... 53.15 53.17 53.2
  * longitude      (longitude) float64 2.5 2.525 2.55 2.575 ... 10.35 10.37 10.4
  * number         (number) int64 1 2 3 4 5 6 7 8 9 ... 13 14 15 16 17 18 19 20
  * step           (step) timedelta64[ns] 01:00:00 02:00:00 ... 1 days 21:00:00
  * time           (time) datetime64[ns] 2017-01-01T03:00:00 ... 2020-12-31T0...
    valid_time     (time, step) datetime64[ns] ...
Data variables:
    r              (time, step, number, isobaricInhPa, latitude, longitude) float32 ...
    t              (time, step, number, isobaricInhPa, latitude, longitude) float32 ...
    u              (time, step, number, isobaricInhPa, latitude, longitude) float32 ...
    v              (time, step, number, isobaricInhPa, latitude, longitude) float32 ...
Attributes:
    Conventions:             CF-1.7
    GRIB_centre:             edzw
    GRIB_centreDescription:  Offenbach
    GRIB_edition:            2
    GRIB_subCentre:          255
    history:                 2025-04-03T19:03 GRIB to CDM+CF via cfgrib-0.9.1...
    license:                 Creative Commons Attribution 4.0
    producer:                Deutsche Wetterdienst (DWD), Offenbach

3 - Precipitation Observations Data

It consists in the total precipitation variable accumulated in the past hour:

Parameter name

ECMWF key

Units

Remarks

Total precipitation

tp

mm

Warning

The units for the total precipitation here are not consistent with the EUPPBench datasets total precipitation units. As the latter uses meters as units, there is a factor 1000 between the two.

Usage: The precipitation observations can be retrieved by calling

import climetlab as cml
ds = cml.load_dataset('EUPreciPBench-gridded-precipitation-observations')
ds.to_xarray()

Alternatively, one can use the Intake catalogue

import euppbench_datasets
cat = euppbench_datasets.open_catalog()
ds = cat.euprecipbench.EUPreciPBench_EURADCLIM_observations.to_dask()

Example:

ds = cml.load_dataset('EUPreciPBench-gridded-precipitation-observations')
ds.to_xarray()
By downloading data from this dataset, you agree to the terms and conditions defined at

    https://github.com/Climdyn/climetlab-eumetnet-postprocessing-benchmark/blob/main/DATA_LICENSE

If you do not agree with such terms, do not download the data. 
<xarray.Dataset>
Dimensions:     (latitude: 289, longitude: 317, step: 45, time: 1440)
Coordinates:
  * latitude    (latitude) float64 46.0 46.02 46.05 46.07 ... 53.15 53.17 53.2
  * longitude   (longitude) float64 2.5 2.525 2.55 2.575 ... 10.35 10.37 10.4
  * step        (step) timedelta64[ns] 01:00:00 02:00:00 ... 1 days 21:00:00
  * time        (time) datetime64[ns] 2017-01-01T03:00:00 ... 2020-12-31T03:0...
    valid_time  (time, step) datetime64[ns] ...
Data variables:
    tp          (time, step, latitude, longitude) float32 ...
Attributes:
    license:          Creative Commons Attribution 4.0
    product name:     EURADCLIM 1 hour accumulated radar precipitation data
    product version:  2.0
    source:           KNMI
    webpage:          https://dataplatform.knmi.nl/dataset/rad-opera-hourly-r...

4 - Static fields

Various static fields associated to the forecast grid can be obtained, with the purpose of serving as predictors for the postprocessing.

Note

For consistency with the rest of the dataset, we use the ECMWF parameters name, terminology and units here. However, please note that - except for the Surface Geopotential - the fields provided are from other non-ECMWF data sources evaluated at grid points. Currently, the main data source being used is the Copernicus Land Monitoring Service.

It includes:

Parameter name

ECMWF key

Remarks

Land use

landu

Extracted from the CORINE 2018 dataset. Values and associated land type differ from the ECMWF one. Please look at the “legend” entry in the metadata for more details.

Model terrain height

mterh

Extracted from the EU-DEMv1.1 data elevation model dataset.

Surface Geopotential

z

The model orography can be obtained by dividing the surface geopotential by g=9.80665 ms \({}^{-2}\).

Usage: The static fields can be retrieved by calling

ds = cml.load_dataset('EUPreciPBench-gridded-static-fields', parameter)
ds.to_xarray()

where the parameter argument is a string with one of the ECMWF keys described above. It is only possible to download one static field per call.

Alternatively, one can use the Intake catalogue

import euppbench_datasets
cat = euppbench_datasets.open_catalog()
# Fetching the land usage field
ds = cat.euprecipbench.EUPreciPBench_land_usage.to_dask()

The other static field are also available in the same way.

Example:

ds = cml.load_dataset('EUPreciPBench-gridded-static-fields', 'landu')
ds.to_xarray()
By downloading data from this dataset, you agree to the terms and conditions defined at

    https://github.com/Climdyn/climetlab-eumetnet-postprocessing-benchmark/blob/main/DATA_LICENSE

If you do not agree with such terms, do not download the data. 
<xarray.Dataset>
Dimensions:    (latitude: 289, longitude: 317)
Coordinates:
  * latitude   (latitude) float64 46.0 46.02 46.05 46.07 ... 53.15 53.17 53.2
  * longitude  (longitude) float64 2.5 2.525 2.55 2.575 ... 10.35 10.37 10.4
Data variables:
    landu      (latitude, longitude) float64 ...
Attributes:
    legend:                 {1: {'label': '111 - Continuous urban fabric', 'n...
    history:                Retrieved from https://land.copernicus.eu/pan-eur...
    full_dataset_metadata:  
    source:                 European Union, Copernicus Land Monitoring Servic...

5 - Explanation of the metadata

For all data, attributes specifying the sources and the license are always present. Depending on the kind of dataset, dimensions and information are embedded in the data as follow:

Metadata

Description

latitude

Latitude of the grid points.

longitude

Longitude of the grid points.

number

Number of the ensemble member.

time

Forecast initialization date

step

Step of the forecast (the lead time).

surface

Layer of the variable considered (here there is just one, at the surface).

isobaricInhPa

Pressure level in hectopascal (or millibar).

valid_time

Actual time and date of the corresponding forecast data.

Note

Bold metadata denotes dimensions indexing the datasets.

Data License

See the DATA_LICENSE file.

The COSMO forecasts were produced and provided by the Deutsche Wetterdienst (DWD). The EURADCLIM were produced and provided by KNMI. See https://dataplatform.knmi.nl/dataset/rad-opera-hourly-rainfall-accumulation-euradclim-2-0 and https://doi.org/10.5194/essd-15-1441-2023 .