Base datasets over Europe’s domain

These are the global datasets available on a large portion of Europe which constitute the base to develop more specific benchmark datasets.

Warning

Access to the forecasts and observations is currently time-granular: these datasets cannot be sliced over the issuance time dimension.

Datasets description

There are two main datasets:

1 - Gridded Data

../_images/gridded_data.jpg

The global EUMETNET postprocessing benchmark domain.

  • The gridded main Eumetnet postprocessing benchmark dataset contains ECMWF ensemble and deterministic forecasts over a large portion of Europe, from 36 to 67° in latitude and from -6 to 17° of longitude, and covers the years 2017-2018.

  • It also contains the corresponding ERA5 reanalysis for the purpose of providing observations for the benchmark.

  • For some dates, it contains also reforecasts that covers 20 years of past forecasts recomputed with the most recent model version.

  • All the forecasts and reforecasts provided are the noon ECMWF runs.

  • The ensemble forecasts and reforecasts also contain by default the control run (the 0-th member).

  • The gridded data resolution is 0.25° x 0.25° which corresponds roughly to 25 kilometers.

  • Please note that you can presently only retrieve one forecast date for each climetlab.load_dataset call.

There are 8 gridded sub-datasets:

1.1 - Extreme Forecast Index

All the Extreme Forecast Index (EFI) variables can be obtained for each forecast date.

It includes:

Parameter name

ECMWF key

Remarks

2 metre temperature efi

2ti

10 metre wind speed efi

10wsi

10 metre wind gust efi

10fgi

cape efi

capei

cape shear efi

capesi

Maximum temperature at 2m efi

mx2ti

Minimum temperature at 2m efi

mn2ti

Snowfall efi

sfi

Total precipitation efi

tpi

The EFI are available for the model step range (in hours) 0-24, 24-48, 48-72, 72-96, 96-120, 120-144 and 144-168.

Usage: The EFI variables can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-efi', date, parameter)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. Setting 'all' as parameter download all the EFI parameters.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-efi', "2017-12-02", "2ti")
ds.to_xarray()

Note

By definition, observations are not available for Extreme Forecast Indices (EFI).

1.2 - Surface variable forecasts

The surface variables can be obtained for each forecast date, both for the ensemble (51 members) and deterministic runs.

It includes:

Parameter name

ECMWF key

Remarks

2 metre temperature

2t

10 metre U wind component

10u

10 metre V wind component

10v

Total cloud cover

tcc

100 metre U wind component anomaly

100ua

Observations not available

100 metre V wind component anomaly

100va

Observations not available

Convective available potential energy

cape

Soil temperature level 1

stl1

Total column water

tcw

Total column water vapour

tcwv

Volumetric soil water layer 1

swvl1

Snow depth

sd

Convective inhibition

cin

Observations not available

Visibility

vis

Observations not available

Some missing observations will become available later.

The forecasts are available for the model steps (in hours) 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228, 234 and 240. All the steps are automatically retrieved.

Usage: The surface variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', date, parameter, kind)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. Setting 'all' as parameter download all the surface parameters. The kind argument allows to select the deterministic or ensemble forecasts, by setting it to 'highres' or 'ensemble'.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface', "2017-12-02", "sd", "highres")
ds.to_xarray()

1.3 - Pressure level variable forecasts

The variables on pressure level can be obtained for each forecast date, both for the ensemble (51 members) and deterministic runs.

It includes:

Parameter name

Level

ECMWF key

Remarks

Temperature

850

t

U component of wind

700

u

V component of wind

700

v

Geopotential

500

z

Specific humidity

700

q

Relative humidity

850

r

The forecasts are available for the same model steps as the surface variables above.

Usage: The pressure level variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-pressure', date, parameter, level, kind)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. Setting 'all' as parameter download all the parameters at the given pressure level. The level argument is the pressure level, as a string or an integer. The kind argument allows to select the deterministic or ensemble forecasts, by setting it to 'highres' or 'ensemble'.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-pressure', "2017-12-02", "z", 500, "highres")
ds.to_xarray()

1.4 - Processed surface variable forecasts

Processed surface variables can be obtained for each forecast date, both for the ensemble (51 members) and deterministic runs. A processed variable is either accumulated, averaged or filtered.

It includes:

Parameter name

ECMWF key

Remarks

Total precipitation

tp

Surface sensible heat flux

sshf

Surface latent heat flux

slhf

Surface net solar radiation

ssr

Surface net thermal radiation

str

Convective precipitation

cp

Maximum temperature at 2 metres

mx2t6

Minimum temperature at 2 metres

mn2t6

Surface solar radiation downwards

ssrd

Surface thermal radiation downwards

strd

10 metre wind gust

10fg6

All these variables are accumulated or filtered over the last 6 hours preceding a given forecast timestamp. Therefore, the forecasts are available for the model steps (in hours) 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228, 234 and 240. All the steps are automatically retrieved.

Usage: The processed surface variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface-processed', date, parameter, kind)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys described above. The kind argument allows to select the deterministic or ensemble forecasts, by setting it to 'highres' or 'ensemble'.

Note

For technical reason, most fields cannot be retrieved along the others and must be downloaded alone. E.g. a request with parameter=['tp', 'mx2t6'] will fail while one with parameter='tp' will succeed.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-forecasts-surface-processed', "2017-12-02", "mx2t6", "highres")
ds.to_xarray()

1.5 - Surface variable reforecasts

The surface variables for the ensemble reforecasts (11 members) can be obtained for each reforecast date. All the variables described at the point 1.2 above are available.

The reforecasts are available for the model steps (in hours) 0, 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72, 78, 84, 90, 96, 102, 108, 114, 120, 126, 132, 138, 144, 150, 156, 162, 168, 174, 180, 186, 192, 198, 204, 210, 216, 222, 228, 234 and 240. All the steps are automatically retrieved.

Note

The ECMWF reforecasts are only available Mondays and Thursdays. Providing any other date will fail.

Usage: The surface variables reforecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface', date, parameter)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys. Setting 'all' as parameter download all the surface parameters.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface', "2017-12-28", "sd")
ds.to_xarray()

1.6 - Pressure level variable reforecasts

The variables on pressure level for the ensemble reforecasts (11 members) can be obtained for each reforecast date All the variables described at the point 1.3 above are available.

The reforecast are available for the same model steps as the surface variables above.

Note

The ECMWF reforecasts are only available Mondays and Thursdays. Providing any other date will fail.

Usage: The pressure level variables reforecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-pressure', date, parameter, level)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys. Setting 'all' as parameter download all the parameters at the given pressure level. The level argument is the pressure level, as a string or an integer.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-pressure', "2017-12-28", "z", 500)
ds.to_xarray()

1.7 - Processed surface variable reforecasts

Processed surface variables as described in section 1.4 can also be obtained as ensemble reforecasts (11 members).

The reforecast are available for the same model steps as the surface variables described in section 1.5.

Note

The ECMWF reforecasts are only available Mondays and Thursdays. Providing any other date will fail.

Usage: The surface variables forecasts can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface-processed', date, parameter)
ds.to_xarray()

where the date argument is a string with a single date, and the parameter argument is a string or a list of string with the ECMWF keys.

Note

For technical reason, most fields cannot be retrieved along the others and must be downloaded alone. E.g. a request with parameter=['tp', 'mx2t6'] will fail while one with parameter='tp' will succeed.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-reforecasts-surface-processed', "2017-12-28", "mx2t6")
ds.to_xarray()

1.8 - Static fields

Various static fields associated to the forecast grid can be obtained, with the purpose of serving as predictors for the postprocessing.

Note

For consistency with the rest of the dataset, we use the ECMWF parameters name, terminology and units here. However, please note that - except for the Surface Geopotential - the fields provided are from other non-ECMWF data sources evaluated at grid points. Currently, the main data source being used is the Copernicus Land Monitoring Service.

It includes:

Parameter name

ECMWF key

Remarks

Land use

landu

Extracted from the CORINE 2018 dataset. Values and associated land type differ from the ECMWF one. Please look at the “legend” entry in the metadata for more details.

Model terrain height

mterh

Extracted from the EU-DEMv1.1 data elevation model dataset.

Surface Geopotential

z

The model orography can be obtained by dividing the surface geopotential by g=9.80665 ms \({}^{-2}\).

Usage: The static fields can be retrieved by calling

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-static-fields', parameter)
ds.to_xarray()

where the parameter argument is a string with one of the ECMWF keys described above. It is only possible to download one static field per call.

Example:

ds = cml.load_dataset('eumetnet-postprocessing-benchmark-training-data-gridded-static-fields', 'mterh')
ds.to_xarray()

2 - Stations Data

Not yet provided.

3 - Getting the observations corresponding to the (re)forecasts

Once obtained, the observations corresponding to the forecasts or reforecasts (if available) can be retrieved in the xarray format by using the get_observations_as_xarray method:

obs = ds.get_observations_as_xarray()

4 - Explanation of the metadata

The following metadata are available in the gridded forecast, reforecast and observation data:

  1. latitude: The latitude of the grid points.

  2. longitude: The longitude of the grid points.

  3. depthBelowLandLayer: the layer below the surface (valid for some variables only, here there is only the upper surface level).

  4. number: the number of the ensemble member. The 0-th member is the control run. Also present in observation, but set to 0.

  5. time: the forecast or reforecast date (reforecasts are only issued on Mondays and Thursdays).

  6. year: a dimension to identify the year in the past, year=1 means a forecast valid 20 years ago at the reforecast day and month, year=20 means a forecast valid one year before the reforecast date. Only valid for reforecasts.

  7. step: the step of the forecast (the lead time).

  8. surface: the layer of the variable considered (here there is just one, at the surface).

  9. valid_time: the actual time and date of the corresponding forecast data.

5 - Major ECMWF model changes

In 2017 and 2018, there were 2 model changes of the ECMWF model on total:

Implementation date

Summary of changes

Resolution

Full IFS documentation

05-Jun-2018

Cycle 45r1

Unchanged

Cycle 45r1 full documentation

11-Jul-17

Cycle 43r3

Unchanged

Cycle 43r3 full documentation

Source: https://www.ecmwf.int/en/forecasts/documentation-and-support/changes-ecmwf-model

Data License

See the DATA_LICENSE file.