Skip to content

Challenge #12- Size, precision, speed - pick two: implementation #2

@EsperanzaCuartero

Description

@EsperanzaCuartero

Challenge 12 - Size, precision, speed - pick two: implementation

Stream 1 - Software development for weather, climate and atmosphere

Goal

This project is a follow-up of the ESoWC 2020 data encoding optimisation challenge.
Based on the results and the findings of the completed project we will implement improved data packing configuration in our production streams. We would also like to analyze some new atmospheric composition and meteorological datasets.

Mentors and skills

  • Mentors: @miha-at-ecmwf @juanjodd
  • Skills required:
    • Some knowledge of meteorological data formats (GRIB, NetCDF) and libraries to decode and manipulate them (ecCodes, netcdf, cdo, nco, ..)
    • Some knowledge about data encoding (data packing, accuracy, compression methods)
    • Knowledge of a software library to compute and present the results
    • Some familiarity with Chemical Transport Modelling (CTM) or Numerical Weather Prediction (NWP) to be able to better appreciate this challenge would be beneficial

Note: Challenge is funded by Copernicus. Only nationals from the European Union and ECMWF Member States are eligible to apply (see Terms and Conditions).


Challenge description

Data and software
We plan to use the CAMS global real-time forecast dataset, ecCodes and NetCDF libraries to test different configurations and estimate data encoding errors and software library to compute and present results (Python, R or Julia).

What is the current problem?
Due to non-optimal data encoding configuration, there is a lot of artificial precision in our data. Datasets are expensive to archive and move and difficult to use.

What could be the solution?
We would like to remove artificial precision from the encoded fields without any loss of information. At the same time, we need to be conscious of operational constraints, so data encoding and decoding steps do not become prohibitively expensive. The desired solution would be a combination of data encoding settings and step to achieve this goal.

Ideas for the implementation
Things to address: more appropriate packing methods, encoding float arrays, explore usage of suitable data compression algorithms.


ESoWC

Metadata

Metadata

Labels

stream-1Stream 1 - Software development for weather, climate and atmosphere

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions