-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Challenge 12 - Size, precision, speed - pick two: implementation
Stream 1 - Software development for weather, climate and atmosphere
Goal
This project is a follow-up of the ESoWC 2020 data encoding optimisation challenge.
Based on the results and the findings of the completed project we will implement improved data packing configuration in our production streams. We would also like to analyze some new atmospheric composition and meteorological datasets.
Mentors and skills
- Mentors: @miha-at-ecmwf @juanjodd
- Skills required:
- Some knowledge of meteorological data formats (GRIB, NetCDF) and libraries to decode and manipulate them (ecCodes, netcdf, cdo, nco, ..)
- Some knowledge about data encoding (data packing, accuracy, compression methods)
- Knowledge of a software library to compute and present the results
- Some familiarity with Chemical Transport Modelling (CTM) or Numerical Weather Prediction (NWP) to be able to better appreciate this challenge would be beneficial
Note: Challenge is funded by Copernicus. Only nationals from the European Union and ECMWF Member States are eligible to apply (see Terms and Conditions).
Challenge description
Data and software
We plan to use the CAMS global real-time forecast dataset, ecCodes and NetCDF libraries to test different configurations and estimate data encoding errors and software library to compute and present results (Python, R or Julia).
What is the current problem?
Due to non-optimal data encoding configuration, there is a lot of artificial precision in our data. Datasets are expensive to archive and move and difficult to use.
What could be the solution?
We would like to remove artificial precision from the encoded fields without any loss of information. At the same time, we need to be conscious of operational constraints, so data encoding and decoding steps do not become prohibitively expensive. The desired solution would be a combination of data encoding settings and step to achieve this goal.
Ideas for the implementation
Things to address: more appropriate packing methods, encoding float arrays, explore usage of suitable data compression algorithms.
