-
Notifications
You must be signed in to change notification settings - Fork 495
Description
Is your feature request related to a problem? Please describe.
The current pruning mechanisms are time, tag and partitioning. We often run into queries that end up having to target every split in the index because we can leverage neither of these. A typical example is when we have two time dimensions, event time and ingestion time. Only one can be used as time in QW, but we might want to query on the second one.
Describe the solution you'd like
I would like to add a configuration (e.g stats_fields) in the doc mapping similar to tag_fields.
- it would be only compatible with numerical fields
- when packaging the split, we would compute min and max and add it to the metadata
- we would store the min and max in the metastore by either
- adding a field to split_metadata_json. This is simple to setup, we could probably still push down the pruning to the metastore but it would be quite expensive.
- using an encoding similar to tags, with something like a Vec where min and max values would be encoded
- use a JSON type
Describe alternatives you've considered
We can come up with workarounds using tags. For instance for the problem of the secondary time dimension, we could record the ingestion day as tag. But its less flexible, harder to setup for users, and more costly.