-
Notifications
You must be signed in to change notification settings - Fork 179
Description
What is the bug?
The correctness issue happens on following PPL queries on Clickbench:
Q15 and its plan
Q16 and its plan
Q17 and its plan
Q22 and its plan
Q34 and its plan
Q35 and its plan
For example, the query of Q16 is fully pushed down to DSL.
The following DSL query returns incorrect (approximate) TopK:
{
"from": 0,
"size": 0,
"timeout": "1m",
"aggregations": {
"UserID": {
"terms": {
"field": "UserID",
"size": 10,
"min_doc_count": 1,
"shard_min_doc_count": 0,
"show_term_doc_count_error": false,
"order": [
{
"count()": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"count()": {
"value_count": {
"field": "_index"
}
}
}
}
}
}
The result is
{"took":4205,"timed_out":false,"terminated_early":true,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":10000,"relation":"gte"},"max_score":null,"hits":[]},"aggregations":{"UserID":{"doc_count_error_upper_bound":-1,"sum_other_doc_count":99890741,"buckets":[{"key":1313338681122956954,"doc_count":29097,"count()":{"value":29097}},{"key":1907779576417363396,"doc_count":25331,"count()":{"value":25331}},{"key":2305303682471783379,"doc_count":10594,"count()":{"value":10594}},{"key":7982623143712728547,"doc_count":7584,"count()":{"value":7584}},{"key":6018350421959114808,"doc_count":6678,"count()":{"value":6678}},{"key":1090981537032625727,"doc_count":6197,"count()":{"value":6197}},{"key":5730251990344211405,"doc_count":6019,"count()":{"value":6019}},{"key":7280399273658728997,"doc_count":6018,"count()":{"value":6018}},{"key":835157184735512989,"doc_count":5211,"count()":{"value":5211}},{"key":770542365400669095,"doc_count":4027,"count()":{"value":4027}}]}}}
But the expected result is
UserID,count(1)
1313338681122956954,29097
1907779576417363396,25333
2305303682471783379,10597
7982623143712728547,7584
6018350421959114808,6678
7280399273658728997,6411
1090981537032625727,6197
5730251990344211405,6019
835157184735512989,5211
770542365400669095,4906
The actual doc_count is 6411 by querying following DSL query:
{
"size": 0,
"track_total_hits": true,
"query": {
"bool": {
"must": [
{
"term": {
"UserID": "7280399273658728997"
}
}
]
}
}
}
This is an OpenSearch Core issue, cannot be fixed within SQL plugin.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status