Pushdown the `top` `rare` commands to nested aggregation #4707

LantaoJin · 2025-10-30T15:30:22Z

Description

Pushdown top 10 usenull=true bytes by method to nested terms aggregation:

{
  "size": 0,
  "aggs": {
    "topBy": {
      "terms": {
        "field": "method"
      },
      "aggs": {
        "topField": {
          "terms": {
            "field": "bytes",
            "size": 10,
            "order": {
              "_count": "desc"
            }
          }
        }
      }
    }
  }
}

And pushdown rare 10 usenull=true bytes by method to:

{
  "size": 0,
  "aggs": {
    "topBy": {
      "terms": {
        "field": "method"
      },
      "aggs": {
        "topField": {
          "terms": {
            "field": "bytes",
            "size": 10,
            "order": {
              "_count": "asc"
            }
          }
        }
      }
    }
  }
}

Additional, rename metrics to measure, rename rule file names.

Related Issues

Resolves #4671

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Lantao Jin <[email protected]>

yuancu · 2025-10-31T07:42:32Z

integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_false.yaml

-      EnumerableCalc(expr#0..3=[{inputs}], expr#4=[2], expr#5=[<=($t3, $t4)], proj#0..2=[{exprs}], $condition=[$t5])
-        EnumerableWindow(window#0=[window(partition {0} order by [2] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
-          CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},count=COUNT())], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"gender":{"terms":{"field":"gender.keyword","missing_bucket":false,"order":"asc"}}},{"state":{"terms":{"field":"state.keyword","missing_bucket":false,"order":"asc"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])
+    CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},count=COUNT()), RARE_TOP->rare 2 state by gender, PROJECT->[gender, state, count], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"gender":{"terms":{"field":"gender.keyword","size":10000,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]},"aggregations":{"state":{"terms":{"field":"state.keyword","size":2,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"asc"},{"_key":"asc"}]}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])


[non-blocking] It's advised to use rare_terms aggregation in place of terms aggregation with ascending count order: https://docs.opensearch.org/latest/aggregations/bucket/rare-terms/

Replace rare in an followup PR?

Is it correct to push down limit into the size of the first level term agg? I guess it will produce 10000 * 2 rows in the end if there is enough buckets.

You guessed right. In general, neither 65535 nor 10000 is precise if the first group has 100,000 keys.

It's not proper to fix it in this PR. @qianheng-aws Can you submit another fix for limit pushdown on nested aggregate: Pushdown to first tier + keep EnumerableLimit.

yuancu · 2025-10-31T07:51:29Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+
+    //


Unintentional insertion of empty comments?

qianheng-aws · 2025-10-31T09:09:53Z

integ-test/src/test/resources/expectedOutput/calcite/explain_rare_usenull_false.yaml

-      EnumerableCalc(expr#0..3=[{inputs}], expr#4=[2], expr#5=[<=($t3, $t4)], proj#0..2=[{exprs}], $condition=[$t5])
-        EnumerableWindow(window#0=[window(partition {0} order by [2] rows between UNBOUNDED PRECEDING and CURRENT ROW aggs [ROW_NUMBER()])])
-          CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},count=COUNT())], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"gender":{"terms":{"field":"gender.keyword","missing_bucket":false,"order":"asc"}}},{"state":{"terms":{"field":"state.keyword","missing_bucket":false,"order":"asc"}}}]}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])
+    CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_account]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0, 1},count=COUNT()), RARE_TOP->rare 2 state by gender, PROJECT->[gender, state, count], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"gender":{"terms":{"field":"gender.keyword","size":10000,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]},"aggregations":{"state":{"terms":{"field":"state.keyword","size":2,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"asc"},{"_key":"asc"}]}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])


Is it correct to push down limit into the size of the first level term agg? I guess it will produce 10000 * 2 rows in the end if there is enough buckets.

qianheng-aws · 2025-10-31T09:15:15Z

...earch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java

                      rowCount,
                      RelMdUtil.guessSelectivity(((FilterDigest) operation.digest()).condition()));
                  case LIMIT -> Math.min(rowCount, ((LimitDigest) operation.digest()).limit());
+                  case RARE_TOP -> Math.min(


Is it accuracy to set the row count to rare/top's number directly? Shouldn't be rare/top's number * estimated buckets' size?

qianheng-aws · 2025-10-31T09:15:29Z

...earch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java

          // Because we'd like to push down LIMIT even when the fetch in LIMIT is greater than
          // dRows.
        case LIMIT -> dRows = Math.min(dRows, ((LimitDigest) operation.digest()).limit()) - 1;
+        case RARE_TOP -> dRows = Math.min(dRows, ((RareTopDigest) operation.digest()).number()) - 1;


Signed-off-by: Lantao Jin <[email protected]>

Optimize the top rare commands to nested aggregation

bb776bb

Signed-off-by: Lantao Jin <[email protected]>

LantaoJin added the enhancement New feature or request label Oct 30, 2025

LantaoJin marked this pull request as ready for review October 30, 2025 15:36

LantaoJin changed the title ~~Optimize the top rare commands to nested aggregation~~ Pushdown the top rare commands to nested aggregation Oct 30, 2025

Merge remote-tracking branch 'upstream/main' into issues/4671

359911a

Signed-off-by: Lantao Jin <[email protected]>

LantaoJin added the backport 2.19-dev label Oct 31, 2025

yuancu previously approved these changes Oct 31, 2025

View reviewed changes

qianheng-aws reviewed Oct 31, 2025

View reviewed changes

LantaoJin added 2 commits October 31, 2025 17:40

Merge remote-tracking branch 'upstream/main' into issues/4671

a5f0375

Fix cost computation

b6f2851

Signed-off-by: Lantao Jin <[email protected]>

LantaoJin dismissed yuancu’s stale review via b6f2851 October 31, 2025 10:20

Pushdown the top rare commands to nested aggregation #4707

Are you sure you want to change the base?

Pushdown the top rare commands to nested aggregation #4707

Uh oh!

Conversation

LantaoJin commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

yuancu Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LantaoJin Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

qianheng-aws Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LantaoJin Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

LantaoJin Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

yuancu Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

LantaoJin Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qianheng-aws Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qianheng-aws Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LantaoJin Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

qianheng-aws Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

LantaoJin Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Pushdown the `top` `rare` commands to nested aggregation #4707

Pushdown the `top` `rare` commands to nested aggregation #4707

LantaoJin commented Oct 30, 2025 •

edited

Loading

yuancu Oct 31, 2025 •

edited

Loading

qianheng-aws Oct 31, 2025 •

edited

Loading

LantaoJin Oct 31, 2025 •

edited

Loading

qianheng-aws Oct 31, 2025 •

edited

Loading

qianheng-aws Oct 31, 2025 •

edited

Loading