CNDB-15155: Segregate SAI's query metrics per query type #1969

adelapena · 2025-08-28T14:42:32Z

SAI's tracks query metrics providing info about latency, partition reads, filtered rows, etc. These metrics are for all the SAI queries for a certain table. When we see problems in queries, for example a high latency, questions that can usually come next are what query is producing this high latency? Are there vector queries? Are queries for a single partition? In that case, what is the latency of those queries? How many rows are they selecting? To answer those questions, I think it can be useful to track metrics for specific types of query.

The types of query depend on multiple characteristics, like being top-k, containing disjunctions, etc. We can make an overwhelming number of combinations of query characteristics conforming types of query. Storing so many metrics is probably not feasible, so we should choose the ones we consider more interesting.

This PR only adds a few type of queries: filtering, top-k, single-partition and range queries. If we need new types in the future, this patch should make it easy to add separate metrics for them, possibly with an one-liner.

The general query metrics remain untouched, this PR only adds new ones.

Here is how the SAI metrics look like without this patch:

And this is with this patch:

github-actions · 2025-08-28T14:42:54Z

eolivelli · 2025-08-28T14:51:04Z

src/java/org/apache/cassandra/index/sai/metrics/TableQueryMetrics.java

these are all counters, this will work well on Astra and we create the metric only when needed

we have to ensure that the metrics will be unregistered in CNDB when the table is unloaded, but there is already a mechanism, we will plug into it

There are also ten histograms per type of query, in TableQueryMetrics.PerQuery. I still have to add config/system properties to optionally disable them in CC, independently of CNDB.

adelapena · 2025-09-01T12:06:30Z

I'm also adding a separate metrics category for hybrid queries, meaning queries that are both filtering and top-k at the same time.

I'm also adding system properties to separately enable/disable SAI query type metrics for table (counters) and for query (histograms). The per-table counters are enabled by default, whereas the per-query histograms are disabled by default.

pkolaczk

Looks very good to me.
Just my usual nagging about naming ;)

pkolaczk · 2025-09-02T10:39:23Z

src/java/org/apache/cassandra/db/MultiRangeReadCommand.java

Suggested change

if (dataRanges.size() != 1)

return false;

return dataRanges.get(0).isSinglePartition();

return dataRanges.size() == 1 && dataRanges.get(0).isSinglePartition();

pkolaczk · 2025-09-02T10:49:59Z

src/java/org/apache/cassandra/index/sai/metrics/TableQueryMetrics.java

Minor suggestion: maybe name them "FilterOnly" and "TopKOnly" ?
Maybe I'm oversensitive, but I bet someone will misunderstand "Filter" as any query which does filtering (strictly adhering to the logic of usesIndexFiltering). I'm not strong on this one. Feel free to discard.

I was fence between those two and I choose Filter/TopK just for brevity, but it it has make you wonder too probably it's safer to use FilterOnly/TopKOnly.

pkolaczk · 2025-09-02T10:51:10Z

src/java/org/apache/cassandra/index/sai/metrics/TableQueryMetrics.java

Why not MultiPartition ?
Aren't we overloading the term "range" query? Because in Cassandra, single partition queries can be also range, can't they?

I think SinglePartitionReadCommand is always single partition, and PartitionRangeReadCommand can be both single partition and multipartition. So indeed Range is confusing; I'm changing it to MultiPartition.

pkolaczk · 2025-09-02T10:57:29Z

src/java/org/apache/cassandra/index/sai/metrics/TableQueryMetrics.java

~~Why do we need type.isEmpty() ?~~
Aaaah, it's the name of the group of metrics / query type. I initially thought this was a CQL type.
Maybe rename to queryType or queryKind?

I have changed it to QueryKind, and replaced "type" by "kind" in all related usages around. Not that I have ever been very good at discerning which one to use, but I think "kind" makes more sense for these kind of vague query subcategories.

adelapena · 2025-09-02T14:52:58Z

It seems this was breaking tracing by sending repeated messages for every kind of query, when we only want one. Just pushed a fix. I'm also removing an IMO redundant check in TraceTest.

Add separate SAI query metrics for filtering, top-k, single-partition and range queries. The new metrics are identical to the previous ones, but they are only applied to the relevant queries. The previous query metrics remain unaltered.

sonarqubecloud · 2025-09-16T12:26:23Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
91.9% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cassci-bot · 2025-09-16T12:31:39Z

✔️ Build ds-cassandra-pr-gate/PR-1969 approved by Butler

Approved by Butler
See build details here

…1969) Add SAI query metrics per kind of query. There are metrics for: * Filtering only queries. * Top-k only queries, such as ANN, BM25 and generic ordering. * Hybrid queries, combining both top-k and filtering. * Single-partition queries. * Multi-partition queries. Each of these groups has the same records as the general query metrics. These new metrics can be enabled/disabled with the system properties: * cassandra.sai.metrics.query_kind.per_table.enabled * cassandra.sai.metrics.query_kind.per_query.enabled Rebase notes: * required to re-implement CASSANDRA-18940 * will be replaced (made redundant) by CASSANDRA-20923

adelapena self-assigned this Aug 28, 2025

adelapena marked this pull request as draft August 28, 2025 14:43

eolivelli reviewed Aug 28, 2025

View reviewed changes

adelapena force-pushed the CNDB-15155-main branch from c107c88 to dcea2e0 Compare September 1, 2025 11:59

adelapena marked this pull request as ready for review September 1, 2025 11:59

pkolaczk approved these changes Sep 2, 2025

View reviewed changes

adelapena force-pushed the CNDB-15155-main branch from 56fc429 to 3c8fdc0 Compare September 2, 2025 14:59

adelapena added 6 commits September 15, 2025 12:03

CNDB-15155: Add metrics for hybrid queries

26f7134

CNDB-15155: Add config options to toggle SAI's query type metrics

663b76d

CNDB-15155: Disable per query type metrics by default

c4cd84e

CNDB-15155: Address review feedback

fa27108

CNDB-15155: Fix tracing

e20af50

adelapena force-pushed the CNDB-15155-main branch from 3c8fdc0 to e20af50 Compare September 15, 2025 11:04

adelapena added 2 commits September 15, 2025 12:53

CNDB-15155: Ease access to metrics for CNDB tests

ff79a01

CNDB-15155: Add more tests

84cc224

adelapena merged commit 550263f into main Sep 17, 2025
493 checks passed

adelapena deleted the CNDB-15155-main branch September 17, 2025 16:27

michaelsembwever mentioned this pull request Sep 23, 2025

CNDB-15432: Port CNDB-15155 550263f to main-5.0 #2016

Merged

-        if (dataRanges.size() != 1)
-            return false;
-        return dataRanges.get(0).isSinglePartition();
+        return dataRanges.size() == 1 && dataRanges.get(0).isSinglePartition();

Uh oh!

CNDB-15155: Segregate SAI's query metrics per query type #1969

CNDB-15155: Segregate SAI's query metrics per query type #1969

Uh oh!

Conversation

adelapena commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 28, 2025

Checklist before you submit for review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adelapena commented Sep 1, 2025

Uh oh!

pkolaczk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adelapena commented Sep 2, 2025

Uh oh!

sonarqubecloud bot commented Sep 16, 2025

Quality Gate passed

Uh oh!

cassci-bot commented Sep 16, 2025

✔️ Build ds-cassandra-pr-gate/PR-1969 approved by Butler

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

adelapena commented Aug 28, 2025 •

edited

Loading