CNDB-15919: Optimize SAI NOT queries, push logic into posting lists #2112

michaeljmarshall · 2025-11-06T22:30:37Z

What is the issue

Fixes: https://github.com/riptano/cndb/issues/15919
Test PR: https://github.com/riptano/cndb/pull/15949

What does this PR fix and why was it fixed

In the original implementation for #820, we introduced the PrimaryKeyMapIterator to iterate all primary keys in an sstable and then do an anti-join on the result of an equality query. That design works, but requires some additional reads from disk to get primary keys that are unnecessary.

There are two possible solutions:

We can use row ids (either sstable or segment) to do the complement of the resulting posting lists. This will be the most performant, since it avoids object allocations. The main issue with this solution is that it is much more complicated to implement and had unaddressed edge cases.
We can use the primaryKeyFromRowId that takes primary key bounds and then uses a row id, when rows are from the same sstable. This will be worse that solution 1 because it creates an object per key and requires comparing sstable ids before comparing sstable row ids, but it is a significant improvement over the current solution, which hits disk to load the primary key.

When testing on my local machine and reviewing the JMH benchmarks, I can see that the current solution is about 16x worse than the minimum solution (2) and 32x worse than the optimal (1) solution. Given that the benchmarks in question are highly specific to the use case, I do no think we have sufficient motivation to introduce the exceedingly complex (1) solution.

Note that the ideal solution to 1, that would have much less complexity, is to convert posting lists into a single iterator of sstable row ids, and then to take the complement of them.

github-actions · 2025-11-06T22:30:53Z

There are still some issues to be solved. Namely, dealing with version AA.

pkolaczk

In addition to the other minor things I mentioned, we need a test for verifying how NEQ works together with token range / partition key restriction.

src/java/org/apache/cassandra/index/sai/disk/v1/postings/ComplementPostingList.java

src/java/org/apache/cassandra/index/sai/disk/v1/IndexSearcher.java

src/java/org/apache/cassandra/index/sai/disk/v1/SegmentMetadata.java

src/java/org/apache/cassandra/index/sai/disk/v1/SegmentMetadataBuilder.java

src/java/org/apache/cassandra/index/sai/SSTableContext.java

Results on current commit: [java] Result "org.apache.cassandra.test.microbench.index.sai.NEQQueryBench.queryNEQ": [java] 7198946.987 ±(99.9%) 155705.068 ns/op [Average] [java] (min, avg, max) = (7086466.212, 7198946.987, 7464311.708), stdev = 102989.262 [java] CI (99.9%): [7043241.919, 7354652.054] (assumes normal distribution) [java] [java] Benchmark (numRowsWithinPartition) Mode Cnt Score Error Units [java] NEQQueryBench.queryNEQ 1000 avgt 10 1109937.157 ± 2362.813 ns/op [java] NEQQueryBench.queryNEQ 10000 avgt 10 7198946.987 ± 155705.068 ns/op

Benchmark results for baseline. [java] Result "org.apache.cassandra.test.microbench.index.sai.NEQQueryBench.queryNEQ": [java] 327531893.145 ±(99.9%) 1209098.423 ns/op [Average] [java] (min, avg, max) = (326375963.710, 327531893.145, 329248861.548), stdev = 799743.744 [java] CI (99.9%): [326322794.722, 328740991.569] (assumes normal distribution) [java] [java] Benchmark (numRowsWithinPartition) Mode Cnt Score Error Units [java] NEQQueryBench.queryNEQ 1000 avgt 10 32187371.679 ± 67260.318 ns/op [java] NEQQueryBench.queryNEQ 10000 avgt 10 327531893.145 ± 1209098.423 ns/op

[java] Result "org.apache.cassandra.test.microbench.index.sai.NEQQueryBench.queryNEQ": [java] 17761666.739 ±(99.9%) 260418.205 ns/op [Average] [java] (min, avg, max) = (17644295.340, 17761666.739, 18241636.689), stdev = 172250.519 [java] CI (99.9%): [17501248.533, 18022084.944] (assumes normal distribution) [java] [java] Benchmark (numRowsWithinPartition) Mode Cnt Score Error Units [java] NEQQueryBench.queryNEQ 1000 avgt 10 2169861.695 ± 11355.830 ns/op [java] NEQQueryBench.queryNEQ 10000 avgt 10 17761666.739 ± 260418.205 ns/op

michaeljmarshall · 2025-11-14T22:28:34Z

@pkolaczk - I added a benchmark and discovered that in the most extreme cases, we can get most of the benefit for an extremely small code change, compared to what I had proposed before. I think we should go with this one for now and leave a deeper re-write for later, if there is demand.

sonarqubecloud · 2025-11-14T23:29:38Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

cassci-bot · 2025-11-14T23:34:19Z

❌ Build ds-cassandra-pr-gate/PR-2112 rejected by Butler

1 regressions found
See build details here

Found 1 new test failures

Test	Explanation	Runs	Upstream
o.a.c.index.sai.cql.datamodels.QueryWriteLifecycleWithCompositePartitionKeyTest.terminated successfully	NEW	🔴⚪	0 / 17

Found 1 known test failures

michaeljmarshall added 3 commits November 6, 2025 20:28

Remove unnecessary state from IndexSearcherContext

cbc0a8d

CNDB-15919: Optimize SAI NOT queries, push logic into posting lists

de93b22

Use proper min/max row id and primary key

36fe292

There are still some issues to be solved. Namely, dealing with version AA.

michaeljmarshall force-pushed the cndb-15919 branch from 25d7f7e to 36fe292 Compare November 7, 2025 04:02

michaeljmarshall marked this pull request as ready for review November 7, 2025 04:12

michaeljmarshall added 4 commits November 7, 2025 16:40

Fix SSTableContext init with no rows

1489943

Close posting list on empty

a2ab887

Replace testUnsupportedOp with valid NEQ assertions

e0767cf

Handle reads disabled in SSTableContext constructor

b2f1420

michaeljmarshall self-assigned this Nov 10, 2025

michaeljmarshall requested review from eolivelli and pkolaczk November 10, 2025 20:50

michaeljmarshall added 2 commits November 10, 2025 14:55

Fix license header for new file

74c53d4

Merge remote-tracking branch 'datastax/main' into cndb-15919

b5b3645

pkolaczk requested changes Nov 13, 2025

View reviewed changes

michaeljmarshall force-pushed the cndb-15919 branch from bd99cd6 to b16fd5e Compare November 14, 2025 22:12

michaeljmarshall added 3 commits November 14, 2025 16:18

michaeljmarshall force-pushed the cndb-15919 branch from b16fd5e to 5c4fa18 Compare November 14, 2025 22:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CNDB-15919: Optimize SAI NOT queries, push logic into posting lists #2112

CNDB-15919: Optimize SAI NOT queries, push logic into posting lists #2112

Uh oh!

michaeljmarshall commented Nov 6, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 6, 2025 •

edited by michaeljmarshall

Loading

Uh oh!

pkolaczk left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michaeljmarshall commented Nov 14, 2025

Uh oh!

sonarqubecloud bot commented Nov 14, 2025

Uh oh!

cassci-bot commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CNDB-15919: Optimize SAI NOT queries, push logic into posting lists #2112

Are you sure you want to change the base?

CNDB-15919: Optimize SAI NOT queries, push logic into posting lists #2112

Uh oh!

Conversation

michaeljmarshall commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What is the issue

What does this PR fix and why was it fixed

Uh oh!

github-actions bot commented Nov 6, 2025 • edited by michaeljmarshall Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist before you submit for review

Uh oh!

pkolaczk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michaeljmarshall commented Nov 14, 2025

Uh oh!

sonarqubecloud bot commented Nov 14, 2025

Quality Gate passed

Uh oh!

cassci-bot commented Nov 14, 2025

❌ Build ds-cassandra-pr-gate/PR-2112 rejected by Butler

Found 1 new test failures

Found 1 known test failures

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michaeljmarshall commented Nov 6, 2025 •

edited

Loading

github-actions bot commented Nov 6, 2025 •

edited by michaeljmarshall

Loading