CNDB-15638: Fix handling mixed key types in SAI iterators #2047

michaeljmarshall · 2025-10-09T20:22:45Z

Fixes: https://github.com/riptano/cndb/issues/15638
Cherry-picks: #2037

This commit fixes multiple issues with KeyRangeIterator implementations
occasionally skipping or emitting duplicate keys when working on
a mix of primary keys with empty / non-empty clusterings.
This situation is possible while scanning tables with static columns
or when some indexes are partition-aware (e.g. version AA) and
others have been updated to a row-aware version (e.g. DC or EC).
Due to those bugs, users could get incorrect results from SAI queries,
e.g. results containing duplicated rows, duplicated partitions or
even missing rows.

The commit introduces extensive randomized property-based tests for
KeyRangeUnionIterator and KeyIntersectionIterator. Previously,
the tests did not test for keys with mixed empty/non-empty clusterings.

Changes in KeyRangeUnionIterator:

KeyRangeUnionIterator merges streams of primary keys in such a way that
duplicates are removed. Unfortunately it does not properly account
for the fact that if a key with an empty clustering meets a key
with a non-empty clustering and the same partition key, we must
always return the key with an empty clustering. A key with an empty
clustering will always fetch the rows matched by any specific row
key for the same partition, but the reverse is not true.

The iterator implementation has been modified to always pick the
key that matches more rows - a key with empty clustering wins
over a key with non-empty clustering. Additionally, once a key
with an empty clustering is emitted, no more keys in that partition
are emitted.

Changes in KeyRangeIntersectionIterator:

Due to a very similar problem like in KeyRangeUnionIterator,
KeyRangeIntersectionIterator could return either too few or
too many keys, when keys with empty clusterings and keys
with non-empty clusterings were present in the input key streams.

In particular consider 2 input streams A and B with the following
keys:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)
1: (1, 2)

Key A.0 matches the whole partition 1. Therefore, the correct result
of intersection are both keys of stream B. Unfortunately, the algorithm
before this patch would advance both A and B iterators when emitting
the first matching key. At the beginning of the second step,
the iterator A would be already exhausted and no more keys would
be produced. Finally key B.1 would be missing from the results.

This patch fixes it by introducing two changes to the intersection
algorithm:

A key with non-empty clustering wins over a key with
empty clustering and same partition.
The selected highest key is not consumed while searching
for the highest matching key, but that happens only after the
search loop finds a match. Then we have more information
which iterators would be moved to the next item. Iterators positioned
at a key with an empty clustering can be advanced only after
we run out of keys with non-empty clustering in the same partition
or if there are no other keys with non-empty clustering.

This patch also fixes another issue where we could return
a less-specific key matching a full partition instead of a key
matching one row:

A:
0: (1, Clustering.EMPTY)

B:
0: (1, 1)

In that case the iterator returned a key with empty clustering,
which would result in fetching and postfiltering many unnecessary rows.

(cherry picked from commit d7b8944)

This commit fixes multiple issues with KeyRangeIterator implementations occasionally skipping or emitting duplicate keys when working on a mix of primary keys with empty / non-empty clusterings. This situation is possible while scanning tables with static columns or when some indexes are partition-aware (e.g. version AA) and others have been updated to a row-aware version (e.g. DC or EC). Due to those bugs, users could get incorrect results from SAI queries, e.g. results containing duplicated rows, duplicated partitions or even missing rows. The commit introduces extensive randomized property-based tests for KeyRangeUnionIterator and KeyIntersectionIterator. Previously, the tests did not test for keys with mixed empty/non-empty clusterings. Changes in KeyRangeUnionIterator: KeyRangeUnionIterator merges streams of primary keys in such a way that duplicates are removed. Unfortunately it does not properly account for the fact that if a key with an empty clustering meets a key with a non-empty clustering and the same partition key, we must always return the key with an empty clustering. A key with an empty clustering will always fetch the rows matched by any specific row key for the same partition, but the reverse is not true. The iterator implementation has been modified to always pick the key that matches more rows - a key with empty clustering wins over a key with non-empty clustering. Additionally, once a key with an empty clustering is emitted, no more keys in that partition are emitted. Changes in KeyRangeIntersectionIterator: Due to a very similar problem like in KeyRangeUnionIterator, KeyRangeIntersectionIterator could return either too few or too many keys, when keys with empty clusterings and keys with non-empty clusterings were present in the input key streams. In particular consider 2 input streams A and B with the following keys: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) 1: (1, 2) Key A.0 matches the whole partition 1. Therefore, the correct result of intersection are both keys of stream B. Unfortunately, the algorithm before this patch would advance both A and B iterators when emitting the first matching key. At the beginning of the second step, the iterator A would be already exhausted and no more keys would be produced. Finally key B.1 would be missing from the results. This patch fixes it by introducing two changes to the intersection algorithm: 1. A key with non-empty clustering wins over a key with empty clustering and same partition. 2. The selected highest key is not consumed while searching for the highest matching key, but that happens only after the search loop finds a match. Then we have more information which iterators would be moved to the next item. Iterators positioned at a key with an empty clustering can be advanced only after we run out of keys with non-empty clustering in the same partition or if there are no other keys with non-empty clustering. This patch also fixes another issue where we could return a less-specific key matching a full partition instead of a key matching one row: A: 0: (1, Clustering.EMPTY) B: 0: (1, 1) In that case the iterator returned a key with empty clustering, which would result in fetching and postfiltering many unnecessary rows. (cherry picked from commit d7b8944)

github-actions · 2025-10-09T20:23:00Z

eolivelli

LGTM

looks the same as 3a6f3c4

The PrimaryKeyWithSource class has been present for two years in the code base as an optimization for hybrid vector workloads, which have to materialize many primary keys in the search-then-sort query path. However, the logic is invalid for version aa (because we have the bug where compacted sstables write per row, not per partition) and it is also invalid for static columns. This commit avoids creation of PrimaryKeyWithSource in those cases. (cherry picked from commit e942cae)

Patch is not exactly the same as https://github.com/datastax/cassandra/pull/2037/files
I don't fell comfortable in approving.
We need Piotr's eyes here

sonarqubecloud · 2025-10-10T09:39:12Z

Quality Gate passed

Issues
0 New issues
3 Accepted issues

Measures
0 Security Hotspots
91.4% Coverage on New Code
2.3% Duplication on New Code

See analysis details on SonarQube Cloud

cassci-bot · 2025-10-10T09:42:48Z

❌ Build ds-cassandra-pr-gate/PR-2047 rejected by Butler

8 regressions found
See build details here

Found 8 new test failures

Test	Explanation	Runs	Upstream
o.a.c.distributed.test.sai.VectorDistributedTest.rangeRestrictedTest	NEW	🔵🔴	0 / 0
o.a.c.index.sai.cql.datamodels.QueryCellDeletionsWithCompoundKeyWithStaticsTest.testCellDeletions[aa] (compression)	NEW	🔵🔴	0 / 0
o.a.c.index.sai.cql.datamodels.QueryRowDeletionsWithCompoundKeyWithStaticsTest.testRowDeletions[aa] (compression)	NEW	🔵🔴	0 / 0
o.a.c.index.sai.cql.datamodels.QueryWriteLifecycleWithCompoundKeyWithStaticsTest.testWriteLifecycle[aa] (compression)	NEW	🔵🔴	0 / 0
o.a.c.index.sai.cql.datamodels.TinySegmentQueryCellDeletionsWithCompoundKeyWithStaticsTest.testCellDeletions[aa] (compression)	NEW	🔵🔴	0 / 0
o.a.c.index.sai.cql.datamodels.TinySegmentQueryRowDeletionsWithCompoundKeyWithStaticsTest.testRowDeletions[aa] (compression)	NEW	🔵🔴	0 / 0
o.a.c.index.sai.cql.datamodels.TinySegmentQueryWriteLifecycleWithCompoundKeyWithStaticsTest.testWriteLifecycle[aa] (compression)	NEW	🔵🔴	0 / 0
o.a.c.utils.binlog.BinLogTest.testTruncationReleasesLogSpace (compression)	NEW	🔴🔵	0 / 0

No known test failures found

pkolaczk · 2025-10-10T09:44:59Z

Butler seems to be wrong on this one. It highlights some old regressions we fixed. I checked the last CI run manually and the only failure there is BinLogTest (which is known to be flaky and I must try to fix it one more time).

michaeljmarshall changed the title ~~CNDB-15570: Fix handling mixed key types in SAI iterators~~ CNDB-15638: Fix handling mixed key types in SAI iterators Oct 9, 2025

eolivelli previously approved these changes Oct 9, 2025

View reviewed changes

pkolaczk approved these changes Oct 10, 2025

View reviewed changes

pkolaczk merged commit d9b690b into cndb-main-release-202505 Oct 10, 2025
486 of 491 checks passed

pkolaczk deleted the cndb-15638 branch October 10, 2025 09:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CNDB-15638: Fix handling mixed key types in SAI iterators #2047

CNDB-15638: Fix handling mixed key types in SAI iterators #2047

Uh oh!

michaeljmarshall commented Oct 9, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 9, 2025

Uh oh!

eolivelli left a comment

Uh oh!

sonarqubecloud bot commented Oct 10, 2025

Uh oh!

cassci-bot commented Oct 10, 2025

Uh oh!

pkolaczk commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CNDB-15638: Fix handling mixed key types in SAI iterators #2047

CNDB-15638: Fix handling mixed key types in SAI iterators #2047

Uh oh!

Conversation

michaeljmarshall commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 9, 2025

Checklist before you submit for review

Uh oh!

eolivelli left a comment

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Oct 10, 2025

Quality Gate passed

Uh oh!

cassci-bot commented Oct 10, 2025

❌ Build ds-cassandra-pr-gate/PR-2047 rejected by Butler

Found 8 new test failures

No known test failures found

Uh oh!

pkolaczk commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michaeljmarshall commented Oct 9, 2025 •

edited

Loading