Skip to content

Conversation

@adelapena
Copy link

Skip indexing empty non-literal values.

Ideally we should index those empty values to have the same behaviour as legacy table-based secondary indexes. However, SAI has never been able to index values non-literal values. CASSANDRA-19461 was meant to add support for this, but it only did it correctly for literal types (text, ascii, boolean and froze multicells), breaking index building for all other types.

I think that, until we have a good way to represent empty values in the KD tree, we should skip indexing empty non-literal types, rather than fail the entire index build.

It's worth mentioning that all the types whose empty values will be ignored return true in AbstractType#isEmptyValueMeaningless. This means that an empty binary value doesn't represent any composed value. That way, type composition always presents these values as null.

Also, this patch prevents an encoding error when searching empty varint and decimal values.

@adelapena adelapena self-assigned this Feb 9, 2025
@github-actions
Copy link

github-actions bot commented Feb 9, 2025

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits

@adelapena adelapena marked this pull request as draft February 9, 2025 19:13
Also, prevent encoding error when searching empty varint and decimal values
@adelapena adelapena marked this pull request as ready for review February 10, 2025 12:35
@adelapena adelapena force-pushed the CNDB-12774-skip-main branch from e5473cc to 45aa6f0 Compare February 10, 2025 12:35
@sonarqubecloud
Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1556 rejected by Butler


1 new test failure(s) in 2 builds
See build details here


Found 1 new test failures

Test Explanation Branch history Upstream history
...st.testTTLOverwriteHasCorrectOnDiskRowCount[ca] regression 🔴🔵 🔵🔵🔵🔵🔵🔵🔵

Found 8 known test failures

Copy link
Member

@michaeljmarshall michaeljmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is a good solution for existing index file format versions. Any future implementation that indexes values will likely need to version the skipsEmptyValue call to be properly backwards compatible.

@adelapena
Copy link
Author

Thanks for the reviews. PR bumping CNDB's CC version: https://github.com/riptano/cndb/pull/12904

@adelapena adelapena merged commit 60bfb7e into main Feb 11, 2025
466 of 477 checks passed
@adelapena adelapena deleted the CNDB-12774-skip-main branch February 11, 2025 13:16
djatnieks pushed a commit that referenced this pull request Mar 11, 2025
Also, prevent encoding error when searching empty varint and decimal values
djatnieks pushed a commit that referenced this pull request May 18, 2025
Also, prevent encoding error when searching empty varint and decimal values
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants