Skip to content

Conversation

adelapena
Copy link

The method AbstractReadQuery.toCQLString prints commands as CQL queries including any column values. This includes the queried values in the WHERE part of a SELECT statement or the written values on INSERT and UPDATE statement. This method is used at least by the slow query logger, printing user data into the logs.

This PR modifies AbstractReadQuery.toCQLString so it doesn't include column values. There is a boolean flag to opt-out from redaction, since seeing the queried values can be useful while debugging.

The criteria for what should be redacted is:

  • Needs redaction: Messages that go to external monitoring systems, such as JMX, diagnostic events, etc.
  • Doesn't need redaction: User-facing exceptions such as InvalidRequestException, query tracing (Tracing.trace) and generic Object#toString() methods.
  • Ideally should use redaction: Things printed in logs. We treat logs as sensitive data and there is plenty of user data that is printed there. I think we should gradually move towards logs free of user data, and this PR does that for AbstractReadQuery.toCQLString, which is used for example by the slow query logger. However, there are still plenty of other things that print user data, for example partition keys. Discussion here: https://datastax.slack.com/archives/C05LHP4HX5J/p1757687570882049?thread_ts=1757533116.788859&cid=C05LHP4HX5J

At reviewer's request, this PR separately adds redaction over the tightly related changes in toCQLString methods done by this other PR. That PR originally combined both things in separate commits, and it already had multiple review comments regarding changes that now are in this PR.

@adelapena adelapena requested a review from k-rus October 7, 2025 10:18
@adelapena adelapena self-assigned this Oct 7, 2025
Copy link

github-actions bot commented Oct 7, 2025

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

Replace column values by '?' when converting internal read queries to CQL,
so user data don't end up in logs or any other unprotected place.

# Conflicts:
#	src/java/org/apache/cassandra/db/Clustering.java
#	src/java/org/apache/cassandra/db/Slices.java
@adelapena adelapena force-pushed the CNDB-15280-main-redaction branch from c5339f8 to 2c07844 Compare October 15, 2025 16:08
Copy link

@cassci-bot
Copy link

✔️ Build ds-cassandra-pr-gate/PR-2038 approved by Butler


Approved by Butler
See build details here

@adelapena adelapena changed the title CNDB-15280: Remove user data from AbstractReadQuery.toCQLString (redaction) CNDB-15280: Remove user data from AbstractReadQuery.toCQLString Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants