Skip to content

Conversation

@pcmanus
Copy link

@pcmanus pcmanus commented Mar 4, 2025

The goal of this commit is to allow low-level code, typically at the
level of file reads (say, a custom FileChannel implementation), to
obtain context on which higher level operation they correspond to. A
typical use case may for tiered storage where FileChannel
implementations may forward reads to remote storage, and where having
context on the overall operation could help, say, take prior work done
to set timeout, or aggregate metrics per "user-level" operations (rather
than per FileChannel operation).

To do this, we reuse the existing ExecutorLocals mechanism, and simply
have the top-level operation setup the proper context as an
ExecutorLocal, allowing it to be accessed by any low-level operations
operating on behalf of that operation. This is, in many way, similar to
tracing, but instead of a TraceState that collect what happens during
the operation, it is a relatively flexible notion of OperationContext.

As of this patch, this feature is fairly barebone and mostly exists for
extensions in the following sense:

  1. only user reads (ReadCommand execution) currently sets up such a
    context. The code is written in such a way that adding support for
    other operations should be easy, but this is not done.
  2. the context set by reads by default has barely any information: it
    merely has a toString() method that can roughly tell what the
    operation itself is, and so could have use for debugging, but not
    much more. Further, that context is not read by anything in this
    patch. However, said context are created through a "factory" and
    the factory class is configurable through a system property. So
    extension can override the factory in order to create contexts
    with more information/state, and they fetch/use those context where
    they see fit.

@github-actions
Copy link

github-actions bot commented Mar 4, 2025

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits

@pcmanus pcmanus changed the title [WIP] Add support for a per-operation context Add support for a per-operation context Mar 5, 2025
@eolivelli
Copy link

I think that Counter Mutations perform reads from storage but they won't be tracked with this patch.

I don't know about LWTs or other statements that require reads

@pcmanus
Copy link
Author

pcmanus commented Mar 6, 2025

I think that Counter Mutations perform reads from storage but they won't be tracked with this patch.

Good point. It should be easy enough to also set the context on writes to cover all the case of "writes that end up doing reads", so I'll add it.

@pcmanus
Copy link
Author

pcmanus commented Mar 6, 2025

Actually, looking more closely, the current patch handles every execution of a ReadCommand, and everything that require a read (counter, LWT) ultimately does so by executing a ReadCommand, so those use case are covered in that sense.

What is true is that currently there is no good way to look at the context and tell if the read is part of a counter update, or a LWT, or it's a user reads, but:

  1. I don't think it matters for our current use cases, be it https://github.com/riptano/cndb/pull/13209 (nothing in the timeout handling really depends on why the request is done, not probably should it) or https://github.com/riptano/cndb/issues/13219 (in theory we could split the metric further by "kind of request", but cardinality is already a concern for that metric and this would make this worst, so in practice I don't think we want to do that even if we could).
  2. there is actually no simple way to add this information in the context: when we do a read for the condition of a LWT write for instance, then the nodes that concretely execute the read are not the same as the one that coordinate the LWT (the LWT coordinator may happen to also be a read replica but it's not guarantee, and even when that's the case, that it at best 1 node out of the quorum that is requested). And there is genuinely nothing currently on the write protocol that allows a replica that receives a read to tell if that read belong to a LWT write, a counter update, or other. Long story short, we'd have to make write protocol modification to even get that information available, and I don't think we care that badly at the moment.

@eolivelli
Copy link

I think that we don't care much about the type of query (counters, single read....), the only important thing is that the new mechanism covers all the types (in order to completely cover the CNDB side implementation)

I would keep this in the current form, no other changes. Thanks for checking

The goal of this commit is to allow low-level code, typically at the
level of file reads (say, a custom `FileChannel` implementation), to
obtain context on which higher level operation they correspond to. A
typical use case may for tiered storage where `FileChannel`
implementations may forward reads to remote storage, and where having
context on the overall operation could help, say, take prior work done
to set timeout, or aggregate metrics per "user-level" operations (rather
than per `FileChannel` operation).

To do this, we reuse the existing `ExecutorLocals` mechanism, and simply
have the top-level operation setup the proper context as an
`ExecutorLocal`, allowing it to be accessed by any low-level operations
operating on behalf of that operation. This is, in many way, similar to
tracing, but instead of a `TraceState` that collect what happens during
the operation, it is a relatively flexible notion of `OperationContext`.

As of this patch, this feature is fairly barebone and mostly exists for
extensions in the following sense:
1. only user reads (`ReadCommand` execution) currently sets up such a
   context. The code is written in such a way that adding support for
   other operations should be easy, but this is not done.
2. the context set by reads by default has barely any information: it
   merely has a `toString()` method that can roughly tell what the
   operation itself is, and so could have use for debugging, but not
   much more. Further, that context is not read by anything in this
   patch. However, said context are created through a "factory" and
   the factory class is configurable through a system property. So
   extension can override the factory in order to create contexts
   with more information/state, and they fetch/use those context where
   they see fit.
@sonarqubecloud
Copy link

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-1619 rejected by Butler


1 new test failure(s) in 2 builds
See build details here


Found 1 new test failures

Test Explanation Branch history Upstream history
o.a.c.u.b.BinLogTest.testTruncationReleasesLogS... regression 🔴🔴 🔵🔵🔵🔵🔵🔵🔵

Found 2 known test failures

@pcmanus pcmanus merged commit 384ba23 into main Mar 14, 2025
474 of 480 checks passed
@pcmanus pcmanus deleted the cndb-13000-ctx branch March 14, 2025 14:27
djatnieks pushed a commit that referenced this pull request Apr 14, 2025
The goal of this commit is to allow low-level code, typically at the
level of file reads (say, a custom `FileChannel` implementation), to
obtain context on which higher level operation they correspond to. A
typical use case may for tiered storage where `FileChannel`
implementations may forward reads to remote storage, and where having
context on the overall operation could help, say, take prior work done
to set timeout, or aggregate metrics per "user-level" operations (rather
than per `FileChannel` operation).

To do this, we reuse the existing `ExecutorLocals` mechanism, and simply
have the top-level operation setup the proper context as an
`ExecutorLocal`, allowing it to be accessed by any low-level operations
operating on behalf of that operation. This is, in many way, similar to
tracing, but instead of a `TraceState` that collect what happens during
the operation, it is a relatively flexible notion of `OperationContext`.

As of this patch, this feature is fairly barebone and mostly exists for
extensions in the following sense:
1. only user reads (`ReadCommand` execution) currently sets up such a
   context. The code is written in such a way that adding support for
   other operations should be easy, but this is not done.
2. the context set by reads by default has barely any information: it
   merely has a `toString()` method that can roughly tell what the
   operation itself is, and so could have use for debugging, but not
   much more. Further, that context is not read by anything in this
   patch. However, said context are created through a "factory" and
   the factory class is configurable through a system property. So
   extension can override the factory in order to create contexts
   with more information/state, and they fetch/use those context where
   they see fit.
djatnieks pushed a commit that referenced this pull request May 18, 2025
The goal of this commit is to allow low-level code, typically at the
level of file reads (say, a custom `FileChannel` implementation), to
obtain context on which higher level operation they correspond to. A
typical use case may for tiered storage where `FileChannel`
implementations may forward reads to remote storage, and where having
context on the overall operation could help, say, take prior work done
to set timeout, or aggregate metrics per "user-level" operations (rather
than per `FileChannel` operation).

To do this, we reuse the existing `ExecutorLocals` mechanism, and simply
have the top-level operation setup the proper context as an
`ExecutorLocal`, allowing it to be accessed by any low-level operations
operating on behalf of that operation. This is, in many way, similar to
tracing, but instead of a `TraceState` that collect what happens during
the operation, it is a relatively flexible notion of `OperationContext`.

As of this patch, this feature is fairly barebone and mostly exists for
extensions in the following sense:
1. only user reads (`ReadCommand` execution) currently sets up such a
   context. The code is written in such a way that adding support for
   other operations should be easy, but this is not done.
2. the context set by reads by default has barely any information: it
   merely has a `toString()` method that can roughly tell what the
   operation itself is, and so could have use for debugging, but not
   much more. Further, that context is not read by anything in this
   patch. However, said context are created through a "factory" and
   the factory class is configurable through a system property. So
   extension can override the factory in order to create contexts
   with more information/state, and they fetch/use those context where
   they see fit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants