-
Notifications
You must be signed in to change notification settings - Fork 174
refactor(state): improve error propagation for CommitCheckpointVerifiedBlock #9979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
refactor(state): improve error propagation for CommitCheckpointVerifiedBlock #9979
Conversation
arya2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, one small suggestions to avoid changing the error messages if you want to keep a CloneError variant in the new error enum.
Should CloneError be replaced?
It would be great if it were replaced there, though it looks like the change may be substantial if you want this to merge as-is.
Should CommitCheckpointVerifiedError wrap QueueAndCommitError?
I think that would be nice, CommitCheckpointVerifiedError could essentially be a copy of CommitSemanticallyVerifiedError with an added WriteError(rocksdb::Error) variant.
|
Hi @arya2, thank you for the feedback. I’ve restructured Some variants between the finalized and non-finalized paths are close in meaning, but not identical — so a full unification would require either small semantic shifts or renaming a few arms. Also, e.g., If you already have a naming or structural direction in mind, feel free to push it — I’m happy to clean up any resulting breakages or follow-up changes. TODOs before merge:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great!
I started making some changes based off this branch in #9999, where the write_block() method has been updated to return a CommitCheckpointVerifiedError. I think there may still be some room for improvement in the Duplicate error variant. I'm happy to work on that a bit more or for you to takeover from there, let me know what you would prefer.
Aside from improving the Duplicate error variant, these changes will likely require some updates to ChainSync::should_restart_sync().
Some variants between the finalized and non-finalized paths are close in meaning, but not identical — so a full unification would require either small semantic shifts or renaming a few arms.
I think we should fully unify them and have both request variants return a CommitBlockError, the only substantial difference is that committing checkpoint-verified blocks could return a rocksdb::Error, but only because those are expected not to happen when finalizing non-finalized blocks, so I think it would be fine to expect that they won't happen for checkpoint-verified blocks either. It doesn't seem like Zebra could recover from those errors anyway, see "an error during a write operation (write to WAL, Memtable Flush, background compaction etc) may cause the database instance to go into read-only mode and further user writes may not be accepted."
Also, e.g., QueueAndCommitError::Dropped isn't currently used — would you prefer to keep these around for now?
If they're not currently necessary, it seems easy enough to add them once they're needed.
I tried to beautify the new Currently, the On a related note: I noticed that Do you prefer that I do this cleanup as part of this PR, or would you rather wait and revisit it once the |
590b1af to
dead800
Compare
|
Yesterday, after a rebase, I had some issues with the acceptance tests in We’ve been working on parts of the code that might affect mempool logic, so I’m wondering — is there anything I should take a closer look at, or anything in particular to watch out for? |
|
Hi @arya2, I patched the
This commit only fixes a small part of #2908 — the fallback string matching is still a TODO, I think. |
|
After re-visiting the method, I made it a bit more defensive: it now checks for Do you want to merge this “safe but messy” fix first and clean it up later (e.g., by implementing |
I would prefer to address it in this PR if it helps make the
Whichever you prefer, I see no urgency to do it now nor much benefit to deferring it. Yesterday, after a rebase, I had some issues with the acceptan
No, these changes should have no effect on the mempool, that test and a handful of other acceptance tests just fail occasionally, it's not an issue unless they fail consistently or are relevant to the changes and fail in a way that could be due to a concurrency bug.
I want to address #2908 as part of this PR, we've been wanting to address it for a long time, seems convenient to do it here where it makes the changes easier to review, and it would be nice to merge this just after tagging the 3.0.0 release anyway so that there's plenty of time to test it before it gets tagged in a release. Let me know if you'd like me to make those changes. Thank you for making all of these changes, they're looking excellent. I'll closely review everything again soon. |
|
Thank you @arya2 for all the explanations and guidance! I really appreciate your help. I agree that we shouldn’t be lazy now 😉 — both TODOs seem manageable and in range to get closed here. |
|
Here are the TODOs we’ve identified.
|
|
I have reverted the changes to Update:
Questions for review:
|
implements review feedback Co-authored-by: Arya <[email protected]>
- Updates `KnownBlock` request handling to check for block hashes in the state's sent hashes, fixing a bug where it's possible for part of a chain to be duplicated in the non-finalized state if a block is sent to the state via a `CommitSemanticallyVerifiedBlock` request while it's in the block write channel or before the non-finalized state channel has been updated
…rror and removes the `CloneError` variant
15125c5 to
4889e85
Compare
Motivation
This PR addresses #9730 and focuses on the
CommitCheckpointVerifiedBlockpath.Solution
CommitCheckpointVerifiedBlockfollowing the example from refactor(state): improve error propagation forCommitSemanticallyVerifiedBlock#9848 and Add(state): Adds aMappedRequesthelper trait and refactors error types used byCommitSemanticallyVerifiedBlockrequests #9923.CommitCheckpointVerifiedErrorinerror.rs.MappedRequestforCommitCheckpointVerifiedBlock.Follow-up Work
CloneErrorbe replaced?CloneErrorwas introduced due tocommit_finalized_directreturning aBoxError. Replacing it will require additional changes to the error like implementingFrom<zebra_chain::parallel::tree::NoteCommitmentTreeError>.CommitCheckpointVerifiedErrorwrapQueueAndCommitError?There is some overlap between these error types, particularly with regard to the error messages. Adopting
QueueAndCommitErrorwould simplify error handling, but this would require modifying some of the existing error messages to match.Happy to adjust based on feedback.
PR Checklist