-
Notifications
You must be signed in to change notification settings - Fork 75
Update merge protocol & various improvements for production stability #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…equisite 2. RedundantRangeRemovalBalancer supports detecting and removing 'zombie' ranges
…sDB Snapshot 2. enable heuristic compaction for data engine by default for dist/inbox/retain
…nagement and error handling
… WAL compaction 2. add a metric to monitor linearization latency
…NoSuchElementException
… server shutting down
…earDown and improve server shutdown handling
…ing FIFO response handling
2. Reduce sizing overhead
1. move signalling fetch to dedicated executor 2. reduce flush operation
2. enable merge behavior for EngineConfig 3. close rocksdb object in preferred order
| */ | ||
| public static <T> T parse(ByteString bytes, Parser<T> parser) throws InvalidProtocolBufferException { | ||
| CodedInputStream input = bytes.newCodedInput(); | ||
| input.enableAliasing(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From this PR, zero-copy parser is used more widely. How about the case of direct buffer memory usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zero-copy parser is only applied in the context of parsing PB from byte string backed by byte array. For direct buffer, things become way more complex, and it will be considered in future release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not pretty sure, since from the enableAliasing comments, it says: "Enables ByteString aliasing of the underlying buffer, trading off on buffer pinning for data copies. Only valid for buffer-backed streams". So i think the pinning buffer here means the direct buffer memory will be pinned until the PB obj is garbage collected.
| for (IKVEngineProvider p : providers.values()) { | ||
| if (p.type().equalsIgnoreCase(type)) { | ||
| if (found != null) { | ||
| throw new IllegalStateException("Duplicate storage engine provider type: " + type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to log a warning and ignore the duplicate providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is by design. It's dangerous to have name-conflict local engine impls running in a bifromq process, the loading order is not guaranteed between restart.
| boundary.getEndKey().toStringUtf8())); | ||
| try (IKVSpaceIterator itr = new RocksDBKVSpaceIterator(new RocksDBSnapshot(dbHandle, null), boundary, | ||
| new IteratorOptions(false, 52428))) { | ||
| for (itr.seekToFirst(); itr.isValid(); itr.next()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During this loop, if an error occurs midway, what happens to the data that has already been flushed in the same loop? Each operation checks shouldFlush.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data migration is not transactional at this level, the split/merge process will ensure the consistency.
Base-KV improvements:
Loosens the merge prerequisite on config alignment, enhances the RedundantRangeRemovalBalancer to detect and purge zombie ranges, refactors the local engine for pluggability, and rolls out multiple optimizations—snapshot session reuse, obsolete snapshot filtering, leaner installation flow, lower split/merge memory pressure, faster distributed range lookup, RocksDB internal metrics toggles, redefined WAL compaction thresholds, improved WAL read path, and better default local-engine settings—alongside pluggable split hinters.
Inbox service optimizations:
Cuts read/write overhead in InboxStoreCoProc and accelerates BatchInsertRequest marshalling.
Multi-tenancy enhancements:
Adds tenant-level observability metrics and introduces a tenant-level switch to publish will messages during shutdown.
Bug fixes:
Resolves a NoSuchElementException when parsing AgentMetadata from CRDT and eliminates task duplication in BatchQueryCall.