Skip to content

Conversation

@popduke
Copy link
Contributor

@popduke popduke commented Nov 7, 2025

  • Base-KV improvements:
    Loosens the merge prerequisite on config alignment, enhances the RedundantRangeRemovalBalancer to detect and purge zombie ranges, refactors the local engine for pluggability, and rolls out multiple optimizations—snapshot session reuse, obsolete snapshot filtering, leaner installation flow, lower split/merge memory pressure, faster distributed range lookup, RocksDB internal metrics toggles, redefined WAL compaction thresholds, improved WAL read path, and better default local-engine settings—alongside pluggable split hinters.

  • Inbox service optimizations:
    Cuts read/write overhead in InboxStoreCoProc and accelerates BatchInsertRequest marshalling.

  • Multi-tenancy enhancements:
    Adds tenant-level observability metrics and introduces a tenant-level switch to publish will messages during shutdown.

  • Bug fixes:
    Resolves a NoSuchElementException when parsing AgentMetadata from CRDT and eliminates task duplication in BatchQueryCall.

…equisite

2. RedundantRangeRemovalBalancer supports detecting and removing 'zombie' ranges
…sDB Snapshot

2. enable heuristic compaction for data engine by default for dist/inbox/retain
… WAL compaction

2. add a metric to monitor linearization latency
…earDown and improve server shutdown handling
1. move signalling fetch to dedicated executor
2. reduce flush operation
2. enable merge behavior for EngineConfig
3. close rocksdb object in preferred order
*/
public static <T> T parse(ByteString bytes, Parser<T> parser) throws InvalidProtocolBufferException {
CodedInputStream input = bytes.newCodedInput();
input.enableAliasing(true);
Copy link
Contributor

@Gujiawei-Edinburgh Gujiawei-Edinburgh Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this PR, zero-copy parser is used more widely. How about the case of direct buffer memory usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zero-copy parser is only applied in the context of parsing PB from byte string backed by byte array. For direct buffer, things become way more complex, and it will be considered in future release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not pretty sure, since from the enableAliasing comments, it says: "Enables ByteString aliasing of the underlying buffer, trading off on buffer pinning for data copies. Only valid for buffer-backed streams". So i think the pinning buffer here means the direct buffer memory will be pinned until the PB obj is garbage collected.

for (IKVEngineProvider p : providers.values()) {
if (p.type().equalsIgnoreCase(type)) {
if (found != null) {
throw new IllegalStateException("Duplicate storage engine provider type: " + type);
Copy link
Contributor

@Gujiawei-Edinburgh Gujiawei-Edinburgh Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to log a warning and ignore the duplicate providers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is by design. It's dangerous to have name-conflict local engine impls running in a bifromq process, the loading order is not guaranteed between restart.

boundary.getEndKey().toStringUtf8()));
try (IKVSpaceIterator itr = new RocksDBKVSpaceIterator(new RocksDBSnapshot(dbHandle, null), boundary,
new IteratorOptions(false, 52428))) {
for (itr.seekToFirst(); itr.isValid(); itr.next()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During this loop, if an error occurs midway, what happens to the data that has already been flushed in the same loop? Each operation checks shouldFlush.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data migration is not transactional at this level, the split/merge process will ensure the consistency.

@popduke popduke merged commit 2b342e9 into apache:main Nov 12, 2025
6 checks passed
@popduke popduke deleted the feat-merge-protocol-upgrade branch November 12, 2025 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants