-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Labels
type/enhancementThe issue or PR belongs to an enhancement.The issue or PR belongs to an enhancement.
Description
Problem
Currently, the eventBroker easily scans too many events for a single dispatcher, which quickly exhausts the memory quota. This leads to several issues:
- Dispatcher Starvation: In DDL or syncpoint scenarios, certain dispatchers can be starved because one dispatcher monopolizes the available memory quota while others wait indefinitely.
- Frequent Reset Events: Under high throughput workloads, the memory quota is frequently hit, triggering reset events. This causes non-smooth synchronization and degrades overall performance.
- Unbalanced Event Distribution: The eventBroker cannot fairly distribute scanning resources across dispatchers.
Proposed Solution
Introduce a memory-aware scan window algorithm that dynamically adjusts the scan interval based on memory quota watermark. The key components include:
- Sliding Window Memory Monitoring: Track memory usage samples over a configurable time window (e.g., 30 seconds) to compute average, max, and trend statistics.
- Tiered Response Thresholds:
- Critical (>90%): Aggressively reduce scan interval to 1/4
- High (>70%): Reduce scan interval to 1/2
- Trend Damping (>30% and rising): Proactively reduce by 10%
- Low (<20%): Increase scan interval by 25%
- Very Low (<10%): Increase scan interval by 50%
- "Fast Brake, Slow Accelerate" Policy:
- Decreases are applied immediately when memory pressure rises
- Increases require cooldown periods and stable conditions to prevent oscillation
- Scan Window Coordination: Use a base timestamp (minSentTs) combined with the dynamic scan interval to compute the maximum timestamp (scanMaxTs) for each scan operation, ensuring dispatchers progress together.
Expected Benefits
- Eliminate Dispatcher Starvation: All dispatchers get fair access to scanning resources
- Reduce Reset Events: Memory quota is managed proactively, avoiding sudden exhaustion
- Smoother Synchronization: Consistent throughput without memory pressure spikes
- Better DDL/Syncpoint Handling: Critical operations complete without being blocked by memory exhaustion
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
type/enhancementThe issue or PR belongs to an enhancement.The issue or PR belongs to an enhancement.