Skip to content

Conversation

saligia-tju
Copy link

Introducing AsyncScheduler with PartitionedDeserializationScheduler

🚀 Core Architecture

Intelligent Event Routing: Leverages primary key-based partitioning to distribute data-change events across dedicated single-thread partition workers, ensuring optimal load distribution and thread safety.

Robust Queue Management: Implements bounded per-partition queues using ArrayBlockingQueue with a strict blocking policy, guaranteeing zero data loss through our no-drop mechanism.

Ordering Guarantees: The source thread's drainRound serves as the exclusive collector, maintaining per-key ordering integrity throughout the entire processing pipeline.

Reliable Checkpoint Recovery: Binds binlog offset advancement to post-emit operations, ensuring bulletproof checkpoint consistency and seamless recovery capabilities.

🌐 Global Async Control Path

Comprehensive Event Support: Seamlessly handles control events (DDL/Watermark/Heartbeat) through a sophisticated global async pathway.

Advanced Completion Management: Utilizes sequence-to-batch mapping for intelligent out-of-order completion handling while maintaining strict in-order emission guarantees.

⚙️ Flexible Configuration Framework

Builder-Driven Design: Offers comprehensive configuration control through intuitive builder patterns:

  • parallelDeserializeEnabled - Master switch (disabled by default)
  • parallelDeserializePkWorkers - Primary key worker scaling
  • parallelDeserializeThreads - Thread pool optimization
  • parallelDeserializeQueueCapacity - Queue size tuning

Backward Compatibility: When disabled, the system gracefully falls back to the original single-thread execution path with zero scheduler overhead.

🛡️ Enhanced Stability & Performance

Concurrency Safety: Resolves potential ConcurrentModificationException issues during emission by implementing snapshot-to-array conversion before iteration, eliminating iterator modification conflicts.

Code Excellence: Features comprehensive Chinese comments, streamlined code paths, and removal of legacy JVM system properties for improved maintainability.

📋 Migration Guide

Activation Steps:

  • Set parallelDeserializeEnabled=true
  • Configure optimal values for pkWorkers, threads, and queue parameters

Seamless Transition: When disabled, behavior remains 100% identical to previous versions, ensuring zero-risk deployment.

Integration Requirements: Ensure downstream systems respect per-key ordering constraints; all collection operations remain anchored to the source thread for consistency.

…ers, bounded blocking queues, source-thread replay)
@ThorneANN
Copy link
Contributor

Think for your contribution ,please don't use mater branch to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants