MySQL CDC: Record Duplication Due to Incorrect Offset Restart After Debezium Connector Failure (Error 1236)

### Helm Chart Version

2.0

### What step the error happened?

During the Sync

### Relevant information

**Airbyte Platform Version:** 2.0.1
**Source Connector:** MySQL CDC (`3.51.5`)
**Destination Connector:** BigQuery (`3.0.16`)
**Sync Mode:** Change Data Capture (CDC)
**Target Write Schema:** Append

### 🐞 Bug Description

When a MySQL CDC sync job (Run 1) fails *after* starting the data emission, the subsequent job (Run 2) incorrectly restarts from the **initial binlog offset of Run 1** instead of the last committed offset. This leads to the re-processing and re-writing of records already sent to the BigQuery destination, causing **data duplication** due to the **Append** write mode. Attached you can find replication jobs logs (one was on error, the subsequent completed but generates duplication on destination)

### Steps to Reproduce

1.  Configure a **MySQL CDC Source Connector (3.51.5)** syncing to a **BigQuery Destination (3.0.16)** using the **Append** write mode on Airbyte Platform **2.0.1**.
2.  Start **Run 1** (sync) which successfully begins streaming from a specific position:
    ```log
    2025-11-30 11:46:50 source ERROR : Requesting streaming from position filename: db05-slave.087542, position: 87056536
    ```
3.  **Force Run 1 to fail** shortly after it starts streaming, specifically triggering the Debezium/MySQL Error 1236 (replica ID conflict).
    * *The failure log excerpt:*
        ```log
        2025-11-30 11:46:56 source ERROR blc-db05-slave.bravofly.intra:3306 i.d.p.ErrorHandler(setProducerThrowable):52 Producer failure io.debezium.DebeziumException: A replica with the same server_uuid/server_id as this replica has connected to the source; the first event 'db05-slave.087542' at 87056536... Error code: 1236; SQLSTATE: HY000.
        ```
4.  Verify that BigQuery received records before Run 1 terminated.
5.  Correct the failure cause (e.g., resolve the server ID conflict) and start **Run 2**.
6.  **Observe the Run 2 log:** The connector logs immediately confirm it is restarting from the exact same binlog position where Run 1 started (`db05-slave.087542, position=87056536`), demonstrating the incorrect offset retrieval:
    * *Run 2 Log excerpt:*
        ```log
        2025-11-30 12:11:19 source INFO DefaultDispatcher-worker-3#global-round-1-create-partitions i.a.c.r.c.CdcPartitionsCreator(run):144 Current position 'MySqlSourceCdcPosition(fileName=db05-slave.087542, position=87056536)' does not exceed target position 'MySqlSourceCdcPosition(fileName=db05-slave.087546, position=19566958)'.
        ```
7.  Check the BigQuery table: the initial batch of records (those processed between Run 1 start and failure) is duplicated.

### Expected Behavior

The subsequent job (Run 2) should resume from the **last binlog offset that was successfully confirmed (committed state)** by the BigQuery destination connector. This ensures that records already written to the target are not re-processed and duplicated.

[db05_volagratis_soft_logs_2404_txt.txt](https://github.com/user-attachments/files/23848997/db05_volagratis_soft_logs_2404_txt.txt)
[db05_volagratis_soft_logs_2395_txt.txt](https://github.com/user-attachments/files/23848996/db05_volagratis_soft_logs_2395_txt.txt)

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MySQL CDC: Record Duplication Due to Incorrect Offset Restart After Debezium Connector Failure (Error 1236) #70254

Helm Chart Version

What step the error happened?

Relevant information

🐞 Bug Description

Steps to Reproduce

Expected Behavior

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MySQL CDC: Record Duplication Due to Incorrect Offset Restart After Debezium Connector Failure (Error 1236) #70254

Description

Helm Chart Version

What step the error happened?

Relevant information

🐞 Bug Description

Steps to Reproduce

Expected Behavior

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions