Skip to content

[Feature][MySQL CDC] MySQL cdc support start by time #9144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 3 tasks
davidzollo opened this issue Apr 10, 2025 · 3 comments · May be fixed by #9285
Open
2 of 3 tasks

[Feature][MySQL CDC] MySQL cdc support start by time #9144

davidzollo opened this issue Apr 10, 2025 · 3 comments · May be fixed by #9285
Assignees

Comments

@davidzollo
Copy link
Contributor

davidzollo commented Apr 10, 2025

Search before asking

  • I had searched in the feature and found no similar feature requirement.

Description

MySQL CDC support start by time.

Currently, the MySQL CDC source supports starting from specific binlog position or GTID. However, in many real-world scenarios, users expect to start a synchronization job based on a human-friendly timestamp, such as:

  • Resume from "2024-04-01 10:00:00" after failure
  • Backfill data since a specific time window

Adding support for start-time (e.g. 2024-04-10 08:00:00) will greatly simplify CDC task configuration and make SeaTunnel more user-friendly in operational scenarios.


source {
  MySQL-CDC {
    hostname = "xxx"
    port = 3306
    ...
    start-time = "2024-04-10 08:00:00"  # Suggested new feature
  }
}

Error handling:

Case Behavior
start-time too old, binlog already purged Fail fast with clear error:Start time is earlier than binlog available. Earliest = 2024-04-08 11:00:00
start-time too new (after current time) Allowed, CDC will wait until matching binlog is produced
Time parsing failure Job fails with IllegalArgumentException

User Scenario:

In real-world CDC scenarios, users often face recovery requirements like:

“I want to resume this CDC pipeline from 2024-04-10 00:00:00”

“I want to only capture changes after yesterday 08:00”

“Binlog filename is not available, but timestamp is known”

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@FrommyMind
Copy link
Contributor

what is the difference between cdc base option startup.timestamp

@ocean-zhc
Copy link
Contributor

If no one claims it, please assign it to me.

@davidzollo davidzollo moved this from Todo to Doing in SeaTunnel RoadMap Apr 12, 2025
@ocean-zhc
Copy link
Contributor

#9285

@Hisoka-X Hisoka-X linked a pull request May 9, 2025 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Doing
Development

Successfully merging a pull request may close this issue.

3 participants