Skip to content

[Bug] [Connector-v2] Bug title The Source Common Options parallelism not work #9302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
Aaronx-Gallagher opened this issue May 12, 2025 · 8 comments · May be fixed by #9319
Closed
2 of 3 tasks

[Bug] [Connector-v2] Bug title The Source Common Options parallelism not work #9302

Aaronx-Gallagher opened this issue May 12, 2025 · 8 comments · May be fixed by #9319
Labels

Comments

@Aaronx-Gallagher
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

I set the parallelism to 5 in env and set the parallelism to 1 in Source. However, the result shows that the parallelism has not been changed to 1, but remains at 5. Is this a bug, or am I using it incorrectly?

Image
This is mentioned on the page Source Common Options in the official documentation.

Image

Then I continued to experiment with Doris and Fakesource, but this parameter still had no effect.

I found the relevant parallelism settings in the code, and it seems that this does not read from the Source, but only reads from the Env.

Image

SeaTunnel Version

the seatunnel version is 2.3.10;

SeaTunnel Config

env{
  parallelism = 5
  job.mode = "BATCH"
  spark.master=local
}

source {
  Http {
    parallelism = 1
    plugin_output = "http"
    url = "http://localhost:8083/ttt/testhttp"
    method = "GET"
    format = "json"
    schema = {
      fields {
        msg = string
      }
    }
  }
}
transform{

}
sink{
 Console = {
 plugin_input = "http"
 }

}

Running Command

I run it in SeaTunnelApiExample.class.

Error Exception

Caused by: java.lang.IllegalArgumentException: A single split source allows only one single reader to be created. Please make sure source parallelism = 1
	at org.apache.seatunnel.shade.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
	at org.apache.seatunnel.connectors.seatunnel.common.source.AbstractSingleSplitSource.createReader(AbstractSingleSplitSource.java:34)
	at org.apache.seatunnel.connectors.seatunnel.common.source.AbstractSingleSplitSource.createReader(AbstractSingleSplitSource.java:28)
	at org.apache.seatunnel.translation.source.ParallelSource.<init>(ParallelSource.java:106)
	... 30 more

Zeta or Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@Aaronx-Gallagher Aaronx-Gallagher changed the title [Bug] [Module Name] Bug title The Source Common Options parallelism not work [Bug] [Connector-v2] Bug title The Source Common Options parallelism not work May 12, 2025
@CosmosNi
Copy link
Contributor

HTTP Source inherits from AbstractSingleSplitSource class, which only allows creating a single reader instance and enforces a parallelism of 1.To solve this issue, you need to set the environment parallelism to 1 as well.

@Aaronx-Gallagher
Copy link
Author

Aaronx-Gallagher commented May 12, 2025 via email

@CosmosNi
Copy link
Contributor

Not all sources support setting parallelism.

Image

@Aaronx-Gallagher
Copy link
Author

Aaronx-Gallagher commented May 12, 2025 via email

@CosmosNi
Copy link
Contributor

Because the source does not support setting parallelism,So the parallelism = 1 you configured is useless

@Aaronx-Gallagher
Copy link
Author

Aaronx-Gallagher commented May 13, 2025 via email

@joexjx
Copy link

joexjx commented May 14, 2025

This should be a bug. I checked the source code and found that when the execute method is called by sourcePluginExecuteProcessor, both Flink and Spark engines set the parallelism in the Environment, with priority given to reading the parallelism value from the sourcePluginConfig; only if it is not set does it read from EnvCommonOptions. When I ran a job using your SeaTunnel config with the Flink Execution engine, I did not encounter the “A single split source allows only one single reader to be created” error. However, when running with the Spark Execution engine, the same error occurred. It turns out that there was a problem with the logic for setting the parallelism value in sparkRuntimeEnvironment within the execute method under Spark Execution. I have already fixed it.

Can you assign this fix to me? I will PR the changes into the dev branch. @davidzollo

@Aaronx-Gallagher
Copy link
Author

Aaronx-Gallagher commented May 14, 2025 via email

joexjx added a commit to joexjx/seatunnel that referenced this issue May 14, 2025
…ng with Spark Engine (apache#9302)

This bug was caused by EnvCommonOptions overriding SourceCommonOptions when setting the parallelism in the sparkRuntimeEnvironment.
joexjx added a commit to joexjx/seatunnel that referenced this issue May 16, 2025
…ng with Spark Engine (apache#9302)

This bug was caused by EnvCommonOptions overriding SourceCommonOptions when setting the parallelism in the sparkRuntimeEnvironment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants