-
Notifications
You must be signed in to change notification settings - Fork 2k
[Bug] [Connector-v2] Bug title The Source Common Options parallelism not work #9302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
HTTP Source inherits from AbstractSingleSplitSource class, which only allows creating a single reader instance and enforces a parallelism of 1.To solve this issue, you need to set the environment parallelism to 1 as well. |
That's true.So is the parallelism in the source not effective?Because of certain reasons, it is not convenient for me to modify the data in the env.
…---Original---
From: ***@***.***>
Date: Mon, May 12, 2025 15:03 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [apache/seatunnel] [Bug] [Connector-v2] Bug title The SourceCommon Options parallelism not work (Issue #9302)
CosmosNi left a comment (apache/seatunnel#9302)
HTTP Source inherits from AbstractSingleSplitSource class, which only allows creating a single reader instance and enforces a parallelism of 1.To solve this issue, you need to set the environment parallelism to 1 as well.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
ok,I got it.The description here may be incorrect, as the description of env in the text will be replaced by the parallelism of source, but on the contrary, here the env replaces the parallelism of source.Hope you can take a look.
憨憨龟
***@***.***
…------------------ 原始邮件 ------------------
发件人: ***@***.***>;
发送时间: 2025年5月12日(星期一) 下午4:16
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [apache/seatunnel] [Bug] [Connector-v2] Bug title The Source Common Options parallelism not work (Issue #9302)
CosmosNi left a comment (apache/seatunnel#9302)
Not all sources support setting parallelism.
image.png (view on web)
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Because the source does not support setting parallelism,So the parallelism = 1 you configured is useless |
OK,thank's for your reply.
…---Original---
From: ***@***.***>
Date: Tue, May 13, 2025 17:55 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [apache/seatunnel] [Bug] [Connector-v2] Bug title The SourceCommon Options parallelism not work (Issue #9302)
CosmosNi left a comment (apache/seatunnel#9302)
Because the source does not support setting parallelism,So the parallelism = 1 you configured is useless
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
This should be a bug. I checked the source code and found that when the execute method is called by sourcePluginExecuteProcessor, both Flink and Spark engines set the parallelism in the Environment, with priority given to reading the parallelism value from the sourcePluginConfig; only if it is not set does it read from EnvCommonOptions. When I ran a job using your SeaTunnel config with the Flink Execution engine, I did not encounter the “A single split source allows only one single reader to be created” error. However, when running with the Spark Execution engine, the same error occurred. It turns out that there was a problem with the logic for setting the parallelism value in sparkRuntimeEnvironment within the execute method under Spark Execution. I have already fixed it. Can you assign this fix to me? I will PR the changes into the dev branch. @davidzollo |
Thank you for your attention.
Okay, it's actually quite simple. You just need to place the line of code indicated by the red arrow on top of the code pointed by the blue arrow. ?It seems to be in a file called SourceExecuteProcessor.java.
…---Original---
From: "Junxin ***@***.***>
Date: Wed, May 14, 2025 20:00 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [apache/seatunnel] [Bug] [Connector-v2] Bug title The SourceCommon Options parallelism not work (Issue #9302)
joexjx left a comment (apache/seatunnel#9302)
This should be a bug. I checked the source code and found that when the execute method is called by sourcePluginExecuteProcessor, both Flink and Spark engines set the parallelism in the Environment, with priority given to reading the parallelism value from the sourcePluginConfig; only if it is not set does it read from EnvCommonOptions. When I ran a job using your SeaTunnel config with the Flink Execution engine, I did not encounter the “A single split source allows only one single reader to be created” error. However, when running with the Spark Execution engine, the same error occurred. It turns out that there was a problem with the logic for setting the parallelism value in sparkRuntimeEnvironment within the execute method under Spark Execution. I have already fixed it.
Can you assign this fix to me? I will merge the changes into the dev branch. @davidzollo
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
…ng with Spark Engine (apache#9302) This bug was caused by EnvCommonOptions overriding SourceCommonOptions when setting the parallelism in the sparkRuntimeEnvironment.
…ng with Spark Engine (apache#9302) This bug was caused by EnvCommonOptions overriding SourceCommonOptions when setting the parallelism in the sparkRuntimeEnvironment.
Search before asking
What happened
I set the parallelism to 5 in env and set the parallelism to 1 in Source. However, the result shows that the parallelism has not been changed to 1, but remains at 5. Is this a bug, or am I using it incorrectly?
This is mentioned on the page Source Common Options in the official documentation.
Then I continued to experiment with Doris and Fakesource, but this parameter still had no effect.
I found the relevant parallelism settings in the code, and it seems that this does not read from the Source, but only reads from the Env.
SeaTunnel Version
the seatunnel version is 2.3.10;
SeaTunnel Config
Running Command
I run it in SeaTunnelApiExample.class.
Error Exception
Zeta or Flink or Spark Version
No response
Java or Scala Version
No response
Screenshots
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: