-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-49678][CORE] Support spark.test.master in SparkSubmitArguments
#48126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Could you review this PR, @viirya ? |
|
Thank you, @viirya ! |
|
Merged to master. |
| // Global defaults. These should be keep to minimum to avoid confusing behavior. | ||
| def master: String = maybeMaster.getOrElse("local[*]") | ||
| def master: String = | ||
| maybeMaster.getOrElse(System.getProperty("spark.test.master", "local[*]")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is fine but just curious. Why can't we just set spark.master instead? The Python documentation build actually works fine in my mac fwiw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can change it to accept spark.master here.
Why can't we just set spark.master instead?
For the following questions, I guess you already have #48129 to limit the number of SparkSubmit. It mitigate the situation. This PR and #48129 aims to two layers (Spark Submission and Spark local executor cores) during Python documentation generation. This PR is still valid for many core machines like c6i.24xlarge.
The Python documentation build actually works fine in my mac fwiw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, the reason why I didn't use spark.master directly is that it could have some side effects in K8s side. Let me check that part to make ti sure, @HyukjinKwon .
|
I made a follow-up to address the comments, @HyukjinKwon . After looking at the code, it seems to okay. Currently, the follow-up PR is running the CIs. If CI passes, it will be okay. Thank you for the suggestion. |
|
To @HyukjinKwon , I closed the above follow-up and kept this original PR. For the comment (#48134 (comment)),
This PR has more contributions on top of Python Doc Generations. For example, |
|
Sure that's fine! Thanks for taking a look. |
|
Thank you. I'll this as a test setting for now~ We can revisit this later. |

What changes were proposed in this pull request?
This PR aims to support
spark.test.masterinSparkSubmitArguments.Why are the changes needed?
To allow users to control the default master setting during testing and documentation generation.
First, currently, we cannot build
Python Documentationon M3 Max (and high-core machines) without this. Only it succeeds on GitHub Action runners (4 cores) or equivalent low-core docker run. Please try the following on your Macs.BEFORE
AFTER
Second, in general, we can control all
SparkSubmit(eg. Spark Shells) like the following.BEFORE (
local[*])AFTER (
local[1])Does this PR introduce any user-facing change?
No.
spark.test.masteris a new parameter.How was this patch tested?
Manual tests.
Was this patch authored or co-authored using generative AI tooling?
No.