Commit 370453a
committed
### What changes were proposed in this pull request?
This PR aims to support `spark.test.master` in `SparkSubmitArguments`.
### Why are the changes needed?
To allow users to control the default master setting during testing and documentation generation.
#### First, currently, we cannot build `Python Documentation` on M3 Max (and high-core machines) without this. Only it succeeds on GitHub Action runners (4 cores) or equivalent low-core docker run. Please try the following on your Macs.
**BEFORE**
```
$ build/sbt package -Phive-thriftserver
$ cd python/docs
$ make html
...
java.lang.OutOfMemoryError: Java heap space
...
24/09/16 14:09:55 WARN PythonRunner: Incomplete task 7.0 in stage 30 (TID 177) interrupted: Attempting to kill Python Worker
...
make: *** [html] Error 2
```
**AFTER**
```
$ build/sbt package -Phive-thriftserver
$ cd python/docs
$ JDK_JAVA_OPTIONS="-Dspark.test.master=local[1]" make html
...
build succeeded.
The HTML pages are in build/html.
```
#### Second, in general, we can control all `SparkSubmit` (eg. Spark Shells) like the following.
**BEFORE (`local[*]`)**
```
$ bin/pyspark
Python 3.9.19 (main, Jun 17 2024, 15:39:29)
[Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
WARNING: Using incubator modules: jdk.incubator.vector
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/09/16 13:53:02 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 4.0.0-SNAPSHOT
/_/
Using Python version 3.9.19 (main, Jun 17 2024 15:39:29)
Spark context Web UI available at http://localhost:4040
Spark context available as 'sc' (master = local[*], app id = local-1726519982935).
SparkSession available as 'spark'.
>>>
```
**AFTER (`local[1]`)**
```
$ JDK_JAVA_OPTIONS="-Dspark.test.master=local[1]" bin/pyspark
NOTE: Picked up JDK_JAVA_OPTIONS: -Dspark.test.master=local[1]
Python 3.9.19 (main, Jun 17 2024, 15:39:29)
[Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
NOTE: Picked up JDK_JAVA_OPTIONS: -Dspark.test.master=local[1]
NOTE: Picked up JDK_JAVA_OPTIONS: -Dspark.test.master=local[1]
WARNING: Using incubator modules: jdk.incubator.vector
Using Spark's default log4j profile: org/apache/spark/log4j2-pattern-layout-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/09/16 13:51:03 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 4.0.0-SNAPSHOT
/_/
Using Python version 3.9.19 (main, Jun 17 2024 15:39:29)
Spark context Web UI available at http://localhost:4040
Spark context available as 'sc' (master = local[1], app id = local-1726519863363).
SparkSession available as 'spark'.
>>>
```
### Does this PR introduce _any_ user-facing change?
No. `spark.test.master` is a new parameter.
### How was this patch tested?
Manual tests.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #48126 from dongjoon-hyun/SPARK-49678.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 294af6e commit 370453a
File tree
1 file changed
+2
-1
lines changed- core/src/main/scala/org/apache/spark/deploy
1 file changed
+2
-1
lines changedLines changed: 2 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
46 | | - | |
| 46 | + | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
0 commit comments