Commit 1a8ea00
[SPARK-17210][SPARKR] sparkr.zip is not distributed to executors when running sparkr in RStudio
## What changes were proposed in this pull request?
Spark will add sparkr.zip to archive only when it is yarn mode (SparkSubmit.scala).
```
if (args.isR && clusterManager == YARN) {
val sparkRPackagePath = RUtils.localSparkRPackagePath
if (sparkRPackagePath.isEmpty) {
printErrorAndExit("SPARK_HOME does not exist for R application in YARN mode.")
}
val sparkRPackageFile = new File(sparkRPackagePath.get, SPARKR_PACKAGE_ARCHIVE)
if (!sparkRPackageFile.exists()) {
printErrorAndExit(s"$SPARKR_PACKAGE_ARCHIVE does not exist for R application in YARN mode.")
}
val sparkRPackageURI = Utils.resolveURI(sparkRPackageFile.getAbsolutePath).toString
// Distribute the SparkR package.
// Assigns a symbol link name "sparkr" to the shipped package.
args.archives = mergeFileLists(args.archives, sparkRPackageURI + "#sparkr")
// Distribute the R package archive containing all the built R packages.
if (!RUtils.rPackages.isEmpty) {
val rPackageFile =
RPackageUtils.zipRLibraries(new File(RUtils.rPackages.get), R_PACKAGE_ARCHIVE)
if (!rPackageFile.exists()) {
printErrorAndExit("Failed to zip all the built R packages.")
}
val rPackageURI = Utils.resolveURI(rPackageFile.getAbsolutePath).toString
// Assigns a symbol link name "rpkg" to the shipped package.
args.archives = mergeFileLists(args.archives, rPackageURI + "#rpkg")
}
}
```
So it is necessary to pass spark.master from R process to JVM. Otherwise sparkr.zip won't be distributed to executor. Besides that I also pass spark.yarn.keytab/spark.yarn.principal to spark side, because JVM process need them to access secured cluster.
## How was this patch tested?
Verify it manually in R Studio using the following code.
```
Sys.setenv(SPARK_HOME="/Users/jzhang/github/spark")
.libPaths(c(file.path(Sys.getenv(), "R", "lib"), .libPaths()))
library(SparkR)
sparkR.session(master="yarn-client", sparkConfig = list(spark.executor.instances="1"))
df <- as.DataFrame(mtcars)
head(df)
```
…
Author: Jeff Zhang <[email protected]>
Closes #14784 from zjffdu/SPARK-17210.
(cherry picked from commit f62ddc5)
Signed-off-by: Felix Cheung <[email protected]>1 parent d3f90e7 commit 1a8ea00
2 files changed
+19
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
491 | 491 | | |
492 | 492 | | |
493 | 493 | | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
494 | 498 | | |
495 | 499 | | |
496 | 500 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
65 | 80 | | |
66 | 81 | | |
67 | 82 | | |
| |||
0 commit comments