-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-2051]In yarn.ClientBase spark.yarn.dist.* do not work #969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8362489
ce609fc
0961151
20e667c
8bc2f4b
9cdff16
55d72fc
35d6fa0
41bce59
871f1db
1048549
8d7b82f
2f48789
c8b4554
b6a9aa1
e3c1107
5261b6c
3bdbc52
8117765
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,8 +21,7 @@ import scala.collection.mutable.{ArrayBuffer, HashMap} | |
|
|
||
| import org.apache.spark.SparkConf | ||
| import org.apache.spark.scheduler.InputFormatInfo | ||
| import org.apache.spark.util.IntParam | ||
| import org.apache.spark.util.MemoryParam | ||
| import org.apache.spark.util.{Utils, IntParam, MemoryParam} | ||
|
|
||
|
|
||
| // TODO: Add code and support for ensuring that yarn resource 'tasks' are location aware ! | ||
|
|
@@ -45,6 +44,18 @@ class ClientArguments(val args: Array[String], val sparkConf: SparkConf) { | |
|
|
||
| parseArgs(args.toList) | ||
|
|
||
| // env variable SPARK_YARN_DIST_ARCHIVES/SPARK_YARN_DIST_FILES set in yarn-client then | ||
| // it should default to hdfs:// | ||
| files = Option(files).getOrElse(sys.env.get("SPARK_YARN_DIST_FILES").orNull) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again this is already handled in the YarnClientSchedulerBackend. It reads the env variables and passes in the --files/--archives without being resolveURI extended. The issue with that code is that it also looks at spark.yarn.dist.archives and spark.yarn.dist.files and doesn't resolveURI extend them. |
||
| archives = Option(archives).getOrElse(sys.env.get("SPARK_YARN_DIST_ARCHIVES").orNull) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as above comment. |
||
|
|
||
| // spark.yarn.dist.archives/spark.yarn.dist.files defaults to use file:// if not specified, | ||
| // for both yarn-client and yarn-cluster | ||
| files = Option(files).getOrElse(sparkConf.getOption("spark.yarn.dist.files"). | ||
| map(p => Utils.resolveURIs(p)).orNull) | ||
| archives = Option(archives).getOrElse(sparkConf.getOption("spark.yarn.dist.archives"). | ||
| map(p => Utils.resolveURIs(p)).orNull) | ||
|
|
||
| private def parseArgs(inputArgs: List[String]): Unit = { | ||
| val userArgsBuffer: ArrayBuffer[String] = new ArrayBuffer[String]() | ||
| val inputFormatMap: HashMap[String, InputFormatInfo] = new HashMap[String, InputFormatInfo]() | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to add support for the env variables for yarn-cluster mode. We only support them on yarn-client mode for backwards compatibility. Can you remove this.