-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environment #34635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #145343 has finished for PR 34635 at commit
|
|
Test build #145345 has finished for PR 34635 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status failure |
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala
Outdated
Show resolved
Hide resolved
| "'mapreduce.job.send-token-conf'. Please check YARN-5910 for more details.") | ||
| .version("3.3.0") | ||
| .stringConf | ||
| .createWithDefault("") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a regex expression, what does empty string regex mean here as a default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not clear, shall we use .createOptional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I think createOptional is better.
| "needs to talk to multiple downstream HDFS clusters, where the YARN RM may not have " + | ||
| "configs (e.g., dfs.nameservices, dfs.ha.namenodes.*, dfs.namenode.rpc-address.*)" + | ||
| "to connect to these clusters. This config is very similar to " + | ||
| "'mapreduce.job.send-token-conf'. Please check YARN-5910 for more details.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had better mention explicitly that this config is ignored in Hadoop 2.7 because we still have Hadoop 2.7 distribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I missed that, added.
| .createWithDefault(false) | ||
|
|
||
| private[spark] val AM_SEND_TOKEN_CONF = | ||
| ConfigBuilder("spark.yarn.am.sendTokenConf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit.spark.yarn.am.tokenConf instead spark.yarn.am.sendTokenConf? sendTokenConf sounds like a boolean config like send or not send.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what is a good name for this and just followed the Hadoop side config name. send here is supposed to mean that the token conf is sent from AM to RM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use regexConf ?
Also, add .regex to config name ? (in addition to @dongjoon-hyun's suggestion for rename).
Take a look at spark.redaction.string.regex for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I updated the config name to spark.yarn.am.tokenConfRegex. Let me know if this looks better.
| } | ||
| } | ||
| copy.write(dob); | ||
| amContainer.setTokensConf(ByteBuffer.wrap(dob.getData)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question. The compilation works with Hadoop 2.7, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, it won't work for Hadoop 2.7. Hmm let me think how to make it work ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can follow ResourceRequestHelper and use reflection to lookup the method when using Hadoop 3.x, to avoid compilation error.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @sunchao . It looks reasonable.
Sorry, but I need to ask if you think we can add a test coverage for this.
|
Test build #145369 has finished for PR 34635 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. I agree with you about the test case. Thanks, @sunchao .
| } | ||
| copy.write(dob); | ||
|
|
||
| // since this method was added in Hadoop 2.9 and 3.0, we use reflection here to avoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we doing this only for 3.x ? If not, relax the isHadoop3 condition ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since Spark only supports built-in Hadoop 2.7 or 3.3, we have the check here. Do you mean support custom Hadoop version 2.9+ too with -Phadoop.version=2.9.x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly - both 2.9 and 2.10 for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I added the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gently ping @mridulm . Does the latest change look good to you?
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala
Outdated
Show resolved
Hide resolved
|
Test build #145464 has finished for PR 34635 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Could you review this once more please, @mridulm ? |
|
Test build #145752 has finished for PR 34635 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status failure |
|
@dongjoon-hyun could you help to double check the new changes and see if they look good to you? if so I'm going to merge this soon. Thanks. |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. New changes look good to me, @sunchao .
|
Merged, thanks! |
…rn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environment (apache#1300) This adds a new config `spark.yarn.am.tokenConfRegex` which is similar to `mapreduce.job.send-token-conf` introduced via [YARN-5910](https://issues.apache.org/jira/browse/YARN-5910). It is used for YARN AM to pass Hadoop configs, such as `dfs.nameservices`, `dfs.ha.namenodes.`, `dfs.namenode.rpc-address.`, etc, to RM for renewing delegation tokens. [YARN-5910](https://issues.apache.org/jira/browse/YARN-5910) introduced a new config `mapreduce.job.send-token-conf` which can be used to pass a job's local configuration to RM which uses them when renewing delegation tokens. A typical use case is when a YARN cluster needs to talk to multiple HDFS clusters, where the RM may not have all the configs (e.g., `dfs.nameservices`, `dfs.ha.namenodes.<nameservice>.*`, `dfs.namenode.rpc-address`) to connect to these clusters when renewing delegation tokens. In this case, the clients can use the feature to pass their local HDFS configs to RM. Yes, a new config `spark.yarn.am.tokenConfRegex` will be introduced to Spark users. By default it is disabled. It seems difficult to come up with a unit test for this. I manually tested it against a YARN cluster with Hadoop version 3.x and it worked as expected. ``` $SPARK_HOME/bin/spark-shell --master yarn \ --deploy-mode client \ --conf spark.driver.extraClassPath="${HADOOP_CONF_DIR}" \ --conf spark.executor.extraclasspath="${HADOOP_CONF_DIR}" \ --conf spark.yarn.am.tokenConfRegex="^dfs.nameservices$|^dfs.namenode.rpc-address.*$|^dfs.ha.namenodes.*$|^dfs.client.failover.proxy.provider.*$|^dfs.namenode.kerberos.principal|^dfs.namenode.kerberos.principal.pattern" \ --conf spark.yarn.access.hadoopFileSystems="<HDFS_URI>" ``` Closes apache#34635 from sunchao/SPARK-37205. Authored-by: Chao Sun <[email protected]> Signed-off-by: Chao Sun <[email protected]>
What changes were proposed in this pull request?
This adds a new config
spark.yarn.am.tokenConfRegexwhich is similar tomapreduce.job.send-token-confintroduced via YARN-5910. It is used for YARN AM to pass Hadoop configs, such asdfs.nameservices,dfs.ha.namenodes.,dfs.namenode.rpc-address., etc, to RM for renewing delegation tokens.Why are the changes needed?
YARN-5910 introduced a new config
mapreduce.job.send-token-confwhich can be used to pass a job's local configuration to RM which uses them when renewing delegation tokens. A typical use case is when a YARN cluster needs to talk to multiple HDFS clusters, where the RM may not have all the configs (e.g.,dfs.nameservices,dfs.ha.namenodes.<nameservice>.*,dfs.namenode.rpc-address) to connect to these clusters when renewing delegation tokens. In this case, the clients can use the feature to pass their local HDFS configs to RM.Does this PR introduce any user-facing change?
Yes, a new config
spark.yarn.am.tokenConfRegexwill be introduced to Spark users. By default it is disabled.How was this patch tested?
It seems difficult to come up with a unit test for this. I manually tested it against a YARN cluster with Hadoop version 3.x and it worked as expected.