-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-21890] Credentials not being passed to add the tokens #19103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Mind fixing the title to be, |
|
ok to test |
|
@redsanket please submit a pr against master as well. I doubt the master would cherry pick back since all that has changed so you can leave this one open as well. |
|
I think the tmpCreds here was just a optimization so we didn't have to look at all the creds for the renewer so I think this is fine. @jerryshao you don't remember anything else where it explicitly required not passing the old creds? |
|
Test build #81317 has finished for PR 19103 at commit
|
|
This needs to be opened against master first. |
So the problem is that Oozie will get tokens for Spark instead of letting Spark do itself, and in Oozie launcher we should not let Spark From my understanding of your issue, this seems like a more general issue regarding Oozie launcher and Spark token manage stuff. With the patch, looks like it only address the HDFS issue, how do we handle hive/hbase, looks like still have issues. |
|
In general it feels like this code shouldn't even be running if the current user doesn't have a TGT to start with. But this patch restores the behavior from Spark 2.1, so if the PR is opened against master it should be ok to merge. |
|
@tgravescs , I think it is in |
|
That's when using principal / keytab and generating new tokens; it's separate from the code path being changed here. The initial tokens are obtained in |
|
@vanzin From my understanding seems like it is a workaround to avoid issuing new HDFS tokens (since this user credential already has HDFS tokens). But how to handle HBase/Hive thing without TGT? |
|
I don't know; perhaps they'll fail, which is why I think the correct behavior would be to skip this credential manager code altogether if a TGT doesn't exist. But that would at least be the same behavior as Spark 2.1, while the behavior in the HDFS provider has definitely changed. |
|
hive and hbase token fetch can be turned off (ie spark.yarn.security.tokens.hive.enabled=false). I thought they didn't work the same as hdfs core as far as not getting one if you have, but would need to check. You tell oozie to get those before launching via the oozie credentials configurations. |
|
BTW Oozie can also disable the HDFS provider ( This one looks ok but it really needs to be opened against master instead. |
|
yes user stated he will be opening one for master, but that is quite a bit different due to the credentials stuff moving around so I think this one will have to stay open anyway. But I agree we should definitely look at the master one first. |
|
@vanzin @tgravescs sorry for the delay, will put up a PR against master, we can move further discussion there, about the suggested improvements, I put up a PR against master just for workaround. #19140 |
|
Jenkins,test this please |
|
Test build #81522 has finished for PR 19103 at commit
|
|
Looks good (since the master PR didn't merge). Merging to 2.2. @redsanket please close the PR manually. |
## What changes were proposed in this pull request? I observed this while running a oozie job trying to connect to hbase via spark. It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release. More Info as to why it fails on secure grid: Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along. The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab). The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception. There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens. https://issues.apache.org/jira/browse/SPARK-21890 Modified to pass creds to get delegation tokens ## How was this patch tested? Manual testing on our secure cluster Author: Sanket Chintapalli <[email protected]> Closes #19103 from redsanket/SPARK-21890.
## What changes were proposed in this pull request? I observed this while running a oozie job trying to connect to hbase via spark. It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release. More Info as to why it fails on secure grid: Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along. The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab). The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception. There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens. https://issues.apache.org/jira/browse/SPARK-21890 Modified to pass creds to get delegation tokens ## How was this patch tested? Manual testing on our secure cluster Author: Sanket Chintapalli <[email protected]> Closes apache#19103 from redsanket/SPARK-21890.
What changes were proposed in this pull request?
I observed this while running a oozie job trying to connect to hbase via spark.
It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release.
More Info as to why it fails on secure grid:
Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along.
The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab).
The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception.
There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens.
https://issues.apache.org/jira/browse/SPARK-21890
Modified to pass creds to get delegation tokens
How was this patch tested?
Manual testing on our secure cluster