Skip to content

Conversation

@redsanket
Copy link

@redsanket redsanket commented Sep 1, 2017

What changes were proposed in this pull request?

I observed this while running a oozie job trying to connect to hbase via spark.
It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release.
More Info as to why it fails on secure grid:
Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along.
The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab).
The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception.
There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens.

https://issues.apache.org/jira/browse/SPARK-21890
Modified to pass creds to get delegation tokens

How was this patch tested?

Manual testing on our secure cluster

@HyukjinKwon
Copy link
Member

Mind fixing the title to be, [SPARK-xxxx][COMPONENT] Title, and filling the PR description up to describe how this fixes the issue?

@tgravescs
Copy link
Contributor

ok to test

@tgravescs
Copy link
Contributor

@redsanket please submit a pr against master as well. I doubt the master would cherry pick back since all that has changed so you can leave this one open as well.

@tgravescs
Copy link
Contributor

I think the tmpCreds here was just a optimization so we didn't have to look at all the creds for the renewer so I think this is fine. @jerryshao you don't remember anything else where it explicitly required not passing the old creds?

@SparkQA
Copy link

SparkQA commented Sep 1, 2017

Test build #81317 has finished for PR 19103 at commit 7043d98.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@redsanket redsanket changed the title [SPARK-21890] [SPARK-21890] Credentials not being passed to add the tokens Sep 1, 2017
@gatorsmile
Copy link
Member

cc @vanzin @mgummelt

@vanzin
Copy link
Contributor

vanzin commented Sep 1, 2017

This needs to be opened against master first.

@jerryshao
Copy link
Contributor

Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along.

The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab).

The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception.

So the problem is that Oozie will get tokens for Spark instead of letting Spark do itself, and in Oozie launcher we should not let Spark Yarn#client to get tokens itself since there might not have tgt available in Oozie launcher.

From my understanding of your issue, this seems like a more general issue regarding Oozie launcher and Spark token manage stuff. With the patch, looks like it only address the HDFS issue, how do we handle hive/hbase, looks like still have issues.

@vanzin
Copy link
Contributor

vanzin commented Sep 1, 2017

In general it feels like this code shouldn't even be running if the current user doesn't have a TGT to start with.

But this patch restores the behavior from Spark 2.1, so if the PR is opened against master it should be ok to merge.

@jerryshao
Copy link
Contributor

@tgravescs , I think it is in AMCredentialRenewer we explicitly create a new Credential every time when issuing new tokens.

    // HACK:
    // HDFS will not issue new delegation tokens, if the Credentials object
    // passed in already has tokens for that FS even if the tokens are expired (it really only
    // checks if there are tokens for the service, and not if they are valid). So the only real
    // way to get new tokens is to make sure a different Credentials object is used each time to
    // get new tokens and then the new tokens are copied over the current user's Credentials.
    // So:
    // - we login as a different user and get the UGI
    // - use that UGI to get the tokens (see doAs block below)
    // - copy the tokens over to the current user's credentials (this will overwrite the tokens
    // in the current user's Credentials object for this FS).

@vanzin
Copy link
Contributor

vanzin commented Sep 1, 2017

That's when using principal / keytab and generating new tokens; it's separate from the code path being changed here. The initial tokens are obtained in Client.scala with the current user's credentials.

@jerryshao
Copy link
Contributor

jerryshao commented Sep 1, 2017

@vanzin From my understanding seems like it is a workaround to avoid issuing new HDFS tokens (since this user credential already has HDFS tokens). But how to handle HBase/Hive thing without TGT?

@vanzin
Copy link
Contributor

vanzin commented Sep 2, 2017

I don't know; perhaps they'll fail, which is why I think the correct behavior would be to skip this credential manager code altogether if a TGT doesn't exist.

But that would at least be the same behavior as Spark 2.1, while the behavior in the HDFS provider has definitely changed.

@tgravescs
Copy link
Contributor

hive and hbase token fetch can be turned off (ie spark.yarn.security.tokens.hive.enabled=false). I thought they didn't work the same as hdfs core as far as not getting one if you have, but would need to check. You tell oozie to get those before launching via the oozie credentials configurations.

@vanzin
Copy link
Contributor

vanzin commented Sep 5, 2017

BTW Oozie can also disable the HDFS provider (spark.yarn.security.credentials.hadoopfs.enabled=false, I think). But it would be nice if Spark was able to do that by itself if the current UGI does not have a TGT (or, alternatively, some way to disable all of the credential providers with a single setting). But that's for a separate PR.

This one looks ok but it really needs to be opened against master instead.

@tgravescs
Copy link
Contributor

yes user stated he will be opening one for master, but that is quite a bit different due to the credentials stuff moving around so I think this one will have to stay open anyway. But I agree we should definitely look at the master one first.

@redsanket
Copy link
Author

@vanzin @tgravescs sorry for the delay, will put up a PR against master, we can move further discussion there, about the suggested improvements, I put up a PR against master just for workaround. #19140

@tgravescs
Copy link
Contributor

Jenkins,test this please

@SparkQA
Copy link

SparkQA commented Sep 7, 2017

Test build #81522 has finished for PR 19103 at commit 7043d98.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Sep 7, 2017

Looks good (since the master PR didn't merge). Merging to 2.2. @redsanket please close the PR manually.

asfgit pushed a commit that referenced this pull request Sep 7, 2017
## What changes were proposed in this pull request?
I observed this while running a oozie job trying to connect to hbase via spark.
It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release.
More Info as to why it fails on secure grid:
Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along.
The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab).
The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception.
There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens.

https://issues.apache.org/jira/browse/SPARK-21890
Modified to pass creds to get delegation tokens

## How was this patch tested?
Manual testing on our secure cluster

Author: Sanket Chintapalli <[email protected]>

Closes #19103 from redsanket/SPARK-21890.
@redsanket redsanket closed this Sep 7, 2017
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
## What changes were proposed in this pull request?
I observed this while running a oozie job trying to connect to hbase via spark.
It look like the creds are not being passed in thehttps://github.com/apache/spark/blob/branch-2.2/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/security/HadoopFSCredentialProvider.scala#L53 for 2.2 release.
More Info as to why it fails on secure grid:
Oozie client gets the necessary tokens the application needs before launching. It passes those tokens along to the oozie launcher job (MR job) which will then actually call the Spark client to launch the spark app and pass the tokens along.
The oozie launcher job cannot get anymore tokens because all it has is tokens ( you can't get tokens with tokens, you need tgt or keytab).
The error here is because the launcher job runs the Spark Client to submit the spark job but the spark client doesn't see that it already has the hdfs tokens so it tries to get more, which ends with the exception.
There was a change with SPARK-19021 to generalize the hdfs credentials provider that changed it so we don't pass the existing credentials into the call to get tokens so it doesn't realize it already has the necessary tokens.

https://issues.apache.org/jira/browse/SPARK-21890
Modified to pass creds to get delegation tokens

## How was this patch tested?
Manual testing on our secure cluster

Author: Sanket Chintapalli <[email protected]>

Closes apache#19103 from redsanket/SPARK-21890.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants