Skip to content

Conversation

@charliechen211
Copy link

Related Jira:
https://issues.apache.org/jira/browse/SPARK-20608

Descriptions:
See PR (in branch-2.1): #17870

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@charliechen211 charliechen211 changed the title [SPARK-20608] allow standby namenodes in spark.yarn.access.namenodes (PR in master branch) [SPARK-20608] allow standby namenodes in spark.yarn.access.namenodes May 5, 2017
@charliechen211
Copy link
Author

@srowen @jerryshao @steveloughran This is the latest PR. #17870 is deprecated.

} catch {
case e: StandbyException =>
logWarning(s"Namenode ${dst} is in state standby", e)
case e: RemoteException =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggested adding a handler for UnknownHostException too, but now I think that could hide problems with client config. Best to leave as is.

@steveloughran
Copy link
Contributor

at a glance, patch LGTM.

dstFs.addDelegationTokens(tokenRenewer, tmpCreds)
} catch {
case e: StandbyException =>
logWarning(s"Namenode ${dst} is in state standby", e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not accurate to say "Namenode" here, because we may configure to other non-HDFS.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum..Here is actually fetching tokens from hadoopFS, including in hadoopFSCredentialProvider, which means it's exactly HDFS?

Copy link
Contributor

@jerryshao jerryshao May 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hadoop compatible FS doesn't equal to HDFS, we can configure to wasb, adls and others. Also wasb and adls support fetching delegation tokens from common FS API, so we should avoid mentioning Namenode which is only existed in HDFS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for the below "RemoteException", how do you know "RemoteException" is exactly a standby exception?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our tests, there are two possible exceptions when yarn.spark.access.namenodes=hdfs://activeNamenode,hdfs://standbyNamenode

  1. Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby
  2. Caused by: org.apache.hadoop.ipc.StandbyException: Operation category WRITE is not supported in state standby
    Maybe RemoteException should be caught by better way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that if "RemoteException" is caused by others, it is not correct to log as "Namenode ${dst} is in state standby".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will refactor the exception log.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored it. Please review and give some more advices :)

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, Path}
import org.apache.hadoop.ipc.RemoteException
import org.apache.hadoop.ipc.StandbyException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This two line of imports can be merged into one line.

@jerryshao
Copy link
Contributor

This change may be conflicted with #17723 , but I think it is easy to resolve, CC @mgummelt .

@charliechen211
Copy link
Author

@jerryshao done.

@srowen
Copy link
Member

srowen commented May 12, 2017

I think @vanzin is saying this is not the right change

@vanzin
Copy link
Contributor

vanzin commented May 12, 2017

Yes, I already explained in the discussion in the bug. The very fact you're getting an exception from the standby namenode means you're not actually getting the delegation token. Which makes this change pointless.

@srowen srowen mentioned this pull request May 17, 2017
@asfgit asfgit closed this in 5d2750a May 18, 2017
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
## What changes were proposed in this pull request?

This PR proposes to close PRs ...

  - inactive to the review comments more than a month
  - WIP and inactive more than a month
  - with Jenkins build failure but inactive more than a month
  - suggested to be closed and no comment against that
  - obviously looking inappropriate (e.g., Branch 0.5)

To make sure, I left a comment for each PR about a week ago and I could not have a response back from the author in these PRs below:

Closes apache#11129
Closes apache#12085
Closes apache#12162
Closes apache#12419
Closes apache#12420
Closes apache#12491
Closes apache#13762
Closes apache#13837
Closes apache#13851
Closes apache#13881
Closes apache#13891
Closes apache#13959
Closes apache#14091
Closes apache#14481
Closes apache#14547
Closes apache#14557
Closes apache#14686
Closes apache#15594
Closes apache#15652
Closes apache#15850
Closes apache#15914
Closes apache#15918
Closes apache#16285
Closes apache#16389
Closes apache#16652
Closes apache#16743
Closes apache#16893
Closes apache#16975
Closes apache#17001
Closes apache#17088
Closes apache#17119
Closes apache#17272
Closes apache#17971

Added:
Closes apache#17778
Closes apache#17303
Closes apache#17872

## How was this patch tested?

N/A

Author: hyukjinkwon <[email protected]>

Closes apache#18017 from HyukjinKwon/close-inactive-prs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants