Skip to content

Conversation

@gaborgsomogyi
Copy link
Contributor

@gaborgsomogyi gaborgsomogyi commented Jan 30, 2019

What changes were proposed in this pull request?

Delegation token providers interface now has a parameter fileSystems but this is needed only for HadoopFSDelegationTokenProvider.

In this PR I've addressed this issue in the following way:

  • Removed fileSystems parameter from HadoopDelegationTokenProvider
  • Moved YarnSparkHadoopUtil.hadoopFSsToAccess into HadoopFSDelegationTokenProvider
  • Moved spark.yarn.stagingDir into core
  • Moved spark.yarn.access.namenodes into core and renamed to spark.kerberos.access.namenodes
  • Moved spark.yarn.access.hadoopFileSystems into core and renamed to spark.kerberos.access.hadoopFileSystems

How was this patch tested?

Existing unit tests.

@SparkQA
Copy link

SparkQA commented Jan 30, 2019

Test build #101891 has finished for PR 23698 at commit 69646eb.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 30, 2019

Test build #101895 has finished for PR 23698 at commit 6eb9ab1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to update the docs that reference the settings you're renaming.

 * Config parameter deprecation
 * Return defaultFS all the time
 * get("spark.master", null)
@SparkQA
Copy link

SparkQA commented Jan 31, 2019

Test build #101959 has finished for PR 23698 at commit 07ff492.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 31, 2019

Test build #101960 has finished for PR 23698 at commit e73250d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jan 31, 2019

Still missing the doc update. running-on-yarn.md still mentions the old config, and that information is not restricted to YARN anymore.

* Doc update
* Param deprecation
@gaborgsomogyi
Copy link
Contributor Author

gaborgsomogyi commented Feb 1, 2019

Still missing the doc update.

Yeah, left from the last commit :/ Now updated.

@SparkQA
Copy link

SparkQA commented Feb 1, 2019

Test build #101994 has finished for PR 23698 at commit 32c5d5d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* Simplified hadoopFSsToAccess
* Moved doc to generic area
@SparkQA
Copy link

SparkQA commented Feb 7, 2019

Test build #102077 has finished for PR 23698 at commit 1c87238.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Feb 7, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Feb 8, 2019

Test build #102082 has finished for PR 23698 at commit 1c87238.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Feb 8, 2019

Merging to master.

@asfgit asfgit closed this in d0443a7 Feb 8, 2019
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…ionTokenProvider.obtainDelegationTokens

## What changes were proposed in this pull request?

Delegation token providers interface now has a parameter `fileSystems` but this is needed only for `HadoopFSDelegationTokenProvider`.

In this PR I've addressed this issue in the following way:
* Removed `fileSystems` parameter from `HadoopDelegationTokenProvider`
* Moved `YarnSparkHadoopUtil.hadoopFSsToAccess` into `HadoopFSDelegationTokenProvider`
* Moved `spark.yarn.stagingDir` into core
* Moved `spark.yarn.access.namenodes` into core and renamed to `spark.kerberos.access.namenodes`
* Moved `spark.yarn.access.hadoopFileSystems` into core and renamed to `spark.kerberos.access.hadoopFileSystems`

## How was this patch tested?

Existing unit tests.

Closes apache#23698 from gaborgsomogyi/SPARK-26766.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…nfigs of Hadoop Filesystems to access (apache#245)

[HADP-45851] Fix backward compatibility of alternative configs of Hadoop Filesystems to access (apache#119)

### What changes were proposed in this pull request?
Fix precedence of configs of Hadoop Filesystems to access.

Before this PR
```
spark.kerberos.access.hadoopFileSystems -> spark.yarn.access.namenodes -> spark.yarn.access.hadoopFileSystems
```

After this PR
```
spark.kerberos.access.hadoopFileSystems ->  spark.yarn.access.hadoopFileSystems -> spark.yarn.access.namenodes
```

### Why are the changes needed?
Before apache#23698, the precedence of configuring Hadoop Filesystems to access is
```
spark.yarn.access.hadoopFileSystems -> spark.yarn.access.namenodes
```
Afterwards, it's
```
spark.kerberos.access.hadoopFileSystems -> spark.yarn.access.namenodes -> spark.yarn.access.hadoopFileSystems
```
When both `spark.yarn.access.hadoopFileSystems` and `spark.yarn.access.namenodes` are configured with different values, the PR will break backward compatibility and cause application failure.

### Does this PR introduce _any_ user-facing change?
Yes. Fix backward compatibility.


### How was this patch tested?
Updated UT.

Co-authored-by: tianlzhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants