Skip to content

Conversation

@nishchalv
Copy link

What changes were proposed in this pull request?

With this change, executor's bindAddress is passed as an input parameter for RPCEnv.create.
A previous PR #21261 which addressed the same, was using a Spark Conf property to get the bindAddress which wouldn't have worked for multiple executors.
This PR is to enable anyone overriding CoarseGrainedExecutorBackend with their custom one to be able to invoke CoarseGrainedExecutorBackend.main() along with the option to configure bindAddress.

Why are the changes needed?

This is required when Kernel-based Virtual Machine (KVM)'s are used inside Linux container where the hostname is not the same as container hostname.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Tested by running jobs with executors on KVMs inside a linux container.

@dongjoon-hyun
Copy link
Member

Hi, @nishchalv .
Could you rebase to the master? GitHub Action complains that the PR fails to build.

@nishchalv
Copy link
Author

@dongjoon-hyun fixed the build.

@nishchalv nishchalv force-pushed the SPARK-29670 branch 5 times, most recently from 0937237 to 62974f8 Compare October 31, 2019 00:16
@dbtsai
Copy link
Member

dbtsai commented Nov 1, 2019

Jenkins, test this please.

@dbtsai dbtsai changed the title [SPARK-29670][core] Make executor's bindAddress configurable [SPARK-24203][Core] Make executor's bindAddress configurable Nov 1, 2019
@nishchalv nishchalv changed the title [SPARK-24203][Core] Make executor's bindAddress configurable [SPARK-24203][core] Make executor's bindAddress configurable Nov 1, 2019
@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #113070 has finished for PR 26331 at commit 62974f8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Nov 1, 2019

Oh interesting, this could potentially help with an issue I've been seeing in Kubeflow with Spark under certain scenarios. I'll try and take a proper look next week if no one else has a chance.

@holdenk
Copy link
Contributor

holdenk commented Nov 1, 2019

Let's CC the folks from the previous PR for their thoughts too: @lukmajercak & @bjliu & @vanzin

@holdenk
Copy link
Contributor

holdenk commented Nov 1, 2019

@nishchalv can you run the Kubernetes integration tests? I know this isn't directly changing anything there so I think think this is going to trigger the K8s CI but I think with a core change like this it's good to make sure the different cluster managers work as much as possible


Arguments(driverUrl, executorId, hostname, cores, appId, workerUrl,
if (bindAddress == null) {
bindAddress = hostname
Copy link
Member

@dongjoon-hyun dongjoon-hyun Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a new test case for this code path, @nishchalv .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun Added couple of test cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

-1,
executorConf,
new SecurityManager(executorConf),
numUsableCores = 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this line in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed unless we add extra method in RpcEnv object.

@lukmajercak
Copy link

Thanks guys for picking this up. We internally patched this a while ago and have been running with this change since.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok, if you ignore how weird it is to have an internal API that isn't used by anyone, except people external to the project extending internal classes...

Also, Dongjoon's question.

ioEncryptionKey: Option[Array[Byte]],
isLocal: Boolean): SparkEnv = {
createExecutorEnv(conf, executorId, hostname,
hostname, numCores, ioEncryptionKey, isLocal);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop the semi-colon

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do.

@dbtsai
Copy link
Member

dbtsai commented Nov 9, 2019

LGTM except the extra test requested by @dongjoon-hyun Thanks!

Copy link
Member

@dbtsai dbtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Waiting the build.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you for updating, @nishchalv .

@dongjoon-hyun
Copy link
Member

Retest this please.

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dbtsai dbtsai closed this in 833a9f1 Nov 13, 2019
@dbtsai
Copy link
Member

dbtsai commented Nov 13, 2019

Merged into master. Thanks!

@SparkQA
Copy link

SparkQA commented Nov 13, 2019

Test build #113726 has finished for PR 26331 at commit c2a2b95.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gedeh
Copy link

gedeh commented May 19, 2021

Hi, any plan to merge this to Spark 2.4? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants