Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Conversation

@mccheah
Copy link

@mccheah mccheah commented Aug 22, 2017

This is the first of several measures to make KubernetesClusterSchedulerBackend feasible to test. Requires #445 but only for convenience and not semantically speaking.

The idea is to start breaking down the functionality of KubernetesClusterSchedulerBackend into multiple individually unit-test friendly units. The logic that builds the executor pod structure was by far the single longest method that could be isolated with relative ease.

}.getOrElse(containerWithExecutorLimitCores)
val withMaybeShuffleConfigPod = shuffleServiceConfig.map { config =>
config.shuffleDirs.foldLeft(executorPod) { (builder, dir) =>
new PodBuilder(builder)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah we lost the indentation here, I'll fix that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fixed now

Copy link
Member

@foxish foxish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor LGTM

ConfigurationUtils.parsePrefixedKeyValuePairs(
sparkConf,
KUBERNETES_NODE_SELECTOR_PREFIX,
"node-selector")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"node selector" for consistency?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a break in the configuration I think. Aside from that, SparkConf keys have never had spaces in them.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that last string is only used for log output -- node selector seems like it would be fine

"-DsimpleDriverConf=simpleDriverConfValue" +
" -Ddriverconfwithspaces='driver conf with spaces value'")
sparkConf.set("spark.files", driverJvmOptionsFile.getAbsolutePath)
sparkConf.set(SparkLauncher.EXECUTOR_EXTRA_JAVA_OPTIONS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a separate change from the refactor correct?

@mccheah
Copy link
Author

mccheah commented Aug 28, 2017

I believe the diff is corrupted and I have to fix it in git.

@mccheah
Copy link
Author

mccheah commented Aug 28, 2017

Actually I just need to rebase against branch-2.2-kubernetes and make the diff that way.

This is the first of several measures to make
KubernetesClusterSchedulerBackend feasible to test.
@mccheah mccheah changed the base branch from support-executor-java-options to branch-2.2-kubernetes August 29, 2017 18:31
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Believe this change is unintentional

@mccheah mccheah force-pushed the separate-executor-pod-construction branch from dbb113d to dc6b186 Compare August 29, 2017 18:38
@mccheah
Copy link
Author

mccheah commented Aug 29, 2017

@foxish rebase complete.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't revert this change -- it went in in a PR and somehow your PRs keep coming close to reverting it.. ?

import org.apache.spark.deploy.kubernetes.submit.{InitContainerUtil, MountSmallFilesBootstrap}
import org.apache.spark.util.Utils

// Strictly an extension of KubernetesClusterSchedulerBakcne that is factored out for testing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: KubernetesClusterSchedulerBakcne

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by extension, you mean the method in this trait is the same as a method in KubernetesClusterSchedulerBackend ? That implies to me it should do multiple inheritance.

I'm not sure the "strictly an extension" language makes sense -- maybe instead say that it's only used in the scheduler backend?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extension meaning a plugin, or functionality that is pretty much only used in the scheduler backend.

ConfigurationUtils.parsePrefixedKeyValuePairs(
sparkConf,
KUBERNETES_NODE_SELECTOR_PREFIX,
"node-selector")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that last string is only used for log output -- node selector seems like it would be fine


import ExecutorPodFactoryImpl._

private val EXECUTOR_ID_COUNTER = new AtomicLong(0L)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this used? I think it should be deleted because KubernetesClusterSchedulerBackend.scala already has one

import org.apache.spark.deploy.kubernetes.constants.ANNOTATION_EXECUTOR_NODE_AFFINITY
import org.apache.spark.internal.Logging

// Strictly an extension of ExecutorPodFactory but extracted out for testing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this comment adds much -- it's good practice to have smaller more modular pieces anyway for understanding, regardless of testing purposes

}.getOrElse(containerWithExecutorLimitCores)
val withMaybeShuffleConfigPod = shuffleServiceConfig.map { config =>
config.shuffleDirs.foldLeft(executorPod) { (builder, dir) =>
new PodBuilder(builder)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fixed now

@ash211
Copy link

ash211 commented Aug 30, 2017

@mccheah conflicts on the GiB -> MiB conversion fix since you moved that elsewhere

Move MiB change to ExecutorPodFactory.
@mccheah
Copy link
Author

mccheah commented Aug 30, 2017

rerun integration tests please

@ash211
Copy link

ash211 commented Sep 6, 2017

Rerun unit tests please

@mccheah
Copy link
Author

mccheah commented Sep 6, 2017

@aash @foxish good to merge this?

@ash211 ash211 merged commit fa02fb1 into branch-2.2-kubernetes Sep 6, 2017
@ash211
Copy link

ash211 commented Sep 6, 2017

test coverage is unchanged (this is an internal refactor) and enables more granular testing in followup PRs

ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 26, 2019
…k8s#452)

* Move executor pod construction to a separate class.

This is the first of several measures to make
KubernetesClusterSchedulerBackend feasible to test.

* Revert change to README

* Address comments.

* Resolve merge conflicts.

Move MiB change to ExecutorPodFactory.
puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019
…k8s#452)

* Move executor pod construction to a separate class.

This is the first of several measures to make
KubernetesClusterSchedulerBackend feasible to test.

* Revert change to README

* Address comments.

* Resolve merge conflicts.

Move MiB change to ExecutorPodFactory.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants