[SPARK-22404][YARN] Provide an option to use unmanaged AM in yarn-client mode #19616

devaraj-kavali · 2017-10-31T00:13:37Z

What changes were proposed in this pull request?

Providing a new configuration "spark.yarn.un-managed-am" (defaults to false) to enable the Unmanaged AM Application in Yarn Client mode which launches the Application Master service as part of the Client. It utilizes the existing code for communicating between the Application Master <-> Task Scheduler for the container requests/allocations/launch, and eliminates these,

Allocating and launching the Application Master container
Remote Node/Process communication between Application Master <-> Task Scheduler

How was this patch tested?

I verified manually running the applications in yarn-client mode with "spark.yarn.un-managed-am" enabled, and also ensured that there is no impact to the existing execution flows.

I would like to hear others feedback/thoughts on this.

mode

vanzin · 2017-12-12T19:24:40Z

@devaraj-kavali why is this a WIP? Are you planning to work more on it before asking for feedback?

devaraj-kavali · 2017-12-12T19:36:20Z

@vanzin Thanks for looking into this.

I thought to verify some scenarios before removing WIP, feedback is welcome anytime. Now I see there are some code conflicts, I will resolve conflicts and remove WIP.

vanzin · 2017-12-19T18:13:33Z

ok to test

SparkQA · 2017-12-19T18:37:48Z

Test build #85125 has finished for PR 19616 at commit cba0c6d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2018-02-01T23:04:13Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

  private val isClusterMode = sparkConf.get("spark.submit.deployMode", "client") == "cluster"

+  private val isClientUnmanagedAMEnabled =
+    sparkConf.getBoolean("spark.yarn.un-managed-am", false) && !isClusterMode


This should be a config constant. Also unmanagedAM is more in line with other config names.

Updated the config name and also added config constants.

vanzin · 2018-02-01T23:05:41Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

    // UI's environment page. This works for client mode; for cluster mode, this is handled
    // by the AM.
-    CACHE_CONFIGS.foreach(sparkConf.remove)
+    if (!isClientUnmanagedAMEnabled) {


Why is this needed in the new mode?

It is clearing the classpath entries and leading to this error in Executors.

Error: Could not find or load main class org.apache.spark.executor.CoarseGrainedExecutorBackend

I think this is happening because you're starting the AM after these are removed from the conf. Should probably juggle things around or change how these are provided to the AM, since these configs are super noisy and shouldn't really show up in the UI.

vanzin · 2018-02-01T23:06:58Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

    populateClasspath(args, hadoopConf, sparkConf, env, sparkConf.get(DRIVER_CLASS_PATH))
    env("SPARK_YARN_STAGING_DIR") = stagingDirPath.toString
+    if (isClientUnmanagedAMEnabled) {
+      System.setProperty("SPARK_YARN_STAGING_DIR", stagingDirPath.toString)


Can this be propagated some other way? Using system properties is kinda hacky, and makes it dangerous to run another Spark app later in the same JVM.

Changed it to get from the spark conf and the application id.

vanzin · 2018-02-01T23:09:03Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+  private def startApplicationMasterService(report: ApplicationReport) = {
+    // Add AMRMToken to establish connection between RM and AM
+    val token = report.getAMRMToken
+    val amRMToken: org.apache.hadoop.security.token.Token[AMRMTokenIdentifier] =


Why do you need to make this copy? Isn't the Token above enough?

report.getAMRMToken gives org.apache.hadoop.yarn.api.records.Token type instance, but currentUGI.addToken expects org.apache.hadoop.security.token.Token type instance.

vanzin · 2018-02-01T23:09:18Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+    val currentUGI = UserGroupInformation.getCurrentUser
+    currentUGI.addToken(amRMToken)
+
+    System.setProperty(


Same question about using system properties.

I changed to set in sparkConf and use the same in ApplicationMaster while getting the containerId.

vanzin · 2018-02-01T23:10:48Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

 * Common application master functionality for Spark on Yarn.
 */
-private[spark] class ApplicationMaster(args: ApplicationMasterArguments) extends Logging {
+private[spark] class ApplicationMaster(args: ApplicationMasterArguments, sparkConf: SparkConf,


This doesn't follow Spark's convention for multi-line arguments.

This also looks a little odd now, because there are conflicting arguments. ApplicationMasterArguments is now only used in cluster mode, and everything else is expected to be provided in the other parameters. So while this is the simpler change, it's also a little ugly.

I don't really have a good suggestion right now, but it's something to think about.

I made changes to the default constructor and added another constructor. Please check and let me know anything can be done better.

vanzin · 2018-02-01T23:20:16Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+    System.setProperty(
+      ApplicationConstants.Environment.CONTAINER_ID.name(),
+      ContainerId.newContainerId(report.getCurrentApplicationAttemptId, 1).toString)
+    val amArgs = new ApplicationMasterArguments(Array("--arg",


This is pretty weird, I'd make this an explicit constructor argument for the AM instead. But if I understand this correctly, this is the address the AM will be connecting back to the driver, right?

It seems like there's an opportunity for better code here, since now they'd both be running in the same process. Like in the cluster mode case, where the AM uses the same RpcEnv instance as the driver (see runDriver()).

I added another constructor without ApplicationMasterArguments and takes RpcEnv to use the same instance in AM.

vanzin · 2018-02-01T23:21:08Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+    val amArgs = new ApplicationMasterArguments(Array("--arg",
+      sparkConf.get("spark.driver.host") + ":" + sparkConf.get("spark.driver.port")))
+    // Start Application Service in a separate thread and continue with application monitoring
+    new Thread() {


Don't you want to keep a reference to this thread and join it at some point, to make sure it really goes away? Should it be a daemon thread instead?

changed it as daemon thread.

SparkQA · 2018-02-15T01:57:20Z

Test build #87464 has finished for PR 19616 at commit ce94235.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-02-15T04:49:21Z

Test build #87462 has finished for PR 19616 at commit 19b6c3a.

This patch fails PySpark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

devaraj-kavali · 2018-03-05T19:32:19Z

@vanzin Thanks for the review, can you have a look into the updated PR?

vanzin · 2018-03-08T00:40:34Z

Unlikely I'll be able to review this very soon.

HyukjinKwon · 2018-07-16T02:19:31Z

ok to test

SparkQA · 2018-07-16T06:15:15Z

Test build #93055 has finished for PR 19616 at commit ce94235.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2018-07-17T00:24:48Z

Test build #93144 has finished for PR 19616 at commit 0921f7a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin

Hi @devaraj-kavali , sorry this got left behind. It needs updating and I need to think some more about the changes to ApplicationMaster.scala, but it's in the right direction.

vanzin · 2018-12-10T23:19:10Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

    // UI's environment page. This works for client mode; for cluster mode, this is handled
    // by the AM.
-    CACHE_CONFIGS.foreach(sparkConf.remove)
+    if (!isClientUnmanagedAMEnabled) {


I think this is happening because you're starting the AM after these are removed from the conf. Should probably juggle things around or change how these are provided to the AM, since these configs are super noisy and shouldn't really show up in the UI.

vanzin · 2018-12-10T23:19:29Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala

-  def getContainerId: ContainerId = {
-    val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
+  def getContainerId(sparkConf: SparkConf): ContainerId = {
+      val containerIdString =


indentation

corrected the indentation

vanzin · 2018-12-10T23:19:53Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala

-    val containerIdString = System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name())
+  def getContainerId(sparkConf: SparkConf): ContainerId = {
+      val containerIdString =
+      if (System.getenv(ApplicationConstants.Environment.CONTAINER_ID.name()) != null) {


better to use sparkConf.getenv.

updated to use sparkConf.getenv

vanzin · 2018-12-10T23:25:31Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+    sparkConf.set("spark.yarn.containerId",
+      ContainerId.newContainerId(report.getCurrentApplicationAttemptId, 1).toString)
+    // Start Application Service in a separate thread and continue with application monitoring
+    val amService = new Thread() {


Thread name?

added thread name

vanzin · 2018-12-10T23:26:16Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+    // Add AMRMToken to establish connection between RM and AM
+    val token = report.getAMRMToken
+    val amRMToken: org.apache.hadoop.security.token.Token[AMRMTokenIdentifier] =
+      new org.apache.hadoop.security.token.Token[AMRMTokenIdentifier](token


Keep related calls in the same like (e.g. token.getIdentifier(), new Text(blah))

made the change, please let me know if anything better can be done here

vanzin · 2018-12-10T23:27:20Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

+    val yarnConf: YarnConfiguration)
+  extends Logging {

+  def this(sparkConf: SparkConf,


See above constructor for multi-line args style.

removed this constructor as part of below comment refactor

vanzin · 2018-12-10T23:31:18Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

-      RpcAddress(driverHost, driverPort),
-      YarnSchedulerBackend.ENDPOINT_NAME)
+    var driverRef : RpcEndpointRef = null
+    if (sparkConf.get(YARN_UNMANAGED_AM)) {


I'm not a big fan of this change. Feels like you should have a different method here called runUnmanaged that is called instead of run(), and takes an RpcEnv.

That way you don't need to keep clientRpcEnv at all since it would be local to that method, since nothing else here needs it. In fact even rpcEnv could go away and become a parameter to createAllocator...

Refactored to a method runUnmanaged, please let me know if anything can be done better.

devaraj-kavali · 2018-12-11T00:44:08Z

Thanks @vanzin for taking time to look into this, will update it with the changes.

vanzin · 2018-12-13T00:16:24Z

ok to test

SparkQA · 2018-12-13T00:27:11Z

Test build #100052 has finished for PR 19616 at commit 837d25f.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

devaraj-kavali · 2018-12-14T20:43:08Z

since these configs are super noisy and shouldn't really show up in the UI.

These configs are getting removed from sparkConf in ApplicationMaster after using.

SparkQA · 2018-12-14T20:53:10Z

Test build #100163 has finished for PR 19616 at commit 65aeba9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

devaraj-kavali · 2018-12-18T18:41:08Z

@vanzin can you check the updated changes? thanks

vanzin · 2018-12-18T21:35:21Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

  }

-  private def runImpl(): Unit = {
+  private def runImpl(opBlock: => Unit): Unit = {


There are things in this method that don't look right when you think about an unmanaged AM.

e.g., overriding spark.master, spark.ui.port, etc, look wrong.

The handling of app attempts also seems wrong, since with an unmanaged AM you don't have multiple attempts. Even the shutdown hooks seems a bit out of place.

Seems to me it would be easier not to try to use this method for the unmanaged AM.

refactored this code

vanzin · 2018-12-18T21:35:51Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

+      addAmIpFilter(Some(driverRef))
+      createAllocator(driverRef, sparkConf, clientRpcEnv)
+
+      // In client mode the actor will stop the reporter thread.


removed this as part of the above comment refactor

vanzin · 2018-12-18T21:37:51Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

      val preserveFiles = sparkConf.get(PRESERVE_STAGING_FILES)
      if (!preserveFiles) {
-        stagingDirPath = new Path(System.getenv("SPARK_YARN_STAGING_DIR"))
+        var stagingDir = System.getenv("SPARK_YARN_STAGING_DIR")


val stagingDir = sys.props.get("...").getOrElse { ... }

Made the change to pass the stagingDir from Client.scala

vanzin · 2018-12-18T21:39:31Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

-        stagingDirPath = new Path(System.getenv("SPARK_YARN_STAGING_DIR"))
+        var stagingDir = System.getenv("SPARK_YARN_STAGING_DIR")
+        if (stagingDir == null) {
+          val appStagingBaseDir = sparkConf.get(STAGING_DIR).map { new Path(_) }


This looks similar to the logic in Client.scala. Maybe the value calculated there should be plumbed through, instead of adding this code.

Made the change to pass the stagingDir from Client.scala

vanzin · 2018-12-18T21:40:38Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

-
+    if (isClientUnmanagedAMEnabled) {
+      // Set Unmanaged AM to true in Application Submission Context
+      appContext.setUnmanagedAM(true)


appContext.setUnmanagedAM(isClientUnmanagedAMEnabled)

Which also makes the comment unnecessary.

vanzin · 2018-12-18T21:41:20Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

      }
-
+      if (state == YarnApplicationState.ACCEPTED && isClientUnmanagedAMEnabled
+        && !amServiceStarted && report.getAMRMToken != null) {


indent one more level

vanzin · 2018-12-18T21:42:13Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+      ContainerId.newContainerId(report.getCurrentApplicationAttemptId, 1).toString)
+    // Start Application Service in a separate thread and continue with application monitoring
+    val amService = new Thread("Unmanaged Application Master Service") {
+      override def run(): Unit = new ApplicationMaster(new ApplicationMasterArguments(Array.empty),


This is a pretty long line. Break it down.

updated it.

vanzin · 2018-12-18T21:45:23Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+    currentUGI.addToken(amRMToken)
+
+    sparkConf.set("spark.yarn.containerId",
+      ContainerId.newContainerId(report.getCurrentApplicationAttemptId, 1).toString)


Won't this name be the same as the first executor created by the app?

I'd rather special-case getContainerId to return some baked-in string when the env variable is not set.

made the change to pass the appAttemptId from Client.scala

SparkQA · 2018-12-20T02:18:57Z

Test build #100328 has finished for PR 19616 at commit 93b016f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2018-12-20T22:11:31Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

+        "APPMASTER", sparkConf.get(APP_CALLER_CONTEXT),
+        Option(appAttemptId.getApplicationId.toString), None).setCurrentContext()
+
+      // This shutdown hook should run *after* the SparkContext is shut down.


This is client mode, so you can't rely on shutdown hooks. You need to explicitly stop this service when the SparkContext is shutdown.

Imagine someone just embeds sc = new SparkContext(); ...; sc.stop() in their app code, but the app itself runs for way longer than the Spark app.

vanzin · 2018-12-20T22:11:53Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

    }
  }

+  def runUnmanaged(clientRpcEnv: RpcEnv,


Multi-line args start on the next line.

vanzin · 2018-12-20T22:19:15Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

    throw new SparkException("While loop is depleted! This should never happen...")
  }

+  private def startApplicationMasterService(report: ApplicationReport) = {


: Unit =

But given you should be explicitly stopping the AM, this should probably return the AM itself.

SparkQA · 2019-01-10T04:13:54Z

Test build #101002 has finished for PR 19616 at commit 23ad9de.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-01-10T04:41:04Z

Test build #101003 has finished for PR 19616 at commit 1c02b7d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2019-01-10T07:31:37Z

Test build #100997 has finished for PR 19616 at commit dc31940.

This patch passes all tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2019-01-15T20:41:46Z

Test build #101277 has finished for PR 19616 at commit 2429e19.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

devaraj-kavali · 2019-01-16T23:52:06Z

@vanzin can you review the latest changes when you have some time? thanks

vanzin

Some minor comments. It would be good to add a test for this in YarnClusterSuite.

vanzin · 2019-01-17T22:14:42Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

      // In cluster mode, do not rely on the disassociated event to exit
      // This avoids potentially reporting incorrect exit codes if the driver fails
-      if (!isClusterMode) {
+      if (!(isClusterMode || sparkConf.get(YARN_UNMANAGED_AM))) {


Update comment above?

vanzin · 2019-01-17T22:17:02Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

  private val isClusterMode = sparkConf.get("spark.submit.deployMode", "client") == "cluster"

+  private val isClientUnmanagedAMEnabled = sparkConf.get(YARN_UNMANAGED_AM) && !isClusterMode
+  private var amServiceStarted = false


Do you need this extra flag? Could you just check if appMaster != null?

vanzin · 2019-01-17T22:18:22Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

+  private val isClientUnmanagedAMEnabled = sparkConf.get(YARN_UNMANAGED_AM) && !isClusterMode
+  private var amServiceStarted = false
+  private var appMaster: ApplicationMaster = _
+  private var unManagedAMStagingDirPath: Path = _


Seems better to just store this in a variable for all cases. It's recomputed from the conf in a few different places in this class.

vanzin · 2019-01-18T00:31:27Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala


+  /* Unmanaged AM configuration. */
+
+  private[spark] val YARN_UNMANAGED_AM = ConfigBuilder("spark.yarn.unmanagedAM")


Add .enabled to the config key.

SparkQA · 2019-01-24T03:11:39Z

Test build #101610 has finished for PR 19616 at commit 6854fc4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

devaraj-kavali · 2019-01-24T17:29:23Z

@vanzin can you check the updated changes? thanks

vanzin

What's the behavior when the AM fails before the context is stopped?

From the code I see some stuff is printed to the logs and the YARN app is marked as finished. But does the context remain alive? Or should that event cause the context to be stopped?

I'm mostly concerned with how clear it is for the user that the jobs start failing because the context is now unusable.

vanzin · 2019-01-25T00:02:02Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala

+        finish(FinalApplicationStatus.FAILED,
+          ApplicationMaster.EXIT_UNCAUGHT_EXCEPTION,
+          "Uncaught exception: " + StringUtils.stringifyException(e))
+        if (!unregistered) {


Is this code needed here? Won't it be called when the client calls stopUnmanaged?

appMaster.runUnmanaged is running in a daemon thread, if something goes unexpected in appMaster.runUnmanaged then the daemon thread stops and monitor thread will not know about it and continue with the status as ACCEPTED/RUNNING, this code unregisters with RM so that the Client/monitor thread gets the application report status as FAILED and stops the services including sc.

vanzin · 2019-01-25T00:03:28Z

resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala

  }

+  test("run Spark in yarn-client mode with unmanaged am") {
+    testBasicYarnApp(true, Map("spark.yarn.unmanagedAM.enabled" -> "true"))


YARN_UNMANAGED_AM.key

devaraj-kavali · 2019-01-25T01:17:32Z

What's the behavior when the AM fails before the context is stopped?

From the code I see some stuff is printed to the logs and the YARN app is marked as finished. But does the context remain alive? Or should that event cause the context to be stopped?

I'm mostly concerned with how clear it is for the user that the jobs start failing because the context is now unusable.

If AM fails before the context is stopped, AM reports the FAILED status to the RM and Client receives the FAILED status as part of monitoring and stops the services including the context.

SparkQA · 2019-01-25T01:38:40Z

Test build #101658 has finished for PR 19616 at commit 3b377af.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2019-01-25T18:00:19Z

If AM fails before the context is stopped...

That explains the YARN side of things. What about the Spark side of things? What will the user see? Is it clear to the user that the context is now unusable because the YARN app is not running anymore?

Or will they get weird errors because of executors not being allocated and things like that?

devaraj-kavali · 2019-01-25T19:39:25Z

That explains the YARN side of things. What about the Spark side of things? What will the user see? Is it clear to the user that the context is now unusable because the YARN app is not running anymore?

Or will they get weird errors because of executors not being allocated and things like that?

It gives an error log from YarnClientSchedulerBackend that YARN application has exited unexpectedly with state FAILED!... and shows the user that (No active SparkContext.) when try to access, and no additional errors, the behavior is similar to application failed in yarn-client mode without unmanaged am enabled.

vanzin · 2019-01-25T19:52:24Z

Sounds good. Merging to master.

devaraj-kavali · 2019-01-25T20:03:38Z

Thank you so much @vanzin.

HeartSaVioR · 2019-01-26T02:30:17Z

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnable.scala

    // Add log urls
    container.foreach { c =>
-      sys.env.get("SPARK_USER").foreach { user =>
+      sys.env.filterKeys(_.endsWith("USER")).foreach { user =>


@devaraj-kavali @vanzin

While resolving merge conflict in #23260, I found two observations here:

What output we expect when env have more than one matching keys? This looks like always leveraging the last key which is indeterministic if there're more than one keys being matched.

Here the user is not a value but a (key, value) here, so we need to use either key or value (I guess we would like to pick value).

Thanks @HeartSaVioR for finding and reporting it, It was my mistake, I am sorry for that. I have created PR #23659 to fix the issue.

…ent mode ## What changes were proposed in this pull request? Providing a new configuration "spark.yarn.un-managed-am" (defaults to false) to enable the Unmanaged AM Application in Yarn Client mode which launches the Application Master service as part of the Client. It utilizes the existing code for communicating between the Application Master <-> Task Scheduler for the container requests/allocations/launch, and eliminates these, 1. Allocating and launching the Application Master container 2. Remote Node/Process communication between Application Master <-> Task Scheduler ## How was this patch tested? I verified manually running the applications in yarn-client mode with "spark.yarn.un-managed-am" enabled, and also ensured that there is no impact to the existing execution flows. I would like to hear others feedback/thoughts on this. Closes apache#19616 from devaraj-kavali/SPARK-22404. Authored-by: Devaraj K <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>

[SPARK-22404][YARN] Provide an option to use unmanaged AM in yarn-client

e51f99e

mode

Devaraj K added 3 commits December 12, 2017 17:33

Merge branch 'master' into SPARK-22404

764c302

Adding Client changes missed while merging master

640013b

Resolving ApplicationMaster conflicts with the merge

cba0c6d

devaraj-kavali changed the title ~~[SPARK-22404][YARN][WIP] Provide an option to use unmanaged AM in yarn-client mode~~ [SPARK-22404][YARN] Provide an option to use unmanaged AM in yarn-client mode Dec 14, 2017

vanzin reviewed Feb 1, 2018

View reviewed changes

Devaraj K added 2 commits February 14, 2018 17:12

Fixed the review comments

19b6c3a

Merge branch 'master' into SPARK-22404

ce94235

Merge branch 'master' into SPARK-22404

0921f7a

vanzin reviewed Dec 10, 2018

View reviewed changes

Merge branch 'master' into SPARK-22404

837d25f

Fixing the review comments

65aeba9

vanzin reviewed Dec 18, 2018

View reviewed changes

Addressing review comments

93b016f

vanzin reviewed Dec 20, 2018

View reviewed changes

Devaraj K added 2 commits January 9, 2019 18:09

Review comments addressing

dc31940

Merge branch 'master' into SPARK-22404

23ad9de

Merge branch 'master' into SPARK-22404

1c02b7d

Changes to fix issues with master merge

2429e19

vanzin reviewed Jan 17, 2019

View reviewed changes

vanzin reviewed Jan 18, 2019

View reviewed changes

Updated with review comments fix and added a test

6854fc4

vanzin reviewed Jan 25, 2019

View reviewed changes

Updated to use YARN_UNMANAGED_AM.key

3b377af

asfgit closed this in f06bc0c Jan 25, 2019

HeartSaVioR reviewed Jan 26, 2019

View reviewed changes

HeartSaVioR mentioned this pull request Jan 29, 2019

[SPARK-26311][CORE] New feature: apply custom log URL pattern for executor log URLs in SHS #23260

Closed


		/* Unmanaged AM configuration. */

		private[spark] val YARN_UNMANAGED_AM = ConfigBuilder("spark.yarn.unmanagedAM")

[SPARK-22404][YARN] Provide an option to use unmanaged AM in yarn-client mode #19616

[SPARK-22404][YARN] Provide an option to use unmanaged AM in yarn-client mode #19616

Uh oh!

Conversation

devaraj-kavali commented Oct 31, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

vanzin commented Dec 12, 2017

Uh oh!

devaraj-kavali commented Dec 12, 2017

Uh oh!

vanzin commented Dec 19, 2017

Uh oh!

SparkQA commented Dec 19, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 15, 2018

Uh oh!

SparkQA commented Feb 15, 2018

Uh oh!

devaraj-kavali commented Mar 5, 2018

Uh oh!

vanzin commented Mar 8, 2018

Uh oh!

HyukjinKwon commented Jul 16, 2018

Uh oh!

SparkQA commented Jul 16, 2018

Uh oh!

SparkQA commented Jul 17, 2018

Uh oh!

vanzin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devaraj-kavali commented Oct 31, 2017 •

edited

Loading

devaraj-kavali commented Dec 14, 2018 •

edited

Loading