Skip to content

Commit ca66159

Browse files
kasjainAndrew Or
authored andcommitted
SPARK-5613: Catch the ApplicationNotFoundException exception to avoid thread from getting killed on yarn restart.
[SPARK-5613] Added a catch block to catch the ApplicationNotFoundException. Without this catch block the thread gets killed on occurrence of this exception. This Exception occurs when yarn restarts and tries to find an application id for a spark job which got interrupted due to yarn getting stopped. See the stacktrace in the bug for more details. Author: Kashish Jain <[email protected]> Closes apache#4392 from kasjain/branch-1.2 and squashes the following commits: 4831000 [Kashish Jain] SPARK-5613: Catch the ApplicationNotFoundException exception to avoid thread from getting killed on yarn restart.
1 parent b3872e0 commit ca66159

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ package org.apache.spark.scheduler.cluster
2020
import scala.collection.mutable.ArrayBuffer
2121

2222
import org.apache.hadoop.yarn.api.records.{ApplicationId, YarnApplicationState}
23+
import org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException
2324

2425
import org.apache.spark.{SparkException, Logging, SparkContext}
2526
import org.apache.spark.deploy.yarn.{Client, ClientArguments}
@@ -133,8 +134,14 @@ private[spark] class YarnClientSchedulerBackend(
133134
val t = new Thread {
134135
override def run() {
135136
while (!stopping) {
136-
val report = client.getApplicationReport(appId)
137-
val state = report.getYarnApplicationState()
137+
var state: YarnApplicationState = null
138+
try {
139+
val report = client.getApplicationReport(appId)
140+
state = report.getYarnApplicationState()
141+
} catch {
142+
case e: ApplicationNotFoundException =>
143+
state = YarnApplicationState.KILLED
144+
}
138145
if (state == YarnApplicationState.FINISHED ||
139146
state == YarnApplicationState.KILLED ||
140147
state == YarnApplicationState.FAILED) {

0 commit comments

Comments
 (0)