-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-6449][YARN] Report failure status if driver throws exception #5130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e.g. OutOfMemoryError on the driver was leading to application reporting SUCCESS on history server and to YARN RM.
|
Test build #28970 has started for PR 5130 at commit
|
|
Test build #28970 has finished for PR 5130 at commit
|
|
Test PASSed. |
|
@tgravescs - can you take a look? |
|
@ryan-williams - can you add the jira ticket to the title, and add "YARN", e.g. "[SPARK-1234][YARN]"? Thanks. |
|
oh yea, sry I forgot to do that @rxin |
|
If the driver throws an exception, the exception will be the cause of |
|
As @zsxwing says, it appears that the code is already trying to handle this case. Do Also, how were we ending up with a success before? If anything forced us to break out of that try block, it seems like we wouldn't call Last, what if we run into an |
It will wrap Error, too. Run the following codes in my machine, class Foo {}
object Foo {
def main(args: Array[String]): Unit = {
val a = ArrayBuffer[String]()
while(true) {
a += "111111111111111111111111111111"
}
}
}
object Bar {
def main(args: Array[String]): Unit = {
val mainMethod = classOf[Foo].getMethod("main", classOf[Array[String]])
mainMethod.invoke(null, null)
}
}and it outputs, |
This has been fixed in #4773 |
After #4773, it should not end up with a success for OutOfMemoryError. However, in my experience, if |
Since AM does not set |
|
@ryan-williams Are you seeing this exception with spark 1.3 then or with older version? (ie pr4773 didn't fix this particular issue) |
@zsxwing Can you clarify this? Are you running something that never starts SparkContext? I'm not sure what you mean by the user doesn't create spark context but the driver exits normally. |
E.g., my application may check some folders at first. If they exist, it will create |
|
That case is basically not handled right now. We expect one of the first things is to create the SparkContext which is why the AM waits for the spark context to be initialized. Anything you do in your program before doing that initialization is relying on the fact that we wait a certain period for it to be initialized and if you never create it, we consider that as failure. Seems like more of a think a workflow manager should be doing but if you want to handle that case I suggest filing a separate jira. |
|
@tgravescs Thanks for the clarification |
e.g. OutOfMemoryError on the driver was leading to application
reporting SUCCESS on history server and to YARN RM.