-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-14454] Better exception handling while marking tasks as failed #12234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #55196 has finished for PR 12234 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the thing is this exception doesn't make it to the driver. it would be great if the error message that made it to the driver can contain the error for both, and the original exception's cause. then users know there is another exception that failed during close/callback, and they can go look up in the executor for the full stacktrace
eebd2ef to
7964b2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - didn't know this existed
|
Test build #55197 has finished for PR 12234 at commit
|
7964b2d to
94b37f2
Compare
|
Test build #55198 has finished for PR 12234 at commit
|
94b37f2 to
c9aaff0
Compare
| * fail as well. This would then suppress the original/likely more meaningful | ||
| * exception from the original `out.write` call. | ||
| */ | ||
| def tryWithSafeCatchAndFailureCallbacks[T](block: => T)(catchBlock: => Unit): T = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems redundant with the method above; they can be unified right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thanks. folded these changes in the method above
|
Test build #55199 has finished for PR 12234 at commit
|
|
@davies are there other occurrences of this pattern? |
|
test this please |
|
Test build #55297 has finished for PR 12234 at commit
|
|
Test build #55323 has finished for PR 12234 at commit
|
|
LGTM, |
|
@sameeragarwal can you create one for 1.6 backport? |
…failed Backports #12234 to 1.6. Original description below: ## What changes were proposed in this pull request? This patch adds support for better handling of exceptions inside catch blocks if the code within the block throws an exception. For instance here is the code in a catch block before this change in `WriterContainer.scala`: ```scala logError("Aborting task.", cause) // call failure callbacks first, so we could have a chance to cleanup the writer. TaskContext.get().asInstanceOf[TaskContextImpl].markTaskFailed(cause) if (currentWriter != null) { currentWriter.close() } abortTask() throw new SparkException("Task failed while writing rows.", cause) ``` If `markTaskFailed` or `currentWriter.close` throws an exception, we currently lose the original cause. This PR fixes this problem by implementing a utility function `Utils.tryWithSafeCatch` that suppresses (`Throwable.addSuppressed`) the exception that are thrown within the catch block and rethrowing the original exception. ## How was this patch tested? No new functionality added Author: Sameer Agarwal <[email protected]> Closes #12272 from sameeragarwal/fix-exception-1.6.
…failed Backports apache#12234 to 1.6. Original description below: ## What changes were proposed in this pull request? This patch adds support for better handling of exceptions inside catch blocks if the code within the block throws an exception. For instance here is the code in a catch block before this change in `WriterContainer.scala`: ```scala logError("Aborting task.", cause) // call failure callbacks first, so we could have a chance to cleanup the writer. TaskContext.get().asInstanceOf[TaskContextImpl].markTaskFailed(cause) if (currentWriter != null) { currentWriter.close() } abortTask() throw new SparkException("Task failed while writing rows.", cause) ``` If `markTaskFailed` or `currentWriter.close` throws an exception, we currently lose the original cause. This PR fixes this problem by implementing a utility function `Utils.tryWithSafeCatch` that suppresses (`Throwable.addSuppressed`) the exception that are thrown within the catch block and rethrowing the original exception. ## How was this patch tested? No new functionality added Author: Sameer Agarwal <[email protected]> Closes apache#12272 from sameeragarwal/fix-exception-1.6. (cherry picked from commit c12db0d)
What changes were proposed in this pull request?
This patch adds support for better handling of exceptions inside catch blocks if the code within the block throws an exception. For instance here is the code in a catch block before this change in
WriterContainer.scala:If
markTaskFailedorcurrentWriter.closethrows an exception, we currently lose the original cause. This PR fixes this problem by implementing a utility functionUtils.tryWithSafeCatchthat suppresses (Throwable.addSuppressed) the exception that are thrown within the catch block and rethrowing the original exception.How was this patch tested?
No new functionality added