Skip to content

Conversation

@BryanCutler
Copy link
Member

When an Executor process is destroyed, the FileAppender that is asynchronously reading the stderr stream of the process can throw an IOException during read because the stream is closed. Before the ExecutorRunner destroys the process, the FileAppender thread is flagged to stop. This PR wraps the inputStream.read call of the FileAppender in a try/catch block so that if an IOException is thrown and the thread has been flagged to stop, it will safely ignore the exception. Additionally, the FileAppender thread was changed to use Utils.tryWithSafeFinally to better log any exception that do occur. Added unit tests to verify a IOException is thrown and logged if FileAppender is not flagged to stop, and that no IOException when the flag is set.

@SparkQA
Copy link

SparkQA commented Jan 12, 2016

Test build #49198 has finished for PR 10714 at commit dc65737.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 12, 2016

Test build #49219 has finished for PR 10714 at commit 20d8e7b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth changing as this is fine, but I think you could just write

case _: IOException if markedForStop => // do nothing because ...

... but maybe I'm missing something about how that works.

@BryanCutler
Copy link
Member Author

Thanks @srowen , I think your suggestion makes it easier to follow what is going on, so I made that change. I also changed the check for appending for only if positive bytes are read, which will just save a call to write if n is 0.

@SparkQA
Copy link

SparkQA commented Jan 12, 2016

Test build #49256 has finished for PR 10714 at commit 4684a58.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jan 13, 2016

Test build #49263 has finished for PR 10714 at commit 4684a58.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jan 13, 2016

LGTM

@asfgit asfgit closed this in 56cdbd6 Jan 14, 2016
asfgit pushed a commit that referenced this pull request Jan 14, 2016
When an Executor process is destroyed, the FileAppender that is asynchronously reading the stderr stream of the process can throw an IOException during read because the stream is closed.  Before the ExecutorRunner destroys the process, the FileAppender thread is flagged to stop.  This PR wraps the inputStream.read call of the FileAppender in a try/catch block so that if an IOException is thrown and the thread has been flagged to stop, it will safely ignore the exception.  Additionally, the FileAppender thread was changed to use Utils.tryWithSafeFinally to better log any exception that do occur.  Added unit tests to verify a IOException is thrown and logged if FileAppender is not flagged to stop, and that no IOException when the flag is set.

Author: Bryan Cutler <[email protected]>

Closes #10714 from BryanCutler/file-appender-read-ioexception-SPARK-9844.

(cherry picked from commit 56cdbd6)
Signed-off-by: Sean Owen <[email protected]>
@srowen
Copy link
Member

srowen commented Jan 14, 2016

Merged to master, 1.6

@BryanCutler
Copy link
Member Author

Thanks!

@srowen
Copy link
Member

srowen commented Jan 15, 2016

@yhuai it's also possibly my cherry-pick to 1.6 was somehow not correct:
0c67993

If it's failing consistently let's just revert it for 1.6. If it's not consistent, give Bryan a chance to look first I guess.

@yhuai
Copy link
Contributor

yhuai commented Jan 15, 2016

yea no problem. Will keep an eye on the test.

@BryanCutler
Copy link
Member Author

Ah, I know what's going on @yhuai and @srowen . The cherry pick is fine, but I had SPARK-12701 merged earlier because of this reason, and it just went into master. Since the thread isn't properly joined, the test can exit before the log is written, and that is what the test is checking. Can we get SPARK-12701 also on 1.6?

@yhuai
Copy link
Contributor

yhuai commented Jan 15, 2016

@srowen @zsxwing Is it safe to merge https://issues.apache.org/jira/browse/SPARK-12701 to branch 1.6? Or, we should revert it from branch 1.6?

@srowen
Copy link
Member

srowen commented Jan 15, 2016

I think it is safe to back-port. I'll do that now.

@BryanCutler
Copy link
Member Author

Yeah, it should be safe to backport, only the tests actually call awaitTermination for this thread.

@BryanCutler BryanCutler deleted the file-appender-read-ioexception-SPARK-9844 branch February 29, 2016 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants