-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-4989][CORE] avoid wrong eventlog conf cause cluster down in standalone mode #3824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #24858 has started for PR 3824 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word Relay was correct here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, relay is correct, replay is not correct, thanks.
|
Test build #24861 has started for PR 3824 at commit
|
|
Test build #24858 has finished for PR 3824 at commit
|
|
Test PASSed. |
|
Test build #24861 has finished for PR 3824 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like eventLogFile is only read from inside the try block on the following line, so why not move it inside and make it into a val instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JoshRosen , eventLogFile is also used in catch block in this file.
|
This change seems okay to me overall, aside from one minor nit. Most of the change is just broadening the scope of the |
|
Test build #25025 has started for PR 3824 at commit
|
|
Test build #25025 has finished for PR 3824 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as commented elsewhere, I don't think logging the stack trace here is particularly useful
|
Test build #25199 has started for PR 3824 at commit
|
|
Test build #25199 has finished for PR 3824 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great. This used to say something like
No event logs found for application X in /history/not-found
which doesn't make sense.
|
Ok LGTM I'm merging this into master thanks |
|
@andrewor14 , I received an email of your comment about creating other PRs to fix this issue for other older branches, but not found on this page. I think you might have removed that comment, so do I still need to make new PRs or just ignore that message? |
|
Yes that would be great if you have the time. It seems that not all of the changes in this PR are applicable there, however. |
|
ok, I'll make new PRs for those old branches 1.0, 1.1, and 1.2. |
when enabling eventlog in standalone mode, if give the wrong configuration, the standalone cluster will down (cause master restart, lose connection with workers).
How to reproduce: just give an invalid value to "spark.eventLog.dir", for example: spark.eventLog.dir=hdfs://tmp/logdir1, hdfs://tmp/logdir2. This will throw illegalArgumentException, which will cause the Master restart. And the whole cluster is not available.