-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-4832][Deploy]some other processes might take the daemon pid #3683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #24397 has started for PR 3683 at commit
|
|
Test build #24397 has finished for PR 3683 at commit
|
|
Test PASSed. |
sbin/spark-daemon.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this looks like a better test, but isn't it redundant with kill?
Or: in case you have detected that the PID file existed but isn't the expected process, log a message?
You could use ps -p for consistency in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep it is redundant with kill. And I will handle the case that we might kill other processes by mistake.
|
Test build #24405 has started for PR 3683 at commit
|
|
Test build #24406 has started for PR 3683 at commit
|
|
Test build #24405 has finished for PR 3683 at commit
|
|
Test PASSed. |
|
Test build #24406 has finished for PR 3683 at commit
|
|
Test PASSed. |
|
@srowen About the process isn't the one we expected, is it necessary to log a message? Ignoring it would be better cause it doesn't do any hurt I think. |
|
It means the process didn't exit cleanly before, and the PID file is invalid. It can't hurt to note that. The PID file should probably be deleted too. |
|
If the process with the pid stored in pid file is not spark daemon, it cannot prove spark daemon didn't exist cleanly but just some other process take that pid after spark daemon ends. |
|
Hm, shouldn't the process or its script clean up a PID file? If the daemon is not running at that process ID, when would you want to leave a PID file saying it is? |
|
In case that user might start a daemon that is already running, we use pid file to check and give a message. Another case is that daemon might not exit after user stop it, according to the pid file we can prompt them. |
|
I think we can finish and merge this. The current logic of the script is "if anything is running at the PID given in the file, fail". Your change is "if the Spark daemon we expect is running at the PID given in the file, fail" -- meaning in the case where somehow the PID files wasn't cleaned up and another process has that PID, we continue. That's good. I had suggested cleaning up the PID file when it's known it's not the Spark daemon. But I see that in the start case, the PID file is overwritten anyway. In the stop case, it's not cleared, but, maybe that's not so important. So, I think that is actually already addressed. Could I get @nchammas to render an opinion on the bash syntax -- are we not doing back ticks? Otherwise I think it's a good one to merge. |
sbin/spark-daemon.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bash style: Use $(...) instead of backticks, which are discouraged.
Also, quote the output of the subshell, as well as the variable argument to cat:
TARGET_ID="$(cat "$pid")"
|
Generally looks good. Left a bunch of feedback about Bash style and protecting against word splitting bugs. |
|
Test build #27422 has started for PR 3683 at commit
|
|
@nchammas Updated as the feedback. Could you check it again, please? Thanks! |
|
LGTM Bash-wise. |
|
Test build #27422 has finished for PR 3683 at commit
|
|
Test PASSed. |
Some other processes might use the pid saved in pid file. In that case we should ignore it and launch daemons. JIRA is down for maintenance. I will file one once it return. Author: WangTaoTheTonic <[email protected]> Author: WangTaoTheTonic <[email protected]> Closes #3683 from WangTaoTheTonic/otherproc and squashes the following commits: daa86a1 [WangTaoTheTonic] some bash style fix 8befee7 [WangTaoTheTonic] handle the mistake scenario cf4ecc6 [WangTaoTheTonic] remove redundant condition f36cfb4 [WangTaoTheTonic] some other processes might take the pid (cherry picked from commit 1768bd5) Signed-off-by: Sean Owen <[email protected]>
Some other processes might use the pid saved in pid file. In that case we should ignore it and launch daemons. JIRA is down for maintenance. I will file one once it return. Author: WangTaoTheTonic <[email protected]> Author: WangTaoTheTonic <[email protected]> Closes #3683 from WangTaoTheTonic/otherproc and squashes the following commits: daa86a1 [WangTaoTheTonic] some bash style fix 8befee7 [WangTaoTheTonic] handle the mistake scenario cf4ecc6 [WangTaoTheTonic] remove redundant condition f36cfb4 [WangTaoTheTonic] some other processes might take the pid (cherry picked from commit 1768bd5) Signed-off-by: Sean Owen <[email protected]>
Some other processes might use the pid saved in pid file. In that case we should ignore it and launch daemons. JIRA is down for maintenance. I will file one once it return. Author: WangTaoTheTonic <[email protected]> Author: WangTaoTheTonic <[email protected]> Closes apache#3683 from WangTaoTheTonic/otherproc and squashes the following commits: daa86a1 [WangTaoTheTonic] some bash style fix 8befee7 [WangTaoTheTonic] handle the mistake scenario cf4ecc6 [WangTaoTheTonic] remove redundant condition f36cfb4 [WangTaoTheTonic] some other processes might take the pid (cherry picked from commit 1768bd5) Signed-off-by: Sean Owen <[email protected]>
Some other processes might use the pid saved in pid file. In that case we should ignore it and launch daemons.
JIRA is down for maintenance. I will file one once it return.