-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-14069][SQL] Improve SparkStatusTracker to also track executor information #11888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @rxin |
|
Test build #53775 has finished for PR 11888 at commit
|
|
I found it's difficulty to write tests for it. As it just collects the informations which are already exposed by the system, is it worth to test them again? cc @rxin |
|
It's probably ok. |
|
Test build #53884 has finished for PR 11888 at commit
|
|
retest this please. |
|
Test build #53904 has finished for PR 11888 at commit
|
|
retest this please. |
|
Test build #53921 has finished for PR 11888 at commit
|
|
retest this please |
|
Test build #53925 has finished for PR 11888 at commit
|
| /** | ||
| * Returns a list of all known executors, represented by string with format: "host:port" | ||
| */ | ||
| def getExecutors(): Array[String] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems kind of arbitrary that getExecutors returns host:port but not IDs. I think it's better that we make a SparkExecutorInfo or something and expose the host:port there, along with other things like cache size, numRunningTasks etc. Then in the future we can add more things we want to expose without tying ourselves with the host:port identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I just want a list of executors, why shouldn't I be able to get them? I think it makes sense to have a more detailed version (maybe replace the following 2), but having a simple one that returns just the list of executors seem to make sense too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing is I don't know if we want to query the scheduler every time we want a list of executors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can expose both, but I'd rather call it something more explicit like getExecutorHostPort or something. Elsewhere in Spark I would think getExecutors: Array[String] returns the executor IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getExecutorList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getExecutorList LGTM, I'll rename to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, how is getExecutorList different from getExecutors? Why not just be more specific what the strings are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually let me move this to the main thread so it doesn't get collapsed.
|
Test build #54526 has finished for PR 11888 at commit
|
|
@rxin I don't understand the distinction between The other problem is the keys to the other maps are also expected to be |
|
My proposal: This is more consistent with the existing status API, where we have things like |
|
What are executor ids? is that even an external concept? |
|
Can you paste me what an executor looks like? If you just tell me "executor id" as an end user, I have no clue what you are talking about. |
|
|
Yea that integer id is completely useless to users who want to figure out what to do with their clusters. |
|
OK, @rxin and I discussed this more offline. Our proposal is: Then we don't need to tie us down with the very specific |
|
Test build #54577 has finished for PR 11888 at commit
|
|
LGTM retest this please |
|
Test build #54588 has finished for PR 11888 at commit
|
|
Merged into master thanks guys. |
|
This looks somewhat dodgy to me from a thread-safety perspective since |
| // Number of tasks running on each executor | ||
| private val executorIdToTaskCount = new HashMap[String, Int] | ||
|
|
||
| def runningTasksByExecutors(): Map[String, Int] = executorIdToTaskCount.toMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a synchronized here would resolve the thread-safety issue, I think. I'll do this as part of a patch fixing another bug and also touching this line.
|
Why did we merge this when the description says "N/A"? |
|
@rxin Do you mean the N/A in "How was this patch tested?" Some guy said that the lack of tests was ok. #11888 (comment) |
|
Yea but "TODO: tests" and tests: N/A ... We needed to update the description. |



What changes were proposed in this pull request?
Track executor information like host and port, cache size, running tasks.
How was this patch tested?
manual test