Skip to content

Conversation

@SOmeONee
Copy link

What changes were proposed in this pull request?

By using spark streaming, --total-executor-cores must greater than 1.
This pull requst add the validation of spark.cores.max.

How was this patch tested?

manual tests:
--total-executor-cores 1
--total-executor-cores 2

Please review http://spark.apache.org/contributing.html before opening a pull request.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Aug 15, 2017

I don't think that's valid. It doesn't catch all cases where you need > 1 core, and isn't necessary for cases that don't involve receivers, I think.

@SOmeONee
Copy link
Author

To example NetworkWordCount, if run with '--total-executor-cores 1', it can' be success.

@srowen
Copy link
Member

srowen commented Aug 15, 2017

That's not the problem -- I'm saying there are cases where 1 is valid. There are obviously cases where it isn't. But your change makes them all fail.

@SOmeONee
Copy link
Author

Ok, sorry, I'ill review it, thanks.

}

if (sc.conf.contains("spark.cores.max")) {
val totalCores = sc.conf.getInt("spark.cores.max", 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, "spark.cores.max" is not set by default, with your change it will always return 1 if not set and throws exception as you wrote.

Besides if there're several receivers in streaming application, even > 1 is not sufficient. So simply checking "spark.cores.max" is not a feasible way to address the issue, and there may not be a good way to handle this problem properly, so I'd suggest to leave the current code as it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config spark.cores.max is used to limit the max number of cores that a single executor can require, and as @jerryshao pointed out, it's not set by default. I'm not convinced that validating the config can benefit your issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerryshao spark.cores.max will be returned only when conf contain it.
Here is the biggest possibility to make a judgment.

Copy link
Contributor

@jerryshao jerryshao Aug 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SOmeONee I'm not quite following your comment here. You assume that default value is "1" which changes the semantics of this configuration. What's more, checking whether "<=1" is not so sufficient as I mentioned above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiangxb1987 "spark.cores.max" is per application configuration to limit the numbers of cores can be requested for this application, it is not a per executor limitation.

The config spark.cores.max is used to limit the max number of cores that a single executor can require

So still if we have 2 receivers in one streaming application, the minimum number should > 2, checking "1" here is still not feasible.

Since receiver number can only be gotten in run-time, checking configuration will not be worked as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, seems the validation here is not very useful.

@SOmeONee
Copy link
Author

@srowen I reviewed again and run the tests, there are no test cases with spark.cores.max=1 in streaming

@jerryshao
Copy link
Contributor

The patch here is not solid, we will not merge it unless you have better solution.

srowen added a commit to srowen/spark that referenced this pull request Sep 12, 2017
@srowen srowen mentioned this pull request Sep 12, 2017
@asfgit asfgit closed this in dd88fa3 Sep 13, 2017
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
Closes apache#18522
Closes apache#17722
Closes apache#18879
Closes apache#18891
Closes apache#18806
Closes apache#18948
Closes apache#18949
Closes apache#19070
Closes apache#19039
Closes apache#19142
Closes apache#18515
Closes apache#19154
Closes apache#19162
Closes apache#19187
Closes apache#19091

Author: Sean Owen <[email protected]>

Closes apache#19203 from srowen/CloseStalePRs3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants