Skip to content

Conversation

@windkit
Copy link
Contributor

@windkit windkit commented Oct 17, 2017

What changes were proposed in this pull request?

To limit the amount of resources a spark job accept from Mesos, currently we can only use spark.cores.max to limit in terms of cpu cores.
However, when we have big memory executors, it would consume all the resources.

This PR added spark.mem.max option for Mesos

How was this patch tested?

Added Unit Test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove space

@skonto
Copy link
Contributor

skonto commented Oct 18, 2017

@windkit there is an open issue here: https://issues.apache.org/jira/browse/SPARK-22133
Could you please add documentation of the new property along with the old ones.

Copy link
Contributor

@skonto skonto Oct 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we defend against minimum values and fail fast? For example default executor memory is 1.4MB. We could calculate the value returned by MesosSchedulerUtils.executorMemory. I don't think these values calculated in canLaunchTask ever change. Same applies for cpus.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skonto
For cpus, I think we can compare with minCoresPerExecutor
For mem, calling the MesosSchedulerUtils.executorMemory to get the minimum requirement.

Then at here, we parse the option, check the minimum and if it is too small, throw exception?

Copy link
Contributor

@skonto skonto Oct 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception I think would be ok, the idea if something is never going to work let the user know, especially for the novice user. In general, the minimum would be a warning if we dont want an exception thrown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add the check with this PR or a separate one?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have a check similar to

but for memory, so that we know we'll "land" on the maximum?

@windkit
Copy link
Contributor Author

windkit commented Oct 20, 2017

@skonto sure I can add those in, can you point to me where the documentation source code is?

@skonto
Copy link
Contributor

skonto commented Oct 20, 2017

@skonto
Copy link
Contributor

skonto commented Oct 20, 2017

@susanxhuynh pls review.

@susanxhuynh
Copy link
Contributor

@windkit Trying to understand the need for this config ... could you accomplish the same thing by setting spark.cores.max, spark.executor.cores, and spark.executor.memory? Could you give an example scenario where this is needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'park'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'park'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'park'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add "across the cluster (not from each machine)". And, something about there is no maximum if this property is not set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion: "Duration for which unused resources are considered declined, when maximum number of cores spark.cores.max has been reached."
@ArtRand Is this the documentation you had in mind in https://issues.apache.org/jira/browse/SPARK-22133 ? Is this enough information for a non-Mesos expert to set this?

@windkit
Copy link
Contributor Author

windkit commented Nov 1, 2017

@susanxhuynh Thanks for reviewing.
I want to use both spark.mem.max and spark.cores.max to limit resource one task can use within the cluster.
Now I am setting up a common cluster for several users, they are allowed to configure spark.executor.cores and spark.executor.memory according to their need. I then need a limit for both cpu cores and memory.

asfgit pushed a commit that referenced this pull request Nov 8, 2017
## What changes were proposed in this pull request?
Documentation about Mesos Reject Offer Configurations

## Related PR
#19510 for `spark.mem.max`

Author: Li, YanKit | Wilson | RIT <[email protected]>

Closes #19555 from windkit/spark_22133.
@skonto
Copy link
Contributor

skonto commented May 19, 2018

@windkit Could you update the PR? @susanxhuynh I think this is fine to merge, are you ok?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@HyukjinKwon
Copy link
Member

gentle ping @windkit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants