-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-16441][YARN] Set maxNumExecutor depends on yarn cluster resources. #16819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I don't think this is a necessary change. Already, you can't ask for more resources than the cluster has; the cluster won't grant them. Capping it here means the app can't use more resources if the cluster suddenly gets more. I see the problem you're trying to solve but the resource manager already ramps up requests slowly, so I don't think this is the issue. |
|
Test build #72434 has finished for PR 16819 at commit
|
|
I agree. Resource managers generally expect applications to request more than what's available already so we don't have to do it again ourselves in Spark. |
|
It will reduce the function call on CoarseGrainedSchedulerBackend.requestTotalExecutors() after apply this PR: Full log can be found here. |
|
What problem does this solve though? calling that function is not a problem. It seems like you get the right behavior in both cases. Are you saying there's some RPC problem? The target goes very high, but, as far as I can see it's correctly reflecting the fact that the app would use a lot of executors if it could -- that's fine. |
|
@srowen . Dynamic set
I add a unit test just now. |
|
Test build #73147 has finished for PR 16819 at commit
|
|
Test build #73151 has finished for PR 16819 at commit
|
| val defaultMaxNumExecutors = DYN_ALLOCATION_MAX_EXECUTORS.defaultValue.get | ||
| if (defaultMaxNumExecutors == sparkConf.get(DYN_ALLOCATION_MAX_EXECUTORS)) { | ||
| val executorCores = sparkConf.getInt("spark.executor.cores", 1) | ||
| val maxNumExecutors = yarnClient.getNodeReports().asScala. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we take queue's maxResources amount into account from ResourceManager REST APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I will try API first. Pseudo code:
import org.apache.hadoop.yarn.client.api.{YarnClient, YarnClientApplication}
import scala.collection.JavaConverters._
import org.apache.hadoop.yarn.api.protocolrecords._
import org.apache.hadoop.yarn.api.records._
import org.apache.hadoop.yarn.conf.YarnConfiguration
val yarnConf = new YarnConfiguration()
val yarnClient = YarnClient.createYarnClient
yarnClient.init(yarnConf)
yarnClient.start()
yarnClient.getRootQueueInfos
|
Test build #73277 has finished for PR 16819 at commit
|
|
Test build #73282 has finished for PR 16819 at commit
|
|
I agree there's room for improvement in the current code; I even asked SPARK-18769 to be filed to track that work. But I don't think setting the max to a fixed value at startup is the right approach. Queue configs change, node managers go up and down, new ones are added, old ones are removed. If this value ends up being calculated at the wrong time, the application will suffer. If you want to investigate a more dynamic approach here I'm all for that, but I'm not a big fan of the current solution. |
|
@vanzin We must pull the configuration from ResourceManager, ResourceManager can't push. In fact, This is suitable for periodic tasks. e.g. ML, SQL, |
|
Getting the config only at the beginning, to me, is not an acceptable solution. Getting it every once in a while is better, but it's not the only possible approach. I even suggest something different in the bug I mention above. |
|
Test build #73515 has finished for PR 16819 at commit
|
|
I agree with others, this is not the way to do this. There are different schedulers in yarn, each with different configs that could affect the actual resources you get. If you want to do something like this it should look at the available resources after calling the allocate call to yarn (allocateResponse.getAvailableResources). When yarn returns it tells you the available resources, which takes into account the various scheduler things. MapReduce refers to that as headroom and uses it to determine things like if it needs to kill a reducer to run a map. We could use this to help with dynamic allocation and do more intelligent things. |
|
@vanzin What do you think about current approach? I have tested on a same Spark hive-thriftserver, the |
|
So your current approach is to have a second connection to the RM, and ask for the RM's available resources every time the scheduler tries to change the number of resources. Did you look at Tom's suggestion of using {{AllocateResponse.getAvailableResources()}} instead? Seems like it would be simpler, cheaper, and could all be handled internally in {{YarnAllocator.scala}}. |
Closes apache#16819 Closes apache#13467 Closes apache#16083 Closes apache#17135 Closes apache#8785 Closes apache#16278 Closes apache#16997 Closes apache#17073 Closes apache#17220
What changes were proposed in this pull request?
Dynamic set
spark.dynamicAllocation.maxExecutorsby cluster resources.How was this patch tested?
manual test and unit test