Skip to content

Conversation

@xwu-intel
Copy link
Contributor

What changes were proposed in this pull request?

Complete Standalone, Yarn, Mesos support from #15579 and adapt to latest master branch.

How was this patch tested?

Standalone and Yarn mode tested on spark cluster with 1 master + 2 slaves, Mesos mode is not tested due to lack of resources.

How to use

  1. Global Environment Variable
    export SPARK_EXECUTOR_LAUNCH_PREFIX="/tmp/spark-numa-example.sh"
  2. Config Files
    • Standalone mode: add SPARK_EXECUTOR_LAUNCH_PREFIX="/tmp/spark-numa-example.sh" in conf/spark-env.sh
    • Yarn client mode: add spark.yarn.appMasterEnv.SPARK_EXECUTOR_LAUNCH_PREFIX="/tmp/spark-numa-example.sh" in conf/spark-defaults.conf

Attach the example script for adding executor launch prefix to enable NUMA-aware binding for executors. Same apply to adding other launch prefix such as strace, vtune etc..

spark-numa-example.zip

@srowen
Copy link
Member

srowen commented Dec 27, 2016

Why is this different from #16411 ?

@xwu-intel
Copy link
Contributor Author

@srowen do you mean #15579?

  1. Fix some character escape issues of [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for executor launch with leading arguments #15579 if the command string contains some special characters like '
  2. Add Standalone and Mesos support as [SPARK-17984][YARN][Mesos][Deploy][WIP] Added support for executor launch with leading arguments #15579 only support Yarn mode

@srowen
Copy link
Member

srowen commented Dec 28, 2016

Oops yes wrong copy/paste.
We have two overlapping PRs then. It'd be better to collaborate on one. If the other is inactive can you close it @sheepduke ? but I am not sure we're going to merge this anyway.

@xwu-intel
Copy link
Contributor Author

@srowen sheepduke is my colleague who just left. I am continuing to refine his work. It's OK to close #15579 . @sheepduke

@tgravescs
Copy link
Contributor

I'm a little worried this is very open ended and could cause a lot of issues with users using it wrong. This opens up customers to basically do anything they want while launching an executor. Even not launching an executor since really this is replacing the normal executor launch command with this script. It relies on that customers script to actually launch the executor based on the command passed in.

If this goes in it definitely need much better explanation and docs on how to properly write and use it. I would rather see it being more truly of a pre-init script then a total replacement. Perhaps the spark executor launch command is a script that will pre-pend some users stuff but then makes sure it still calls the normal java executor launch command.

Also what about yarn cluster mode?

Do you have test results of configuring numa that shows definite improvements? How does this compare to Automatic NUMA balancing that I believe is on by default in Rhel7. I realize perhaps most machines aren't running rhel7 yet but wondering if it was tried.

does numactl require special priveleges (like root) to do certain operations?

The script looks very basic which I understand for an example is fine but it seems like there are definitely things missing and things people could get wrong.
For instance, how do you handle multiple containers on a node. How does this work when you specify an executor to have X cores.

Note I haven't done any tuning of numa myself so sorry if some of these questions seem obvious.
How does processes with numa configured interact with processes that don't? It seems like tuning things right could be quite hard especially if running on something like yarn where other applications aren't using the same logic to configure numa.

@xwu-intel
Copy link
Contributor Author

@tgravescs Thanks for your comments. There are two things we have tried.

  1. To add a prefix command on executor launch
    I agree this opens a door for user to do anything for launching the executor. This patch is intended for profiling and debugging. May not fit for production. I am not sure it's the best form to implement, it fixed our problem quickly.

  2. NUMA
    The script attached about NUMA is only an example to show how to use this patch. User can customize it to fulfill their specific needs. Automatic NUMA balancing is by default enabled on our system. As mentioned in the original Redhat slides, It can only deal with certain cases and still can not beat manual pinning. From our experiments, not all cases have big NUMA penalties. We should use some platform tools such as Intel VTune to identify if there is a NUMA problem and tune case by case.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen srowen mentioned this pull request Aug 20, 2018
@asfgit asfgit closed this in b8788b3 Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants