Skip to content

Conversation

@quanfuw
Copy link

@quanfuw quanfuw commented Oct 18, 2016

What changes were proposed in this pull request?

Add NUMA aware support for Yarn based deployment mode.
This patch optimizes the memory allocation, executors are bound to NUMA nodes in round-robin for a worker node so that memory allocation tries local NUMA node firstly and only when there is no enough memory in local NUMA node it tries remote ones.
Before this patch, Spark is NUMA unaware in which many remote memory allocations happen and the tremendous remote memory accesses impact performance a lot. We observed significant performance improvement during NUMA aware patch evaluation.

To Do:

  1. Add support for NUMA node numbers' configuration and make testing.
  2. Add NUMA aware support for Mesos based deployment mode and make testing.
  3. Add NUMA aware support for Standalone deployment mode and make testing.

How was this patch tested?

We observed significant performance improvement during evaluation with BigBench. We are still making evaluation and more detailed results will be updated continuously.

Setup:
Cluster Topo: 1 Master + 4 Slaves (Spark on Yarn)
CPU: Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz(72 Cores)
Memory: 128GB(2 NUMA Nodes)
NIC: 1x10Gb/Sec
Disk: Write -1.5GB/Sec, Read- 5GB/Sec
SW Version: Hadoop-5.7.0 + Spark-2.0.0

NUMA Introduction

As below diagram depicts, in UMA(Uniform Memory Access) model, processors share one bus. The contention on bus becomes very heavy when processer scales up. NUMA(Non-Uniform Memory Access) processer has a better scalability by dividing processors and memory blocks into nodes, nodes are interconnected with added bus.
For NUMA, the memory accessing to a remote node is much slower than accesing to local one, while, for UMA memory accessing to any nodes is uniform.

image

For more NUMA information, please refer to https://en.wikipedia.org/wiki/Non-uniform_memory_access.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@quanfuw quanfuw changed the title add numa aware support(WIP, not ready for review) [Spark][JIRA: SPARK-17984][YARN, Mesos, Deploy][WIP] add support for numa aware feature Oct 18, 2016
@quanfuw quanfuw changed the title [Spark][JIRA: SPARK-17984][YARN, Mesos, Deploy][WIP] add support for numa aware feature [SPARK-17984][YARN][Mesos][Deploy][WIP] add support for numa aware feature Oct 18, 2016
@quanfuw quanfuw changed the title [SPARK-17984][YARN][Mesos][Deploy][WIP] add support for numa aware feature [SPARK-17984][YARN][Mesos][Deploy][WIP] Add support for numa aware feature Oct 18, 2016
@quanfuw quanfuw changed the title [SPARK-17984][YARN][Mesos][Deploy][WIP] Add support for numa aware feature [SPARK-17984][YARN][Mesos][Deploy][WIP] Add support for NUMA aware feature Oct 18, 2016
@srowen
Copy link
Member

srowen commented Oct 25, 2016

This should be closed in favor of #15579 at least

srowen added a commit to srowen/spark that referenced this pull request Oct 31, 2016
@asfgit asfgit closed this in 26b07f1 Oct 31, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants