[SPARK-3453] Netty-based BlockTransferService, extracted from Spark core #2753

aarondav · 2014-10-10T07:16:48Z

This PR encapsulates #2330, which is itself a continuation of #2240. The first goal of this PR is to provide an alternate, simpler implementation of the ConnectionManager which is based on Netty.

In addition to this goal, however, we want to resolve SPARK-3796, which calls for a standalone shuffle service which can be integrated into the YARN NodeManager, Standalone Worker, or on its own. This PR makes the first step in this direction by ensuring that the actual Netty service is as small as possible and extracted from Spark core. Given this, we should be able to construct this standalone jar which can be included in other JVMs without incurring significant dependency or runtime issues. The actual work to ensure that such a standalone shuffle service would work in Spark will be left for a future PR, however.

In order to minimize dependencies and allow for the service to be long-running (possibly much longer-running than Spark, and possibly having to support multiple version of Spark simultaneously), the entire service has been ported to Java, where we have full control over the binary compatibility of the components and do not depend on the Scala runtime or version.

These issues: have been addressed by folding in #2330:

SPARK-3453: Refactor Netty module to use BlockTransferService interface
SPARK-3018: Release all buffers upon task completion/failure
SPARK-3002: Create a connection pool and reuse clients across different threads
SPARK-3017: Integration tests and unit tests for connection failures
SPARK-3049: Make sure client doesn't block when server/connection has error(s)
SPARK-3502: SO_RCVBUF and SO_SNDBUF should be bootstrap childOption, not option
SPARK-3503: Disable thread local cache in PooledByteBufAllocator

TODO before mergeable:

Implement uploadBlock()
Unit tests for RPC side of code
Performance testing (see comments here)
Turn OFF by default (currently on for unit testing)

Also includes some partial support for uploading blocks.

…lockFetcherIteratorSuite

- use same pool for boss and worker - remove ioratio - disable caching of byte buf allocator - childoption sendbuf/receivebuf - fire exception through pipeline In addition: - fire failure handler BlockFetchingListener at least once per block. - enabled a bunch of ignored tests

SparkQA · 2014-10-10T07:19:39Z

QA tests have started for PR 2753 at commit c0cd242.

This patch merges cleanly.

SparkQA · 2014-10-10T07:21:05Z

QA tests have finished for PR 2753 at commit c0cd242.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-10T07:21:06Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21580/Test FAILed.

AmplabJenkins · 2014-10-29T03:13:25Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22415/
Test FAILed.

aarondav · 2014-10-29T03:15:07Z

Jenkins, retest this please.

SparkQA · 2014-10-29T03:17:32Z

Test build #22421 has started for PR 2753 at commit d7be11b.

This patch merges cleanly.

SparkQA · 2014-10-29T04:55:18Z

Test build #22421 has finished for PR 2753 at commit d7be11b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-29T04:55:21Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22421/
Test PASSed.

aarondav · 2014-10-29T05:13:00Z

That is a pass with netty turned on. Now I am turning it off for preparation to merge.

SparkQA · 2014-10-29T05:19:48Z

Test build #22427 has started for PR 2753 at commit cadfd28.

This patch merges cleanly.

SparkQA · 2014-10-29T06:11:34Z

Test build #22427 has finished for PR 2753 at commit cadfd28.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-29T06:11:38Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22427/
Test FAILed.

aarondav · 2014-10-29T06:45:45Z

Jenkins, retest this please.

SparkQA · 2014-10-29T06:49:55Z

Test build #22437 has started for PR 2753 at commit cadfd28.

This patch merges cleanly.

SparkQA · 2014-10-29T08:06:55Z

Test build #22437 has finished for PR 2753 at commit cadfd28.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-10-29T08:06:59Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22437/
Test PASSed.

rxin · 2014-10-29T18:19:58Z

network/common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java

nit - it might be slightly more clear to write it this way

if (cachedClient != null) { if (cachedClient.isActive) { return cachedClient; } else { connectionPool.remove(address, cachedClient); // Remove inactive clients. } }

You can do it in the next PR.

rxin · 2014-10-29T18:27:17Z

Merging this in master. Thanks!

This patch introduces the tooling necessary to construct an external shuffle service which is independent of Spark executors, and then use this service inside Spark. An example (just for the sake of this PR) of the service creation can be found in Worker, and the service itself is used by plugging in the StandaloneShuffleClient as Spark's ShuffleClient (setup in BlockManager). This PR continues the work from apache#2753, which extracted out the transport layer of Spark's block transfer into an independent package within Spark. A new package was created which contains the Spark business logic necessary to retrieve the actual shuffle data, which is completely independent of the transport layer introduced in the previous patch. Similar to the transport layer, this package must not depend on Spark as we anticipate plugging this service as a lightweight process within, say, the YARN ApplicationManager, and do not wish to include Spark's dependencies (including Scala itself). There are several outstanding tasks which must be complete before this PR can be merged: - [ ] Complete unit testing of network/shuffle package. - [ ] Performance and correctness testing on a real cluster. - [ ] Documentation of the feature in the Spark docs. - [ ] Remove example service instantiation from Worker.scala. There are even more shortcomings of this PR which should be addressed in followup patche: - Don't use Java serializer for RPC layer! It is not cross-version compatible. - Handle shuffle file cleanup for dead executors once the application terminates or the ContextCleaner triggers. - Integrate unit testing with Spark's tests (currently only runnable via maven). - Improve behavior if the shuffle service itself goes down (right now we don't blacklist it, and new executors cannot spawn on that machine). - SSL and SASL integration - Nice to have: Handle shuffle file consolidation (this would requires changes to Spark's implementation).

This patch introduces the tooling necessary to construct an external shuffle service which is independent of Spark executors, and then use this service inside Spark. An example (just for the sake of this PR) of the service creation can be found in Worker, and the service itself is used by plugging in the StandaloneShuffleClient as Spark's ShuffleClient (setup in BlockManager). This PR continues the work from #2753, which extracted out the transport layer of Spark's block transfer into an independent package within Spark. A new package was created which contains the Spark business logic necessary to retrieve the actual shuffle data, which is completely independent of the transport layer introduced in the previous patch. Similar to the transport layer, this package must not depend on Spark as we anticipate plugging this service as a lightweight process within, say, the YARN NodeManager, and do not wish to include Spark's dependencies (including Scala itself). There are several outstanding tasks which must be complete before this PR can be merged: - [x] Complete unit testing of network/shuffle package. - [x] Performance and correctness testing on a real cluster. - [x] Remove example service instantiation from Worker.scala. There are even more shortcomings of this PR which should be addressed in followup patches: - Don't use Java serializer for RPC layer! It is not cross-version compatible. - Handle shuffle file cleanup for dead executors once the application terminates or the ContextCleaner triggers. - Documentation of the feature in the Spark docs. - Improve behavior if the shuffle service itself goes down (right now we don't blacklist it, and new executors cannot spawn on that machine). - SSL and SASL integration - Nice to have: Handle shuffle file consolidation (this would requires changes to Spark's implementation). Author: Aaron Davidson <[email protected]> Closes #3001 from aarondav/shuffle-service and squashes the following commits: 4d1f8c1 [Aaron Davidson] Remove changes to Worker 705748f [Aaron Davidson] Rename Standalone* to External* fd3928b [Aaron Davidson] Do not unregister executor outputs unduly 9883918 [Aaron Davidson] Make suggested build changes 3d62679 [Aaron Davidson] Add Spark integration test 7fe51d5 [Aaron Davidson] Fix SBT integration 56caa50 [Aaron Davidson] Address comments c8d1ac3 [Aaron Davidson] Add unit tests 2f70c0c [Aaron Davidson] Fix unit tests 5483e96 [Aaron Davidson] Fix unit tests 46a70bf [Aaron Davidson] Whoops, bracket 5ea4df6 [Aaron Davidson] [SPARK-3796] Create external service which can serve shuffle files

This commit removes the netty shuffle implementation (the default shuffle implementation uses Java's NIO library). According to Patrick Wendell, this never worked well anyway (there were a bunch of corner cases it didn't support, and in the more recent Spark branch this code was completely rewritten: apache#2753), and having multiple implementations of the network layer made it much more difficult to change some of the code.

This commit removes the netty shuffle implementation (the default shuffle implementation uses Java's NIO library). According to Patrick Wendell, this never worked well anyway (there were a bunch of corner cases it didn't support, and in the more recent Spark branch this code was completely rewritten: apache/spark#2753), and having multiple implementations of the network layer made it much more difficult to change some of the code.

rxin added 27 commits October 9, 2014 22:24

[SPARK-3453] Refactor Netty module to use BlockTransferService.

165eab1

Also includes some partial support for uploading blocks.

Use Epoll.isAvailable in BlockServer as well.

1760d32

Added more documentation.

2b44cf1

Reference count buffers and clean them up properly.

064747b

Fixed ShuffleBlockFetcherIteratorSuite.

b5c8d1f

Forgot to add TestSerializer to the commit list.

108c9ed

Shorten NioManagedBuffer and NettyManagedBuffer class names.

1be4e8e

Added more test cases covering cleanup when fault happens in ShuffleB…

cb589ec

…lockFetcherIteratorSuite

Fixed style violation.

5cd33d7

Fixed BlockClientHandlerSuite

9e0cb87

Added connection pooling.

b2f3281

Removed BlockManager.getLocalShuffleFromDisk.

14323a5

Fixed test hanging.

f0a16e9

Mark private package visibility and MimaExcludes.

519d64d

Implement java.io.Closeable interface.

c066309

Added logging.

6afc435

Add more debug message.

f63fb4c

Logging close() in case close() fails.

d68f328

Fixed tests.

1bdd7ee

Removed OIO and added num threads settings.

bec4ea2

Copy the buffer in fetchBlockSync.

4b18db2

Implemented block uploads.

a0518c7

Fix style violation.

407e59a

Merge with latest master.

f6c220d

Flip buffer.

5d98ce3

Fixed spark.shuffle.io.receiveBuffer setting.

f7e7568

Turn netty on by default

d7be11b

aarondav force-pushed the netty branch from d5d123f to d7be11b Compare October 29, 2014 03:16

Turn netty off by default

cadfd28

rxin reviewed Oct 29, 2014
View reviewed changes

asfgit closed this in dff0155 Oct 29, 2014

aarondav mentioned this pull request Oct 29, 2014

[SPARK-3796] Create external service which can serve shuffle files #3001

Closed

3 tasks

[SPARK-3453] Netty-based BlockTransferService, extracted from Spark core #2753

[SPARK-3453] Netty-based BlockTransferService, extracted from Spark core #2753

Uh oh!

Conversation

aarondav commented Oct 10, 2014

Uh oh!

SparkQA commented Oct 10, 2014

Uh oh!

SparkQA commented Oct 10, 2014

Uh oh!

AmplabJenkins commented Oct 10, 2014

Uh oh!

AmplabJenkins commented Oct 29, 2014

Uh oh!

aarondav commented Oct 29, 2014

Uh oh!

SparkQA commented Oct 29, 2014

Uh oh!

SparkQA commented Oct 29, 2014

Uh oh!

AmplabJenkins commented Oct 29, 2014

Uh oh!

aarondav commented Oct 29, 2014

Uh oh!

SparkQA commented Oct 29, 2014

Uh oh!

SparkQA commented Oct 29, 2014

Uh oh!

AmplabJenkins commented Oct 29, 2014

Uh oh!

aarondav commented Oct 29, 2014

Uh oh!

SparkQA commented Oct 29, 2014

Uh oh!

SparkQA commented Oct 29, 2014

Uh oh!

AmplabJenkins commented Oct 29, 2014

Uh oh!

rxin Oct 29, 2014

Choose a reason for hiding this comment

Uh oh!

rxin commented Oct 29, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants