[SPARK-13992] Add support for off-heap caching #11805

JoshRosen · 2016-03-18T00:13:48Z

This patch adds support for caching blocks in the executor processes using direct / off-heap memory.

User-facing changes

Updated semantics of OFF_HEAP storage level: In Spark 1.x, the OFF_HEAP storage level indicated that an RDD should be cached in Tachyon. Spark 2.x removed the external block store API that Tachyon caching was based on (see #10752 / SPARK-12667), so OFF_HEAP became an alias for MEMORY_ONLY_SER. As of this patch, OFF_HEAP means "serialized and cached in off-heap memory or on disk". Via the StorageLevel constructor, useOffHeap can be set if serialized == true and can be used to construct custom storage levels which support replication.

Storage UI reporting: the storage UI will now report whether in-memory blocks are stored on- or off-heap.

Only supported by UnifiedMemoryManager: for simplicity, this feature is only supported when the default UnifiedMemoryManager is used; applications which use the legacy memory manager (spark.memory.useLegacyMode=true) are not currently able to allocate off-heap storage memory, so using off-heap caching will fail with an error when legacy memory management is enabled. Given that we plan to eventually remove the legacy memory manager, this is not a significant restriction.

Memory management policies: the policies for dividing available memory between execution and storage are the same for both on- and off-heap memory. For off-heap memory, the total amount of memory available for use by Spark is controlled by spark.memory.offHeap.size, which is an absolute size. Off-heap storage memory obeys spark.memory.storageFraction in order to control the amount of unevictable storage memory. For example, if spark.memory.offHeap.size is 1 gigabyte and Spark uses the default storageFraction of 0.5, then up to 500 megabytes of off-heap cached blocks will be protected from eviction due to execution memory pressure. If necessary, we can split spark.memory.storageFraction into separate on- and off-heap configurations, but this doesn't seem necessary now and can be done later without any breaking changes.

Use of off-heap memory does not imply use of off-heap execution (or vice-versa): for now, the settings controlling the use of off-heap execution memory (spark.memory.offHeap.enabled) and off-heap caching are completely independent, so Spark SQL can be configured to use off-heap memory for execution while continuing to cache blocks on-heap. If desired, we can change this in a followup patch so that spark.memory.offHeap.enabled affect the default storage level for cached SQL tables.

Internal changes

Rename ByteArrayChunkOutputStream to ChunkedByteBufferOutputStream
- It now returns a ChunkedByteBuffer instead of an array of byte arrays.
- Its constructor now accept an allocator function which is called to allocate ByteBuffers. This allows us to control whether it allocates regular ByteBuffers or off-heap DirectByteBuffers.
- Because block serialization is now performed during the unroll process, a ChunkedByteBufferOutputStream which is configured with a DirectByteBuffer allocator will use off-heap memory for both unroll and storage memory.
The MemoryStore's MemoryEntries now tracks whether blocks are stored on- or off-heap.
- evictBlocksToFreeSpace() now accepts a MemoryMode parameter so that we don't try to evict off-heap blocks in response to on-heap memory pressure (or vice-versa).
Make sure that off-heap buffers are properly de-allocated during MemoryStore eviction.
The JVM limits the total size of allocated direct byte buffers using the -XX:MaxDirectMemorySize flag and the default tends to be fairly low (< 512 megabytes in some JVMs). To work around this limitation, this patch adds a custom DirectByteBuffer allocator which ignores this memory limit.

SparkQA · 2016-03-18T00:18:55Z

Test build #53483 has finished for PR 11805 at commit feffc20.

This patch fails Scala style tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2016-03-22T21:28:53Z

Test build #53819 has finished for PR 11805 at commit 222d80b.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-22T23:44:58Z

Test build #53825 has finished for PR 11805 at commit bf62983.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-24T23:30:34Z

Test build #54096 has finished for PR 11805 at commit 5a15de2.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-25T02:43:21Z

Test build #54125 has finished for PR 11805 at commit 3c122af.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-25T23:01:40Z

Test build #54215 has finished for PR 11805 at commit 334b727.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2016-03-25T23:34:18Z

/cc @andrewor14, @nongli, @rxin, can you glance over the PR description to check whether the user-facing semantics are okay for now? We might want to change some bits later (maybe to add separate nice shorthands for off-heap-memory-only vs off-heap-memory-and-disk), but I'd like to understand whether any semantics need to change here.

rxin · 2016-03-26T00:36:20Z

Looks pretty good actually. There are some details about naming and on/off defaults we need to figure out, but that can go in later.

SparkQA · 2016-03-26T02:21:21Z

Test build #54239 has finished for PR 11805 at commit 4d9489a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

…ryManager This patch extends Spark's `UnifiedMemoryManager` to add bookkeeping support for off-heap storage memory, an requirement for enabling off-heap caching (which will be done by #11805). The `MemoryManager`'s `storageMemoryPool` has been split into separate on- and off-heap pools and the storage and unroll memory allocation methods have been updated to accept a `memoryMode` parameter to specify whether allocations should be performed on- or off-heap. In order to reduce the testing surface, the `StaticMemoryManager` does not support off-heap caching (we plan to eventually remove the `StaticMemoryManager`, so this isn't a significant limitation). Author: Josh Rosen <[email protected]> Closes #11942 from JoshRosen/off-heap-storage-memory-bookkeeping.

…lly serialized block.

SparkQA · 2016-03-27T04:10:53Z

Test build #54272 has finished for PR 11805 at commit df8be62.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

JoshRosen · 2016-03-28T22:59:14Z

Discovered a potentially major problem: if we bail out of a getOrElseUpdate call and don't end up caching a block and then do not fully consume or dispose of the resulting valuesIterator then we might leak off-heap memory. This problem didn't exist before because we'd never use off-heap memory for unroll space. I may have to add some task completion callbacks to dispose of PartiallySerializedBlocks.

JoshRosen · 2016-03-29T02:29:56Z

Alright, this should now be ready for an initial review pass. I just pushed a somewhat hacky patch to work around the direct buffer allocation limit (42d0356) and added a few simple tests. I also addressed the potential leak of direct memory in cases where partially serialized blocks' iterators aren't fully consumed.

I'm sure that this probably needs a bit more work and I'll do more self-review tomorrow once I've had time to set this aside for a bit.

JoshRosen · 2016-03-29T02:30:05Z

/cc @andrewor14 @davies for review

andrewor14 · 2016-03-31T21:10:49Z

streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala

 package org.apache.spark.streaming.rdd

 import java.io.File
+import java.nio.ByteBuffer


andrewor14 · 2016-03-31T21:41:16Z

@JoshRosen Overall looks good. I have not verified comprehensively whether all the things are disposed and released correctly, but it looks OK from what I can tell. If there are actually leaks I'm sure we'll find them quickly.

andrewor14 · 2016-03-31T21:42:20Z

core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala

    val transfer = transferService
      .getOrElse(new NettyBlockTransferService(conf, securityMgr, numCores = 1))
-    val memManager = new StaticMemoryManager(conf, Long.MaxValue, maxMem, numCores = 1)
+    val memManager = UnifiedMemoryManager(conf, numCores = 1)


andrewor14 · 2016-03-31T22:15:36Z

Forgot to mention, you'll need to update documentation for spark.memory.storageFraction. Right now its description is pretty specific to the on-heap stuff.

JoshRosen · 2016-03-31T23:57:36Z

core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala

   * and then serializing the values from the original input iterator.
   */
  def finishWritingToStream(os: OutputStream): Unit = {
    ByteStreams.copy(unrolled.toInputStream(), os)


There's actually a bug here: we aren't guaranteed to free the actual offHeap memory here. I think that in my benchmarks it just so happened to be cleaned by the sun.misc.Cleaner that I attached to the DirectByteBuffer in my custom allocator. I'm going to take a closer look at this block to try to fix the leak issues here.

SparkQA · 2016-04-01T00:41:20Z

Test build #54678 has finished for PR 11805 at commit 61920a9.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-01T02:36:26Z

Test build #54681 has finished for PR 11805 at commit fde020f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-01T02:51:47Z

Test build #54682 has finished for PR 11805 at commit a604078.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

andrewor14 · 2016-04-01T18:50:08Z

deep-review this please

DeepSparkBot · 2016-04-01T18:55:51Z

@andrewor14 Review complete. In the future we should add tests to ensure buffers are released. @JoshRosen please file a JIRA for this issue.

LGTM.

JoshRosen · 2016-04-01T21:25:04Z

Alright, I'm going to merge this into master and will file followup JIRAs to make sure that we update the user-facing documentation and add more tests before 2.0. There are a number of questions to answer about how best to expose the relevant off-heap configurations to users, but I'd rather address them separately.

JoshRosen · 2016-04-01T21:38:48Z

Filed https://issues.apache.org/jira/browse/SPARK-14336 as followup

JoshRosen mentioned this pull request Mar 24, 2016

[SPARK-14135] Add off-heap storage memory bookkeeping support to MemoryManager #11942

Closed

JoshRosen force-pushed the off-heap-caching branch from bf62983 to 5a15de2 Compare March 24, 2016 21:38

JoshRosen force-pushed the off-heap-caching branch from 5a15de2 to 3c122af Compare March 25, 2016 00:41

JoshRosen added 6 commits March 26, 2016 18:48

Off-heap caching changes.

9f7ee01

Update HistoryServer tests to reflect new StorageLevel.toString

2c1b558

Fix problems in ChunkedByteBufferOutputStream tests.

9607220

Fix missing classTag in deserialization.

f8951b2

Use UnifiedMemoryManager in BlockManager test suites.

f8ba8d6

Discard extra data written by serializer close when discarding partia…

df8be62

…lly serialized block.

JoshRosen force-pushed the off-heap-caching branch from 4d9489a to df8be62 Compare March 27, 2016 01:57

Merge remote-tracking branch 'origin/master' into off-heap-caching

f13031e

JoshRosen added 5 commits March 28, 2016 17:20

Tests and several bugfixes.

af90073

Merge remote-tracking branch 'origin/master' into off-heap-caching

33c9d9b

Add replication tests and fix more bugs.

74427e9

Fix accidental early release; dispose on partial read.

8a0f944

Use crazy hack to allocate DirectBuffers while ignoring JVM limit.

42d0356

andrewor14 reviewed Mar 31, 2016
View reviewed changes

JoshRosen added 8 commits March 31, 2016 16:17

Merge remote-tracking branch 'origin/master' into off-heap-caching

0a01768

Simplify StorageLevel.toString().

8a702b4

Remove unused import.

7816118

Remove old TODO.

047b16d

Fix style nit.

fc1eed2

Expand Platform.allocateDirectBuffer comment.

e5f30c9

Add clarifying comment to offHeapUnrollMemoryMap.

b1fd4a7

Use allocator in ChunkedByteBuffer.copy().

61920a9

JoshRosen reviewed Mar 31, 2016
View reviewed changes

Address possible direct buffer leak.

fde020f

close() idempotenency fix.

a604078

asfgit closed this in e41acb7 Apr 1, 2016

JoshRosen deleted the off-heap-caching branch April 1, 2016 21:37

JoshRosen mentioned this pull request Jun 30, 2022

[SPARK-39636][CORE][UI] Fix multiple bugs in JsonProtocol, impacting off heap StorageLevels and Task/Executor ResourceRequests #37027

Closed

[SPARK-13992] Add support for off-heap caching #11805

[SPARK-13992] Add support for off-heap caching #11805

Uh oh!

Conversation

JoshRosen commented Mar 18, 2016

User-facing changes

Internal changes

Uh oh!

SparkQA commented Mar 18, 2016

Uh oh!

SparkQA commented Mar 22, 2016

Uh oh!

SparkQA commented Mar 22, 2016

Uh oh!

SparkQA commented Mar 24, 2016

Uh oh!

SparkQA commented Mar 25, 2016

Uh oh!

SparkQA commented Mar 25, 2016

Uh oh!

JoshRosen commented Mar 25, 2016

Uh oh!

rxin commented Mar 26, 2016

Uh oh!

SparkQA commented Mar 26, 2016

Uh oh!

SparkQA commented Mar 27, 2016

Uh oh!

JoshRosen commented Mar 28, 2016

Uh oh!

JoshRosen commented Mar 29, 2016

Uh oh!

JoshRosen commented Mar 29, 2016

Uh oh!

andrewor14 Mar 31, 2016

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Mar 31, 2016

Uh oh!

andrewor14 Mar 31, 2016

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Mar 31, 2016

Uh oh!

JoshRosen Mar 31, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Apr 1, 2016

Uh oh!

SparkQA commented Apr 1, 2016

Uh oh!

SparkQA commented Apr 1, 2016

Uh oh!

andrewor14 commented Apr 1, 2016

Uh oh!

DeepSparkBot commented Apr 1, 2016

Uh oh!

JoshRosen commented Apr 1, 2016

Uh oh!

JoshRosen commented Apr 1, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants