Skip to content

Commit e05ad88

Browse files
Jie Xionghvanhovell
authored andcommitted
[SPARK-18208][SHUFFLE] Executor OOM due to a growing LongArray in BytesToBytesMap
## What changes were proposed in this pull request? BytesToBytesMap currently does not release the in-memory storage (the longArray variable) after it spills to disk. This is typically not a problem during aggregation because the longArray should be much smaller than the pages, and because we grow the longArray at a conservative rate. However this can lead to an OOM when an already running task is allocated more than its fair share, this can happen because of a scheduling delay. In this case the longArray can grow beyond the fair share of memory for the task. This becomes problematic when the task spills and the long array is not freed, that causes subsequent memory allocation requests to be denied by the memory manager resulting in an OOM. This PR fixes this issuing by freeing the longArray when the BytesToBytesMap spills. ## How was this patch tested? Existing tests and tested on realworld workloads. Author: Jie Xiong <[email protected]> Author: jiexiong <[email protected]> Closes #15722 from jiexiong/jie_oom_fix. (cherry picked from commit c496d03) Signed-off-by: Herman van Hovell <[email protected]>
1 parent f5c5a07 commit e05ad88

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,8 @@ public final class BytesToBytesMap extends MemoryConsumer {
169169

170170
private long peakMemoryUsedBytes = 0L;
171171

172+
private final int initialCapacity;
173+
172174
private final BlockManager blockManager;
173175
private final SerializerManager serializerManager;
174176
private volatile MapIterator destructiveIterator = null;
@@ -201,6 +203,7 @@ public BytesToBytesMap(
201203
throw new IllegalArgumentException("Page size " + pageSizeBytes + " cannot exceed " +
202204
TaskMemoryManager.MAXIMUM_PAGE_SIZE_BYTES);
203205
}
206+
this.initialCapacity = initialCapacity;
204207
allocate(initialCapacity);
205208
}
206209

@@ -897,12 +900,12 @@ public LongArray getArray() {
897900
public void reset() {
898901
numKeys = 0;
899902
numValues = 0;
900-
longArray.zeroOut();
901-
903+
freeArray(longArray);
902904
while (dataPages.size() > 0) {
903905
MemoryBlock dataPage = dataPages.removeLast();
904906
freePage(dataPage);
905907
}
908+
allocate(initialCapacity);
906909
currentPage = null;
907910
pageCursor = 0;
908911
}

0 commit comments

Comments
 (0)