Skip to content

Commit 3a156b8

Browse files
committed
SPARK-3211 .take() is OOM-prone with empty partitions
Instead of jumping straight from 1 partition to all partitions, do exponential growth and double the number of partitions to attempt each time instead. Fix proposed by Paul Nepywoda
1 parent fb0db77 commit 3a156b8

File tree

1 file changed

+1
-1
lines changed
  • core/src/main/scala/org/apache/spark/rdd

1 file changed

+1
-1
lines changed

core/src/main/scala/org/apache/spark/rdd/RDD.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1068,7 +1068,7 @@ abstract class RDD[T: ClassTag](
10681068
// Otherwise, interpolate the number of partitions we need to try, but overestimate it
10691069
// by 50%.
10701070
if (buf.size == 0) {
1071-
numPartsToTry = totalParts - 1
1071+
numPartsToTry = partsScanned * 2
10721072
} else {
10731073
numPartsToTry = (1.5 * num * partsScanned / buf.size).toInt
10741074
}

0 commit comments

Comments
 (0)