Skip to content

Commit aba837b

Browse files
viper-kunCodingCat
authored andcommitted
[SPARK-9973] [SQL] Correct in-memory columnar buffer size
The `initialSize` argument of `ColumnBuilder.initialize()` should be the number of rows rather than bytes. However `InMemoryColumnarTableScan` passes in a byte size, which makes Spark SQL allocate more memory than necessary when building in-memory columnar buffers. Author: Kun Xu <[email protected]> Closes apache#8189 from viper-kun/errorSize.
1 parent 6790328 commit aba837b

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

sql/core/src/main/scala/org/apache/spark/sql/columnar/InMemoryColumnarTableScan.scala

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -121,8 +121,7 @@ private[sql] case class InMemoryRelation(
121121
def next(): CachedBatch = {
122122
val columnBuilders = output.map { attribute =>
123123
val columnType = ColumnType(attribute.dataType)
124-
val initialBufferSize = columnType.defaultSize * batchSize
125-
ColumnBuilder(attribute.dataType, initialBufferSize, attribute.name, useCompression)
124+
ColumnBuilder(attribute.dataType, batchSize, attribute.name, useCompression)
126125
}.toArray
127126

128127
var rowCount = 0

0 commit comments

Comments
 (0)