[SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset #41782

wankunde · 2023-06-29T05:12:49Z

What changes were proposed in this pull request?

Add a memory reserve policy for WritableColumnVector:

If the vector capacity < VECTORIZED_HUGE_VECTOR_THRESHOLD, will reserve requested capacity * 2 memory.
If the vector capacity >= VECTORIZED_HUGE_VECTOR_THRESHOLD, will reserve requested capacity * VECTORIZED_HUGE_VECTOR_RESERVE_RATIO memory.
Free the WritableColumnVector memory if the vector capacity >= VECTORIZED_HUGE_VECTOR_THRESHOLD

which will reuse the allocated array object for small column vectors and free the memory for huge column vectors.

Why are the changes needed?

When spark reads a data file into a WritableColumnVector, the memory allocated by the WritableColumnVectors is not freed until the VectorizedColumnReader completes.

It will save memory allocation time by reusing the allocated array objects. But it also takes up too many unused memory after the current large vector batch has been read.

Add a memory reserve policy for this scenario which will reuse the allocated array object for small column vectors and free the memory for huge column vectors.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added UT

wankunde · 2023-07-05T03:53:00Z

Hi, @cloud-fan @sunchao @viirya Could you help to review this PR?

liuzqt · 2023-07-24T21:22:33Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

I think it worth mention here this ratio is only effective when VECTORIZED_HUGE_VECTOR_THRESHOLD is enabled and required mem larger that threshold

liuzqt · 2023-07-24T21:26:23Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/VectorReservePolicy.java

Maybe make this abstract as well? also for defaultCapacity. Does seem to be necessary in the base abstract class

liuzqt · 2023-07-24T21:38:19Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/VectorReservePolicy.java

This may lead to conter-intuitive behavior for example If the threshold is 1000, then require 999 might look more memory-hungry than 1001. How about we make it

(requiredCapacity - hugeThreshold) * hugeReserveRatio + hugeThreshold * 2L

to be more consistent

Thanks @liuzqt for your review.

For (requiredCapacity - hugeThreshold) * hugeReserveRatio + hugeThreshold * 2L is equal to requiredCapacity * hugeReserveRatio + hugeThreshold * (2L - hugeReserveRatio). If hugeThreshold= 100M, and the requiredCapacity = 1000, before this change, spark will only allocate 2000 bytes while after this change, spark will allocate 1000 * 1.2 + 100M * 0.8 ~= 80M bytes, will small column vectors take up too much memory?

Oh sorry for the confusion, the above formula only apply for the else case (i.e, requiredCapacity >= hugeThreshold), for requiredCapacity < hugeThreshold it's still requiredCapacity * 2L as in your code.

if (hugeThreshold < 0 || requiredCapacity < hugeThreshold) { currentCapacity = (int) Math.min(MAX_CAPACITY, requiredCapacity * 2L); } else { // we only change this currentCapacity = (int) Math.min(MAX_CAPACITY, (requiredCapacity - hugeThreshold) * hugeReserveRatio + hugeThreshold * 2L); }

the idea is that for requiredCapacity >= hugeThreshold, the exceeding part we multiply by hugeReserveRatio

Thanks for your idea, I have updated the formula.

liuzqt · 2023-08-02T02:09:07Z

LGTM, @cloud-fan mind taking another look?

wankunde · 2023-08-14T14:53:19Z

Hi, @liuzqt @cloud-fan any thoughts about this PR?

cloud-fan · 2023-08-15T02:22:34Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

We can use .bytesConf(ByteUnit.BYTE), so that people can set it to 1g which is more convenient.

cloud-fan · 2023-08-15T02:24:15Z

...re/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java

is it an existing bug?

Before this PR, there is no existing bug, after this PR, for Parquet tables whose columns have associated DEFAULT values, the result may be incorrect.
https://github.com/apache/spark/blob/master/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetColumnVector.java#L84C15-L100

So I add a flag protected boolean hasDefaultValue = false; to indicate if the current column vector has default value and not to clean the data if it's true.

cloud-fan · 2023-08-15T02:29:12Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/VectorReservePolicy.java

This looks a bit over-designed as we only have one policy, can we inline the logic in WritableColumnVector? And I feel it's not well-designed to track the previously allocated memory size in a policy.

OK, I'll inline the policy logic.

And do you think we should track the previously allocated memory size?

cloud-fan · 2023-08-15T02:29:43Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/VectorReservePolicy.java

Why can't we reset vectors with default value? can't we reset the default value as well?

The default value was set into the columnVector before reading data.

The default value is in ParquetColumnVector and it's internal vector final WritableColumnVector vector does not know the default value.

cloud-fan · 2023-08-17T12:40:35Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

this does not match the config doc. It should be requiredCapacity * hugeVectorReserveRatio

cloud-fan · 2023-08-17T12:42:08Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

I'm not convinced about it. We can reset it and set the default value again, can't we?

I'm sorry, it seems to be another issue.

If a column vector has default values, we will always set isConstant to true. So we don't need the hasDefaultValue field. But we should also set the isConstant to true for all its children vectors.
Change code : 128c8ff

cloud-fan · 2023-08-17T12:42:56Z

sql/core/src/test/scala/org/apache/spark/sql/execution/columnar/VectorReserveSuite.scala

shall we just put it in ColumnVectorSuite?

cloud-fan · 2023-08-18T11:16:06Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java

shall we add a new method releaseMemory to share the code between reset and close?

cloud-fan · 2023-08-18T11:17:16Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

let's mark these variables as final if the value won't change

cloud-fan · 2023-08-18T11:19:59Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

can we set this as 1 and see if there is any test failures? If not we can change it back to -1 and merge it.

When VECTORIZED_HUGE_VECTOR_THRESHOLD = 1, there are two UT failures, as expected.

HyukjinKwon · 2023-08-21T01:22:31Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Suggested change

.doc("spark will reserve requiredCapacity * this ratio memory next time. This is only " +

.doc("Spark reserves requiredCapacity * this ratio memory next time. This is only " +

Can we avoid using the name of variable such as requiredCapacity? Also, it's a bit difficult to understand by reading the description here. I think you should explain this more explicitly including the behaviour of reserving * 2 by default.

cloud-fan · 2023-08-21T03:17:39Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java

nit: if we can move the declartion of this function to the parent class, then reset and close can both be implementd in the parent class. The child classes only need to implement releaseMemory.

cloud-fan · 2023-08-21T03:18:46Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

This change make sense. Do you know why there was no problem before?

For example, alter table t add column s array<int> default array(1, 2) , spark will create a vector for column s and , and a vector for the items of this column.
Before this PR, both those two vectors will not be reset.
After this PR, the second vector will be reset without this change.

cloud-fan · 2023-08-21T03:58:05Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

do we still need it?

Remove this function.

wankunde · 2023-08-21T15:11:20Z

[error] spark-sql: Failed binary compatibility check against org.apache.spark:spark-sql_2.12:3.4.0! Found 2 potential problems (filtered 417)
[error]  * abstract method close()Unit in class org.apache.spark.sql.vectorized.ColumnVector does not have a correspondent in current version
[error]    filter with: ProblemFilters.exclude[DirectAbstractMethodProblem]("org.apache.spark.sql.vectorized.ColumnVector.close")
[error]  * abstract method close()Unit in class org.apache.spark.sql.vectorized.ColumnVector does not have a correspondent in current version
[error]    filter with: ProblemFilters.exclude[DirectAbstractMethodProblem]("org.apache.spark.sql.vectorized.ColumnVector.close")

…umn vector is constant

cloud-fan · 2023-08-25T05:36:19Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java

  }

+  @Override
+  public void close() {


this seems like a bug in MIMA... anyway, it's fine to have this workaround for MIMA

cloud-fan · 2023-08-25T05:37:48Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+        "reserves required memory * 2 memory; otherwise, spark reserves " +
+        "required memory * this ratio memory, and will release this column vector memory before " +
+        "reading the next batch rows.")
+      .version("3.5.0")


one last comment: 3.5.0 is already at RC2 and it's too late to merge this feature to 3.5. Can we update it to 4.0.0?

cloud-fan · 2023-08-25T05:37:54Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .doc("When the required memory is larger than this, spark reserves required memory * " +
+        s"${VECTORIZED_HUGE_VECTOR_RESERVE_RATIO.key} memory next time and release this column " +
+        s"vector memory before reading the next batch rows. -1 means disabling the optimization.")
+      .version("3.5.0")


cloud-fan · 2023-08-30T14:36:15Z

thanks, merging to master!

yaooqinn · 2024-08-05T09:26:46Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

      numNulls = 0;
    }
+
+    if (hugeVectorThreshold > 0 && capacity > hugeVectorThreshold) {


shouldn't this be hugeVectorThreshold > -1?

if hugeVectorThreshold == 0 or hugeVectorThreshold is a small value, the ColumnVector will always releaseMemory() and reserve new memory, this may be slower than before.

I know, but according to the doc and impl, this should be > -1, right?

Yes, the doc and the code doesn't matched, Sorry.

Can you send a followup？

Sorry for the later reply, filed a followup PR: https://github.com/apache/spark/pull/47988/files

pan3793 · 2024-08-29T09:51:21Z

sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java

 public abstract class WritableColumnVector extends ColumnVector {
  private final byte[] byte8 = new byte[8];

+  protected abstract void releaseMemory();


@cloud-fan do we treat WritableColumnVector as public api? if so, we should give it a default implementation instead of abstract method, otherwise, third-party subclass without implementing this method will be failed with

java.lang.AbstractMethodError: org.apache.spark.sql.execution.vectorized.WritableColumnVector.releaseMemory()V at org.apache.spark.sql.execution.vectorized.WritableColumnVector.close(WritableColumnVector.java:92) at io.glutenproject.vectorized.ArrowWritableColumnVector.close(ArrowWritableColumnVector.java:362)

It's not a public API, I think third-party lib should update and re-compile the code when upgrading Spark versions if private APIs were used.

got it, thanks for the information.

…s are reset (apache#237) Add a memory reserve policy for WritableColumnVector: * If the vector capacity < VECTORIZED_HUGE_VECTOR_THRESHOLD, will reserve requested capacity * 2 memory. * If the vector capacity >= VECTORIZED_HUGE_VECTOR_THRESHOLD, will reserve requested capacity * VECTORIZED_HUGE_VECTOR_RESERVE_RATIO memory. * Free the WritableColumnVector memory if the vector capacity >= VECTORIZED_HUGE_VECTOR_THRESHOLD which will reuse the allocated array object for small column vectors and free the memory for huge column vectors. When spark reads a data file into a WritableColumnVector, the memory allocated by the WritableColumnVectors is not freed until the VectorizedColumnReader completes. It will save memory allocation time by reusing the allocated array objects. But it also takes up too many unused memory after the current large vector batch has been read. Add a memory reserve policy for this scenario which will reuse the allocated array object for small column vectors and free the memory for huge column vectors. ![image](https://github.com/apache/spark/assets/3626747/a7a487bd-f184-4b24-bea0-75e530702887) ![image](https://github.com/apache/spark/assets/3626747/01d0268f-68e7-416f-b9b3-6c9d60919596) No Added UT Closes apache#41782 from wankunde/vector. Lead-authored-by: Kun Wan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> Co-authored-by: Kun Wan <[email protected]>

github-actions bot added the SQL label Jun 29, 2023

wankunde changed the title ~~[SPARK-44239][SQL] Free memory allocated by huge column vector~~ [SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset Jun 29, 2023

wankunde changed the title ~~[SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset~~ [WIP][SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset Jun 30, 2023

wankunde changed the title ~~[WIP][SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset~~ [SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset Jul 3, 2023

liuzqt reviewed Jul 24, 2023

View reviewed changes

cloud-fan reviewed Aug 15, 2023

View reviewed changes

wankunde force-pushed the vector branch from e740221 to 4f5974f Compare August 15, 2023 11:49

wankunde requested a review from cloud-fan August 17, 2023 09:48

cloud-fan reviewed Aug 17, 2023

View reviewed changes

wankunde force-pushed the vector branch from a4ea741 to 0397b7c Compare August 18, 2023 06:42

cloud-fan reviewed Aug 18, 2023

View reviewed changes

wankunde force-pushed the vector branch from f96965b to 5305974 Compare August 19, 2023 03:38

HyukjinKwon reviewed Aug 21, 2023

View reviewed changes

wankunde force-pushed the vector branch 2 times, most recently from cc1c1b2 to 5012869 Compare August 21, 2023 02:07

cloud-fan reviewed Aug 21, 2023

View reviewed changes

cloud-fan approved these changes Aug 21, 2023

View reviewed changes

cloud-fan reviewed Aug 21, 2023

View reviewed changes

wankunde added 10 commits August 22, 2023 09:49

Fix UT

4594f34

Inline write vector reserve policy

5d03e79

update code

3e560eb

Remove hasDefault field

d17ccc6

Test hugeVectorThreshold=1

0f5f977

The children column vectors should also be constant if the parent col…

bfef94e

…umn vector is constant

Update config description

88822b3

Move reset and close method into the parent class

ccb02c0

Remove int getCapacity() method

c510024

Fix Mina error

2707803

wankunde force-pushed the vector branch from a77c7f6 to 2707803 Compare August 22, 2023 01:50

wankunde requested review from HyukjinKwon, cloud-fan and liuzqt August 25, 2023 02:36

cloud-fan reviewed Aug 25, 2023

View reviewed changes

wankunde added 4 commits August 25, 2023 17:19

Update config version

d429aa0

Merge branch 'apache:master' into vector

358f29c

Merge branch 'apache:master' into vector

72ce8a5

Merge branch 'apache:master' into vector

cc42fd1

wankunde requested a review from cloud-fan August 30, 2023 09:27

cloud-fan closed this in 27f9ac2 Aug 30, 2023

wankunde deleted the vector branch January 3, 2024 06:57

yaooqinn reviewed Aug 5, 2024

View reviewed changes

pan3793 reviewed Aug 29, 2024

View reviewed changes

yaooqinn mentioned this pull request Sep 4, 2024

[SPARK-44239][SQL][FOLLOWUP] Do not disable vector memory optimization when hugeVectorThreshold=0 to align its document #47988

Closed

	.doc("spark will reserve requiredCapacity * this ratio memory next time. This is only " +
	.doc("Spark reserves requiredCapacity * this ratio memory next time. This is only " +

[SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset #41782

[SPARK-44239][SQL] Free memory allocated by large vectors when vectors are reset #41782

Uh oh!

Conversation

wankunde commented Jun 29, 2023

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wankunde commented Jul 5, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuzqt commented Aug 2, 2023

Uh oh!

wankunde commented Aug 14, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Aug 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wankunde Aug 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Aug 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Aug 15, 2023 •

edited

Loading

wankunde Aug 20, 2023 •

edited

Loading

cloud-fan Aug 18, 2023 •

edited

Loading