Skip to content

Conversation

@chenhao-db
Copy link
Contributor

What changes were proposed in this pull request?

Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., VariantVal.toString). Future memory instability is possible, but this change won't break any existing workload.

Why are the changes needed?

It enhances the ability of the variant data type to process larger data.

Does this PR introduce any user-facing change?

Yes, as stated above.

How was this patch tested?

Unit test.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label May 16, 2025
@chenhao-db
Copy link
Contributor Author

@cloud-fan Could you help review? Thanks!

…essions/variant/VariantExpressionEvalUtilsSuite.scala
}
for (json <- Seq("\"" + "a" * (16 * 1024 * 1024) + "\"",
(0 to 4 * 1024 * 1024).mkString("[", ",", "]"))) {
(0 to 100 * 1024 * 1024).mkString("[", ",", "]"))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still can't trigger the size limitation, I'll leave it to @chenhao-db

@cloud-fan
Copy link
Contributor

seems a real test failure in from_csv with variant *** FAILED ***

@chenhao-db
Copy link
Contributor Author

@cloud-fan It is just another test that assumes the old limit. Fixed.

@cloud-fan
Copy link
Contributor

cloud-fan commented May 16, 2025

thanks, merging to master!

@cloud-fan cloud-fan closed this in af6499f May 16, 2025
@cloud-fan
Copy link
Contributor

@chenhao-db can you open a backport PR for 4.0? thanks!

chenhao-db added a commit to chenhao-db/oss that referenced this pull request May 16, 2025
Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload.

It enhances the ability of the variant data type to process larger data.

Yes, as stated above.

Unit test.

No.

Closes apache#50913 from chenhao-db/variant_large_size_limit.

Lead-authored-by: Chenhao Li <[email protected]>
Co-authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
cloud-fan pushed a commit that referenced this pull request May 17, 2025
### What changes were proposed in this pull request?

This is a cherry-pick of #50913

Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload.

### Why are the changes needed?

It enhances the ability of the variant data type to process larger data.

### Does this PR introduce _any_ user-facing change?

Yes, as stated above.

### How was this patch tested?

Unit test.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #50927 from chenhao-db/variant_large_size_limit_4.0.

Authored-by: Chenhao Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@LuciferYang
Copy link
Contributor

LuciferYang commented May 19, 2025

// case is the input exceeds the variant size limit (16MiB).
val largeInput = "a" * (16 * 1024 * 1024)
// case is the input exceeds the variant size limit (128MiB).
val largeInput = "a" * (128 * 1024 * 1024)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the GA machines are not powerful enough to run tests that take 128 MB. One idea is to have a hardcoded testing limit of 16 MB. e.g.

public static final int SIZE_LIMIT = if (System.getenv("SPARK_TESTING") != null) U24_MAX + 1 else  128 * 1024 * 1024;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use sys.env.contains("GITHUB_ACTIONS")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then local test and GA test will have different error messages, we will have to complicate the test cases to make them pass for both environments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #50950

LuciferYang pushed a commit that referenced this pull request May 20, 2025
### What changes were proposed in this pull request?

This is a follow-up of #50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests.

### Why are the changes needed?

make tests more reliable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #50950 from cloud-fan/test.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
cloud-fan added a commit to cloud-fan/spark that referenced this pull request May 20, 2025
This is a follow-up of apache#50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests.

make tests more reliable.

no

existing tests

no

Closes apache#50950 from cloud-fan/test.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
cloud-fan added a commit that referenced this pull request May 20, 2025
This is a 4.0 backport of #50950

### What changes were proposed in this pull request?

This is a follow-up of #50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests.

### Why are the changes needed?

make tests more reliable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #50954 from cloud-fan/test.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@tedyu
Copy link
Contributor

tedyu commented May 21, 2025

The instantiation of BigDecimal is used by VariantValueConverter. parseDecimal.

      def parseDecimal(): DataType = {
        try {
          var d = decimalParser(s)
          if (d.scale() < 0) {
            d = d.setScale(0)
          }
          if (d.scale() <= VariantUtil.MAX_DECIMAL16_PRECISION &&
            d.precision() <= VariantUtil.MAX_DECIMAL16_PRECISION) {
            builder.appendDecimal(d)
            // The actual decimal type doesn't matter. `appendDecimal` will use the smallest
            // possible decimal type to store the value.
            DecimalType.USER_DEFAULT
          } else {
            if (options.preferDate) parseDate() else parseTimestampNTZ()
          }

The above code can be refactored so that we get scale and precision from the string directly. Date and timestamp parsing doesn't require BigDecimal instance.
This way, memory consumption can be reduced.

I have a feeling that increasing the limit without providing a new config knob may result in surprising OOM issues when the new release goes to production.

@cloud-fan
Copy link
Contributor

@tedyu I'm confused, how can increasing the limit bring new OOM issues? We don't have a knob before this PR either.

@tedyu
Copy link
Contributor

tedyu commented May 22, 2025

I am still trying to understand the implication of OOM in tests. The previous comment provided one refactor that avoids unnecessary creation of bigdecimal's.
In my opinion, providing a config knob for the limit would give us enough room for handling potential issues after release.

@cloud-fan
Copy link
Contributor

@tedyu you are welcome to add such a config, but I think it's hard as variant component is an individual module that does not depend on spark sql.

@tedyu
Copy link
Contributor

tedyu commented May 22, 2025

It seems we can pass Configuration explicitly to VariantVal.toString:

def toString(conf: VariantConf): String

where VariantConf is a lightweight case class.
This requires a large change across modules. Not sure it is worth the effort.

@chenhao-db
Copy link
Contributor Author

Increasing the limit won't bring any new OOM to existing workloads. The new OOM in tests is because we changed the test code to use a bigger input.

I'm not sure whether we are able to find all the use cases of toString, which can happen in many places implicitly. Even if you provide this new version of toString, you cannot easily guarantee the no-parameter version is not called.

@tedyu
Copy link
Contributor

tedyu commented May 22, 2025

bq. The new OOM in tests is because we changed the test code to use a bigger input.

With the release containing this PR, some users would feed their workloads with bigger input.

@cloud-fan
Copy link
Contributor

cloud-fan commented May 23, 2025

@tedyu Please note that we don't have a size limitation for many data types: UTF8String, binary, struct/array/map values. I don't think VARIANT makes things worse. And it's a new feature, so there is nothing wrong with shipping it using 128 MB as the limit.

yhuang-db pushed a commit to yhuang-db/spark that referenced this pull request Jun 9, 2025
### What changes were proposed in this pull request?

Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload.

### Why are the changes needed?

It enhances the ability of the variant data type to process larger data.

### Does this PR introduce _any_ user-facing change?

Yes, as stated above.

### How was this patch tested?

Unit test.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#50913 from chenhao-db/variant_large_size_limit.

Lead-authored-by: Chenhao Li <[email protected]>
Co-authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
yhuang-db pushed a commit to yhuang-db/spark that referenced this pull request Jun 9, 2025
### What changes were proposed in this pull request?

This is a follow-up of apache#50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests.

### Why are the changes needed?

make tests more reliable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes apache#50950 from cloud-fan/test.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
### What changes were proposed in this pull request?

This is a cherry-pick of apache#50913

Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload.

### Why are the changes needed?

It enhances the ability of the variant data type to process larger data.

### Does this PR introduce _any_ user-facing change?

Yes, as stated above.

### How was this patch tested?

Unit test.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#50927 from chenhao-db/variant_large_size_limit_4.0.

Authored-by: Chenhao Li <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
This is a 4.0 backport of apache#50950

### What changes were proposed in this pull request?

This is a follow-up of apache#50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests.

### Why are the changes needed?

make tests more reliable.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

### Was this patch authored or co-authored using generative AI tooling?

no

Closes apache#50954 from cloud-fan/test.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants