-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-52181] Increase variant size limit to 128MiB #50913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cloud-fan Could you help review? Thanks! |
...cala/org/apache/spark/sql/catalyst/expressions/variant/VariantExpressionEvalUtilsSuite.scala
Outdated
Show resolved
Hide resolved
…essions/variant/VariantExpressionEvalUtilsSuite.scala
| } | ||
| for (json <- Seq("\"" + "a" * (16 * 1024 * 1024) + "\"", | ||
| (0 to 4 * 1024 * 1024).mkString("[", ",", "]"))) { | ||
| (0 to 100 * 1024 * 1024).mkString("[", ",", "]"))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still can't trigger the size limitation, I'll leave it to @chenhao-db
|
seems a real test failure in |
|
@cloud-fan It is just another test that assumes the old limit. Fixed. |
|
thanks, merging to master! |
|
@chenhao-db can you open a backport PR for 4.0? thanks! |
Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload. It enhances the ability of the variant data type to process larger data. Yes, as stated above. Unit test. No. Closes apache#50913 from chenhao-db/variant_large_size_limit. Lead-authored-by: Chenhao Li <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
### What changes were proposed in this pull request? This is a cherry-pick of #50913 Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload. ### Why are the changes needed? It enhances the ability of the variant data type to process larger data. ### Does this PR introduce _any_ user-facing change? Yes, as stated above. ### How was this patch tested? Unit test. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #50927 from chenhao-db/variant_large_size_limit_4.0. Authored-by: Chenhao Li <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
|
It seems that
Could you take a look when you have time? @chenhao-db |
| // case is the input exceeds the variant size limit (16MiB). | ||
| val largeInput = "a" * (16 * 1024 * 1024) | ||
| // case is the input exceeds the variant size limit (128MiB). | ||
| val largeInput = "a" * (128 * 1024 * 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the GA machines are not powerful enough to run tests that take 128 MB. One idea is to have a hardcoded testing limit of 16 MB. e.g.
public static final int SIZE_LIMIT = if (System.getenv("SPARK_TESTING") != null) U24_MAX + 1 else 128 * 1024 * 1024;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use sys.env.contains("GITHUB_ACTIONS")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then local test and GA test will have different error messages, we will have to complicate the test cases to make them pass for both environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #50950
### What changes were proposed in this pull request? This is a follow-up of #50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests. ### Why are the changes needed? make tests more reliable. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #50950 from cloud-fan/test. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: yangjie01 <[email protected]>
This is a follow-up of apache#50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests. make tests more reliable. no existing tests no Closes apache#50950 from cloud-fan/test. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: yangjie01 <[email protected]>
This is a 4.0 backport of #50950 ### What changes were proposed in this pull request? This is a follow-up of #50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests. ### Why are the changes needed? make tests more reliable. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes #50954 from cloud-fan/test. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
|
The instantiation of BigDecimal is used by The above code can be refactored so that we get scale and precision from the string directly. Date and timestamp parsing doesn't require BigDecimal instance. I have a feeling that increasing the limit without providing a new config knob may result in surprising OOM issues when the new release goes to production. |
|
@tedyu I'm confused, how can increasing the limit bring new OOM issues? We don't have a knob before this PR either. |
|
I am still trying to understand the implication of OOM in tests. The previous comment provided one refactor that avoids unnecessary creation of bigdecimal's. |
|
@tedyu you are welcome to add such a config, but I think it's hard as variant component is an individual module that does not depend on spark sql. |
|
It seems we can pass where |
|
Increasing the limit won't bring any new OOM to existing workloads. The new OOM in tests is because we changed the test code to use a bigger input. I'm not sure whether we are able to find all the use cases of |
|
bq. The new OOM in tests is because we changed the test code to use a bigger input. With the release containing this PR, some users would feed their workloads with bigger input. |
|
@tedyu Please note that we don't have a size limitation for many data types: UTF8String, binary, struct/array/map values. I don't think VARIANT makes things worse. And it's a new feature, so there is nothing wrong with shipping it using 128 MB as the limit. |
### What changes were proposed in this pull request? Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload. ### Why are the changes needed? It enhances the ability of the variant data type to process larger data. ### Does this PR introduce _any_ user-facing change? Yes, as stated above. ### How was this patch tested? Unit test. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50913 from chenhao-db/variant_large_size_limit. Lead-authored-by: Chenhao Li <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
### What changes were proposed in this pull request? This is a follow-up of apache#50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests. ### Why are the changes needed? make tests more reliable. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50950 from cloud-fan/test. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This is a cherry-pick of apache#50913 Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g., `VariantVal.toString`). Future memory instability is possible, but this change won't break any existing workload. ### Why are the changes needed? It enhances the ability of the variant data type to process larger data. ### Does this PR introduce _any_ user-facing change? Yes, as stated above. ### How was this patch tested? Unit test. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50927 from chenhao-db/variant_large_size_limit_4.0. Authored-by: Chenhao Li <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
This is a 4.0 backport of apache#50950 ### What changes were proposed in this pull request? This is a follow-up of apache#50913 . It sets the variant size limit back to 16 MB in test environment, to reduce OOM in tests. ### Why are the changes needed? make tests more reliable. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50954 from cloud-fan/test. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>

What changes were proposed in this pull request?
Increase variant size limit from 16MiB to 128MiB. It is difficult to control the size limit with a flag, because the limit is accessed in many places where the SQL config is not available (e.g.,
VariantVal.toString). Future memory instability is possible, but this change won't break any existing workload.Why are the changes needed?
It enhances the ability of the variant data type to process larger data.
Does this PR introduce any user-facing change?
Yes, as stated above.
How was this patch tested?
Unit test.
Was this patch authored or co-authored using generative AI tooling?
No.