-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-27938][SQL] Remove feature flag LEGACY_PASS_PARTITION_BY_AS_OPTIONS #24784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #106123 has finished for PR 24784 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
Outdated
Show resolved
Hide resolved
| "turning the flag on provides a way for these sources to see these partitionBy columns.") | ||
| .booleanConf | ||
| .createWithDefault(false) | ||
| .createWithDefault(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, why do we need this configuration if it's not invasive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was to avoid possible regression for the release 2.4.3, which was reasonable at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not intrusive then should be no regression strictly though. Yea but I got that it was an extra caution. Can we remove it now in master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, we should be removing these in 3.0 rather than defaulting back to legacy behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I will go ahead and remove this feature flag altogether if no objections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @gatorsmile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
I have updated the PR to remove the config. Thanks for the feedback! |
|
Test build #106256 has finished for PR 24784 at commit
|
|
LGTM pending Jenkins. |
|
Test build #106258 has finished for PR 24784 at commit
|
| .save() | ||
|
|
||
| val partColumns = LastOptions.parameters(DataSourceUtils.PARTITIONING_COLUMNS_KEY) | ||
| assert(DataSourceUtils.decodePartitioningColumns(partColumns) === Seq("col1", "col2")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decodePartitioningColumns is under execution package that's not supposed to be exposed so users shouldn't use this util directly.
Did we document this option to any public datasource v1 API? We should also say this is a JSON string.
|
LGTM too. strictly #24784 (comment) can be done separately. Merged to master. |
…TIONS ## What changes were proposed in this pull request? In PR apache#24365, we pass in the partitionBy columns as options in `DataFrameWriter`. To make this change less intrusive for a patch release, we added a feature flag `LEGACY_PASS_PARTITION_BY_AS_OPTIONS` with the default to be false. For 3.0, we should just do the correct behavior for DSV1, i.e., always passing partitionBy as options, and remove this legacy feature flag. ## How was this patch tested? Existing tests. Closes apache#24784 from liwensun/SPARK-27453-default. Authored-by: liwensun <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
What changes were proposed in this pull request?
In PR #24365, we pass in the partitionBy columns as options in
DataFrameWriter. To make this change less intrusive for a patch release, we added a feature flagLEGACY_PASS_PARTITION_BY_AS_OPTIONSwith the default to be false.For 3.0, we should just do the correct behavior for DSV1, i.e., always passing partitionBy as options, and remove this legacy feature flag.
How was this patch tested?
Existing tests.