-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24923][SQL][WIP] Add unpartitioned CTAS and RTAS support for DataSourceV2 #21877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@cloud-fan, @gatorsmile, @marmbrus, this PR demonstrates how plans would use the catalog changes introduced in #21306. To see the changes, you may want to look at just the last commit because this includes changes from other PRs. |
|
Test build #93572 has finished for PR 21877 at commit
|
5dcf159 to
323479c
Compare
|
Test build #93613 has finished for PR 21877 at commit
|
48c9998 to
8709957
Compare
|
Test build #93614 has finished for PR 21877 at commit
|
|
Test build #93615 has finished for PR 21877 at commit
|
|
Test build #93616 has finished for PR 21877 at commit
|
8709957 to
65e42b9
Compare
|
Test build #93618 has finished for PR 21877 at commit
|
65e42b9 to
37b981b
Compare
|
Test build #93620 has finished for PR 21877 at commit
|
37b981b to
b6b29d8
Compare
|
Test build #93638 has finished for PR 21877 at commit
|
Expression is internal and should not be used in public APIs. To avoid using Expression in the TableCatalog API, this commit adds a small set of transformations that are used to communicate partitioning to catalog implementations. This also adds an apply transformation that passes the name of a transform instead of a Transform class. This can be used to pass transforms that are unknown to Spark to the underlying catalog implementation.
This uses the catalog API introduced in SPARK-24252 to implement CTAS and RTAS plans.
b6b29d8 to
e50d94b
Compare
|
Test build #94828 has finished for PR 21877 at commit
|
|
Test build #99893 has finished for PR 21877 at commit
|
|
Test build #101100 has finished for PR 21877 at commit
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
ReadSupportandWriteSupportclasses for use withTableTableCatalogtoDataFrameReaderandDataFrameWriterTableV2Relationfor tables that are loaded byTableCatalogand have noDataSourceinstanceDataSourceV2Implicitsto avoid future churnNote that this doesn't handle
partitionByinDataFrameWriter. Adding support for partitioned tables will require validation rules.This is based on unmerged work and includes the commits from #21306 and #21305.
How was this patch tested?
Adding unit tests for CTAS and RTAS.