[SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL #44269
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR adds a general framework to support any user-defined data source (name is not finalized yet) as a v2 source.
A user-defined data source is a data source defined outside of JVM. For now, it can only be the Python data source. A user-defined data source can use arbitrary query plans to read it. This PR looks up user-defined data sources and turns them into
DataSourceV2Relationwith a fake v2 table. Then there is a rule to turnDataSourceV2Relationinto the query plan that the user-defined data source needs.Why are the changes needed?
Unify the code to look up data sources.
Does this PR introduce any user-facing change?
Yes, now people can use python data source in SQL CREATE TABLE.
How was this patch tested?
new test
Was this patch authored or co-authored using generative AI tooling?
No