Skip to content

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds a general framework to support any user-defined data source (name is not finalized yet) as a v2 source.

A user-defined data source is a data source defined outside of JVM. For now, it can only be the Python data source. A user-defined data source can use arbitrary query plans to read it. This PR looks up user-defined data sources and turns them into DataSourceV2Relation with a fake v2 table. Then there is a rule to turn DataSourceV2Relation into the query plan that the user-defined data source needs.

Why are the changes needed?

Unify the code to look up data sources.

Does this PR introduce any user-facing change?

Yes, now people can use python data source in SQL CREATE TABLE.

How was this patch tested?

new test

Was this patch authored or co-authored using generative AI tooling?

No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant