[SPARK-23321][SQL]: Validate datasource v2 writes #20488
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
DataSourceV2 does not currently apply any validation rules when writing. Other write paths attempt to validate that a data frame can be written to a target table or path and these changes add the same logic to v2.
This updates the logical plan to use InsertIntoTable and applies the insert preprocess rules to writes. It also adds a conversion rule from InserIntoTable to DataSourceV2Write because InsertIntoTable cannot be used in logical plans after analysis.
InsertIntoTable is not necessarily the right logical plan. It assumes that the table exists and can report its schema. I would like to hear suggestions for what the correct logical plan is. Other paths appear to use a source-specific command.
This also relies on changes in #20387, which hasn't been merged yet.
How was this patch tested?
Added a test that fails analysis in the preprocess rule.