[SPARK-49163][SQL] Attempt to create table based on broken parquet partition data results should return user-facing error #47668
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Create an example parquet table with partitions and insert data in Spark:
Go into the
parquet-testpath in the filesystem and try to copy parquet data file from pathcol1=a/col2=bdirectory intocol1=a. After that, try to create new table based on parquet data in Spark:This query errors with internal error. Stack trace excerpts:
Fix this by changing internal error to user-facing error.
Why are the changes needed?
Replace internal error with user-facing one for valid sequence of Spark SQL operations.
Does this PR introduce any user-facing change?
Yes, it presents the user with regular error instead of internal error.
How was this patch tested?
Added checks to
ParquetPartitionDiscoverySuitewhich simulate the described scenario by manually breaking parquet table in the filesystem.Was this patch authored or co-authored using generative AI tooling?
No.