fix: prevent UnionExec panic with empty inputs #17449
Open
+83
−23
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes a panic in
UnionExec
when constructed with empty inputs, replacing the crash with proper error handling and descriptive error messages.Fixes: #17052
Problem
When
UnionExec::new(vec![])
was called with an empty input vector, it would panic with:This occurred because
union_schema()
directly accessedinputs[0]
without checking if the array was empty.Solution
Core Changes
Made
UnionExec::new()
returnResult<Self>
:inputs.is_empty()
"UnionExec requires at least one input"
Made
union_schema()
returnResult<SchemaRef>
:inputs[0]
"Cannot create union schema from empty inputs"
Updated all call sites (7 files):
physical_planner.rs
- Core DataFusion integrationrepartition/mod.rs
- Internal dependenciesResult
return typeError Handling
Testing
Added 4 comprehensive tests:
test_union_empty_inputs()
- Verifies empty input validationtest_union_schema_empty_inputs()
- Tests schema creation with empty inputstest_union_single_input()
- Ensures single input still workstest_union_multiple_inputs_still_works()
- Verifies existing functionality unchangedTest Results:
Backward Compatibility
UnionExec::new()
now returnsResult<Self>
instead ofSelf
This is a breaking change but justified because:
Union
which requires ≥2 inputsFiles Changed
datafusion/physical-plan/src/union.rs
- Core fix + tests (main changes)datafusion/core/src/physical_planner.rs
- HandleResult
returndatafusion/physical-plan/src/repartition/mod.rs
- Update internal callsThe fix provides robust error handling while maintaining all existing functionality for valid use cases.