Skip to content

minor: UnionExec inputs validation #17052

@gruuya

Description

@gruuya

Is your feature request related to a problem or challenge?

When physical plans are directly constructed (i.e. skipping over logical plans/dataframes) in a dynamical manner, UnionExecs inputs can end up being an empty vector.

In turn, this causes a panic such as

thread '...' panicked at datafusion/physical-plan/src/union.rs:542:24:
index out of bounds: the len is 0 but the index is 0

Describe the solution you'd like

Along the lines of the existing validation for the logical union plan maybe this should be an error instead

if inputs.len() < 2 {
return plan_err!("UNION requires at least two inputs");
}

Otherwise (since that would entail changing the signature of UnionExec::new), at least the panic can be made to have a clearer message, e.g.

--- a/datafusion/physical-plan/src/union.rs
+++ b/datafusion/physical-plan/src/union.rs
@@ -538,8 +538,11 @@ pub fn can_interleave<T: Borrow<Arc<dyn ExecutionPlan>>>(
             .all(|partition| partition == *reference)
 }
 
-fn union_schema(inputs: &[Arc<dyn ExecutionPlan>]) -> SchemaRef {
-    let first_schema = inputs[0].schema();
+fn union_schema(inputs: &[Arc<dyn ExecutionPlan>]) -> Result<SchemaRef> {
+    let first_schema = inputs
+        .get(0)
+        .expect("No union input plans provided")
+        .schema();

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions