Per file filter evaluation #15057

adriangb · 2025-03-07T06:50:32Z

A step towards Support Push down expression evaluation in TableProviders #14993.

I decided to tackle filter pushdown first because:

A lot of the benefit of expression pushdown comes from filter pushdown since filters often remove large chunks of work from a query.
Filter pushdown already has a lot of the machinery in place, projection pushdown requires a lot more API design.

The idea is that we can experiment with filters because it's less work and later re-use FileExpressionRewriter to do projection pushdown once we flesh out the details and apply learnings from this piece of work.

adriangb · 2025-03-07T06:52:50Z

The example is not working yet. It gets Error: Shared(ArrowError(ExternalError(External(ComputeError("Error evaluating filter predicate: Internal(\"PhysicalExpr Column references column '_user_info.age' at index 0 (zero-based) but input schema only has 0 columns: []\")"))), None)). I think some work will be needed with how the rewrites interact with projections for the filters.

adriangb · 2025-03-07T21:36:05Z

Ok the example is now working and I think the overall approach is interesting but I don't think it's quite close to a workable solution.

adriangb · 2025-03-10T23:04:53Z

The example is now working and even does stats pruning of shredded columns 🚀

datafusion-examples/examples/struct_field_rewrite.rs

adriangb · 2025-03-10T23:08:43Z

datafusion-examples/examples/struct_field_rewrite.rs

+        let parquet_source = ParquetSource::default()
+            .with_predicate(self.schema.clone(), filter)
+            .with_pushdown_filters(true)
+            .with_filter_expression_rewriter(Arc::new(StructFieldRewriter) as _);


This is the API for users to attach this rewriter to their plan

adriangb · 2025-03-10T23:09:04Z

datafusion-examples/examples/struct_field_rewrite.rs

+struct StructFieldRewriterImpl {
+    file_schema: SchemaRef,
+}
+
+impl TreeNodeRewriter for StructFieldRewriterImpl {
+    type Node = Arc<dyn PhysicalExpr>;
+
+    fn f_down(
+        &mut self,
+        expr: Arc<dyn PhysicalExpr>,
+    ) -> Result<Transformed<Arc<dyn PhysicalExpr>>> {
+        if let Some(scalar_function) = expr.as_any().downcast_ref::<ScalarFunctionExpr>()
+        {
+            if scalar_function.name() == "get_field" {
+                if scalar_function.args().len() == 2 {
+                    // First argument is the column, second argument is the field name
+                    let column = scalar_function.args()[0].clone();
+                    let field_name = scalar_function.args()[1].clone();
+                    if let Some(literal) =
+                        field_name.as_any().downcast_ref::<expressions::Literal>()
+                    {
+                        if let Some(field_name) = literal.value().try_as_str().flatten() {
+                            if let Some(column) =
+                                column.as_any().downcast_ref::<expressions::Column>()
+                            {
+                                let column_name = column.name();
+                                let source_field =
+                                    self.file_schema.field_with_name(column_name)?;
+                                let expected_flattened_column_name =
+                                    format!("_{}.{}", column_name, field_name);
+                                // Check if the flattened column exists in the file schema and has the same type
+                                if let Ok(shredded_field) = self
+                                    .file_schema
+                                    .field_with_name(&expected_flattened_column_name)
+                                {
+                                    if source_field.data_type()
+                                        == shredded_field.data_type()
+                                    {
+                                        // Rewrite the expression to use the flattened column
+                                        let rewritten_expr = expressions::col(
+                                            &expected_flattened_column_name,
+                                            &self.file_schema,
+                                        )?;
+                                        return Ok(Transformed::yes(rewritten_expr));
+                                    }
+                                }
+                            }
+                        }
+                    }
+                }
+            }
+        }
+
+        Ok(Transformed::no(expr))
+    }
+}


Example implementation of a rewriter

datafusion/datasource-parquet/src/opener.rs

datafusion/datasource-parquet/src/source.rs

datafusion-examples/examples/struct_field_rewrite.rs

adriangb · 2025-03-10T23:32:05Z

@alamb I think this is ready for a first round of review when you have a chance!

adriangb · 2025-03-11T20:08:41Z

The main issue I've found with this approach is marking filters as Exact or Inexact.
In particular unless you mark them as Exact DataFusion will still need to pull the possibly large unshredded data to re-apply filters in a FilterExec. This doesn't completely kill performance because if the filter is selective there is less data to re-filter, but the worst case scenario is possibly worse than not having this feature at all. But I feel like this is a consequence of filter pushdown in general? Ignoring this change, if I have a TableProvider that returns a DataSourceExec and I have filter pushdown enabled, should I be marking all of my filters as Exact? That seems dangerous given that it's not documented anywhere that filter pushdown supports all filters that FilterExec does and things like

datafusion/datafusion/datasource-parquet/src/row_filter.rs

Lines 333 to 336 in 9382add

    
           #[inline] 
        
           fn prevents_pushdown(&self) -> bool { 
        
               self.non_primitive_columns || self.projected_columns 
        
           }

.

adriangb · 2025-03-11T21:39:41Z

Okay I think I can answer my own question: https://github.com/pydantic/datafusion/blob/38356998059a2d08113401ea8111f238899ab0b8/datafusion/core/src/datasource/listing/table.rs#L961-L995

Based on this it seems like it's safe to mark filters as exact if they are getting pushed down 😄

adriangb · 2025-03-12T16:46:49Z

Okay folks sorry for the churn, I thought this was in a better state than it ended up being.

I've now reworked it to minimize the diff and make sure all existing tests pass. I'm going to add tests for the new functionality now to compliment the example.

adriangb · 2025-03-12T18:59:04Z

datafusion/datasource-parquet/src/opener.rs

+        // Note about schemas: we are actually dealing with _4_ different schemas here:
+        // - The table schema as defined by the TableProvider. This is what the user sees, what they get when they `SELECT * FROM table`, etc.
+        // - The "virtual" file schema: this is the table schema minus any hive partition columns. This is what the file schema is coerced to.
+        // - The physical file schema: this is the schema as defined by the parquet file. This is what the parquet file actually contains.
+        // - The filter schema: a hybrid of the virtual file schema and the physical file schema.
+        //   If a filter is rewritten to reference columns that are in the physical file schema but not the virtual file schema, we need to add those columns to the filter schema so that the filter can be evaluated.
+        //   This schema is generated by taking any columns from the virtual file schema that are referenced by the filter and adding any columns from the physical file schema that are referenced by the filter but not in the virtual file schema.
+        //   Columns from the virtual file schema are added in the order they appear in the virtual file schema.
+        //   The columns from the physical file schema are always added to the end of the schema, in the order they appear in the physical file schema.
+        //
+        // I think it might be wise to do some renaming of parameters where possible, e.g. rename `file_schema` to `table_schema_without_partition_columns` and `physical_file_schema` or something like that.


This is an interesting bit to ponder upon

adriangb · 2025-03-13T20:53:54Z

datafusion/datasource/src/file_expr_rewriter.rs

+/// Rewrite an expressions to take into account this file's particular schema.
+/// This can be used to evaluate expressions against shredded variant columns or columns that pre-compute expressions (e.g. `day(timestamp)`).
+pub trait FileExpressionRewriter: Debug + Send + Sync {
+    /// Rewrite an an expression in the context of a file schema.
+    fn rewrite(
+        &self,
+        file_schema: SchemaRef,
+        expr: Arc<dyn PhysicalExpr>,
+    ) -> Result<Arc<dyn PhysicalExpr>>;
+}


Note: if users need the table_schema they can bind that inside of TableProvider::scan

datafusion/datasource-parquet/src/source.rs

alamb · 2025-03-15T13:13:28Z

I will try and give this a look over the next few days

adriangb · 2025-04-13T15:40:46Z

I would like to resume this work.

Some thoughts should the rewrite happen via a new trait as I'm currently doing, or should we add a method PhysicalExpr::with_schema?
If we add with_schema what schema do we pass it? The actual file schema? There's something to be said for that: it could rewrite filters to case the literals / filters instead of casting the columns/arrays as is currently done, which should be cheaper. I expect that any time it was okay to cast the data it was also okay to cast the predicate itself. It could also absorb the work of reassign_predicate_columns (we implement it for Column such that if it's index doesn't match but another one does it swaps).

I suspect the hard bit with this approach will be edge cases: what if a filter cannot adapt itself to the file schema, but we could cast the column to make it work? I'm thinking something like a UDF that only accepts Utf8 but the the file produces Utf8View 🤔

I think @jayzhan211 proposed something similar in https://github.com/apache/datafusion/pull/15685/files#diff-2b3f5563d9441d3303b57e58e804ab07a10d198973eed20e7751b5a20b955e42.

@alamb any thoughts?

jayzhan211 · 2025-04-14T03:19:41Z

PhysicalExpr::with_schema

This method is too general and it is unclear what we need to do with the provided schema for each PhysicalExpr, it is not a good idea.

I suspect the hard bit with this approach will be edge cases: what if a filter cannot adapt itself to the file schema, but we could cast the column to make it work? I'm thinking something like a UDF that only accepts Utf8 but the the file produces Utf8View

I think it is unavoidable we need to cast the columns to be able to evaluate the filter.

Another question is, isn't the filter created based on table schema? And then the batch is read as file schema and casted to table schema and is evaluated by filter. What we could do is rewrite the filter based on file schema. Assume we have cast(a, i64) = 100, a is i32 in table schema and i64 in file schema. We rewrite it to cast(cast(a,i32),i64) = 100 and then optimize it with a = 100. In your example where udf only accepts utf8, we know that no optimization we could do so we just end up additional casting from file schema to table schema.

adriangb · 2025-04-14T03:25:42Z

Another question is, isn't the filter created based on table schema? And then the batch is read as file schema and casted to table schema and is evaluated by filter.

Yes this is exactly the case.

What we could do is rewrite the filter based on file schema. Assume we have cast(a, i64) = 100, a is i32 in table schema and i64 in file schema. We rewrite it to cast(cast(a,i32),i64) = 100 and then optimize it with a = 100.

Yes that is exactly what I am proposing above, thank you for giving a more concrete example.

The other point is if we can use this same mechanism to handle shredding for the variant type. In other words, can we "optimize" variant_get(col, 'key') to col.typed_value.key if we know from the file schema that key is shredded for this specific file.

And if that all makes sense... how do we do those optimizations? Is it something like an optimizer that has to downcast match on the expressions, or do we add methods to PhysicalExpr for each expression to describe how it handles this behavior?

adriangb · 2025-07-10T20:18:19Z

datafusion-examples/examples/default_column_values.rs

+    );
+
+    println!("\n=== Demonstrating default value injection in filter predicates ===");
+    let query = "SELECT id, name FROM example_table WHERE status = 'active' ORDER BY id";


@alamb this is the example we talked about on the call today

What would be even more interesting: an example showing generated expression defaults:

ALTER TABLE t ADD COLUMN g DEFAULT (other_col);

alamb

Thank you @adriangb -- I think this is looking great. Thank you for sticking with it

I think it would be great too if we could file a ticket to track the consolidation of SchemaAdapter and PhysicalExprRewriter

alamb · 2025-07-11T20:40:51Z

datafusion-examples/examples/default_column_values.rs

+// Important: PhysicalExprAdapter is specifically designed for rewriting filter predicates
+// that get pushed down to file scans. For handling missing columns in projections,
+// other mechanisms in DataFusion are used (like SchemaAdapter).


Maybe here would be a good place to leave a link to the ticket describing the unification of SchemaAdapter and PhysicalExprAdapter)

alamb · 2025-07-11T20:44:08Z

datafusion-examples/examples/default_column_values.rs

+        expr: Arc<dyn PhysicalExpr>,
+        logical_file_schema: &Schema,
+        physical_file_schema: &Schema,
+        partition_values: &[(FieldRef, ScalarValue)],


Not for this PR, but it seem to me that partition_values shouldn't really be in PhysicalExprAdapter -- rather the ListingTable should provide a PhysicalExprAdapter that knows how to fill in partition values

As a follow on PR perhaps

Very interesting idea. I think let's see how the refactoring of FileScanConfigBuilder & co goes, then we can revisit moving the injection of the partition values around.

Maybe we end up with PhysicalExprAdapterFactory that users create and then gets called as let adapter = physical_expr_adapter_factory.create_adapter(logical_file_schema, physical_file_schema).with_partition_values(...); and let expr = adapter.rewrite(expr);

I don't think it's necessary for the TableProvider to create it. The FileOpener has the partition values, the logical file schema and the physical file schema. Splitting up initialization will add complexity I think.

Part of my thinking behind PhysicalExprAdapterFactory is that when we get around to projection pushdown we'll want to adapt multiple expressions. By having a factory we can (1) minimize boilerplate and (2) have the opportunity to pre-compute e.g. missing columns or other mappings to make the rewrites more performant.

…r API - Update all tests in schema_rewriter.rs to use DefaultPhysicalExprAdapterFactory - Update documentation examples to demonstrate factory pattern - Update default_column_values.rs example to use factory-style API - Convert from rewrite_to_file_schema method to rewrite method with factory pattern - Add proper partition values handling with with_partition_values method 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

adriangb · 2025-07-15T15:02:04Z

I plan on merging #15057 once CI passes. It has been approved / reviewed and we need the customization of the rewriters for #16235 (comment). I will follow up with PRs to:

Add a no-op / disable rewriter.
Restore the hooks to add a custom SchemaAdapter.
Document in the Changelog how to restore the old behavior if users don't want to buy into the new APIs just yet.

alamb · 2025-07-15T15:31:49Z

Thank you @adriangb

github-actions bot added core Core DataFusion crate datasource Changes to the datasource crate labels Mar 7, 2025

adriangb mentioned this pull request Mar 7, 2025

Support Push down expression evaluation in TableProviders #14993

Open

adriangb marked this pull request as draft March 7, 2025 06:52

adriangb force-pushed the filter-pushdown branch from a21f316 to 3bb4c36 Compare March 10, 2025 23:01

adriangb marked this pull request as ready for review March 10, 2025 23:01

adriangb commented Mar 10, 2025

View reviewed changes

xudong963 self-requested a review March 11, 2025 10:04

adriangb force-pushed the filter-pushdown branch from 008eba0 to 1878f59 Compare March 12, 2025 16:45

adriangb commented Mar 12, 2025

View reviewed changes

adriangb commented Mar 13, 2025

View reviewed changes

datafusion/datasource-parquet/src/source.rs Outdated Show resolved Hide resolved

This was referenced Mar 17, 2025

Fix predicate pushdown for custom SchemaAdapters #15263

Merged

Add dynamic pruning filters from TopK state #15301

Closed

adriangb force-pushed the filter-pushdown branch 2 times, most recently from a310528 to 59ec143 Compare March 20, 2025 22:11

This was referenced Apr 3, 2025

parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener #15561

Merged

[EPIC] [Parquet] Implement Variant type support in Parquet apache/arrow-rs#6736

Closed

Introduce DynamicFilterSource and DynamicPhysicalExpr #15568

Merged

adriangb added 11 commits July 10, 2025 12:25

wip

99f57cc

working

2deb4b3

Update datafusion/datasource-parquet/src/opener.rs

8c7a6fb

lint

1477276

decouple

f4c37bd

remove commented out code, flip order

b14ec75

handle edge case with rewrite

6794ffc

address pr feedback

6ef51fc

add missing file

3376b53

move tests

438f0b9

add example, refactor

280fab1

adriangb force-pushed the filter-pushdown branch from ac3c8d4 to 280fab1 Compare July 10, 2025 19:53

github-actions bot removed the core Core DataFusion crate label Jul 10, 2025

adriangb added 2 commits July 10, 2025 15:12

Fix examples

ff4505c

fmt

ca48503

adriangb commented Jul 10, 2025

View reviewed changes

adriangb added 2 commits July 10, 2025 20:16

move to filescanconfigbuilder

50b9d3c

fix

fc7e010

alamb approved these changes Jul 11, 2025

View reviewed changes

adriangb mentioned this pull request Jul 13, 2025

Release DataFusion 49.0.0 (July 2025) #16235

Closed

34 tasks

adriangb and others added 5 commits July 15, 2025 09:28

use a factory

8a5cf4f

fix example

72ad6a4

Fix json_shredding.rs

3ce0a0f

fmt

3df4041

alamb merged commit 1d14e56 into apache:main Jul 15, 2025
29 checks passed

alamb added the performance Make DataFusion faster label Jul 18, 2025

adriangb mentioned this pull request Oct 21, 2025

feat: allow pushdown of dynamic filters having partition cols #18172

Merged

Per file filter evaluation #15057

Per file filter evaluation #15057

Uh oh!

Conversation

adriangb commented Mar 7, 2025 • edited by alamb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangb commented Mar 7, 2025

Uh oh!

adriangb commented Mar 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangb commented Mar 10, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adriangb commented Mar 10, 2025

Uh oh!

adriangb commented Mar 11, 2025

Uh oh!

adriangb commented Mar 11, 2025

Uh oh!

adriangb commented Mar 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alamb commented Mar 15, 2025

Uh oh!

adriangb commented Apr 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayzhan211 commented Apr 14, 2025

Uh oh!

adriangb commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

alamb commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

adriangb commented Mar 7, 2025 •

edited by alamb

Loading

adriangb commented Mar 7, 2025 •

edited

Loading

adriangb commented Apr 13, 2025 •

edited

Loading

adriangb commented Apr 14, 2025 •

edited

Loading

adriangb Jul 15, 2025 •

edited

Loading

adriangb commented Jul 15, 2025 •

edited

Loading