Skip to content

Conversation

@alamb
Copy link

@alamb alamb commented Jul 28, 2025

Which issue does this PR close?

Merging this PR will update apache#16861

Rationale for this change

I think we should have a slt "end to end" reproducer for (not) pushing down volatile predicates

What changes are included in this PR?

Add a slt test

Are these changes tested?

Only tests

Are there any user-facing changes?

@alamb
Copy link
Author

alamb commented Jul 28, 2025

This test fails on main like this

Completed 1 test files in 0 seconds                                                                                                                                                                                                  External error: 1 errors in file /Users/andrewlamb/Software/datafusion/datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt

1. query result mismatch:
[SQL] EXPLAIN select a from t_pushdown where b > random();
[Diff] (-expected|+actual)
    logical_plan
    01)Projection: t_pushdown.a
    02)--Filter: CAST(t_pushdown.b AS Float64) > random()
    03)----TableScan: t_pushdown projection=[a, b]
-   physical_plan
-   01)CoalesceBatchesExec: target_batch_size=8192
-   02)--FilterExec: CAST(b@1 AS Float64) > random(), projection=[a@0]
-   03)----DataSourceExec: file_groups={2 groups: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet_filter_pushdown/parquet_table/1.parquet], [WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet_filter_pushdown/parquet_table/2.parquet]]}, projection=[a, b], file_type=parquet
+   physical_plan DataSourceExec: file_groups={2 groups: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet_filter_pushdown/parquet_table/1.parquet], [WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet_filter_pushdown/parquet_table/2.parquet]]}, projection=[a], file_type=parquet, predicate=CAST(b@1 AS Float64) > random(), pruning_predicate=b_null_count@1 != row_count@2 AND CAST(b_max@0 AS Float64) > random(), required_guarantees=[]
at /Users/andrewlamb/Software/datafusion/datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt:411



Error: Execution("1 failures")
error: test failed, to rerun pass `-p datafusion-sqllogictest --test sqllogictests`

Caused by:
  process didn't exit successfully: `/Users/andrewlamb/Software/datafusion/target/ci/deps/sqllogictests-28a2a2385ac148cc parquet_filter_pushdown` (exit status: 1)

03)----TableScan: t_pushdown projection=[a, b]
physical_plan
01)CoalesceBatchesExec: target_batch_size=8192
02)--FilterExec: CAST(b@1 AS Float64) > random(), projection=[a@0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place where it would be nice to cast expressions instead of data 😀

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All we need is a library that can rewrite expressions ... 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adriangb adriangb merged commit d63b3e7 into pydantic:blacklist-exprs Jul 28, 2025
27 checks passed
@alamb alamb deleted the alamb/proposed_patch branch July 28, 2025 22:09
adriangb added a commit that referenced this pull request Jul 30, 2025
* dissallow pushdown of volatile PhysicalExprs

* fix

* add FilteredVec helper to handle filter / remap pattern (#34)

* checkpoint: Address PR feedback in https://github.com/apach...

* add FilteredVec to consolidate handling of filter / remap pattern

* lint

* Add slt test for pushing volatile predicates down (#35)

---------

Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants