Skip to content

Conversation

@zhuqi-lucas
Copy link

…n't need to cast back to utf8

Which issue does this PR close?

Make the benchmark back to normal.

Rationale for this change

Fix ArrowReaderOptions should read with physical_file_schema

Make the benchmark back to normal.

What changes are included in this PR?

Fix ArrowReaderOptions should read with physical_file_schema
Make the benchmark back to normal.

Are these changes tested?

Yes

Are there any user-facing changes?

No

ArrowReaderOptions::new().with_page_index(true),
ArrowReaderOptions::new()
.with_page_index(true)
.with_schema(physical_file_schema.clone()),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nice catch

@adriangb adriangb merged commit cd6d766 into pydantic:move-predicate Apr 4, 2025
1 check passed
adriangb added a commit that referenced this pull request Apr 6, 2025
… ParquetOpener (apache#15561)

* parquet reader: move pruning predicate creation from ParquetSource to ParquetOpener

* use file schema, avoid loading page index if unecessary

* Add comment

* add comment

* Add comment

* remove check

* fix clippy

* update sqllogictest

* restore to explain plans

* reverted

* modify access

* Fix ArrowReaderOptions should read with physical_file_schema so we do… (#17)

* Fix ArrowReaderOptions should read with physical_file_schema so we don't need to cast back to utf8

* Fix fmt

* Update opener.rs

* Always apply per-file schema during parquet read (#18)

* Update datafusion/datasource-parquet/src/opener.rs

---------

Co-authored-by: Qi Zhu <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants