`ParquetRecordBatchStream` API to fetch the next row group while decoding

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
I've noticed low CPU utilization when reading from filesystems with low
bandwidth using a `ParquetRecordBatchStream`. This appears to be caused by the
fact that the stream fetches row group data on demand rather than ahead of
time. In my specific scenario, I'm reading a parquet file from s3 with four
128MB row groups. It takes ~2 seconds to fetch the data and ~500ms to decode the
entire row group. In all, it takes around 10 seconds to read and decode the
entire file.

**Describe the solution you'd like**
I'd like to add the option for `ParquetRecordBatchStream` to fetch the data for
the next row group while decoding data for the current row group.

**Describe alternatives you've considered**


**Additional context**



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`ParquetRecordBatchStream` API to fetch the next row group while decoding #6559

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ParquetRecordBatchStream API to fetch the next row group while decoding #6559

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`ParquetRecordBatchStream` API to fetch the next row group while decoding #6559