-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Open
Labels
enhancementNew feature or requestNew feature or requestperformanceMake DataFusion fasterMake DataFusion faster
Description
Is your feature request related to a problem or challenge?
- While reviewing Format
Date32to string given timestamp specifiers #15361 I noticed that theto_charimplementation for arrays could likely be improved substntially
Specifically the code that
The use of Vec<Option<String>> to build a string array is non ideal as it requires additional allocations for each output row along with a second copy of the data
fn to_char_array(args: &[ColumnarValue]) -> Result<ColumnarValue> {
let arrays = ColumnarValue::values_to_arrays(args)?;
let mut results: Vec<Option<String>> = vec![]; // <--- this is non idealAlso, the code that falls back to date formatting calls cast for a single row every time the formatting fails -- since the failure is likely due to a date specifier that will get called for all rows, it would probably be faster to convert the entire input array to Date once the first time the specifier is retried
Describe the solution you'd like
Make to_char faster
You can evaluate your changes with the existing benchmark
cargo bench --bench to_charDescribe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestperformanceMake DataFusion fasterMake DataFusion faster