Skip to content

Conversation

@nikhilsinhaparseable
Copy link
Contributor

Fixes #616

create parquet file by grouping all arrow files (in staging) for the duration provided in env variable P_STORAGE_UPLOAD_INTERVAL also check if arrow files vector is not empty,
then sort the arrow files and create key for parquet file from last file from sorted arrow files vector

create parquet file by grouping all arrow files (in staging) for the duration provided in env variable P_STORAGE_UPLOAD_INTERVAL
also check if arrow files vector is not empty,
then sort the arrow files and create key for parquet file from last file from sorted arrow files vector
@nikhilsinhaparseable nikhilsinhaparseable changed the title Fixes #616 Fixes #616 feature: allow configurable duration for data push to S3 Jan 16, 2024
.or_default()
.push(arrow_file_path);

//check if arrow files is not empty, fetch the parquet file path from last file from sorted arrow file list
Copy link

@sinhaashish sinhaashish Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try this out

Don't use unwrap(). It panics.

if !arrow_files.is_empty() {
    arrow_files.sort();
    let key = Self::arrow_path_to_parquet(arrow_files.last().unwrap_or_default());

    let entry = grouped_arrow_file.entry(key.clone()).or_default();
    entry.extend(arrow_files);
}

@nitisht nitisht merged commit 7b1e9dd into parseablehq:main Jan 29, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jan 29, 2024
@nikhilsinhaparseable nikhilsinhaparseable deleted the issue616 branch July 12, 2025 08:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature: allow configurable duration for data push to S3

3 participants