-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crateChanges to the parquet crate
Description
Describe the bug
If parquet is written with timestamps with time unit other than ns reading such file would produce incorrect dates, whereas pandas is reading the dates correctly
To Reproduce
Generate parquet file as follows:
`
import pandas as pd
import numpy as np
np.random.seed(0)
create an array of 5 dates starting at '2015-02-24', one per minute
rng = pd.date_range('2020-01-01', periods=5, freq='H')
df = pd.DataFrame({ 'Date': rng, 'Val': np.random.randn(len(rng)) })
df.to_parquet('data/myfile.parquet', coerce_timestamps='ms', allow_truncated_timestamps=True)
`
Expected behavior
Data is not corrupted and dates are read back correctly.
Additional context
_
scimas and jmahenriques
Metadata
Metadata
Assignees
Labels
enhancementAny new improvement worthy of a entry in the changelogAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crateChanges to the parquet crate