Skip to content

Conversation

@yongtang
Copy link
Member

@yongtang yongtang commented Aug 23, 2019

Parquet columnar file format that naturally fits into a table/column data. Since Parquet file itself is indexable, degenerating parquet into an iterable dataset is not desirable as it loses convenience and flexibility.

This PR adds tfio.IOTensor.from_parquet support so that it is possible to acess parquet data through natual __getitem__ operations.

Signed-off-by: Yong Tang [email protected]

@yongtang yongtang force-pushed the io_tensor_parquet branch 2 times, most recently from 200cd33 to 37bf6b7 Compare August 25, 2019 19:36
Parquet columnar file format that naturally fits into
a table/column data. Since Parquet file itself is indexable,
degenerating parquet into an iterable dataset is not desirable
as it loses convenience and flexibility.

This PR adds tfio.IOTensor.from_parquet support so that it is possible
to acess parquet data through natual `__getitem__` operations.

Signed-off-by: Yong Tang <[email protected]>
@yongtang
Copy link
Member Author

Intend to merge this PR soon as it has been open for some time. This PR basically follow the same pattern as other PRs, and adds the ability of partitioned read.

@yongtang yongtang merged commit a0cc8b7 into tensorflow:master Sep 21, 2019
@yongtang yongtang deleted the io_tensor_parquet branch September 21, 2019 16:41
i-ony pushed a commit to i-ony/io that referenced this pull request Feb 8, 2021
Parquet columnar file format that naturally fits into
a table/column data. Since Parquet file itself is indexable,
degenerating parquet into an iterable dataset is not desirable
as it loses convenience and flexibility.

This PR adds tfio.IOTensor.from_parquet support so that it is possible
to acess parquet data through natual `__getitem__` operations.

Signed-off-by: Yong Tang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant