Skip to content

Commit 1de1d12

Browse files
committed
Add read_avro and list_avro_columns for rework on Splittable Avro support
This PR is part of the effort to rework on Dataset with large files reading into Tensors first to speed up performance. See 382 and 366 for related discussions. Summary: 1) read_avro is able to read a avro file within the range of [offset, offset+length] (Splittable) 2) we use primitive read_avro C++ ops to read in big chunks and then wire up with tf.data.Dataset 3) read_avro could be used in other places. 4) AvroDataset automatically find out the dtype in eager mode, in graph mode, user has to specify the dtype in kwargs. Signed-off-by: Yong Tang <[email protected]>
1 parent 8cb9f63 commit 1de1d12

File tree

9 files changed

+517
-365
lines changed

9 files changed

+517
-365
lines changed

tensorflow_io/avro/BUILD

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ load(
1010
cc_library(
1111
name = "avro_ops",
1212
srcs = [
13-
"kernels/avro_input.cc",
13+
"kernels/avro_kernels.cc",
1414
"ops/avro_ops.cc",
1515
],
1616
copts = tf_io_copts(),

tensorflow_io/avro/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,24 @@
1515
"""Avro Dataset.
1616
1717
@@AvroDataset
18+
@@list_avro_columns
19+
@@read_avro
1820
"""
1921

2022
from __future__ import absolute_import
2123
from __future__ import division
2224
from __future__ import print_function
2325

2426
from tensorflow_io.avro.python.ops.avro_ops import AvroDataset
27+
from tensorflow_io.avro.python.ops.avro_ops import list_avro_columns
28+
from tensorflow_io.avro.python.ops.avro_ops import read_avro
2529

2630
from tensorflow.python.util.all_util import remove_undocumented
2731

2832
_allowed_symbols = [
2933
"AvroDataset",
34+
"list_avro_columns",
35+
"read_avro_",
3036
]
3137

3238
remove_undocumented(__name__, allowed_exception_list=_allowed_symbols)

tensorflow_io/avro/kernels/avro_input.cc

Lines changed: 0 additions & 248 deletions
This file was deleted.

0 commit comments

Comments
 (0)