Migrate MNIST to 2.0 tf.data.Dataset #268
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With the upcoming TF 1.14.0 (rc0 released, final could be released in several weeks) and TF 2.0 beta (expected to be released next week) release. It is time for us to move forward with 2.0 version of the tf.data.Dataset.
There are lots of changes in tf.data.DatasetV2. The biggest one being iteration now could be done in eager mode with
for ...(instead of the previous awkwardtf.Session.run()iteration).For tensorflow-io, one immediate question is: with the co-existence of 1.x and 2.0, what should we support? I was also asked multiple times about what exactly is the advantage of using tf.data pipeline vs. graph with custom-written ops.
For tf.data, there are several useful aspects:
tf.Session.run(init_ops); tf.Session.run(next_ops);pattern. However, with the upcoming 2.0 it is possible to iterate through dataset withfor ...now.@tf.function. As we could see@tf.functioncarries a signature oftf.TensorSpecwhich could fit into tf.data. This could be something we want to explore.I think the idea is to clean up anything that does not carry enough value. For example, should we really test the iteration in V1 with
tf.Session.run(init_ops); tf.Session.run(next_ops);where most user will never use?This PR is an effort to clean up and rework on MNISTDataset, so that it fits into DatasetV2, and allows future expansions:
a. tf.keras support in 1.x with non-eager mode (default mode in 1.x)
b. tf.keras support in 2.0 with eager-mode (default mode in 2.0)
c. iteration with
for ...in 2.0 eager-mode (default mode in 2.0)d. iteration with
for ...in 1.x eager-mode (non-default mode in 1.x)e. iteration with tf.session in 1.x non-eager mode has been dropped.
Note: This is a rework of #195.
Signed-off-by: Yong Tang [email protected]