-
Notifications
You must be signed in to change notification settings - Fork 307
Update with compiling optimization and DT_UINT32 support for HDF5 #379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For compilation optimization flags, the default (-march=native) optimizes the generated code for your machine's CPU type. [see here](https://www.tensorflow.org/install/source#configuration_options)
|
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here (e.g. What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
|
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
|
I signed it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the fix!
We already had some discussion about batch size and the overall column based data (e.g., Parquet, Feather, HDF5) pipeline in #366 (comment)
Previously the batch is a limitation of overall tf.data.Dataset pipeline where it generate each record one by one. This is not an issue for large records such as image files but is really slowing down everything when each record is say one integer or one float32.
We added batch concept in tensorflow-io to speed up. But we were using the same batch as tf.keras which actually have different concept (number of sample).
My way of thinking is that we may want to
read and process as much as possible in one chunk of big memory, if not the whole file
for each "batch process" in tf.data pipeline, then rebatch() to align tf.keras' batch if needed.
That likely needs some change in the overall tf.data pipeline (or move much of the logic out of tf.data pipeline). With TF 2.0 I think the effort will be smaller.
/cc @BryanCutler
Thanks for your review and detailed reply! |
…nsorflow#379) * Update README.md * Update with compiling optimization For compilation optimization flags, the default (-march=native) optimizes the generated code for your machine's CPU type. [see here](https://www.tensorflow.org/install/source#configuration_options) * add DT_UINT32 support
More than 50% acceleration would be achieved when reading compressed hdf5 files, using compiling optimization, and even more with large batch_size.
Besides, DT_UINT32 would be supported.