-
Notifications
You must be signed in to change notification settings - Fork 617
Description
System information
- TensorFlow version (you are using): tf-nightly-2.0-preview
- TensorFlow Addons version: source
- Is it in the tf.contrib (if so, where): no
- Are you willing to contribute it (yes/no): yes
- Are you willing to maintain it going forward? (yes/no): yes
Describe the feature and the current behavior/state.
Currently, tfa.image.mean_filter2d is implemented by tf.image.extract_patches, while its functionality is the same with the one applying 2-D convolution with box filter (or uniform filter) channel by channel. This can be easily done by tf.nn.depthwise_conv2d with significant speed improvement.
Here is the notebook example to compare the performance of these two implementations. On colab platform, it is 4x faster than the original one for either single image or multiple images. Note that I extend the original implementation with tf.map_fn to make it support 4-D input, and some trivial normalization is omitted.
Still wondering why some normalization and casting are performed in the original implementation in both mean_filter2d and median_filter2d. (no offense, just want to know why)
- For opencv,
dst: destination array of the same size and type as src.
- For scipy,
output : array or dtype, optional
The array in which to place the output, or the dtype of the returned array. By default an array of the same dtype as input will be created.
- For matlab,
Output image, returned as a numeric matrix of the same class as the input image I.
As far as I'm concerned, there is no need to force the computation limited in range [0, 1], and it's not necessary to transform the output back to uint8 range even if the input is not of type uint8. All we need to do is to compute the output (may first cast image to float because depthwise_conv2d does not accept non-float kernel and input) and cast it back to the original data type. I believe that for median and average operation, there will not result in any over/underflow situation. Hence, it might be very safe for casting the output back to the original data type, though I think it's better to discuss this first.
Moreover, if users want to do some post processing on the intermediate feature maps from CNNs, these feature maps might not range in either [0, 1] or uint8. They are more likely to tensors of real number depending on what the activation function is used.
Will this change the current api? How?
mean_filter2d(image, filter_shape=(3, 3), padding="REFLECT", constant_values=0, name=None)
The new API is similar to scipy.ndimage.uniform_filter which supports padding mode. The default padding mode is "REFLECT" because not only scipy but also opencv set REFLECT padding as a default (while some toolboxes in matlab adopt zero-padding by default). Besides, tf.image.sobel_edges, a filter-based application, also uses REFLECT padding mode.
Who will benefit with this feature?
People who want to process images.
Any Other info.
To conclude:
- Implement with
tf.nn.depthwise_conv2d, which can speed up a lot and support batch-wise computation. - Cast the input image to float if needed and cast the output back to
image.dtype. - Support padding modes.
Some references for API design: