-
Notifications
You must be signed in to change notification settings - Fork 617
Description
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
- TensorFlow version and how it was installed (source or binary): binary(2.0.0-dev20190920)
- TensorFlow-Addons version and how it was installed (source or binary): binary(0.6.0-dev)
- Python version: 3.6
- Is GPU used? (yes/no): Yes
Describe the bug
In the context of eager execution, we need to re-setup the memory on each step of training. However, it seems that the current API does not provide this kind of behavior. The following code snippet is from _BaseAttentionMechanism.__call__(...) method.
if self._memory_initialized:
if len(inputs) not in (2, 3):
raise ValueError(
"Expect the inputs to have 2 or 3 tensors, got %d" %
len(inputs))
if len(inputs) == 2:
# We append the calculated memory here so that the graph will be
# connected.
inputs.append(self.values)
return super(_BaseAttentionMechanism, self).__call__(inputs, **kwargs)As you can see, once the memory gets initialized, it assumes future inputs will be only to query the memory. Therefore the second call to this method (to re-setup the memory) will raise an error.
Other info / logs
I encountered this issue when i was working on #335
Ideas to solve:
1- Use AttetionMechanism.setup_memory(...) to re-setup the attention memory. But as far as I know, the API does not recommend this usage.
2- Set AttetionMechanism._memory_initialized to False at the beginning of the model's call method.
3- Internally fix this behavior. e.g. Change AttentionMechanism.__call__(...) to consider re-setting the memory.