Skip to content

AttentionMachanism is not compatible with Eager Execution #535

@kazemnejad

Description

@kazemnejad

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • TensorFlow version and how it was installed (source or binary): binary(2.0.0-dev20190920)
  • TensorFlow-Addons version and how it was installed (source or binary): binary(0.6.0-dev)
  • Python version: 3.6
  • Is GPU used? (yes/no): Yes

Describe the bug

In the context of eager execution, we need to re-setup the memory on each step of training. However, it seems that the current API does not provide this kind of behavior. The following code snippet is from _BaseAttentionMechanism.__call__(...) method.

if self._memory_initialized:
    if len(inputs) not in (2, 3):
        raise ValueError(
            "Expect the inputs to have 2 or 3 tensors, got %d" %
            len(inputs))
    if len(inputs) == 2:
        # We append the calculated memory here so that the graph will be
        # connected.
        inputs.append(self.values)
return super(_BaseAttentionMechanism, self).__call__(inputs, **kwargs)

As you can see, once the memory gets initialized, it assumes future inputs will be only to query the memory. Therefore the second call to this method (to re-setup the memory) will raise an error.

Other info / logs
I encountered this issue when i was working on #335

Ideas to solve:
1- Use AttetionMechanism.setup_memory(...) to re-setup the attention memory. But as far as I know, the API does not recommend this usage.
2- Set AttetionMechanism._memory_initialized to False at the beginning of the model's call method.
3- Internally fix this behavior. e.g. Change AttentionMechanism.__call__(...) to consider re-setting the memory.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions