Skip to content

Conversation

@guillaumekln
Copy link
Contributor

This fixes using a custom attention layer which required the AttentionMechanism instances to be initialized with a memory at the time the AttentionWrapper is created.

Fixes #461.

@guillaumekln guillaumekln changed the title Lazily the compute attention layer size Lazily compute the attention layer size Sep 9, 2019
This fixes using a custom attention layer which required the
AttentionMechanism instances to be initialized with a memory at the
time the AttentionWrapper is created.
qlzh727
qlzh727 previously approved these changes Sep 9, 2019
Copy link
Member

@qlzh727 qlzh727 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with this change to support custom attention, on the other hand, it might introduce some latency since the value is now calculated on fly multiple times. Can we cache the value once after its calculated?

@seanpmorgan seanpmorgan merged commit 6633c43 into tensorflow:master Sep 10, 2019
@guillaumekln guillaumekln deleted the lazy-attention-layer-size branch June 9, 2020 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting a custom attention_layer fails with AttentionMechanism without memory

4 participants