mask zero and activation in HATT

we use this code to build our project, but we found the acc dropped. So , we review the code, and find the following issues.

1. This code did not  implemented "mask" in the "AttLayer" class.
2. we believe "Dense layer" should implemented in the class "AttLayer", instead of using "Dense" out of the class
3. lost "Activation function" in the Dense layer

We made the above changes，and the acc increased by 4-5 percent from baseline in out task(text classification).

we give our "AttLayer" class, this input is the direct output from the GRU without an additional "Dense layer":
```python
class AttLayer(Layer):
    def __init__(self, attention_dim):
        self.init = initializers.get('normal')
        self.supports_masking = True
        self.attention_dim = attention_dim
        super(AttLayer, self).__init__()

    def build(self, input_shape):
        assert len(input_shape) == 3
        self.W = K.variable(self.init((input_shape[-1], self.attention_dim)))
        self.b = K.variable(self.init((self.attention_dim, )))
        self.u = K.variable(self.init((self.attention_dim, 1)))
        self.trainable_weights = [self.W, self.b, self.u]
        super(AttLayer, self).build(input_shape)

    def compute_mask(self, inputs, mask=None):
        return mask

    def call(self, x, mask=None):
        # size of x :[batch_size, sel_len, attention_dim]
        # size of u :[batch_size, attention_dim]
        # uit = tanh(xW+b)
        uit = K.tanh(K.bias_add(K.dot(x, self.W), self.b))
        ait = K.dot(uit, self.u)
        ait = K.squeeze(ait, -1)

        ait = K.exp(ait)

        if mask is not None:
            # Cast the mask to floatX to avoid float64 upcasting in theano
            ait *= K.cast(mask, K.floatx())
        ait /= K.cast(K.sum(ait, axis=1, keepdims=True) + K.epsilon(), K.floatx())
        ait = K.expand_dims(ait)
        weighted_input = x * ait
        output = K.sum(weighted_input, axis=1)

        return output

    def compute_output_shape(self, input_shape):
        return (input_shape[0], input_shape[-1])

```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mask zero and activation in HATT #28

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

mask zero and activation in HATT #28

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions