Fix bool attention mask in transformer encoder #1454

ebsmothers · 2021-12-08T01:53:46Z

The bool attention mask was being cast to float incorrectly. This diff fixes the cast and creates an additional test case for bool masks

Summary: Rather than raise an exception whenever head_dim != 64, we can just infer the scaling value and continue to provide a warning. Also add an assertion in case embed_dim is not a multiple of num_heads (in which case forward will break). Reviewed By: parmeet Differential Revision: D32193989 fbshipit-source-id: 30f68c55f3ec37932252c77c355ae55b8bf34ded

parmeet · 2021-12-08T14:46:07Z

torchtext/models/roberta/modules.py

+            if attn_mask.dtype == torch.bool:
+                new_attn_mask = torch.zeros_like(attn_mask, dtype=input.dtype)
+                new_attn_mask.masked_fill_(attn_mask, -1e8 if input.dtype == torch.float32 else -1e4)
+                attn_mask = new_attn_mask


I wonder if it is necessary to add the conversion here, since we have already added inside TransformerEncoderLayer? Following the same argument, perhaps we could also remove it from TransformerEncoderLayer, since we know that it is going to be passed to MultiHeadSelfAttention that would do this conversion. This was also discussed a bit in here #1435 (comment). I feel, we could avoid the redundancy here. wdyt?

Good question. I think it makes sense to perform the bool-to-float conversion only in MultiHeadSelfAttention, since that is where the mask is actually used. Then we can just keep the checks in TransformerEncoder and TransformerEncoderLayer to ensure the attention mask is either bool or float.

…to attn_mask_fix

codecov · 2021-12-08T18:46:25Z

Codecov Report

Merging #1454 (b584f30) into main (9f2fb3f) will increase coverage by 0.17%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##             main    #1454      +/-   ##
==========================================
+ Coverage   86.35%   86.52%   +0.17%     
==========================================
  Files          58       58              
  Lines        2220     2219       -1     
==========================================
+ Hits         1917     1920       +3     
+ Misses        303      299       -4

Impacted Files	Coverage Δ
torchtext/models/roberta/modules.py	`84.75% <75.00%> (+2.33%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f2fb3f...b584f30. Read the comment docs.

parmeet

LGTM, thanks for fixing the issue!

ebsmothers added 3 commits November 5, 2021 13:38

Merge branch 'pytorch:main' into main

6589989

Fix bool attention mask in Transformer encoder

d857e4f

pytorch-probot bot added the ciflow/default label Dec 8, 2021

facebook-github-bot added the cla signed label Dec 8, 2021

parmeet reviewed Dec 8, 2021

View reviewed changes

ebsmothers added 4 commits December 8, 2021 17:12

Fix bool attention mask in Transformer encoder

16123f0

merge changes

667321b

merge changes

b061086

Merge branch 'attn_mask_fix' of https://github.com/ebsmothers/text in…

57714bb

…to attn_mask_fix

ebsmothers marked this pull request as ready for review December 8, 2021 17:23

ebsmothers added 2 commits December 8, 2021 18:05

Merge branch 'attn_mask_fix' of https://github.com/ebsmothers/text in…

21d313b

…to attn_mask_fix

Merge branch 'attn_mask_fix' of https://github.com/ebsmothers/text in…

b584f30

…to attn_mask_fix

parmeet approved these changes Dec 8, 2021

View reviewed changes

parmeet merged commit a074cb2 into pytorch:main Dec 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bool attention mask in transformer encoder #1454

Fix bool attention mask in transformer encoder #1454

Uh oh!

ebsmothers commented Dec 8, 2021

Uh oh!

parmeet Dec 8, 2021

Uh oh!

ebsmothers Dec 8, 2021

Uh oh!

codecov bot commented Dec 8, 2021 •

edited

Loading

Uh oh!

parmeet left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix bool attention mask in transformer encoder #1454

Fix bool attention mask in transformer encoder #1454

Uh oh!

Conversation

ebsmothers commented Dec 8, 2021

Uh oh!

parmeet Dec 8, 2021

Choose a reason for hiding this comment

Uh oh!

ebsmothers Dec 8, 2021

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

parmeet left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 8, 2021 •

edited

Loading