Make T5 model torchscriptable #1876

pmabbo13 · 2022-08-08T18:42:02Z

Description

Make minor design changes that allow for the T5 model to be scripted

Process

A few design changes had to be made for the model to become scriptable:

relative_attention_bias is no longer instantiated when a T5 layer is being instantiated. This is because relative_attention_bias then needs to get passed down to T5MultiheadAttention as an input argument, where the embedding is actually used, but Torchscript does not support nn.Embedding input args. So instead, relative_attention_bias gets instantiated with T5MultiheadAttention. This completely avoids having to pass it in as an input argument.
There seemed to be an issue with all_outputs, all_sa_scores, and all_ca_scores (in T5Stack.forward) being tuples (probably because after every layer we "append" tensors to this object, and tuples aren't the best data structures for what we are effectively using as mutable objects). Changing their types to List cleared the torchscript errors.
After correcting the above and a few other minor errors, we were then getting a vague error message of torchscript RuntimeError: Unsupported value kind: Tensor. The error message did not provide any pointers to where in the code this issue was arising. It is unclear why exactly this worked, but breaking out T5Layer and T5Stack to be T5EncoderLayer, T5DecoderLayer, T5Encoder, T5Decoder seemed to resolve this issue.

Testing

Updated pre-trained weights saved in S3/Manifold and modified integrations tests to test scripted version of models.

pytest test/prototype/integration_tests/test_models.py

parmeet

Overall LGTM! Thanks @pmabbo13 for making the model torchscriptable.

Nayef211

LGTM! Great work on this!

pmabbo13 added 3 commits August 8, 2022 10:32

type annotate device

3321854

refactor relative_attention_bias

59d0e9e

breaking out encoder and decoder layer and stacks

22809fa

facebook-github-bot added the cla signed label Aug 8, 2022

pmabbo13 added 4 commits August 8, 2022 15:23

Merge branch 'main' into feature/t5-torchscript

8fba7b2

updating doc strings

2b4f61c

correcting type annotations

d595a7f

update integration tests to test scripted version of models

5441e9a

pmabbo13 mentioned this pull request Aug 9, 2022

Add T5 Model and Demo on Text Summarization using CNNDM Dataset #1800

Closed

25 tasks

pmabbo13 marked this pull request as ready for review August 10, 2022 15:02

pmabbo13 requested review from Nayef211, abhinavarora and parmeet and removed request for Nayef211 August 10, 2022 15:02

parmeet approved these changes Aug 10, 2022

View reviewed changes

Nayef211 approved these changes Aug 10, 2022

View reviewed changes

pmabbo13 merged commit e7bcf3c into pytorch:main Aug 11, 2022

pmabbo13 deleted the feature/t5-torchscript branch August 11, 2022 14:36

pmabbo13 mentioned this pull request Aug 12, 2022

Testing T5Model #1848

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make T5 model torchscriptable #1876

Make T5 model torchscriptable #1876

Uh oh!

pmabbo13 commented Aug 8, 2022 •

edited

Loading

Uh oh!

parmeet left a comment

Uh oh!

Nayef211 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Make T5 model torchscriptable #1876

Make T5 model torchscriptable #1876

Uh oh!

Conversation

pmabbo13 commented Aug 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Process

Testing

Uh oh!

parmeet left a comment

Choose a reason for hiding this comment

Uh oh!

Nayef211 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pmabbo13 commented Aug 8, 2022 •

edited

Loading