Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@pmabbo13
Copy link
Contributor

@pmabbo13 pmabbo13 commented Aug 8, 2022

Description

Make minor design changes that allow for the T5 model to be scripted

Process

A few design changes had to be made for the model to become scriptable:

  1. relative_attention_bias is no longer instantiated when a T5 layer is being instantiated. This is because relative_attention_bias then needs to get passed down to T5MultiheadAttention as an input argument, where the embedding is actually used, but Torchscript does not support nn.Embedding input args. So instead, relative_attention_bias gets instantiated with T5MultiheadAttention. This completely avoids having to pass it in as an input argument.

  2. There seemed to be an issue with all_outputs, all_sa_scores, and all_ca_scores (in T5Stack.forward) being tuples (probably because after every layer we "append" tensors to this object, and tuples aren't the best data structures for what we are effectively using as mutable objects). Changing their types to List cleared the torchscript errors.

  3. After correcting the above and a few other minor errors, we were then getting a vague error message of torchscript RuntimeError: Unsupported value kind: Tensor. The error message did not provide any pointers to where in the code this issue was arising. It is unclear why exactly this worked, but breaking out T5Layer and T5Stack to be T5EncoderLayer, T5DecoderLayer, T5Encoder, T5Decoder seemed to resolve this issue.

Testing

Updated pre-trained weights saved in S3/Manifold and modified integrations tests to test scripted version of models.

pytest test/prototype/integration_tests/test_models.py

@pmabbo13 pmabbo13 marked this pull request as ready for review August 10, 2022 15:02
@pmabbo13 pmabbo13 requested review from Nayef211, abhinavarora and parmeet and removed request for Nayef211 August 10, 2022 15:02
Copy link
Contributor

@parmeet parmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM! Thanks @pmabbo13 for making the model torchscriptable.

Copy link
Contributor

@Nayef211 Nayef211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work on this!

@pmabbo13 pmabbo13 merged commit e7bcf3c into pytorch:main Aug 11, 2022
@pmabbo13 pmabbo13 deleted the feature/t5-torchscript branch August 11, 2022 14:36
@pmabbo13 pmabbo13 mentioned this pull request Aug 12, 2022
2 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants