add t5 model that can function as both encodery-only or encoder-decoder model #1829

pmabbo13 · 2022-07-12T21:35:53Z

Stack from ghstack (oldest at bottom):

…er model [ghstack-poisoned]

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: 19111b5 Pull Request resolved: #1829

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

Nayef211

@parmeet I wonder if you think it makes sense to update our docs to include the T5Model class as well as the yet to be created T5Bundle. Similarly is there a reason why we don't include the RobertaModel class in the docstring if it's a public facing component?

torchtext/prototype/t5/model.py

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: 0219b63 Pull Request resolved: #1829

pmabbo13 · 2022-07-15T15:02:25Z

Description

The T5Model implementation is very similar to the nn.Transformer implementation, with some additional functionality

Takes a tokenized encoder input sequence and decoder input sequence and transforms them into word embeddings.
Computes the padding masks for the input sequences based on the padding_idx input argument when initializing the model.
Generates a causal mask for decoder self-attention unless one has already been provided via the decoder_mask input argument to the forward method.
Returns the output of the final layers to the encoder and decoder, the output at each layer of the encoder and decoder, the self-attentions scores of each layer of the encoder and decoder, the cross-attention scores of each layer of the decoder
Based on input parameter encoder_only when initializing the model, will just run the encoder portion and output its corresponding results.

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: b5a8a5e Pull Request resolved: #1829

parmeet

Overall LGTM!

torchtext/prototype/t5/model.py

parmeet · 2022-07-15T19:42:51Z

I wonder if you think it makes sense to update our docs to include the T5Model class as well as the yet to be created T5Bundle.

This is just a prototype feature. So it may not be necessary to include it in the docs. I don't see other domains doing this as well?

Similarly is there a reason why we don't include the RobertaModel class in the docstring if it's a public facing component?

hmm, that's a good catch. Not really. Looks like we missed including the docs for it, perhaps we were only focusing on the Roberta Bundler API that expose this model to users..

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: 0ae4e98 Pull Request resolved: #1829

…coder-decoder model" [ghstack-poisoned]

…er model ghstack-source-id: a5da3a7 Pull Request resolved: #1829

* compute relative position buckets for relative attention bias [ghstack-poisoned] * compute relative position bias for t5 attention [ghstack-poisoned] * compute attention scores for t5 model using relative attention bias [ghstack-poisoned] * perform multihead attention using relative attention bias for t5 model [ghstack-poisoned] * create T5MultiheadAttention module [ghstack-poisoned] * add layer norm module for t5 model [ghstack-poisoned] * add t5 layer module that can be used for both encoder or decoder stack [ghstack-poisoned] * add t5 stack that can function as either the encoder or decoder of a t5 model [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * Update base for Update on "add t5 model that can function as both encodery-only or encoder-decoder model" [ghstack-poisoned] * add t5 model that can function as both encodery-only or encoder-decoder model (#1829)

add t5 model that can function as both encodery-only or encoder-decod…

7f8b482

…er model [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 12, 2022

add t5 model that can function as both encodery-only or encoder-decod…

25a9bf6

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

facebook-github-bot added the cla signed label Jul 12, 2022

Update on "add t5 model that can function as both encodery-only or en…

a2bef90

…coder-decoder model" [ghstack-poisoned]

This was referenced Jul 13, 2022

compute relative position buckets for relative attention bias #1830

Merged

computing relative attention bias #1831

Merged

computing attention scores using relative attention bias #1832

Merged

adding forward method for multihead attention #1833

Merged

pmabbo13 added a commit that referenced this pull request Jul 13, 2022

add t5 model that can function as both encodery-only or encoder-decod…

124e050

…er model ghstack-source-id: 19111b5 Pull Request resolved: #1829

Update on "add t5 model that can function as both encodery-only or en…

fc376f4

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 13, 2022

add t5 model that can function as both encodery-only or encoder-decod…

6b6003d

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

Update on "add t5 model that can function as both encodery-only or en…

80f850c

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 13, 2022

add t5 model that can function as both encodery-only or encoder-decod…

6bd0a15

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

Update on "add t5 model that can function as both encodery-only or en…

d77c583

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 13, 2022

add t5 model that can function as both encodery-only or encoder-decod…

3c96e7a

…er model ghstack-source-id: 946c573 Pull Request resolved: #1829

pmabbo13 requested a review from Nayef211 July 13, 2022 18:15

Nayef211 reviewed Jul 13, 2022

View reviewed changes

torchtext/prototype/t5/model.py Show resolved Hide resolved

pmabbo13 mentioned this pull request Jul 14, 2022

Add T5 Model and Demo on Text Summarization using CNNDM Dataset #1800

Closed

25 tasks

Update on "add t5 model that can function as both encodery-only or en…

3287f82

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 14, 2022

add t5 model that can function as both encodery-only or encoder-decod…

b2ba34f

…er model ghstack-source-id: 0219b63 Pull Request resolved: #1829

pmabbo13 requested a review from parmeet July 15, 2022 15:58

pmabbo13 requested a review from abhinavarora July 15, 2022 15:58

Update on "add t5 model that can function as both encodery-only or en…

24ce08b

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 15, 2022

add t5 model that can function as both encodery-only or encoder-decod…

6b37a7f

…er model ghstack-source-id: b5a8a5e Pull Request resolved: #1829

parmeet approved these changes Jul 15, 2022

View reviewed changes

torchtext/prototype/t5/model.py Outdated Show resolved Hide resolved

Update on "add t5 model that can function as both encodery-only or en…

d2b5c68

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 15, 2022

add t5 model that can function as both encodery-only or encoder-decod…

248b533

…er model ghstack-source-id: 0ae4e98 Pull Request resolved: #1829

Update on "add t5 model that can function as both encodery-only or en…

cb0b408

…coder-decoder model" [ghstack-poisoned]

pmabbo13 added a commit that referenced this pull request Jul 18, 2022

add t5 model that can function as both encodery-only or encoder-decod…

e3064e4

…er model ghstack-source-id: a5da3a7 Pull Request resolved: #1829

Nayef211 approved these changes Jul 18, 2022

View reviewed changes

pmabbo13 merged commit adbc511 into gh/pmabbo13/9/base Jul 18, 2022

facebook-github-bot deleted the gh/pmabbo13/9/head branch August 18, 2022 14:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add t5 model that can function as both encodery-only or encoder-decoder model #1829

add t5 model that can function as both encodery-only or encoder-decoder model #1829

Uh oh!

pmabbo13 commented Jul 12, 2022 •

edited

Loading

Uh oh!

Nayef211 left a comment

Uh oh!

Uh oh!

pmabbo13 commented Jul 15, 2022

Uh oh!

parmeet left a comment

Uh oh!

Uh oh!

parmeet commented Jul 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

add t5 model that can function as both encodery-only or encoder-decoder model #1829

add t5 model that can function as both encodery-only or encoder-decoder model #1829

Uh oh!

Conversation

pmabbo13 commented Jul 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Nayef211 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pmabbo13 commented Jul 15, 2022

Description

Uh oh!

parmeet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

parmeet commented Jul 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pmabbo13 commented Jul 12, 2022 •

edited

Loading