Updating T5 demo to use beam search for generator #1869

pmabbo13 · 2022-08-02T22:23:09Z

Description

Update T5 tutorial to use a beam search for decoding, as opposed to a greedy search.

Process

A beam search was implemented, which keeps track of the log probability of multiple sequences generated for a single input sequence, and prunes them at each iteration to only keep the top k most likely. This allows the generator to create sequences that are more probable than those generated by a greedy search.

We find that the sequences generated via beam search tend to be longer, but still mostly capture the main points expressed in the target summaries. We expect an improvement in the summaries with the addition of constraints such as a length penalty, ngram limit, min length, max length, etc.

Testing

We tested the logic be ascertaining that the sequences generated when beam_size=1 were the same as those generated under a greedy decoder --> see Generate Summaries section of this notebook

Run BUILD_GALLERY=1 make 'SPHINXOPTS=-W' html in docs and review rendered document in docs/build/html/tutorials/cnndm_summarization.html

pmabbo13 · 2022-08-03T14:49:39Z

examples/tutorials/cnndm_summarization.py

 from torch import Tensor
 from torchtext.prototype.models import T5Model




One thing to note is that when generating the first tokens of the sequences, decoder_tokens has shape (batch_size, 1). Since we are using a beam search, at that first iteration each sequence has k many tokens with which to start the sequence (since we are choosing the top k tokens to expand a given sequence by). Since the decoder expects decoder_tokens to be 2D, with one sequence per row, we treat each beam as its own sequence such that decoder_tokens now has shape (batch_size * beam_size, 2), where the first k rows are the beams belonging to the original sequence 1, the next k rows are the beams belonging to the original sequence 2, etc.

Since the decoder also requires the encoder outputs as an input argument, we must also now reshape the encoder output since it only has the output for a batch_size number of sequence. Lines 239-244 define a new order where we repeat each encoder output related to each given original sequence k times. This means that along dim=0, the first k indices contain the encoder output for the original sequence 1, the next k for original sequence 2, etc. This is so that that as we pass in each beam as its own sequence in decoder_tokens, the decoder has the correct corresponding encoder output for the sequence.

pmabbo13 · 2022-08-03T14:56:00Z

examples/tutorials/cnndm_summarization.py

-) -> Tensor:
+def beam_search(
+    beam_size: int,
+    step: int,


step here is equivalent to the current length of sequences in decoder_tokens. The first time beam_search is called, step=1 because decoder_tokens is initialized to have the padding token as the starter token to each sequence.

examples/tutorials/cnndm_summarization.py

parmeet

LGTM!

updating demo to use beam search for generator

82155f1

facebook-github-bot added the cla signed label Aug 2, 2022

pmabbo13 mentioned this pull request Aug 2, 2022

Tutorial on using T5 model for text summarization #1864

Merged

1 task

pmabbo13 changed the title ~~updating demo to use beam search for generator~~ Updating T5 demo to use beam search for generator Aug 2, 2022

Merge branch 'main' into feature/t5-beam-search

35a6b44

pmabbo13 requested review from Nayef211, abhinavarora and parmeet August 3, 2022 14:34

pmabbo13 commented Aug 3, 2022

View reviewed changes

Nayef211 approved these changes Aug 3, 2022

View reviewed changes

examples/tutorials/cnndm_summarization.py Outdated Show resolved Hide resolved

parmeet approved these changes Aug 3, 2022

View reviewed changes

pmabbo13 added 2 commits August 3, 2022 16:06

details on beam size

27d4f40

Merge branch 'main' into feature/t5-beam-search

fc0d552

pmabbo13 merged commit 2bb2562 into pytorch:main Aug 3, 2022

pmabbo13 deleted the feature/t5-beam-search branch August 3, 2022 21:10

pmabbo13 mentioned this pull request Aug 3, 2022

Add T5 Model and Demo on Text Summarization using CNNDM Dataset #1800

Closed

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updating T5 demo to use beam search for generator #1869

Updating T5 demo to use beam search for generator #1869

Uh oh!

pmabbo13 commented Aug 2, 2022 •

edited

Loading

Uh oh!

pmabbo13 Aug 3, 2022 •

edited

Loading

Uh oh!

pmabbo13 Aug 3, 2022

Uh oh!

Uh oh!

parmeet left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		from torch import Tensor
		from torchtext.prototype.models import T5Model

Updating T5 demo to use beam search for generator #1869

Updating T5 demo to use beam search for generator #1869

Uh oh!

Conversation

pmabbo13 commented Aug 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Process

Testing

Uh oh!

pmabbo13 Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmabbo13 Aug 3, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

parmeet left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pmabbo13 commented Aug 2, 2022 •

edited

Loading

pmabbo13 Aug 3, 2022 •

edited

Loading