Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@pmabbo13
Copy link
Contributor

@pmabbo13 pmabbo13 commented Jul 28, 2022

Description

Add a tutorial which demonstrates how to use the T5 model for text summarization.

Process

Demonstrate the end-to-end process for generating text summaries of articles from the CNNDM dataset using TorchText's pre-trained T5 model with base configuration. This involved showcasing how to:

  1. Initialize the text pre-processing pipeline
  2. Load the CNNDM dataset and process its text to include the task prefix needed for the model
  3. Create a generator that iteratively calls on the decoder in order to continuously expand the output sequences until all sequences in the batch generate the end-of-sequence token. A greedy search is used to predict the next word in each sequence
  4. Load the pre-trained T5 model and call on the generator to produce summaries for a small batch of articles taken from the CNNDM test set

Test

Run BUILD_GALLERY=1 make 'SPHINXOPTS=-W' html in docs and review rendered document in docs/build/html/tutorials/cnndm_summarization.html

Follow-Up

@pmabbo13 pmabbo13 marked this pull request as ready for review August 1, 2022 15:25
Copy link
Contributor

@parmeet parmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM! Thank you @pmabbo13 for adding this end-2-end tutorial, this would be quite helpful for users to on-board T5 model. Let's make sure the tutorial can be build locally before merge :).

Copy link
Contributor

@Nayef211 Nayef211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work on adding a descriptive tutorial that is very easy to follow!

@pmabbo13 pmabbo13 merged commit 466f2e2 into pytorch:main Aug 2, 2022
@pmabbo13 pmabbo13 deleted the feature/t5-demo branch August 2, 2022 15:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants