-
Notifications
You must be signed in to change notification settings - Fork 814
[Feature] Add ability to load HF checkpoints into T5 model #1918
Conversation
Nayef211
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for adding this capability @joecummings! I would recommend pulling in these changes and ensuring it works with internal Meta infra before merging in the PR. 😄
| ) -> T5Model: | ||
| """Build T5Model model from a HuggingFace checkpoint. | ||
| Note: Only works with Huggingface models saved in the PyTorch format. Will not work \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we need the \ in the docstrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - not entirely sure what you're referring to here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry updated the comment. Looks like backslash characters don't show up when you put quotes around them which was a noob mistake 😅
fbd2af3 to
a624031
Compare
Add the ability to load a T5 model from pretrained HuggingFace weights
Changes
build_model_from_huggingface_ckptstatic method to theT5Bundler.Testing
Future considerations: This does not support loading from external URLs - planning on adding this in a follow-up Diff from the internal FB side so I can test with Manifold (+ possible GDrive, static public links, Github).