Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Conversation

@Jack-Khuu
Copy link
Contributor

@Jack-Khuu Jack-Khuu commented Jul 29, 2024

This adds an initial index for model customization options accessible from the README.

The introduced a "model_customization.md" that will be iterated and expanded upon over time

Note that some of the content is extracted/inspired by quantization.md and the ADVANCED_USERS docs, which are marked as outdated/unstable.

README's
https://github.com/pytorch/torchchat/blob/readme-model-customization-guide/README.md
https://github.com/pytorch/torchchat/blob/readme-model-customization-guide/docs/model_customization.md

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/962

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit a340f0c with merge base 900b6d4 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 29, 2024
@Jack-Khuu Jack-Khuu merged commit d78710f into main Jul 29, 2024
Copy link

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it. For compile, it is not really a model customization, if we can, to be more complete, extend that category of this doc to encompass that to (et | aoti | eager | compile) it might be more comprehensive / useful.

Also It would be great if you have a "support matrix" view of these four knobs for a flagship model for torchchat which would be llama3-8b I suppose.

@@ -0,0 +1,60 @@
# Model Customization

By default, torchchat (and PyTorch) default to unquantized [eager execution](https://pytorch.org/blog/optimizing-production-pytorch-performance-with-graph-transformations/).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helpful to be specific about unquantized d-type i.e. fp32 or fp16 or bf16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the checkpoint dtype, so it'll vary


This page goes over the different options torchchat provides for customizing the model execution for inference.
- Device
- Compilation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fit very well IMHO in this otherwise awesome high level, almost orthogonal optimization knob categorization. It could be just me.

Copy link
Contributor Author

@Jack-Khuu Jack-Khuu Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it's not an optimization knob, it's a model customization knob

Luckily Quant/optimization gets it's own page for that

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants