README: Add a model customization guide #962

Jack-Khuu · 2024-07-29T19:04:00Z

This adds an initial index for model customization options accessible from the README.

The introduced a "model_customization.md" that will be iterated and expanded upon over time

Note that some of the content is extracted/inspired by quantization.md and the ADVANCED_USERS docs, which are marked as outdated/unstable.

README's
https://github.com/pytorch/torchchat/blob/readme-model-customization-guide/README.md
https://github.com/pytorch/torchchat/blob/readme-model-customization-guide/docs/model_customization.md

pytorch-bot · 2024-07-29T19:04:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/962

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit a340f0c with merge base 900b6d4 ():

NEW FAILURE - The following job has failed:

pull / torchchat-command-load-test (macos-14) (gh)
Process completed with exit code 1.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / compile-gguf (macos-14) (gh) (trunk failure)
RuntimeError: MPS backend out of memory (MPS allocated: 1.04 GB, other allocations: 0 bytes, max allowed: 7.93 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
Run the README instructions - with stories - on MPS/MacOS / test-readme-mps-macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

digantdesai

Love it. For compile, it is not really a model customization, if we can, to be more complete, extend that category of this doc to encompass that to (et | aoti | eager | compile) it might be more comprehensive / useful.

Also It would be great if you have a "support matrix" view of these four knobs for a flagship model for torchchat which would be llama3-8b I suppose.

digantdesai · 2024-07-29T19:52:53Z

docs/model_customization.md

@@ -0,0 +1,60 @@
+# Model Customization
+
+By default, torchchat (and PyTorch) default to unquantized [eager execution](https://pytorch.org/blog/optimizing-production-pytorch-performance-with-graph-transformations/).


helpful to be specific about unquantized d-type i.e. fp32 or fp16 or bf16?

It's the checkpoint dtype, so it'll vary

digantdesai · 2024-07-29T22:48:44Z

docs/model_customization.md

+
+This page goes over the different options torchchat provides for customizing the model execution for inference.
+- Device
+- Compilation


This doesn't fit very well IMHO in this otherwise awesome high level, almost orthogonal optimization knob categorization. It could be just me.

Agreed, it's not an optimization knob, it's a model customization knob

Luckily Quant/optimization gets it's own page for that

Jack-Khuu added 2 commits July 29, 2024 11:59

Add an initial index on customization guide for models

3e0a79a

Adding README pointers + help fix

4d81c5b

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 29, 2024

Jack-Khuu requested review from Gasoonjia, byjlw, digantdesai, kimishpatel, malfet and vmpuri July 29, 2024 19:06

Updating Format and content of customization page

a340f0c

Gasoonjia approved these changes Jul 29, 2024

View reviewed changes

Jack-Khuu merged commit d78710f into main Jul 29, 2024

digantdesai approved these changes Jul 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README: Add a model customization guide #962

README: Add a model customization guide #962

Uh oh!

Jack-Khuu commented Jul 29, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 29, 2024 •

edited

Loading

Uh oh!

digantdesai left a comment

Uh oh!

digantdesai Jul 29, 2024

Uh oh!

Jack-Khuu Jul 29, 2024

Uh oh!

digantdesai Jul 29, 2024

Uh oh!

Jack-Khuu Jul 29, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		@@ -0,0 +1,60 @@
		# Model Customization

		By default, torchchat (and PyTorch) default to unquantized [eager execution](https://pytorch.org/blog/optimizing-production-pytorch-performance-with-graph-transformations/).

README: Add a model customization guide #962

README: Add a model customization guide #962

Uh oh!

Conversation

Jack-Khuu commented Jul 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/962

❌ 1 New Failure, 2 Unrelated Failures

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

digantdesai Jul 29, 2024

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Jul 29, 2024

Choose a reason for hiding this comment

Uh oh!

digantdesai Jul 29, 2024

Choose a reason for hiding this comment

Uh oh!

Jack-Khuu Jul 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Jack-Khuu commented Jul 29, 2024 •

edited

Loading

pytorch-bot bot commented Jul 29, 2024 •

edited

Loading

Jack-Khuu Jul 29, 2024 •

edited

Loading