Commit 543209b
authored
Add floating point options for autoquant and add accuracy measurement (#1355)
* Add floating point options for autoquant and add accuracy measurement
Summary:
* This PR adds float32/float16/bfloat16 as a list of options for autoquant, it converts input/weight/bias/output to the specified dtype
* Also adds min_sqnr (https://en.wikipedia.org/wiki/Signal-to-quantization-noise_ratio) to allow users to
filter out the quantization methods that has large numerical impact compared to original output
Note that we use random generated input activation right now, we can improve this by adding the support
for using real inputs
Test Plan:
python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --compile_prefill --quantization autoquant-fp
Reviewers:
Subscribers:
Tasks:
Tags:
* update docstring
* fix
* ruff
* skip if no cuda1 parent 04a25e7 commit 543209b
File tree
5 files changed
+190
-19
lines changed- examples/sam2_amg_server
- test/integration
- torchao
- _models/llama
- quantization
5 files changed
+190
-19
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
371 | 371 | | |
372 | 372 | | |
373 | 373 | | |
| 374 | + | |
374 | 375 | | |
375 | 376 | | |
376 | 377 | | |
| |||
399 | 400 | | |
400 | 401 | | |
401 | 402 | | |
402 | | - | |
| 403 | + | |
403 | 404 | | |
404 | 405 | | |
405 | | - | |
| 406 | + | |
406 | 407 | | |
407 | 408 | | |
408 | | - | |
| 409 | + | |
409 | 410 | | |
410 | 411 | | |
411 | 412 | | |
| |||
416 | 417 | | |
417 | 418 | | |
418 | 419 | | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
419 | 432 | | |
420 | 433 | | |
421 | 434 | | |
| |||
494 | 507 | | |
495 | 508 | | |
496 | 509 | | |
497 | | - | |
| 510 | + | |
498 | 511 | | |
499 | 512 | | |
500 | 513 | | |
| |||
512 | 525 | | |
513 | 526 | | |
514 | 527 | | |
515 | | - | |
| 528 | + | |
516 | 529 | | |
517 | 530 | | |
518 | 531 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1514 | 1514 | | |
1515 | 1515 | | |
1516 | 1516 | | |
| 1517 | + | |
| 1518 | + | |
| 1519 | + | |
| 1520 | + | |
| 1521 | + | |
| 1522 | + | |
| 1523 | + | |
| 1524 | + | |
| 1525 | + | |
| 1526 | + | |
| 1527 | + | |
| 1528 | + | |
| 1529 | + | |
| 1530 | + | |
| 1531 | + | |
| 1532 | + | |
| 1533 | + | |
1517 | 1534 | | |
1518 | 1535 | | |
1519 | 1536 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
402 | 402 | | |
403 | 403 | | |
404 | 404 | | |
| 405 | + | |
| 406 | + | |
405 | 407 | | |
406 | 408 | | |
407 | 409 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
89 | 90 | | |
90 | 91 | | |
91 | 92 | | |
| 93 | + | |
92 | 94 | | |
93 | 95 | | |
94 | 96 | | |
| |||
0 commit comments