Commit 5642f44
committed
Update on "Add generic fake quantized linear for QAT"
**Summary:** This commit adds a generic fake quantized linear module
to replace the uses of the existing more specific QAT linears.
For example, `Int8DynActInt4WeightQATLinear` can be expressed
as follows:
```
from torchao.quantization.prototype.qat.api import FakeQuantizeConfig
from torchao.quantization.prototype.qat.linear import FakeQuantizedLinear
activation_config = FakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False)
weight_config = FakeQuantizeConfig(torch.int4, group_size=8)
fq_linear = FakeQuantizedLinear(16, 32, False, activation_config, weight_config)
```
The main motivation is to provide a more flexible way to perform
QAT on models with linear layers. Previously, we would have to
create a new linear class every time we wish to experiment with
different fake quantization settings, e.g. different group size
or different bit width. Now we can express this easily using a
single linear module.
**Test Plan:**
python test/quantization/test_qat.py -k test_fake_quantize_config_granularity
python test/quantization/test_qat.py -k test_fake_quantize_config_granularity_error_cases
python test/quantization/test_qat.py -k test_fake_quantize_config_mapping_type
python test/quantization/test_qat.py -k test_fake_quantized_linear_8da4w
python test/quantization/test_qat.py -k test_fake_quantized_linear_4w
[ghstack-poisoned]2 files changed
+11
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1126 | 1126 | | |
1127 | 1127 | | |
1128 | 1128 | | |
| 1129 | + | |
1129 | 1130 | | |
1130 | 1131 | | |
1131 | 1132 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
58 | 58 | | |
59 | 59 | | |
60 | 60 | | |
| 61 | + | |
61 | 62 | | |
62 | 63 | | |
63 | 64 | | |
| |||
753 | 754 | | |
754 | 755 | | |
755 | 756 | | |
756 | | - | |
757 | | - | |
758 | | - | |
759 | | - | |
760 | | - | |
761 | | - | |
762 | | - | |
763 | | - | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
764 | 766 | | |
765 | 767 | | |
766 | 768 | | |
| |||
0 commit comments