Commit fe2f567
Sync External Code (#9)
* Fix rtn tuning_device issue (#893)
Signed-off-by: Kaihui-intel <[email protected]>
* fix vlm gguf ut (#895)
Signed-off-by: n1ck-guo <[email protected]>
* update alg_ext.abi3.so with python compatible version (#894)
* move ste from quant to round for nvfp4 (#889)
Signed-off-by: He, Xin3 <[email protected]>
* Add GPT-OSS quant support (#887)
* better help printing information (#883)
* better help printing information
Signed-off-by: n1ck-guo <[email protected]>
* speedup quant and evaluation, fix recompile issue (#897)
* rewrite the implementation for ease-of-maintain
Signed-off-by: He, Xin3 <[email protected]>
* fix bug
Signed-off-by: He, Xin3 <[email protected]>
* fix quant performance
Signed-off-by: He, Xin3 <[email protected]>
* Update auto_round/compressors/base.py
---------
Signed-off-by: He, Xin3 <[email protected]>
* fix nvfp act quantization bug (#891)
* fix nvfp act quantization bug
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* add cuda ut for moe nvfp quantize
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* add cpu UT, refine cuda UT
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix ut typo
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix cpu ut
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* enhance experts amax match, refine UT
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Zhang, Weiwei1 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* support automatic mixed bits assignment (#851)
* try to fix gguf issue (#886)
* remove numba from requirments (#905)
Signed-off-by: yiliu30 <[email protected]>
* Extend mxfp loading dtypes (#907)
* block dataset logger info (#908)
Signed-off-by: n1ck-guo <[email protected]>
* fix torch compile issue in AutoScheme (#909)
* Revert "Extend mxfp loading dtypes (#907)" (#915)
This reverts commit 0c2619c.
* support disable_opt_rtn in auto-scheme (#913)
* fix llama 4 ut (#896)
* fix ut of llama 4
Signed-off-by: n1ck-guo <[email protected]>
* add numba for cpu lib (#919)
Signed-off-by: yiliu30 <[email protected]>
* Loosen the packing restrictions for mxfp&nvfp (#911)
* Loosen the packing restrictions for mxfp&nvfp, enable Qwen1.5-MoE-A2.7B quantize
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix UT
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine mxfp&nvfp layer checker
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix pylint
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Zhang, Weiwei1 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Extend mxfp loading dtypes (#916)
Signed-off-by: root <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix act config exporting for mixed schemes (#903)
* fp8 exporting bugfix
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix act related config saving
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add ut for act_config check
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine extra_config saving, add UTs
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix ut typo
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix ut typo
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fixtypo
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix CI
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix scan issue
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix scan issue
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* rm global variable
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* rerun ut
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine ut
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Zhang, Weiwei1 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* optimize rtn for int woq (#924)
* fix bug of gguf and support for LiquidAI/LFM2-1.2B (#927)
Signed-off-by: n1ck-guo <[email protected]>
* remove numpy<2.0 limitation (#921)
* enable regex quantization config saving for mixed bits (#825)
* enable dynamic quantization config saving
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fixtypo
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* rebase code, refine config saving
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine ut
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* fix UT
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* enable hf loading for regex, add UTs
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine export, enhance gptq UT
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Signed-off-by: Zhang, Weiwei1 <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Fix Flux tuning issue (#936)
Signed-off-by: Mengni Wang <[email protected]>
* gguf support for inclusionAI/Ling-flash-2.0 (#940)
* remove low_cpu_mem (#934)
* Add compatibility test (#918)
* Add commit hash to version (#941)
Signed-off-by: Sun, Xuehao <[email protected]>
* gguf weight type align with original, output.weight, token_embed (#900)
* support attention mask in user's dataset (#930)
* Add diffusion README (#923)
* update readme (#949)
* refactor utils file (#943)
* refact utils
Signed-off-by: n1ck-guo <[email protected]>
* update readme for sglang support (#953)
* update readme for sglang support
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* refine doc
Signed-off-by: Zhang, Weiwei1 <[email protected]>
* Update README.md
---------
Signed-off-by: Zhang, Weiwei1 <[email protected]>
Co-authored-by: Wenhua Cheng <[email protected]>
* update gguf and support for CompressedLinear (#950)
* Reduce AutoSchem VRAM usage by up to 10X (#944)
* add self attribution and fix avg_bits error (#956)
* add self attribution and fix avg_bits error
---------
Signed-off-by: He, Xin3 <[email protected]>
Co-authored-by: Wenhua Cheng <[email protected]>
* add logo (#960)
* refine AutoScheme readme/code (#958)
* update readme (#962)
* fix critic disable_opt_rtn regression (#963)
* [1/N] Initial vllm-ext evaluation support (MXFP4 MOE) (#935)
Signed-off-by: yiliu30 <[email protected]>
* fix bug of imatrix contains 0 (#955)
* fix rtn bug (#966)
* enhance flux doc (#967)
* clean code (#968)
* support for model scope (#957)
* support for model scope
Signed-off-by: n1ck-guo <[email protected]>
* merge main branch to alg_ext (#970)
* fix cuda CI backend issue, fixtypo (#974)
* disable compile packing by default (#975)
Signed-off-by: yiliu30 <[email protected]>
* enhance auto device map and support XPU (#961)
* enhance auto device map and support XPU
---------
Signed-off-by: He, Xin3 <[email protected]>
* refine readme (#978)
* cli support for positional arguments model (#979)
Signed-off-by: n1ck-guo <[email protected]>
* update bits (#986)
Signed-off-by: He, Xin3 <[email protected]>
* fix guff scheme and device_map bug (#969)
* add support for Magistral-Small (#980)
* support model_dtype and fix bug of scheme contains quotes, mllm eval (#985)
---------
Signed-off-by: Kaihui-intel <[email protected]>
Signed-off-by: n1ck-guo <[email protected]>
Signed-off-by: He, Xin3 <[email protected]>
Signed-off-by: Zhang, Weiwei1 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: root <[email protected]>
Signed-off-by: Mengni Wang <[email protected]>
Signed-off-by: Sun, Xuehao <[email protected]>
Co-authored-by: Tang Kaihui <[email protected]>
Co-authored-by: Heng Guo <[email protected]>
Co-authored-by: Xin He <[email protected]>
Co-authored-by: Yi Liu <[email protected]>
Co-authored-by: Weiwei <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Wenhua Cheng <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Wang, Mengni <[email protected]>
Co-authored-by: Sun, Xuehao <[email protected]>1 parent 3243987 commit fe2f567
File tree
119 files changed
+13340
-8422
lines changed- .azure-pipelines
- scripts/ut
- template
- auto_round_extension
- torch
- triton
- vllm_ext
- tests
- auto_round
- auto_scheme
- compressors
- diffusion
- mllm
- data_type
- eval
- export
- export_to_autogptq
- export_to_autoround
- export_to_awq
- export_to_gguf
- export_to_itrex
- export_to_llmcompressor
- inference
- low_cpu_mem
- modelling
- utils
- docs
- imgs
- test
- test_cpu
- test_cuda
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
119 files changed
+13340
-8422
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
34 | | - | |
35 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
36 | 44 | | |
37 | 45 | | |
38 | 46 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
8 | 12 | | |
9 | 13 | | |
10 | 14 | | |
| |||
34 | 38 | | |
35 | 39 | | |
36 | 40 | | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
37 | 90 | | |
38 | 91 | | |
39 | 92 | | |
| |||
49 | 102 | | |
50 | 103 | | |
51 | 104 | | |
52 | | - | |
| 105 | + | |
53 | 106 | | |
54 | 107 | | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
70 | 124 | | |
71 | 125 | | |
72 | 126 | | |
| |||
85 | 139 | | |
86 | 140 | | |
87 | 141 | | |
88 | | - | |
| 142 | + | |
89 | 143 | | |
90 | 144 | | |
91 | 145 | | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
97 | 151 | | |
98 | | - | |
99 | | - | |
| 152 | + | |
| 153 | + | |
100 | 154 | | |
101 | | - | |
102 | | - | |
| 155 | + | |
| 156 | + | |
103 | 157 | | |
104 | | - | |
105 | | - | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
106 | 161 | | |
107 | 162 | | |
108 | 163 | | |
109 | 164 | | |
110 | 165 | | |
111 | 166 | | |
| 167 | + | |
112 | 168 | | |
113 | 169 | | |
114 | 170 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
| 58 | + | |
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| |||
0 commit comments