Skip to content

Conversation

@xin3he
Copy link
Contributor

@xin3he xin3he commented Nov 7, 2023

Type of Change

feature

Description

Huggingface GPTQ models are using a popular repo https://github.com/qwopqwop200/GPTQ-for-LLaMa for compression.
I add an args in get_compressed_model(use_hf_fomat=True) to align with the popular one.

The main change for this arg is as below:

1: compression_dim: weight = 1, zeros = 0 and both are transposed.
2: zeros -= 1 before compression. Why we need it?
3: g_idx: use same number for one group instead of recording the channel order.
4. parameter name changed, such as 'packed_weight' -> 'qweight'.
5. zeros is always needed even for sym.

Expected Behavior & Potential Risk

UT pass

How has this PR been tested?

local test

Dependency Change?

N/A

xin3he added 21 commits November 7, 2023 21:58
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
@chensuyue chensuyue added the enhancement New feature or request label Nov 14, 2023
Signed-off-by: Xin He <[email protected]>
Signed-off-by: Xin He <[email protected]>
@chensuyue chensuyue merged commit 5179da1 into master Nov 17, 2023
@chensuyue chensuyue deleted the xinhe/hf_format branch November 17, 2023 06:27
@xin3he xin3he changed the title add use_HF_format for export_compressed_model add use_hf_format for export_compressed_model Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants