transform_weight broken for uint1

Encoding matrix with uint1 does not work. There are two different uint1 matrices, that produce the same result after `transform_weight`, which is not correct.
Sample matrices are eight 1s followed by zeroes, and and nine 1s followed by zeroes.
In both cases, the result will be `[15, 15,  0,  0]`, even though in the second one it should probably be `[31, 15,  0,  0]`,

Minimal example to reproduce / illustrate:

```
import bitblas
import torch

M = 1
NK = 32

matmul_config = bitblas.MatmulConfig(
    M=M,  # M dimension
    N=NK,  # N dimension
    K=NK,  # K dimension
    A_dtype="float16",  # activation A dtype
    W_dtype="uint1",  # weight W dtype
    accum_dtype="float32",  # accumulation dtype
    out_dtype="float16",  # output dtype
    layout="nt",  # matrix layout, "nt" indicates the layout of A is non-transpose and the layout of W is transpose
    with_bias=False,  # bias
    # configs for weight only quantization
    group_size=None,  # setting for grouped quantization
    with_scaling=False,  # setting for scaling factor
    with_zeros=False,  # setting for zeros
    zeros_mode=None,  # setting for how to calculating zeros
)
matmul = bitblas.Matmul(config=matmul_config)

t1 = torch.zeros((NK,NK), dtype=torch.uint8).cuda()
t1[0,0:8]+= 1

out1 = matmul.transform_weight(t1)

t2=t1.clone()
t2[0,8]+=1
out2 = matmul.transform_weight(t2)

print(torch.equal(out1, out2)) # This should not be true, but it is
print(torch.equal(t1, t2))

```

Tested on version bitblas==0.1.0.post1 and bitblas==0.1.0 from pip:
- NVIDIA A100-SXM4-80GB, cuda 12.7, Python 3.12.11
- NVIDIA GeForce RTX 2080, cuda 12.4, Python 3.11.6


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

transform_weight broken for uint1 #311

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

transform_weight broken for uint1 #311

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions