Support weight only quantization from bfloat16 to int8?

An error occurred while quantifying the bf16 model:

tensorrt_llm_0.5.0/examples/gpt/weight.py:265

```python
 elif use_weight_only:
                processed_torch_weights, torch_weight_scales = torch.ops.fastertransformer.symmetric_quantize_last_axis_of_batched_matrix(
                    torch.tensor(t), plugin_weight_only_quant_type)
```

error:
```
can't convert np.ndarray of type numpy.void. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support weight only quantization from bfloat16 to int8? #110

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support weight only quantization from bfloat16 to int8? #110

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions