Support Blockwise Quantization #1181

DaniAffCH · 2024-05-29T08:42:33Z

This PR contains the test data necessary to verify the correctness of blockwise quantization introduced in opencv/opencv#25644

[GSoC] dnn: Blockwise quantization support #25644 This PR introduces blockwise quantization in DNN allowing the parsing of ONNX models quantized in blockwise style. In particular it modifies the `Quantize` and `Dequantize` operations. The related PR opencv/opencv_extra#1181 contains the test data. Additional notes: - The original quantization issue has been fixed. Previously, for 1D scale and zero-point, the operation applied was $y = int8(x/s - z)$ instead of $y = int8(x/s + z)$. Note that the operation was already correctly implemented when the scale and zero-point were scalars. The previous implementation failed the ONNX test cases, but now all have passed successfully. [Reference](https://github.com/onnx/onnx/blob/main/docs/Operators.md#QuantizeLinear) - the function `block_repeat` broadcasts scale and zero-point to the input shape. It repeats all the elements of a given axis n times. This function generalizes the behavior of `repeat` from the core module which is defined just for 2 axis assuming `Mat` has 2 dimensions. If appropriate and useful, you might consider moving `block_repeat` to the core module. - Now, the scale and zero-point can be taken as layer inputs. This increases the ONNX layers' coverage and enables us to run the ONNX test cases (previously disabled) being fully compliant with ONNX standards. Since they are now supported, I have enabled the test cases for: `test_dequantizelinear`, `test_dequantizelinear_axis`, `test_dequantizelinear_blocked`, `test_quantizelinear`, `test_quantizelinear_axis`, `test_quantizelinear_blocked` just in CPU backend. All of them pass successfully. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

[GSoC] dnn: Blockwise quantization support opencv#25644 This PR introduces blockwise quantization in DNN allowing the parsing of ONNX models quantized in blockwise style. In particular it modifies the `Quantize` and `Dequantize` operations. The related PR opencv/opencv_extra#1181 contains the test data. Additional notes: - The original quantization issue has been fixed. Previously, for 1D scale and zero-point, the operation applied was $y = int8(x/s - z)$ instead of $y = int8(x/s + z)$. Note that the operation was already correctly implemented when the scale and zero-point were scalars. The previous implementation failed the ONNX test cases, but now all have passed successfully. [Reference](https://github.com/onnx/onnx/blob/main/docs/Operators.md#QuantizeLinear) - the function `block_repeat` broadcasts scale and zero-point to the input shape. It repeats all the elements of a given axis n times. This function generalizes the behavior of `repeat` from the core module which is defined just for 2 axis assuming `Mat` has 2 dimensions. If appropriate and useful, you might consider moving `block_repeat` to the core module. - Now, the scale and zero-point can be taken as layer inputs. This increases the ONNX layers' coverage and enables us to run the ONNX test cases (previously disabled) being fully compliant with ONNX standards. Since they are now supported, I have enabled the test cases for: `test_dequantizelinear`, `test_dequantizelinear_axis`, `test_dequantizelinear_blocked`, `test_quantizelinear`, `test_quantizelinear_axis`, `test_quantizelinear_blocked` just in CPU backend. All of them pass successfully. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [x] The feature is well documented and sample code can be built with the project CMake

add blockwise quantization test data

408b752

DaniAffCH mentioned this pull request May 29, 2024

[GSoC] dnn: Blockwise quantization support opencv/opencv#25644

Merged

6 tasks

update onnx test cases

ce14f47

asmorkalov merged commit 9d0d24d into opencv:4.x Jul 30, 2024

asmorkalov mentioned this pull request Aug 6, 2024

5.x merge 4.x #1199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Blockwise Quantization #1181

Support Blockwise Quantization #1181

Uh oh!

DaniAffCH commented May 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Support Blockwise Quantization #1181

Support Blockwise Quantization #1181

Uh oh!

Conversation

DaniAffCH commented May 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants