Quantize NafNet deblurring: ORT pipeline + validation script + robust handling #299
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Adds ONNX Runtime static quantization support for deblurring_nafnet, and a validation script to compare FP32 vs INT8 outputs with PSNR/SSIM and timing.
Improves quantization robustness and memory handling across the tools.
Changes
tools/quantize/quantize-ort.py
Adds deblurring_nafnet entry to the central models dict.
Uses QuantFormat.QOperator + op_types_to_quantize=['Conv','MatMul'] for better CPU EP performance, with an automatic fallback to QDQ Conv-only when needed.
Optional pre-processing (skips if onnxruntime-extensions is missing).
Lazy
DataReader
creation to avoid importing all datasets at module import.
MinMax calibration; input resize to 512x512; max_samples=1 to reduce memory.
Auto-excludes Conv nodes without bias initializers to avoid bias=None errors.
tools/quantize/transform.py
Makes
HandAlign
lazy-load its palm detector so unrelated quant runs don’t load Mediapipe ONNX at import time.
models/deblurring_nafnet/validate_quantization.py
New: Runs FP32 and INT8 models via ORT, reports timing, PSNR, SSIM, and shows side-by-side results.
Usage
Quantize NafNet:
cd tools/quantize
python quantize-ort.py deblurring_nafnet
Output: models/deblurring_nafnet/deblurring_nafnet_2025may_int8.onnx
Validate:
cd models/deblurring_nafnet
python validate_quantization.py --input example_outputs/licenseplate_motion.jpg --model_fp32 deblurring_nafnet_2025may.onnx --model_int8 deblurring_nafnet_2025may_int8.onnx
Notes on performance
On CPUExecutionProvider, INT8 speed-ups are not guaranteed unless the provider fuses/accelerates int8 kernels.
For acceleration, consider provider-specific EPs:
OpenVINOExecutionProvider (CPU) or DmlExecutionProvider (GPU) if available.
The script quantizes to QOperator Conv/MatMul by default; falls back to QDQ Conv-only for robustness.
Known limitations
Quantization depends on ORT support for the model’s ops and shapes; the script includes automatic mitigations (resize, sample limit, excludes).
We do not commit generated ONNX or output images.
Screenshot:
