Added memory optimization for onnx transforms #538

quic-rishinr · 2025-08-12T15:16:16Z

Added periodic memory cleanup to FP16ClipTransform and SplitTensorsTransform to reduce memory usage during large tensor processing. Also avoids redundant external data loading when already present.

quic-hemagnih · 2025-08-19T00:28:24Z

QEfficient/base/onnx_transforms.py

 from typing import Optional, Tuple

 import numpy as np
 from onnx import ModelProto, external_data_helper, numpy_helper

+from QEfficient.utils.constants import ONNX_TRANSFROM_MEMORY_CLEANUP_INTERVAL
+


NIT: TRANSFORM - spell check

quic-hemagnih · 2025-08-19T00:38:19Z

QEfficient/base/onnx_transforms.py

+        :param model: The ONNX model to check
+        :returns: True if external data is already loaded, False otherwise
+        """
+        for tensor in external_data_helper._get_all_tensors(model):


Can we think of skipping this extra loop for checking whether for all the tensors external data has been loaded or not. The place where we are loading the external data there we can maintain a flag. This flag by default will be set to false and then once all the external data is loaded we can mark it to TRUE. Then in code we may have to just check the flag. or may not need this function if you want to directly use the flag.

quic-hemagnih · 2025-08-19T00:39:50Z

QEfficient/base/onnx_transforms.py

@@ -61,6 +89,15 @@ def apply(cls, model: ModelProto, *, onnx_base_dir: Optional[str] = None, **kwar
                tensor.CopyFrom(new_tensor)
                transformed = True

+                del neg_inf_mask, clipped_tensor, new_tensor
+


In this loop itself you can check and then update flag

Signed-off-by: Rishin Raj <[email protected]>

quic-rishinr requested review from ochougul, quic-hemagnih and quic-amitraj as code owners August 12, 2025 15:16

quic-hemagnih requested changes Aug 19, 2025

View reviewed changes

quic-rishinr added 2 commits September 9, 2025 08:33

Added mem optimization for onnx transforms

5d1fecc

Signed-off-by: Rishin Raj <[email protected]>

Addressed review comments and added memory profiling module

85c8a0d

Signed-off-by: Rishin Raj <[email protected]>

quic-rishinr force-pushed the export_optim branch from 570aad1 to 85c8a0d Compare September 9, 2025 08:34

quic-rishinr added 4 commits September 9, 2025 08:47

lint and minor changes

7436ccb

Signed-off-by: Rishin Raj <[email protected]>

lint

ca8a0e8

Signed-off-by: Rishin Raj <[email protected]>

lint

76c5329

Signed-off-by: Rishin Raj <[email protected]>

Ruff format

e636e24

Signed-off-by: Rishin Raj <[email protected]>

quic-rishinr mentioned this pull request Sep 10, 2025

Optimized ONNX Transform via Class Merging and Thread Pooling #546

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added memory optimization for onnx transforms #538

Added memory optimization for onnx transforms #538

quic-rishinr commented Aug 12, 2025

Uh oh!

quic-hemagnih Aug 19, 2025

Uh oh!

quic-hemagnih Aug 19, 2025

Uh oh!

quic-hemagnih Aug 19, 2025

Uh oh!

Uh oh!

Added memory optimization for onnx transforms #538

Are you sure you want to change the base?

Added memory optimization for onnx transforms #538

Conversation

quic-rishinr commented Aug 12, 2025

Uh oh!

quic-hemagnih Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

quic-hemagnih Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

quic-hemagnih Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!