Skip to content

🐛 [Bug] Refitter test failed when constant fold is disabled #3752

@lanluo-nvidia

Description

@lanluo-nvidia

Bug Description

Since constant fold is giving the nan issue in RTX,
#3699
I have disabled constant fold which caused the refit test failed due to missing eps weights

From the code all the eps weights became unknown because we only capture weight/bias/running_mean/running_var:
https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py#L540

(Pdb) refitter.get_missing_weights()
[‘layer3.0.downsample.1/_native_batch_norm_legit_no_training_12_eps CONSTANT’, ‘layer4.1.bn2/_native_batch_norm_legit_no_training_19_eps CONSTANT’, ‘layer1.1.bn1/_native_batch_norm_legit_no_training_3_eps CONSTANT’, ‘layer3.0.bn1/_native_batch_norm_legit_no_training_10_eps CONSTANT’, ‘layer1.0.bn2/_native_batch_norm_legit_no_training_2_eps CONSTANT’, ‘layer2.0.bn1/_native_batch_norm_legit_no_training_5_eps CONSTANT’, ‘layer4.0.bn1/_native_batch_norm_legit_no_training_15_eps CONSTANT’, ‘layer4.0.bn2/_native_batch_norm_legit_no_training_16_eps CONSTANT’, ‘layer1.0.bn1/_native_batch_norm_legit_no_training_1_eps CONSTANT’, ‘layer2.0.bn2/_native_batch_norm_legit_no_training_6_eps CONSTANT’, ‘layer3.1.bn1/_native_batch_norm_legit_no_training_13_eps CONSTANT’, ‘layer3.0.bn2/_native_batch_norm_legit_no_training_11_eps CONSTANT’, ‘layer2.1.bn1/_native_batch_norm_legit_no_training_8_eps CONSTANT’, ‘layer1.1.bn2/_native_batch_norm_legit_no_training_4_eps CONSTANT’, ‘layer2.1.bn2/_native_batch_norm_legit_no_training_9_eps CONSTANT’, ‘layer4.0.downsample.1/_native_batch_norm_legit_no_training_17_eps CONSTANT’, ‘layer4.1.bn1/_native_batch_norm_legit_no_training_18_eps CONSTANT’, ‘layer3.1.bn2/_native_batch_norm_legit_no_training_14_eps CONSTANT’, ‘bn1/_native_batch_norm_legit_no_training_eps CONSTANT’, ‘layer2.0.downsample.1/_native_batch_norm_legit_no_training_7_eps CONSTANT’]

To Reproduce

Steps to reproduce the behavior:

Expected behavior

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0):
  • PyTorch Version (e.g. 1.0):
  • CPU Architecture:
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, libtorch, source):
  • Build command you used (if compiling from source):
  • Are you using local sources or building from archives:
  • Python version:
  • CUDA version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions