-
Notifications
You must be signed in to change notification settings - Fork 370
Description
Bug Description
Since constant fold is giving the nan issue in RTX,
#3699
I have disabled constant fold which caused the refit test failed due to missing eps weights
From the code all the eps weights became unknown because we only capture weight/bias/running_mean/running_var:
https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py#L540
(Pdb) refitter.get_missing_weights()
[‘layer3.0.downsample.1/_native_batch_norm_legit_no_training_12_eps CONSTANT’, ‘layer4.1.bn2/_native_batch_norm_legit_no_training_19_eps CONSTANT’, ‘layer1.1.bn1/_native_batch_norm_legit_no_training_3_eps CONSTANT’, ‘layer3.0.bn1/_native_batch_norm_legit_no_training_10_eps CONSTANT’, ‘layer1.0.bn2/_native_batch_norm_legit_no_training_2_eps CONSTANT’, ‘layer2.0.bn1/_native_batch_norm_legit_no_training_5_eps CONSTANT’, ‘layer4.0.bn1/_native_batch_norm_legit_no_training_15_eps CONSTANT’, ‘layer4.0.bn2/_native_batch_norm_legit_no_training_16_eps CONSTANT’, ‘layer1.0.bn1/_native_batch_norm_legit_no_training_1_eps CONSTANT’, ‘layer2.0.bn2/_native_batch_norm_legit_no_training_6_eps CONSTANT’, ‘layer3.1.bn1/_native_batch_norm_legit_no_training_13_eps CONSTANT’, ‘layer3.0.bn2/_native_batch_norm_legit_no_training_11_eps CONSTANT’, ‘layer2.1.bn1/_native_batch_norm_legit_no_training_8_eps CONSTANT’, ‘layer1.1.bn2/_native_batch_norm_legit_no_training_4_eps CONSTANT’, ‘layer2.1.bn2/_native_batch_norm_legit_no_training_9_eps CONSTANT’, ‘layer4.0.downsample.1/_native_batch_norm_legit_no_training_17_eps CONSTANT’, ‘layer4.1.bn1/_native_batch_norm_legit_no_training_18_eps CONSTANT’, ‘layer3.1.bn2/_native_batch_norm_legit_no_training_14_eps CONSTANT’, ‘bn1/_native_batch_norm_legit_no_training_eps CONSTANT’, ‘layer2.0.downsample.1/_native_batch_norm_legit_no_training_7_eps CONSTANT’]
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version (e.g. 1.0.0):
- PyTorch Version (e.g. 1.0):
- CPU Architecture:
- OS (e.g., Linux):
- How you installed PyTorch (
conda
,pip
,libtorch
, source): - Build command you used (if compiling from source):
- Are you using local sources or building from archives:
- Python version:
- CUDA version:
- GPU models and configuration:
- Any other relevant information: