🐛 [Bug] TensorRT engine exceptions are not raised

##  Bug Description

I would expect for torch to raise an exception when inference fails for any reason, such as wrong input tensor shape or wrong dtype. Instead, a warning is raised in console but the program continues successfully. This can have serious implications in production environments.


## To Reproduce

I have a model compiled on float16 that accepts a static input shape of `(1, 3, 538, 538)`

```python
import torch
import torch_tensorrt

model = torch.jit.load("model.ts")
input_tensor = torch.zeros((1, 3, 538, 538), dtype=torch.float16, device="cuda")
output_tensor = model(input_tensor)
```

This is what happens if I pass a wrong shape
```python
>>> out = model(torch.zeros((1, 3, 500, 500), dtype=torch.float16, device="cuda"))
ERROR: [Torch-TensorRT] - 3: [executionContext.cpp::setInputShape::2020] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::setInputShape::2020, condition: engineDims.d[i] == dims.d[i]. Static dimension mismatch while setting input shape.
)
>>> # NO EXCEPTION IS RAISED
```

This is what happens if I pass a wrong dtype
```python
>>> out = model(torch.zeros((1, 3, 538, 538), dtype=torch.float32, device="cuda"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wizard/mambaforge/envs/remini/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/<model>.py", line 8, in forward
    input_0: Tensor) -> Tensor:
    __torch___<model>_trt_engine_ = self_1.__torch___<model>_trt_engine_
    _0 = ops.tensorrt.execute_engine([input_0], __torch___<model>_trt_engine_)
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _1, = _0
    return _1

Traceback of TorchScript, original code (most recent call last):
RuntimeError: [Error thrown at core/runtime/execute_engine.cpp:136] Expected inputs[i].dtype() == expected_type to be true but got false
Expected input tensors to have type Half, found type float


>>> # NO EXCEPTION IS RAISED
```

## Expected behavior

An exception should be raised if the TensorRT Engine returns an error.

## Environment

> Build information about Torch-TensorRT can be found by turning on debug messages

 - Torch-TensorRT Version (e.g. 1.0.0): 1.4.0
 - PyTorch Version (e.g. 1.0): 2.0.1
 - CPU Architecture: x86_64
 - OS (e.g., Linux): LInux
 - How you installed PyTorch (`conda`, `pip`, `libtorch`, source): pip (custom whl)
 - Build command you used (if compiling from source): -
 - Are you using local sources or building from archives: -
 - Python version: 3.9
 - CUDA version: 11.8
 - GPU models and configuration: Tesla T4 / Tesla L4
 - Any other relevant information: tensorrt 8.5.3.1



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 [Bug] TensorRT engine exceptions are not raised #2367

Bug Description

To Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🐛 [Bug] TensorRT engine exceptions are not raised #2367

Description

Bug Description

To Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions