Add support for TensorRT-RTX #3753

lanluo-nvidia · 2025-08-06T18:49:35Z

Description

Add initial support for TensorRT-RTX.
The following are the currently identified issues:
RTX team side:

mobilenet/efficientnet test failed when input dtype is bfloat16(float16 and float32 is working)
5439176
Flux model FP4 and FP8 are not working (fp16 and fp32 is working)
5400490
NoneZero has not been supported yet in RTX(will add support in future)
5407733
Int8 quantization only support weights quantization, does not support activations quantization yet.
5402295
deconv3d is not working in rtx
5459328
cuda graph is not working in rtx
5481821

Our side:

a few test cases failed on strong typing
PR is in Progress:
fix: atan2 strong type support & bug fix for integer dynamic shape #3751
add strong typing fix #3749
batchnorm constant fold got nan (skip constant fold is able to get the result)
🐛 [Bug] TensorRT-RTX BatchNorm constant fold got nan #3699
refit test failed on missing eps weights due to we only capture (bias/weight/running_mean/running_var)
🐛 [Bug] TensorRT-RTX Refitter test failed when constant fold is disabled #3752
cuda graph test failed
🐛 [Bug] TensorRT-RTX: Cuda graph test failed #3781
torch script test failed
🐛 [Bug] TensorRT - RTX: torch-script test failure #3782

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

pytorch-bot · 2025-08-06T18:50:56Z

No ciflow labels are configured for this repo.
For information on how to enable CIFlow bot see this wiki

.github/workflows/build-test-linux-x86_64_rtx.yml

narendasan

@lanluo-nvidia just remove the PTQ Calibrator feature from python and C++ and put in deprecation errors.

.github/workflows/build-test-windows_rtx.yml

core/conversion/conversionctx/ConversionCtx.cpp

core/conversion/conversion.cpp

cpp/bin/torchtrtc/main.cpp

py/torch_tensorrt/fx/converters/acc_ops_converters.py

py/torch_tensorrt/trt_alias.py

toolchains/ci_workspaces/MODULE.bazel.tmpl

pyproject_rtx.toml.temp

setup.py

.github/scripts/install-tensorrt-rtx.sh

docsrc/getting_started/tensorrt_rtx.rst

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/tests/py/ts/integrations/test_trt_intercompatibility.py	2025-08-27 00:39:48.227435+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/ts/integrations/test_trt_intercompatibility.py	2025-08-27 00:40:20.832194+00:00
@@ -34,10 +34,11 @@

        trt_engine = torchtrt.ts.convert_method_to_trt_engine(
            self.ts_model, "forward", **compile_spec
        )
        import tensorrt as trt
+
        TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
        with trt.Runtime(TRT_LOGGER) as rt:
            engine = rt.deserialize_cuda_engine(trt_engine)
            with engine.create_execution_context() as ctx:
                out = torch.empty(

py/torch_tensorrt/dynamo/conversion/impl/quantize.py

py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py

narendasan

Just the decorator then you are good to merge

initial check in for tensorrt rtx

e1a1731

meta-cla bot added the cla signed label Aug 6, 2025

github-actions bot added component: evaluators Issues re: Specific op evaluators component: runtime component: partitioning component: fx component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Aug 6, 2025

facebook-github-bot added the fx label Aug 6, 2025

github-actions bot requested a review from narendasan August 6, 2025 18:49

lanluo-nvidia marked this pull request as ready for review August 6, 2025 18:50

lanluo-nvidia added the ciflow/binaries/all Build for all Python Versions label Aug 6, 2025

narendasan reviewed Aug 6, 2025

View reviewed changes

.github/workflows/build-test-linux-x86_64_rtx.yml Show resolved Hide resolved

LD_LIBRARY_PATH fix for windows smoke test

7d34b49

narendasan reviewed Aug 6, 2025

View reviewed changes

lanluo-nvidia added 5 commits August 6, 2025 14:31

resolve comments

184c84c

change the pyproject.toml to make dependencies dynamic

cf80997

add documentation

7727fc8

Merge branch 'main' into lluo/rtx_pr

e31afc8

resolve merge conflict

8eee6af

narendasan reviewed Aug 12, 2025

View reviewed changes

.github/scripts/install-tensorrt-rtx.sh Outdated Show resolved Hide resolved

github-actions bot removed the component: fx label Aug 15, 2025

lanluo-nvidia added 9 commits August 22, 2025 12:14

add skip test for rtx

6c72548

add skip test

887438c

Merge branch 'main' into lluo/rtx_pr

5459d63

add skip test for bfloat16

036c5c6

fix the ts test failures.

ec49512

test

3e4c412

ignore cudagraph tests

714905c

add fx flag in setup.py

26486c3

test

b768b91

narendasan reviewed Aug 26, 2025

View reviewed changes

docsrc/getting_started/tensorrt_rtx.rst Outdated Show resolved Hide resolved

narendasan reviewed Aug 26, 2025

View reviewed changes

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py Outdated Show resolved Hide resolved

narendasan reviewed Aug 26, 2025

View reviewed changes

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py Outdated Show resolved Hide resolved

lanluo-nvidia added 4 commits August 26, 2025 15:52

resolve comments

4cd7755

Merge branch 'main' into lluo/rtx_pr

661ff9c

resolve comments

20c45c5

resolve comments

1a6a57b

github-actions bot requested changes Aug 27, 2025

View reviewed changes

lanluo-nvidia added 4 commits August 26, 2025 18:49

resolve comments

74719dc

test

3935a8d

ignore refit error test and added TODO to fix later

b81bdeb

add black-forest-labs/FLUX.1-Kontext-dev support for rtx perf

c2c7c0e

narendasan reviewed Aug 27, 2025

View reviewed changes

py/torch_tensorrt/dynamo/conversion/impl/quantize.py Outdated Show resolved Hide resolved

narendasan reviewed Aug 27, 2025

View reviewed changes

py/torch_tensorrt/dynamo/conversion/aten_ops_converters.py Outdated Show resolved Hide resolved

narendasan approved these changes Aug 27, 2025

View reviewed changes

lanluo-nvidia added 4 commits August 27, 2025 12:27

resolve comments

50be4b7

resolve comments

5b1b251

fix test error

a30261f

fix test error

0d0ab96

lanluo-nvidia merged commit 996e33b into main Aug 28, 2025
16 checks passed

Add support for TensorRT-RTX #3753

Add support for TensorRT-RTX #3753

Uh oh!

Conversation

lanluo-nvidia commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

pytorch-bot bot commented Aug 6, 2025

Uh oh!

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lanluo-nvidia commented Aug 6, 2025 •

edited

Loading