Skip to content

Why FMHA is not supported in V100 and T4 #320

@jiangsongHW

Description

@jiangsongHW

I'm running TensorRT-LLM on V100, when I enabled fmha with --enable_context_fmha,
I got this error message:
[TensorRT-LLM][ERROR] Assertion failed: Unsupported architecture (/home/build/TensorRT_LLM/TensorRT-LLM-master/cpp/tensorrt_llm/kernels/contextFusedMultiHeadAttention/fmhaRunner.cpp:87)

I checked the code of FusedMHARunnerV2, it seems sm70 and sm75 are not supported.

may I know why V100 is not supported for fmha? or is there any plan on the way?

Thanks!

Metadata

Metadata

Labels

feature requestNew feature or request. This includes new model, dtype, functionality supporttriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions