Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented Nov 22, 2024

The 2-pass XDL write VGPR, read by non-XDL SGEMM/DGEMM case
was 1 wait state overly conservative. Previously, for gfx940,
the XDL/non-XDL cases happened to have the same number of cycles
in all cases. Now the XDL consumer case has an additional state for
2 pass sources.

This was referenced Nov 22, 2024
Copy link
Contributor Author

arsenm commented Nov 22, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

This was referenced Nov 26, 2024
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Feb 3, 2025
The 2-pass XDL write VGPR, read by non-XDL SGEMM/DGEMM case
was 1 wait state overly conservative. Previously, for gfx940,
the XDL/non-XDL cases happened to have the same number of cycles
in all cases. Now the XDL consumer case has an additional state for
2 pass sources.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants