-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[RFC] vulkan: fix matmul pipeline selection for small n values #16681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] vulkan: fix matmul pipeline selection for small n values #16681
Conversation
Change mul_mat and mul_mat_id pipeline selection heuristic to prevent Intel Arc GPU hangs. The previous logic would select the small pipeline (mul_mat_id_s) when one dimension was small, causing hangs on Intel Arc when, e.g. m=512 and n=23 as it happens with IBM Granite 4. Signed-off-by: Giuseppe Scrivano <[email protected]>
|
@0cc4m can you please take a look? Do you have any suggestions for a better fix? |
|
This just avoids the problem by switching the used shader for one that doesn't cause an issue. Instead, we should figure out why the small one is not working on Intel. That could be a driver bug or an issue on our side that gets triggered only on Intel. I'll try to reproduce it on my A770. |
|
the issue does not happen if I stress only the mat_mul_id operation, I get it only on the second chat message when using |
On what hardware? Let's maybe move this conversation into an issue. |
|
I am testing it on the iGPU that is available on my ThinkPad: |
|
and the vulkaninfo output: |
|
created an issue: #16684 |
Change mul_mat and mul_mat_id pipeline selection heuristic to prevent Intel Arc GPU hangs.
The previous logic would select the small pipeline (mul_mat_id_s) when one dimension was small, causing hangs on Intel Arc when, e.g. m=512 and n=23 as it happens with IBM Granite 4.
I've tested it with NVIDIA L4 and it doesn't affect the performance.
Marked as RFC as I am not sure if it is the right thing to do, but it definitely solves the problem on Intel Arc and after this change I've no problems to run the IBM Granite models.