Skip to content

Conversation

@ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Nov 11, 2025

PyTorch CI started failing in a recent PR. I guess they did some breaking changes. Strangely enough it can do import torch.compiler?

Other things are failing like it not liking numpy dtypes?

Closes #1723

@ricardoV94 ricardoV94 added bug Something isn't working torch PyTorch backend labels Nov 11, 2025
@ricardoV94
Copy link
Member Author

It seems to be installing a rather old version of pytorch?

  pytorch                    2.0.1         py3.11_cpu_0           pytorch    

So maybe that's why. It is currently at 2.9.0.

CC @maresb @Ch0ronomato

@maresb
Copy link
Contributor

maresb commented Nov 11, 2025

@ricardoV94, it's caused by 8a6e407

@ricardoV94
Copy link
Member Author

I've lost the context for that mkl thing

@maresb
Copy link
Contributor

maresb commented Nov 11, 2025

I'm pretty sure you're safe reverting that upper bound.

@ricardoV94
Copy link
Member Author

Odd error:

FAILED tests/link/pytorch/test_extra_ops.py::test_pytorch_Unique_axis[1] - AssertionError: expected size 3==3, stride 1==2 at dim=0; expected size 2==2, stride 3==1 at dim=1
Error in op: torch.ops.aten.unique_dim.default
This error most often comes from a incorrect fake (aka meta) kernel for a custom op.

@ricardoV94
Copy link
Member Author

It stopped automagically failing in other CI. I'll try to rerun without the Pin anyway

@ricardoV94 ricardoV94 changed the title Pytorch ci failing Pytorch CI failing Nov 18, 2025
@ricardoV94 ricardoV94 added no releasenotes GitHub CI/CD and removed bug Something isn't working labels Nov 18, 2025
@ricardoV94 ricardoV94 mentioned this pull request Nov 18, 2025
@ricardoV94
Copy link
Member Author

2 tests fail in the most recent pytorch:

  • FAILED tests/link/pytorch/test_extra_ops.py::test_pytorch_Unique_axis[1] - AssertionError: expected size 3==3, stride 1==2 at dim=0; expected size 2==2, stride 3==1 at dim=1
    Error in op: torch.ops.aten.unique_dim.default

  • FAILED tests/link/pytorch/test_nlinalg.py::test_eig - AssertionError: expected size 3==3, stride 1==3 at dim=0; expected size 3==3, stride 3==1 at dim=1
    Error in op: torch.ops.aten.linalg_eig.default

May be worth checking this so we can remove the MKL pin? Or is it indeed related to it?

@maresb
Copy link
Contributor

maresb commented Nov 19, 2025

I'm guessing that this happens in practice, and you're only seeing it by busting the cache.

Unpinned dependencies are such a massive headache. I really need to finish that pixi PR. 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PyTorch CI failing

2 participants