Skip to content

[Bug]: Inconsistency between is_contiguous and stride API in HIPGRAPH #2020

@tjtanaa

Description

@tjtanaa

🐛 Describe the bug

I could not replicate the scenario through simple test script. It is a bug found in vLLM ROCm when running meta-llama/Llama-4-Scout-17B-16E-Instruct. This behaviour only occurs in HIPGraph mode + torch.compile. In EAGER mode + torch.compile, the contiguous() API and stride() API are consistent.

In the HIPGraph, it could occur that the tensor A has the following properties:
.shape: ([1024, 1])
.is_contiguous(): True
.stride() : [1,1024]
.is_contiguous(memory_format=torch.channels_last) is False
.is_contiguous(memory_format=torch.contiguous_format) is True

Expected behaviour is .stride(1) == 1 is True as is_contiguous() is True.

This A = A.contiguous() do not fix the issue. It is still .stride() : [1,1024].

To fix this bug, the workaround right now is A = A.view(-1).reshape(A.shape).

On CUDA, the stride returns (1,1) but on ROCm, it returns (1,1024)

Versions

Collecting environment information...                                                                                                             
PyTorch version: 2.7.0a0+git295f2ed                                                                                                               
Is debug build: False                                                                                                                             
CUDA used to build PyTorch: N/A                                                                                                                   
ROCM used to build PyTorch: 6.3.42133-1b9c17779                                                                                                   
                                                                                                                                                  
OS: Ubuntu 22.04.5 LTS (x86_64)                                                                                                                   
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0                                                                                                
Clang version: 18.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-6.3.1 24491 1e0fda770a2079fbd71e4b70974d74f62fd3af10)             
CMake version: version 3.31.6                                                                                                                     
Libc version: glibc-2.35                                                                                                                          
                                                                                                                                                  
Python version: 3.12.9 (main, Feb  5 2025, 08:49:00) [GCC 11.4.0] (64-bit runtime)                                                                
Python platform: Linux-5.15.0-116-generic-x86_64-with-glibc2.35                                                                                   
Is CUDA available: True                                                                                                                           
CUDA runtime version: Could not collect                                                                                                           
CUDA_MODULE_LOADING set to: LAZY                                                                                                                  
GPU models and configuration: AMD Instinct MI300X (gfx942:sramecc+:xnack-)                                                                        
Nvidia driver version: Could not collect                                                                                                          cuDNN version: Could not collect
HIP runtime version: 6.3.42133                                           
MIOpen runtime version: 3.3.0                                            
Is XNNPACK available: True          

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.7.0a0+git295f2ed
[pip3] torchvision==0.21.0+7af6987
[pip3] triton==3.2.0+gite5be006a

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions