Skip to content

scalar getindex when shared index changes order #28

@MacKenzieHnC

Description

@MacKenzieHnC

I was asked to make this an issue, but I'm not sure there's anything to be done about it.

TensorCast uses scalar indexing when a shared index changes position, which is nightmarishly slow for big calculations on gpu.

using TensorCast
using CUDA
CUDA.allowscalar(false)

a, b, c, d = 1, 2, 3, 4
A = cu(rand(a,b,c))
B = cu(rand(a,c,d))

# Fast
@reduce C[a,b,d] := sum(c) A[a,b,c] * B[a,c,d]

# Fast
@reduce C[a,d,b] := sum(c) A[a,b,c] * B[a,c,d]

# Not fast
@reduce C[b,a,d] := sum(c) A[a,b,c] * B[a,c,d]

# Fast
@reduce C[b,c,d] := sum(a) A[a,b,c] * B[a,c,d]

# Not fast
@reduce C[c,b,d] := sum(a) A[a,b,c] * B[a,c,d]

Metadata

Metadata

Assignees

No one assigned

    Labels

    gpuanything involving a CuArray or similar

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions