Skip to content

Conversation

@mcabbott
Copy link
Member

@mcabbott mcabbott commented Mar 22, 2020

I think that we can allow batched_gemm! to be called on many (but not all) PermutedDimsArrays, and this is generally much faster than first calling permutedims. This PR is an attempt to implement this. It also extends batched_mul! to take α, β scales like mul!.

EARLIER:

@mcabbott
Copy link
Member Author

mcabbott commented Apr 3, 2020

This has changed a lot. The current form uses https://github.com/JuliaMatrices/ArrayLayouts.jl to keep track dimensions & strides, via traits UnitStrideFirst and UnitStride{2}, added there in JuliaLinearAlgebra/ArrayLayouts.jl#12 (which not yet merged).

See also #191 for an alternative approach.

Either of these can be made to work with JuliaGPU/CuArrays.jl#664, which will similarly allow permutations.

@mcabbott mcabbott marked this pull request as draft April 17, 2020 20:54
@mcabbott
Copy link
Member Author

This could possibly now be done with https://github.com/SciML/ArrayInterface.jl instead of https://github.com/JuliaMatrices/ArrayLayouts.jl . But I'm hesitant to add dependencies (which would probably need to be added to CUDA.jl too), and I think #191 is simpler lower-tech solution.

@mcabbott mcabbott closed this Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants