Skip to content

Conversation

jishnub
Copy link
Member

@jishnub jishnub commented Feb 14, 2025

This reinstates slightly altered versions of the methods that were removed in JuliaLang/julia#52389. Sort of fixes #1205, although this doesn't recover the full performance. However, this version is more general, and works with the example presented in JuliaLang/julia#52389. There's still a performance regression, but the full performance may only be obtained for mutable matrices, and we may not assume mutability in general.

Performance:
v1.10:

julia> n = 100
100

julia> A = adjoint(sparse(Float64, I, n, n));

julia> B = Diagonal(ones(n));

julia> @btime $A * $B;
  837.119 ns (5 allocations: 2.59 KiB)

This PR

julia> @btime $A * $B;
  1.106 μs (15 allocations: 5.56 KiB)

We need double the allocations here compared to earlier, as we firstly materialize D' * A', and then we again copy the adjoint of this result. I wonder if this may be reduced.

@codecov
Copy link

codecov bot commented Feb 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.93%. Comparing base (5cf41c4) to head (cf94706).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1207   +/-   ##
=======================================
  Coverage   91.92%   91.93%           
=======================================
  Files          34       34           
  Lines       15372    15384   +12     
=======================================
+ Hits        14131    14143   +12     
  Misses       1241     1241           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jishnub jishnub added backport 1.11 Change should be backported to the 1.11 release backport 1.12 Change should be backported to release-1.12 labels Feb 17, 2025
@dkarrasch
Copy link
Member

Should we add methods for strided AdjOrTrans for even better performance? But that doesn't guarantee mutability of individual entries, right? Otherwise, looks good to go.

@jishnub
Copy link
Member Author

jishnub commented Feb 18, 2025

We may add these methods, but non-strided mutable matrices (such as SparseArrays) will still be slow. Perhaps we need a mutability trait, like ArrayInterface.

@dkarrasch
Copy link
Member

SparseArrays) will still be slow

We could take care of that case specifically over there. But performance is not too bad for that case, as you showed.

@jishnub jishnub marked this pull request as draft February 20, 2025 05:45
@jishnub jishnub force-pushed the jishnub/adj_diag_mul branch from 0016612 to 35a3df2 Compare February 20, 2025 08:06
@jishnub jishnub marked this pull request as ready for review February 20, 2025 08:33
Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you perhaps take a look at SparseArrays.jl to see what it needs there to bring back the inplace multiplication?

@jishnub
Copy link
Member Author

jishnub commented Feb 20, 2025

For the future: if #1210 is merged, we may specialize mul directly, instead of redirecting * to _diag_adj_mul.

@dkarrasch
Copy link
Member

For the future: if #1210 is merged, we may specialize mul directly, instead of redirecting * to _diag_adj_mul.

I'd suggest we do that: having only generic * methods, and everything else (like bidiag*hessenberg, and whatnot) should be dispatched through mul.

@dkarrasch
Copy link
Member

I forgot: if we want to backport, we can't use the new stuff. So mul has to go into a new PR.

@jishnub jishnub merged commit 579b5f7 into master Feb 20, 2025
4 checks passed
@jishnub jishnub deleted the jishnub/adj_diag_mul branch February 20, 2025 16:50
dkarrasch added a commit that referenced this pull request Feb 20, 2025
This reinstates slightly altered versions of the methods that were
removed in JuliaLang/julia#52389. Sort of fixes
#1205, although this doesn't recover the full performance. However, this
version is more general, and works with the example presented in
JuliaLang/julia#52389. There's still a
performance regression, but the full performance may only be obtained
for mutable matrices, and we may not assume mutability in general.

Performance:
v1.10:

```julia
julia> n = 100
100

julia> A = adjoint(sparse(Float64, I, n, n));

julia> B = Diagonal(ones(n));

julia> @Btime $A * $B;
  837.119 ns (5 allocations: 2.59 KiB)
```

This PR
```julia
julia> @Btime $A * $B;
  1.106 μs (15 allocations: 5.56 KiB)
```
We need double the allocations here compared to earlier, as we firstly
materialize `D' * A'`, and then we again copy the adjoint of this
result. I wonder if this may be reduced.

---------

Co-authored-by: Daniel Karrasch <[email protected]>
@dkarrasch dkarrasch mentioned this pull request Feb 20, 2025
7 tasks
KristofferC added a commit that referenced this pull request Feb 25, 2025
Backported PRs:
- [x] #1194 
- [x] #1207 
- [x] #1196 <!-- Explicitly declare type constructor imports -->
- [x] #1202 <!-- Add fast path in generic matmul -->
- [x] #1203 <!-- Restrict Diagonal sqrt branch to positive diag -->
- [x] #1210 <!-- Indirection in matrix multiplication to avoid
ambiguities -->
@jishnub jishnub mentioned this pull request May 12, 2025
27 tasks
@jishnub jishnub removed the backport 1.12 Change should be backported to release-1.12 label May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 1.11 Change should be backported to the 1.11 release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance regression for (*)(A, D::Diagonal) when A is the adjoint of a sparse matrix

3 participants