-
-
Notifications
You must be signed in to change notification settings - Fork 34
Specialize Diagonal * Adjoint #1207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1207 +/- ##
=======================================
Coverage 91.92% 91.93%
=======================================
Files 34 34
Lines 15372 15384 +12
=======================================
+ Hits 14131 14143 +12
Misses 1241 1241 ☔ View full report in Codecov by Sentry. |
Should we add methods for strided |
We may add these methods, but non-strided mutable matrices (such as |
We could take care of that case specifically over there. But performance is not too bad for that case, as you showed. |
0016612
to
35a3df2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Could you perhaps take a look at SparseArrays.jl to see what it needs there to bring back the inplace multiplication?
For the future: if #1210 is merged, we may specialize |
I'd suggest we do that: having only generic |
I forgot: if we want to backport, we can't use the new stuff. So |
This reinstates slightly altered versions of the methods that were removed in JuliaLang/julia#52389. Sort of fixes #1205, although this doesn't recover the full performance. However, this version is more general, and works with the example presented in JuliaLang/julia#52389. There's still a performance regression, but the full performance may only be obtained for mutable matrices, and we may not assume mutability in general. Performance: v1.10: ```julia julia> n = 100 100 julia> A = adjoint(sparse(Float64, I, n, n)); julia> B = Diagonal(ones(n)); julia> @Btime $A * $B; 837.119 ns (5 allocations: 2.59 KiB) ``` This PR ```julia julia> @Btime $A * $B; 1.106 μs (15 allocations: 5.56 KiB) ``` We need double the allocations here compared to earlier, as we firstly materialize `D' * A'`, and then we again copy the adjoint of this result. I wonder if this may be reduced. --------- Co-authored-by: Daniel Karrasch <[email protected]>
Backported PRs: - [x] #1194 - [x] #1207 - [x] #1196 <!-- Explicitly declare type constructor imports --> - [x] #1202 <!-- Add fast path in generic matmul --> - [x] #1203 <!-- Restrict Diagonal sqrt branch to positive diag --> - [x] #1210 <!-- Indirection in matrix multiplication to avoid ambiguities -->
This reinstates slightly altered versions of the methods that were removed in JuliaLang/julia#52389. Sort of fixes #1205, although this doesn't recover the full performance. However, this version is more general, and works with the example presented in JuliaLang/julia#52389. There's still a performance regression, but the full performance may only be obtained for mutable matrices, and we may not assume mutability in general.
Performance:
v1.10:
This PR
We need double the allocations here compared to earlier, as we firstly materialize
D' * A'
, and then we again copy the adjoint of this result. I wonder if this may be reduced.