Skip to content

Conversation

@jishnub
Copy link
Member

@jishnub jishnub commented Apr 24, 2025

Instead of @noinline on the entire size-check function, we now separate the error-throwing part into a separate function and mark it as @noinline. This way, the size check may still be evaluated inline, and only the error path will not be inlined.

This improves performance for small matmul.

julia> A = [1 2; 3 4];

julia> @btime $A * $A;
  53.361 ns (2 allocations: 112 bytes) # v"1.13.0-DEV.438"
  47.504 ns (2 allocations: 112 bytes) # this PR

@jishnub jishnub added performance Must go faster backport 1.12 Change should be backported to release-1.12 labels Apr 24, 2025
@jishnub jishnub marked this pull request as draft April 24, 2025 13:56
@jishnub jishnub marked this pull request as ready for review April 24, 2025 14:15
@jishnub jishnub force-pushed the jishnub/mul_size_check branch from f6922cb to 651edc8 Compare April 26, 2025 05:45
@codecov
Copy link

codecov bot commented Apr 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.72%. Comparing base (e7a8a15) to head (02cbcb4).
Report is 2 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1310      +/-   ##
==========================================
- Coverage   93.73%   93.72%   -0.02%     
==========================================
  Files          34       34              
  Lines       15682    15685       +3     
==========================================
+ Hits        14699    14700       +1     
- Misses        983      985       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jishnub jishnub requested a review from dkarrasch April 26, 2025 07:26
Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps, why not.

@jishnub jishnub merged commit e4e8c19 into master Apr 26, 2025
4 checks passed
@jishnub jishnub deleted the jishnub/mul_size_check branch April 26, 2025 08:44
jishnub added a commit that referenced this pull request May 12, 2025
Instead of `@noinline` on the entire size-check function, we now
separate the error-throwing part into a separate function and mark it as
`@noinline`. This way, the size check may still be evaluated inline, and
only the error path will not be inlined.

This improves performance for small matmul.

```julia
julia> A = [1 2; 3 4];

julia> @Btime $A * $A;
  53.361 ns (2 allocations: 112 bytes) # v"1.13.0-DEV.438"
  47.504 ns (2 allocations: 112 bytes) # this PR
```

(cherry picked from commit e4e8c19)
@jishnub jishnub mentioned this pull request May 12, 2025
27 tasks
jishnub added a commit that referenced this pull request May 26, 2025
Backported PRs:
- [x] #1209 <!-- Remove `LinearAlgebra` qualifications in `cholesky.jl`
-->
- [x] #1230 <!-- Avoid materializing `diag` in `Diagonal` `kron` -->
- [x] #1240 <!-- Reduce `stable_muladdmul` branches in `generic
matvecmul!` -->
- [x] #1247 <!-- fix dispatch to herk -->
- [x] #1255 <!-- use smaller matrix size in `peakflops` on 32-bit -->
- [x] #1310 <!-- Only `@noinline` error path in `matmul_size_check` -->
- [x] #1267 <!-- Refine column ranges in `_isbanded_impl` -->
- [x] #1320 <!-- Copy matrices in `triu`/`tril` if no zero exists for
the `eltype` -->
- [x] #1324 <!-- Fix empty `Tridiagonal` broadcast -->
- [x] #1327 <!-- `iszero` check in hessenberg setindex -->
- [x] #1326 <!-- Fix multiplication with empty `HessenbergQ` -->
- [x] #1332 <!-- Unwrap triangular matrices in broadcast -->
- [x] #1337 <!-- Change `1:size` to `axes` in bidiag mul -->
- [x] #1342 <!-- `Char` uplo in `Bidiagonal` constructor -->
- [x] #1344 <!-- Update the docstring of ldiv! -->
- [x] #1335 <!-- Test: prune old LA based on ENV variable -->
- [x] #1346 <!-- Fix scaling unit triangular matrices -->
- [x] #1355 <!-- Add compat notice for `diagview` -->
- [x] #1349 <!-- Prune `LinearAlgebra` module in ambiguity test -->

Contains multiple commits, manual intervention needed:
- [x] #1238 <!-- Ensure positive-definite matrix in lapack posv test -->
- [x] #1298 <!-- Add `diagm` example -->
- [x] #1312 <!-- WIP: Try use method deletion instead of custom sysimage
-->
- [x] #1333 <!-- Make `fillstored!` public -->
- [x] #1331 <!-- Document SingularException throw for
inv(::AbstractMatrix) -->
- [x] #1350 <!-- Fix copy for partly initialized unit triangular -->

Non-merged PRs with backport label:
- [x] #1352 <!-- log for dense diagonal matrix with negative elements
-->
- [ ] #1305 <!-- Bounds-checking in triangular indexing branches -->

---------

Co-authored-by: Mateus Araújo <[email protected]>
Co-authored-by: Jeff Bezanson <[email protected]>
Co-authored-by: Steven G. Johnson <[email protected]>
Co-authored-by: WalterMadelim <[email protected]>
Co-authored-by: Kristoffer Carlsson <[email protected]>
Co-authored-by: Daniel Karrasch <[email protected]>
Co-authored-by: Michael Abbott <[email protected]>
ViralBShah added a commit that referenced this pull request Jun 2, 2025
This PR adds a size check in the 2-argument `mul`, so that now the
destination array is allocated only if the sizes of the arguments are
compatible with matrix multiplication. This means that we don't allocate
in case of an error anymore.

The performance for small-matrix multiplication seems largely similar
(it's comparable to
#1310, and seems
identical within the noise limit):
```julia
julia> A = [1 2; 3 4];

julia> @Btime $A * $A;
  42.304 ns (2 allocations: 112 bytes) # before this PR
  44.203 ns (2 allocations: 112 bytes) # this PR
```

We also redirect the generic `mul` to `_mul` now, which is the function
that defines the multiplication code. This allows us to reuse the `_mul`
definition elsewhere without having to repeat code. Currently, this is
mainly necessary in the `Bidiagonal`-triangular multiplications.
@jishnub jishnub removed the backport 1.12 Change should be backported to release-1.12 label Jun 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants