Rewrite 4x4 det as product of 2x2 minors for improved performance #1046
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The SMatrix{4,4}
det
code (added in #260) performs 72 multiplications and 23 addition/subtractions. Rewriting as products of 2x2 minors reduces to 30 multiplications and 17 additions/subtractions. Benchmarking (below) shows performance improvement from 11ns to <5ns, commensurate with the decrease in operations.I see that in pull request #597 @ryanelandt (with the approval of @andyferris) implicitly rewrote
inv
using 2x2 minors. Thus, my pull request just extends the performance improvement todet
.julia> bmold()
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 10.975 ns … 47.887 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 11.259 ns ┊ GC (median): 0.00%
Time (mean ± σ): 11.398 ns ± 1.592 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.
julia> bm()
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 4.582 ns … 34.553 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 4.707 ns ┊ GC (median): 0.00%
Time (mean ± σ): 4.770 ns ± 0.865 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.