-
Notifications
You must be signed in to change notification settings - Fork 39
Optimize BandedBlockBandedMatrix #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #74 +/- ##
==========================================
+ Coverage 67.98% 70.66% +2.68%
==========================================
Files 7 7
Lines 406 450 +44
==========================================
+ Hits 276 318 +42
- Misses 130 132 +2
Continue to review full report at Codecov.
|
src/jacobians.jl
Outdated
end | ||
end | ||
|
||
@inline function _colorediteration!(Jac::BlockBandedMatrices.BandedBlockBandedMatrix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this in an @requires
block
@require BlockBandedMatrices="ffab5731-97b5-5995-9138-79e8c1846df0" begin | ||
_use_findstructralnz(::BlockBandedMatrices.BandedBlockBandedMatrix) = false | ||
|
||
@inline function _colorediteration!(Jac::BlockBandedMatrices.BandedBlockBandedMatrix, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about just standard BlockBandedMatrix and BandedMatrix? Are those fine without a special iteration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm working on that.
The original indexing based assignment is slow for BandedBlockBandedMatrix compared to CSC matrix (#67). This PR switch to a column-based iteration, the evaluation time of colored jacobian for BBBmatrix is reduced from 120 ms to 87ms while the performance for CSC matrix is kept the same.