-
-
Notifications
You must be signed in to change notification settings - Fork 35
stable pinv least-squares via rtol,atol in svd, ldiv! #1387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note that the advice from JuliaLang/julia#8859 to use a tolerance For example, here is the ill-conditioned Hilbert-matrix example from that PR: hilb(n::Integer) = [1/(i+j-1) for i = 1:n, j=1:n]
# form an ill-conditioned matrix
m, n = 1000, 100
A = randn(m,n) * hilb(n);
# create a right hand side in the range of A
x0 = randn(n);
b = A*x0;
# find the least squares solution
x = A \ b;
# the error in rhs should be near machine precision
norm(A*x - b) / norm(b)The final backwards relative error using the default (QR) method However, if you use While it's true that an |
|
(Note that this adds a missing |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1387 +/- ##
=======================================
Coverage 93.81% 93.82%
=======================================
Files 34 34
Lines 15722 15733 +11
=======================================
+ Hits 14750 14761 +11
Misses 972 972 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm not sure if there is a better reference than the offhand remarks in @ahbarnett's papers cited above for the fact that |
|
I forgot that Michael Stewart (github handle?) commented on this topic on julia discourse as well, where he pointed out that Higham's book shows that
|
|
Hi Steven - nice demo, and guilty as charged with respect to my "offhand remarks" :) To this day I do not know a good reference that pinv(A)*b is not backward stable for ill-conditioned LSQ problems, and I keep needing the idea in other papers. You guys are right to point to the Higham book Sec 14.1 for the square case - this is much more well known, and I probably should refer to it. But it's not satisfactory since square vs LSQ problems are usually treated differently. Higham Ch 20 (on LSQ) has many references in its afternotes but I do not see anything with the statement we need. Let me know if you find something! |
On second thought, I've reverted this change (but left a commented-out note), since it is unrelated to this PR; I'll file another PR once this is merged. @dkarrasch, do you have any thoughts on merging this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small proposal, but otherwise this is good to merge from my perspective.
Co-authored-by: Daniel Karrasch <[email protected]>
|
Shouldn't we consider making |
|
Oh, that's an idea. We have the freedom to do that since On the other hand, it's not clear why anyone would do this rather than just doing |
|
It would probably only really make a practical difference if |
|
How do we create LinearAlgebra NEWS items for the next Julia release? Do we have to open a separate PR to the julia repo? |
|
No idea. Maybe @KristofferC knows. |
As suggested by @andreasnoack in #1387 (comment), this changes the `pinv(::SVD)` function to instead return the SVD of the pseudo-inverse, rather than an explicit matrix, so that it can be applied stably. (No backwards-compatibility issue since the `pinv(::SVD)` method was introduced in #1387.)


It recently came to my attention that solving a least-square problem$\min_x \Vert Ax - b \Vert$ by the pseudo-inverse $\hat{x} = A^+ b$ is not backwards stable if you explicitly compute the pseudo inverse matrix $A^+$ . That is, $A = U \Sigma V^*$ , the right way is to compute $V (\Sigma^+ (U^* b))$ , not $A^+ b = (V \Sigma^+ U^*) b$ . See e.g. Liu & Barnett (2016) and references therein:
pinv(A) * bis numerically unstable. If you want to use the SVDHowever,
pinv(A) * bis convenient in that it allows you to specify explicit tolerancesatol, rtolto regularize the problem by dropping small singular values, whereas if you use thesvd(A)directly there was no such truncation method provided, and furthermore the drop tolerance insvd(A) \ bwas inconsistent with the default drop tolerance inpinv(A) * b.To make it easier to do the right thing, this PR makes the following changes:
ldiv!(svd(A), b; atol, rtol)now allows you to pass tolerances, and its default tolerances are consistent withpinv. (This gives you a backwards-stable algorithm, but is somewhat low-level.)atol, rtolkeywords directly to thesvdfunction, viasvd(A; atol, rtol), in order to compute a truncated SVD. This a provides higher-level, backwards-stable APIsvd(A; atol, rtol) \ bto compute a regularized least-square solution without explicitly forming thepinvmatrix.pinv(svd(A))that lets you pass the SVD topinvalong withatol, rtoltolerances. (This still computes an explicitpinv, which is undesirable in ill-conditioned problems, but I got the extra method "for free" as the result of some refactoring.) Update: This method is improved in return SVD from pinv(::SVD) #1398 to return anSVDobject, making its application backwards-stable.