Skip to content

Conversation

@stevengj
Copy link
Member

@stevengj stevengj commented Jun 19, 2025

It recently came to my attention that solving a least-square problem $\min_x \Vert Ax - b \Vert$ by the pseudo-inverse $\hat{x} = A^+ b$ is not backwards stable if you explicitly compute the pseudo inverse matrix $A^+$. That is, pinv(A) * b is numerically unstable. If you want to use the SVD $A = U \Sigma V^*$, the right way is to compute $V (\Sigma^+ (U^* b))$, not $A^+ b = (V \Sigma^+ U^*) b$. See e.g. Liu & Barnett (2016) and references therein:

image

However, pinv(A) * b is convenient in that it allows you to specify explicit tolerances atol, rtol to regularize the problem by dropping small singular values, whereas if you use the svd(A) directly there was no such truncation method provided, and furthermore the drop tolerance in svd(A) \ b was inconsistent with the default drop tolerance in pinv(A) * b.

To make it easier to do the right thing, this PR makes the following changes:

  • ldiv!(svd(A), b; atol, rtol) now allows you to pass tolerances, and its default tolerances are consistent with pinv. (This gives you a backwards-stable algorithm, but is somewhat low-level.)
  • You can now pass atol, rtol keywords directly to the svd function, via svd(A; atol, rtol), in order to compute a truncated SVD. This a provides higher-level, backwards-stable API svd(A; atol, rtol) \ b to compute a regularized least-square solution without explicitly forming the pinv matrix.
  • A new method pinv(svd(A)) that lets you pass the SVD to pinv along with atol, rtol tolerances. (This still computes an explicit pinv, which is undesirable in ill-conditioned problems, but I got the extra method "for free" as the result of some refactoring.) Update: This method is improved in return SVD from pinv(::SVD) #1398 to return an SVD object, making its application backwards-stable.

@stevengj stevengj added the enhancement New feature or request label Jun 19, 2025
@stevengj
Copy link
Member Author

stevengj commented Jun 19, 2025

Note that the advice from JuliaLang/julia#8859 to use a tolerance $\sqrt{\epsilon}$ for pinv in ill-conditioned least-squares problems seems to be misleading. The correct advice is not to use pinv at all in such cases (at least, not with a small tolerance; with a large tolerance, you regularize away the ill-conditioning, but of course that changes the answer).

For example, here is the ill-conditioned Hilbert-matrix example from that PR:

hilb(n::Integer) = [1/(i+j-1) for i = 1:n, j=1:n]

# form an ill-conditioned matrix
m, n = 1000, 100
A = randn(m,n) * hilb(n);

# create a right hand side in the range of A
x0 = randn(n);
b = A*x0;

# find the least squares solution
x = A \ b;

# the error in rhs should be near machine precision
norm(A*x - b) / norm(b)

The final backwards relative error using the default (QR) method A \ b is indeed near machine precision, e.g. 5.642891778507215e-16 for one random example on my machine. You get similar backwards error from svd(A) \ b.

However, if you use pinv(A; rtol) * b, for any value of rtol, the backwards error never goes below 1e-10:
image

While it's true that an rtol near $\sqrt{\epsilon}$ gives the smallest backwards error, it is much better to not use pinv at all. (Unless you are intentionally using a large tolerance to regularize the problem, independent of precision, as noted above, which changes the answer.)

@stevengj
Copy link
Member Author

(Note that this adds a missing checksquare to inv(::SVD), which I would argue is a bug from JuliaLang/julia#32126. If you want the pseudo-inverse, you should call pinv(::SVD) explicitly, not rely on magic undocumented behavior of inv.)

@codecov
Copy link

codecov bot commented Jun 19, 2025

Codecov Report

Attention: Patch coverage is 97.22222% with 1 line in your changes missing coverage. Please review.

Project coverage is 93.82%. Comparing base (c5d7122) to head (e70acb3).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
src/svd.jl 97.05% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1387   +/-   ##
=======================================
  Coverage   93.81%   93.82%           
=======================================
  Files          34       34           
  Lines       15722    15733   +11     
=======================================
+ Hits        14750    14761   +11     
  Misses        972      972           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@stevengj
Copy link
Member Author

stevengj commented Jun 19, 2025

I'm not sure if there is a better reference than the offhand remarks in @ahbarnett's papers cited above for the fact that pinv(A) * b is unstable. Trefethen & Bau, Golub & Van Loan, and Björk all recommend using the SVD for least-squares problems via $V \Sigma^+ (U^* b)$, with $U^* b$ computed first, but don't explicitly say that the other parenthesization is unstable.

@stevengj
Copy link
Member Author

stevengj commented Jun 23, 2025

I forgot that Michael Stewart (github handle?) commented on this topic on julia discourse as well, where he pointed out that Higham's book shows that $A^{-1}b$ is not a backwards stable way to solve $Ax=b$ for an invertible square matrix, and since this is a special case of the pseudoinverse the result follows. On page 260, Higham writes:

image

@ahbarnett
Copy link

Hi Steven - nice demo, and guilty as charged with respect to my "offhand remarks" :) To this day I do not know a good reference that pinv(A)*b is not backward stable for ill-conditioned LSQ problems, and I keep needing the idea in other papers. You guys are right to point to the Higham book Sec 14.1 for the square case - this is much more well known, and I probably should refer to it. But it's not satisfactory since square vs LSQ problems are usually treated differently. Higham Ch 20 (on LSQ) has many references in its afternotes but I do not see anything with the statement we need. Let me know if you find something!
Best, Alex

@stevengj
Copy link
Member Author

(Note that this adds a missing checksquare to inv(::SVD), which I would argue is a bug from JuliaLang/julia#32126. If you want the pseudo-inverse, you should call pinv(::SVD) explicitly, not rely on magic undocumented behavior of inv.)

On second thought, I've reverted this change (but left a commented-out note), since it is unrelated to this PR; I'll file another PR once this is merged.

@dkarrasch, do you have any thoughts on merging this?

Copy link
Member

@dkarrasch dkarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small proposal, but otherwise this is good to merge from my perspective.

Co-authored-by: Daniel Karrasch <[email protected]>
@dkarrasch dkarrasch merged commit db95a2a into master Jun 30, 2025
4 checks passed
@dkarrasch dkarrasch deleted the tsvd_pinv branch June 30, 2025 14:20
@andreasnoack
Copy link
Member

Shouldn't we consider making pinv(::SVD) return an SVD or some other struct that would allow pinv(A) * b to do the right thing?

@stevengj
Copy link
Member Author

stevengj commented Jul 1, 2025

Oh, that's an idea. We have the freedom to do that since pinv(::SVD) is new.

On the other hand, it's not clear why anyone would do this rather than just doing F = svd(A); F \ b

@andreasnoack
Copy link
Member

It would probably only really make a practical difference if pinv(::Matrix) returned such a struct but we probably can't get away with that. It just feels unnecessary to densify the result of pinv(::SVD). In particular now that we know that it is worse to do so.

@stevengj
Copy link
Member Author

stevengj commented Jul 4, 2025

How do we create LinearAlgebra NEWS items for the next Julia release? Do we have to open a separate PR to the julia repo?

@andreasnoack
Copy link
Member

No idea. Maybe @KristofferC knows.

stevengj added a commit that referenced this pull request Jul 7, 2025
As suggested by @andreasnoack in
#1387 (comment),
this changes the `pinv(::SVD)` function to instead return the SVD of the
pseudo-inverse, rather than an explicit matrix, so that it can be
applied stably.

(No backwards-compatibility issue since the `pinv(::SVD)` method was
introduced in #1387.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants