Solving Linear Equations via SVD
- Solution set of a linear equation
- Pseudo-inverse
- Sensitivity analysis and condition number
Solution set
Consider a linear equation
data:image/s3,"s3://crabby-images/71ff3/71ff3a79b24cd5df19e48f4fd25b94f4821336f1" alt="Ax = y,"
where and
are given. We can completely describe the set of solutions via SVD, as follows. Let us assume that
admits an SVD given here. With
, pre-multiply the linear equation by the inverse of
,
; then we express the equation in terms of the rotated vector
. This leads to
data:image/s3,"s3://crabby-images/ec077/ec07756d36890972b3400fc307e7d73c2c51d0b8" alt="tilde{ {S}} tilde{x} = tilde{y},"
where is the ‘‘rotated’’ right-hand side of the equation. Due to the simple form of
, the above writes
data:image/s3,"s3://crabby-images/c3aad/c3aadcf79d38dd70a2e7879dbc34964637421821" alt="sigma_i tilde{x}_i = tilde{y}_i , ;; i = 1,ldots, r, ;; 0 = tilde{y}_i , ;; i=r+1,ldots,m."
Two cases can occur.
- If the last
components of
are not zero, then the above system is infeasible, and the solution set is empty. This occurs when
is not in the range of
.
- If
is in the range of
, then the last set of conditions in the above system hold, and we can solve for
with the first set of conditions:
data:image/s3,"s3://crabby-images/5e640/5e640eaddd5c3322af363c0eb0df8f9ed00d3898" alt="tilde{x}_i = frac{tilde{y}_i}{sigma_i} , ;; i = 1,ldots, r."
The last components of
are free. This corresponds to elements in the nullspace of
. If
is full column rank (its nullspace is reduced to
), then there is a unique solution.
Pseudo-inverse
Definition
The solution set is conveniently described in terms of the pseudo-inverse of , denoted by
, and defined via the SVD of
:
data:image/s3,"s3://crabby-images/a7596/a7596d6c3399bbd15044753f7a129b8597f31ab3" alt="A = U tilde{ {S}} V^T, ;; tilde{ {S}} = mbox{bf diag}(sigma_1,ldots,sigma_r,0,ldots,0)"
as one with same SVD, with non-zero singular values inverted, and the matrix transposed:
data:image/s3,"s3://crabby-images/7b9a4/7b9a4814ceb89c509c3e4ba420368ebb0c485026" alt="A^dagger := V tilde{ {S}}^dagger U^T, ;; tilde{ {S}} = mbox{bf diag}(sigma_1^{-1},ldots,sigma_r^{-1},0,ldots,0)."
The pseudo-inverse of a matrix is always well-defined, and that it has the same size as the transpose . When the matrix is invertible (it is square and full column or row rank:
), then it reduces to the inverse.
Example: pseudo-inverse of a matrix.
>> Adagger = pinv(A);
Link with solution set
From the above development, we see that the solution set can be written as
data:image/s3,"s3://crabby-images/c66ca/c66cad30e13ae689c5e3c86486d1cd9be53dc16e" alt="mathbf{S} = left{ A^dagger y + z ~:~ z in {cal N}(A) right},"
where is the nullspace of
. Both
and a basis for the nullspace can be computed via the SVD.
Case when
is full rank
If is full column rank, the pseudo-inverse can be written as
data:image/s3,"s3://crabby-images/d6180/d61808e2aa999444b0c7d6962e16cee4e1cd0d45" alt="A^dagger = (A^TA)^{-1}A^T."
In that case, is a left-inverse of
, since
.
If is full row-rank, then the pseudo-inverse can be written as
data:image/s3,"s3://crabby-images/c31ac/c31ac24570a9b2a95702a77832a212612053cb7a" alt="A^dagger = A^T(AA^T)^{-1}."
In that case, is a right-inverse of
, since
.
Sensitivity analysis and condition number
Sensitivity analysis refers to the process of quantifying the impact of changes in the linear equations’ coefficients (the matrix and vector
), on the solution. To simplify, let us assume that
is square and invertible, and analyze the effects of errors in
only. The condition number of the matrix
quantifies this.
We start from the linear equation above, which has the unique solution . Now assume that
is changed into
, where
is a vector that contains the changes in
. Then let us denote by
the new solution, which is
. From the equations
data:image/s3,"s3://crabby-images/7c866/7c866efd40b81b6212c0cd13f1b8278fdf859146" alt="delta x = A^{-1} delta y, ;;"
and via the definition of the largest singular value norm, we obtain:
data:image/s3,"s3://crabby-images/ac267/ac267d6285466106f3a4434c9f9d2a7d604fbef0" alt="|delta x|_2 le |A^{-1}| cdot |delta y|_2 , ;; |y|_2 le |A| cdot |x|_2 ."
Combining the two inequalities we obtain:
data:image/s3,"s3://crabby-images/9975c/9975cd0c5bc1e2ad122a7f0a26a6f225cdd3fb4f" alt="frac{|delta x|_2}{|x|_2} le kappa(A) cdot frac{|delta y|_2}{|y|_2} ,"
where is the condition number of
, defined as
data:image/s3,"s3://crabby-images/3cec3/3cec32d7355465d99235d098603f18aa732403fa" alt="kappa(A) := |A| cdot |A^{-1}|."
We can express the condition number as the ratio between the largest and smallest singular values of :
data:image/s3,"s3://crabby-images/7ff45/7ff45afa68771f0c89f0dd47b280ef68e9d69b17" alt="kappa(A) = frac{sigma_1(A)}{sigma_n(A)}."
The condition number gives a bound on the ratio between the relative error in the left-hand side to that of the solution. We can also analyze the effect of errors in the matrix itself on the solution. The condition number turns out to play a crucial role there as well.