36
[latexpage]
36.1. Solution set
Consider a linear equation
\[ Ax = y, \]
where \( A \in \mathbb{R}^{m\times n} \) and \( y \in \mathbb{R}^m \) are given. We can completely describe the set of solutions via SVD, as follows. Let us assume that \( A \) admits an SVD given here. With \( A = U\tilde{S}V^T \) pre-multiply the linear equation by the inverse of \( U \), \( U^T \); then we express the equation in terms of the rotated vector \( \tilde{x} = V^Tx \). This leads to
\[ \tilde{S}\tilde{x}=\tilde{y}, \]
where \( \tilde{y} := U^Ty \) is the “rotated” right-hand side of the equation.
Due to the simple form of \( \tilde{S} \), the above writes
\[ \begin{aligned} \sigma_i \tilde{x}_i &= \tilde{y}_i, & i &= 1, \ldots, r, \\ 0 &= \tilde{y}_i, & i &= r+1, \ldots, m. \end{aligned} \]
Two cases can occur:
- If the last \( m-r \) components of \( \tilde{y} \) are not zero, then the above system is infeasible, and the solution set is empty. This occurs when \( y \) is not in the range of \( A \).
- If \( y \) is in the range of \( A \), then the last set of conditions in the above system hold, and we can solve for \( \tilde{x} \) with the first set of conditions:
\[ \tilde{x}_i = \frac{\tilde{y}_i}{\sigma_i}, \quad i = 1,\ldots, r. \]
The last \( n-r \) components of \( \tilde{x} \) are free. This corresponds to elements in the nullspace of \( A \). If \( A \) is full column rank (its nullspace is reduced to {\(0\)}, then there is a unique solution.
36.2. Pseudo-inverse
Definition
The solution set is conveniently described in terms of the pseudo-inverse of \( A \), denoted by \( A^\dagger \), and defined via the SVD of \( A \):
\[ A = U \tilde{S} V^T, \quad \tilde{S} = \textbf{diag}(\sigma_1,\ldots,\sigma_r,0,\ldots,0), \]
as one with the same SVD, with non-zero singular values inverted, and the matrix \( \tilde{S} \) transposed:
\[ A^\dagger := V \tilde{S}^\dagger U^T, \quad \tilde{S} = \textbf{diag}(\sigma_1^{-1},\ldots,\sigma_r^{-1},0,\ldots,0). \]
The pseudo-inverse of a matrix is always well-defined, and it has the same size as the transpose \( A^T \). When the matrix is invertible (it is square and full column or row rank: \( m=n=r \)), then it reduces to the inverse.
Example: pseudo-inverse of a \( 4 \times 5 \) matrix.
Link with solution set
From the above development, we see that the solution set can be written as
\[ \mathbf{S} = \left\{ A^\dagger y + z :~ z \in \mathbf{N}(A) \right\}, \]
where \( \mathbf{N}(A) \) is the nullspace of \( A \). Both \( A^\dagger \) and a basis for the nullspace can be computed via the SVD.
Case when \( A \) is full rank
If \( A \) is full column rank, the pseudo-inverse can be written as
\[ A^\dagger = (A^T A)^{-1} A^T. \]
In that case, \( A^\dagger \) is a left-inverse of \( A \), since \( A^\dagger A = I_n \).
If \( A \) is full row-rank, then the pseudo-inverse can be written as
\[ A^\dagger = A^T (A A^T)^{-1}. \]
In that case, \( A^\dagger \) is a right-inverse of \( A \), since \( A A^\dagger = I_m \).
36.3. Sensitivity analysis and condition number
Sensitivity analysis refers to the process of quantifying the impact of changes in the linear equations’ coefficients (the matrix \( A \) and vector \( y \)) on the solution. To simplify, let us assume that \( A \) is square and invertible, and analyze the effects of errors in \( y \) only. The condition number of the matrix \( A \) quantifies this.
We start from the linear equation above, which has the unique solution \( x = A^{-1} y \). Now assume that \( y \) is changed into \( y + \delta y \), where \( \delta y \) is a vector that contains the changes in \( y \). Let’s denote by \( x + \delta x \) the new solution, which is \( A^{-1} (y + \delta y) \). From the equations:
\[ \delta x = A^{-1} \delta y, \]
and using the definition of the largest singular value norm, we obtain:
\[ \|\delta x\|_2 \leq \|A^{-1}\| \cdot \|\delta y\|_2, \quad \|y\|_2 \leq \|A\| \cdot \|x\|_2. \]
Combining the two inequalities we get:
\[ \frac{\|\delta x\|_2}{\|x\|_2} \leq \kappa(A) \cdot \frac{\|\delta y\|_2}{\|y\|_2}, \]
where \( \kappa(A) \) is the condition number of \( A \), defined as:
\[ \kappa(A) := \|A\| \cdot \|A^{-1}\|. \]
We can express the condition number as the ratio between the largest and smallest singular values of \( A \):
\[ \kappa(A) = \frac{\sigma_1(A)}{\sigma_n(A)}. \]
The condition number gives a bound on the ratio between the relative error in the left-hand side to that of the solution. We can also analyze the effect of errors in the matrix itself on the solution. The condition number turns out to play a crucial role there as well.