"

[latexpage]

Theorem: Singular Value Decomposition (SVD)

 

An arbitrary matrix $A \in \mathbb{R}^{m \times n}$ admits a decomposition of the form

$$
A=\sum_{i=1}^r \sigma_i u_i v_i^T=U \tilde{S} V^T, \quad \tilde{S}:=\left(\begin{array}{cc}
S & 0 \\
0 & 0
\end{array}\right)
$$

where $U \in \mathbb{R}^{m \times m}, V \in \mathbb{R}^{n \times n}$ are both orthogonal matrices, and the matrix $S$ is diagonal:

$$
S=\mathbf{diag}\left(\sigma_1, \ldots, \sigma_r\right),
$$

where the positive numbers $\sigma_1 \geq \ldots \geq \sigma_r>0$ are unique, and are called the singular values of $A$. The number $r \leq \min (m, n)$ is equal to the rank of $A$, and the triplet $(U, \tilde{S}, V)$ is called a singular value decomposition (SVD) of $A$. The first $r$ columns of $U: u_i, i=1, \ldots, r$ (resp. $V: v_i, i=1, \ldots, r$ ) are called left (resp. right) singular vectors of $A$, and satisfy

$$
A v_i=\sigma_i u_i, \quad u_i^T A=\sigma_i v_i, \quad i=1, \ldots, r .
$$

Proof: The matrix $n \times n \; A^T A$ is real and symmetric. According to the spectral theorem, it admits an eigenvalue decomposition in the form $A^T A=V \Lambda V^T$, with $V$ a $n \times n$ matrix whose columns form an orthonormal basis (that is, $V^T V=V V^T=I_n$ ), and $\Lambda=\mathbf{diag}\left(\lambda_1, \ldots, \lambda_r, 0, \ldots, 0\right)$. Here, $r$ is the rank of $A^T A$ (if $r=n$ then there are no trailing zeros in $\Lambda$ ). Since $A^T A$ is positive semi-definite, the $\lambda_j$ ‘s are non-negative, and we can define the non-zero quantities $\sigma_j:=\sqrt{\lambda_j}, j=1, \ldots, r$.

Note that when $j>r, A v_j=0$, since then $\left\|A v_j\right\|_2^2=v_j^T A^T A v_j=\lambda_j v_j^T v_j=0$.

Let us construct an $m \times m$ orthogonal matrix $U$ as follows. We set

$$
u_i=\frac{1}{\sigma_i} A v_i, \quad i=1, \ldots, r .
$$

These $m$-vectors are unit-norm, and mutually orthogonal since $v_j$ ‘s are eigenvectors of $A^T A$. Using (say) the Gram-Schmidt orthogonalization procedure, we can complete (if necessary, that is in the case $r<m$) this set of vectors by $u_{r+1}, \ldots, u_m$ in order to form an orthogonal matrix $U:=\left(u_1, \ldots, u_m\right) \in \mathbb{R}^m$.

Let us check that $U, V$ satisfy the conditions of the theorem, by showing that

$$
U^T A V^T=\tilde{S}:=\mathbf{diag}\left(\sigma_1, \ldots, \sigma_r, 0, \ldots, 0\right).
$$

We have

$$
\left(U^T A V\right)_{i j}=u_i^T A v_j= \begin{cases}\sigma_j u_i^T u_j & \text { if } j \leq r \\ 0 & \text { otherwise, }\end{cases}
$$

where the second line stems from the fact that $A v_j=0$ when $j>r$. Thus, $U^T A V=\tilde{S}$, as claimed.

License

Icon for the Public Domain license

This work (Đại số tuyến tính by Tony Tin) is free of known copyright restrictions.