Matrix Properties via SVD

  • Nullspace
  • Range, rank
  • Fundamental theorem of linear algebra
  • Matrix norms and condition number


Nullspace

Finding a basis for the nullspace

The SVD allows the computation of an orthonormal basis for the nullspace of a matrix. To understand this, let us first consider a matrix of the form

    \[\tilde{S} = \begin{pmatrix} 1.3 & 0 & 0 \\ 0 & 2.1 & 0 \end{pmatrix},\]

The nullspace of this matrix is readily found by solving the equation \tilde{S}x=0.. We obtain that x=(x_{1}, x_{2}, x_{3}) is in the nullspace if and only if the first two components of x are zero: x_{1}=x_{2}=0..

What about a general matrix A, which admits the SVD as given in the SVD theorem? Since U is orthogonal, we can pre-multiply the nullspace equation Ax=0 by U^{T}, and solve in terms of the ‘‘rotated’’ variable \tilde{x} := V^{T}x. We obtain the condition on \tilde{x}

    \[0=\tilde{S}\tilde{x}=(\sigma_{1}x\tilde{x}_{1},\dots,\sigma,\tilde{x}_{r},0,\dots,0).\]

The above is equivalent to the first r components of \tilde{x} being zero. Since x = V\tilde{x}, this corresponds to the fact that x belongs to the span of the last n-r columns of V. Note that these columns form a set of mutually orthogonal, normalized vectors that span the nullspace: hence they form an orthonormal basis for it.

Theorem: nullspace via SVD

The nullspace of a matrix A with SVD

    \[A = U\tilde{S}V^{T}, \tilde{S} := \begin{pmatrix} S & 0 \\ 0 & 0 \end{pmatrix}, S = diag(\sigma_{1},\dots,\sigma_{r}),\]

where U \in \mathbb{R}^{m \times m}, V \in \mathbb{R}^{n \times n} are both orthogonal matrices, admits the last n-r columns of V as an orthonormal basis.

ExampleNullspace of a 4 \times 5 matrix.

Full column-rank matrices

One-to-one (or, full column rank) matrices are the matrices with nullspace reduced to {0}. If the dimension of the nullspace is zero, then we must have n=r. Thus, full column rank matrices are ones with SVD of the form

    \[A=U \begin{pmatrix} S \\ 0 \end{pmatrix}V^{T}.\]

Range, rank via the SVD

Basis of the range

As with the nullspace, we can express the range in terms of the SVD of the matrix A. Indeed the range of A is the set of vectors of the form

    \[Ax=U \begin{pmatrix} S & 0\\ 0 & 0 \end{pmatrix}V^{T}x.\]

where x \in \mathbb{R}^{n}. Since V is orthogonal, when x spans \mathbb{R}^{n}, so does \tilde{x} := V^{T}x. Decomposing the latter vector in two sub-vectors \tilde{x}=(\tilde{x}_{r},\tilde{x}_{n-r}), we obtain that the range is the set of vectors U\tilde{y}, with

    \[\tilde{y} = \begin{pmatrix} S & 0\\ 0 & 0 \end{pmatrix} \begin{pmatrix} \tilde{x}_{r} \\ \tilde{x}_{n-r} \end{pmatrix} = \begin{pmatrix} S\tilde{x}_{r} \\ 0 \end{pmatrix},\]

where \tilde{x}_{r} is an arbitrary vector of \mathbb{R}^{r}. Since S is invertible, z := S\tilde{x}_{r} also spans \mathbb{R}^{r}. We obtain that the range is the set of vectors U\tilde{y}, where \tilde{y} is of the form (z, 0) with z \in \mathbb{R}^{r} arbitrary. This means that the range is the span of the first r columns of the orthogonal matrix U, and that these columns form an orthonormal basis for it. Hence, the number r of dyads appearing in the SVD decomposition is indeed the rank (dimension of the range).

Theorem: range and rank via SVD

The range of a matrix A with SVD

    \[A = U\tilde{S}V^{T}, \tilde{S} = diag(\sigma_{1},\dots,\sigma_{r},0,\dots,0)\]

where U \in \mathbb{R}^{m \times m}, V \in \mathbb{R}^{n \times n} are both orthogonal matrices, admits the first r columns of U as an orthonormal basis.

Full row rank matrices

An onto (or, full row rank) matrix has a range r=m. These matrices are characterized by an SVD of the form

    \[A=U \begin{pmatrix} S \\ 0 \end{pmatrix}V^{T}.\]

ExampleRange of a 4 \times 5 matrix.

Fundamental theorem of linear algebra

The theorem already mentioned here allows to decompose any vector into two orthogonal ones, the first in the nullspace of a matrix A, and the second in the range of its transpose. This theorem will be useful for writing optimality conditions for linearly constrained optimization problems.

Fundamental theorem of linear algebra

Let A \in \mathbb{R}^{m \times n}. The sets N(A) and R(A^{T}) form an orthogonal decomposition of \mathbb{R}^{n}, in the sense that any vector x \in \mathbb{R}^{n} can be written as

    \[x = y+z, y \in N(A), z \in R(A^{T}), y^{T}z=0.\]

In particular, we obtain that the condition on a vector x to be orthogonal to any vector in the nullspace implies that it must be in the range:

    \[x^{T}y = 0 \text{ whenever } Ay=0 \Leftrightarrow \exits \lambda \in \mathbb{R}^{m}: x = A^{T}\lambda.\]

Proof.

Matrix norms, condition number

Matrix norms are useful to measure the size of a matrix. Some of them can be interpreted in terms of input-output properties of the corresponding linear map; for example, the Frobenius norm measures the average response to unit vectors, while the largest singular (LSV) norm measures the peak gain. These two norms can be easily read from the SVD.

Frobenius norm

The Frobenius norm can be defined as

    \[||A||_{F} = \sqrt{Tr A^{T}A}\]

Using the SVD (U, \tilde{S}, V) of A, we obtain

    \[||A||_{F}^{2} = Tr(V\tilde{S}^{T}\tilde{S}V^{T}=Tr(V^{T}V\tilde{S}^{T}\tilde{S})=Tr(\tilde{S}^{T}\tilde{S})=\sum_{i=1}^{r}\sigma_{i}^{2}.\]

Hence the squared Frobenius norm is nothing else than the sum of the squares of the singular values.

Largest singular value norm

An alternate way to measure matrix size is based on asking for the maximum ratio of the norm of the output to the norm of the input. When the norm used is the Euclidean norm, the corresponding quantity

    \[||A||_{LSV} := \max_{x}||Ax||_{2}: ||x||_{2} \le 1.\]

is called the largest singular value (LSV) norm. The reason for this wording is given by the following theorem.

Theorem: largest singular value norm

For any matrix A,

    \[||A||_{LSV} := \max_{x : ||x||_{2} \le 1}||Ax||_{2} = \sigma_{1}(A),\]

where \sigma_{1}(A) is the largest singular value of A. Any left singular vector associated with the singular value achieves the maximum in the above.

ExampleNorms of a 4 \times 5 matrix.

Condition number

The condition number of an invertible n \times n matrix A is the ratio between the largest and the smallest singular values:

    \[\kappa(A)=\dfrac{\sigma_{1}}{\sigma_{n}}=||A||\cdot||A^{-1}||.\]

As seen in the next section, this number provides a measure of the sensitivity of the solution of a linear equation to changes in A.

License

Hyper-Textbook: Optimization Models and Applications Copyright © by L. El Ghaoui. All Rights Reserved.

Share This Book