19
19.1. Set of solutions
Consider the linear equation in [latex]x \in \mathbb{R}^n[/latex]:
[latex]\begin{align*}Ax = y,\end{align*}[/latex]
where [latex]A \in \mathbb{R}^{m \times n}[/latex] and [latex]y \in \mathbb{R}^m[/latex] are given, and [latex]x \in \mathbb{R}^n[/latex] is the variable.
The set of solutions to the above equation, if it is not empty, is an affine subspace. That is, it is of the form [latex]x_0 + L[/latex] where [latex]L[/latex] is a subspace.
We’d like to be able to
- determine if a solution exists;
- if so, determine if it is unique;
- compute a solution [latex]x_0[/latex] if one exists;
- find an orthonormal basis of the subspace [latex]L[/latex].
19.2. Existence: range and rank of a matrix
Range
The range (or, image) of a [latex]m \times n[/latex] matrix [latex]A[/latex] is defined as the following subset of [latex]\mathbb{R}^m[/latex]:
[latex]\begin{align*} \mathbf{R}(A) := \{Ax : x \in \mathbb{R}^n\}. \end{align*}[/latex]
The range describes the vectors [latex]y = Ax[/latex] that can be attained in the output space by an arbitrary choice of a vector [latex]x[/latex], in the input space. The range is simply the span of the columns of [latex]A[/latex].
If [latex]y \not \in \mathbf{R}(A)[/latex], we say that the linear equation [latex]Ax = y[/latex] is infeasible. The set of solutions to the linear equation is empty.
From a matrix [latex]A[/latex] it is possible to find a matrix, the columns of which span the range of the matrix [latex]A[/latex], and are mutually orthogonal. Hence, [latex]U^TU = I_r[/latex], where [latex]r[/latex] is the dimension of the range. One algorithm to obtain the matrix [latex]U[/latex] is the Gram-Schmidt procedure.
Example: An infeasible linear system.
Rank
The dimension of the range is called the rank of the matrix. As we will see later, the rank cannot exceed any one of the dimensions of the matrix [latex]A: r \leq \mathrm{min}(m, n)[/latex]. A matrix is said to be full rank if [latex]r = \mathrm{min}(m, n)[/latex].
Note that the rank is a very ‘‘brittle’’ notion, in that small changes in the entries of the matrix can dramatically change its rank. Random matrices are full rank. We will develop here a better, more numerically reliable notion.
[latexpage]
Example 1: Range and rank of a simple matrix. |
Let’s consider the matrix |
\[ A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}. \] |
Range: The columns of \(A\) are |
\[ c_1 = \begin{pmatrix} 1 \\ 3 \end{pmatrix}, \quad c_2 = \begin{pmatrix} 2 \\ 4 \end{pmatrix}. \] |
Any linear combination of these vectors can be represented as \(Ax\), where \(x \in \mathbb{R}^2\). For our matrix \(A\), the range can be visually represented as the plane spanned by \(c_1\) and \(c_2\). |
Rank: The rank of a matrix is the dimension of its range. For our matrix \(A\), since both column vectors are linearly independent, the rank is: |
\[ \text{rank}(A) = 2. \] |
Thus, the matrix \(A\) is of full rank. |
See also:
Full row rank matrices
The matrix [latex]A[/latex] is said to be full row rank (or, onto) if the range is the whole output space, [latex]\mathbb{R}^m[/latex]. The name ‘‘full row rank’’ comes from the fact that the rank equals the row dimension of [latex]A[/latex]. Since the rank is always less than the smallest of the number of columns and rows, a [latex]m \times n[/latex] matrix of full row rank has necessarily less rows than columns (that is, [latex]m \leq n[/latex]).
An equivalent condition for [latex]A[/latex] to be full row rank is that the square, [latex]m \times m[/latex] matrix [latex]AA^T[/latex] is invertible, meaning that it has full rank, [latex]m[/latex].
19.3. Unicity: nullspace of a matrix
Nullspace
The nullspace (or, kernel) of a [latex]m \times n[/latex] matrix [latex]A[/latex] is the following subspace of [latex]\mathbb{R}^n[/latex]:
[latex]\begin{align*} \mathbf{N}(A) := \{ x \in \mathbb{R}^n : Ax = 0 \}. \end{align*}[/latex]
The nullspace describes the ambiguity in [latex]x[/latex] given [latex]y = Ax[/latex]: any [latex]z \in \mathbf{N}(A)[/latex] will be such that [latex]A(x+z) = y[/latex], so [latex]x[/latex] cannot be determined by the sole knowledge of [latex]y[/latex] if the nullspace is not reduced to the singleton [latex]\{0\}[/latex].
From a matrix [latex]A[/latex] we can obtain a matrix, the columns of which span the nullspace of the matrix [latex]A[/latex], and are mutually orthogonal. Hence, [latex]U^TU = I_p[/latex] , where [latex]p[/latex] is the dimension of the nullspace.
[latexpage]
Example 2: Nullspace of a simple matrix. |
Consider the matrix |
\[ A = \begin{pmatrix} 1 & 2 & 1\\ 2 & 4 & 2 \end{pmatrix}. \] |
The nullspace, \( \mathbf{N}(A) \), is defined as |
\[ \mathbf{N}(A) = \{ x \in \mathbb{R}^3 : Ax = 0 \}. \] |
Given the matrix structure, for any vector \( x \) such that the first component is \(-2\) times the second component and the third component can be arbitrary, \( Ax = 0 \). |
For example, the vector |
\[ x = \begin{pmatrix} -2 \\ 1 \\ 0 \end{pmatrix}, \] |
satisfies \( Ax = 0 \) and is thus in the nullspace of \( A \). The dimension of this nullspace, \( p \), is 2 (since we have two free variables). |
Nullity
The nullity of a matrix is the dimension of the nullspace. The rank-nullity theorem states that the nullity of a [latex]m \times n[/latex] matrix [latex]A[/latex] is [latex]n - r[/latex], where [latex]r[/latex] is the rank of [latex]A[/latex].
Full column rank matrices
The matrix [latex]A[/latex] is said to be full column rank (or, one-to-one) if its nullspace is the singleton [latex]\{0\}[/latex]. In this case, if we denote by [latex]a_i[/latex] the [latex]n[/latex] columns of [latex]A[/latex], the equation
[latex]\begin{align*} (Ax =) \sum_{i=1}^n a_ix_i = 0 \end{align*}[/latex]
has [latex]x = 0[/latex] as the unique solution. Hence, [latex]A[/latex] is one-to-one if and only if its columns are independent. Since the rank is always less than the smallest of the number of columns and rows, a [latex]m \times n[/latex] matrix of full column rank has necessarily less columns than rows (that is, [latex]m \geq n[/latex]).
The term ‘‘one-to-one’’ comes from the fact that for such matrices, the condition [latex]y = Ax[/latex] uniquely determines [latex]x[/latex], since [latex]Ax_1 = y[/latex] and [latex]Ax_2 = y[/latex] implies [latex]A(x_1 - x_2) = 0[/latex], so that the solution is unique: [latex]x_1 = x_2[/latex]. The name ‘‘full column rank’’ comes from the fact that the rank equals the column dimension of [latex]A[/latex].
An equivalent condition for [latex]A[/latex] to be full column rank is that the square, [latex]n \times n[/latex] matrix [latex]A^TA[/latex] is invertible, meaning that it has full rank, [latex]n[/latex].
Example: Nullspace of a transpose incidence matrix.
19.4. Fundamental facts
Two important results about the nullspace and range of a matrix.
Rank-nullity theorem
The nullity (dimension of the nullspace) and the rank (dimension of the range) of a [latex]m \times n[/latex] matrix [latex]A[/latex] add up to the column dimension of [latex]A, n[/latex]. |
Another important result is involves the definition of the orthogonal complement of a subspace.
Fundamental theorem of linear algebra
The range of a matrix is the orthogonal complement of the nullspace of its transpose. That is, for a [latex]m \times n[/latex] matrix [latex]A[/latex]: [latex]\begin{align*} \mathbf{R}(A)^{\perp} = \mathbf{N}(A^T) \end{align*}[/latex] |