Set of solutions to the least-squares problem via QR decomposition

The set \mathbf{S} of solutions to the least-squares problem

\min _x\|A x-y\|_2^2 \text {, }

where A \in \mathbf{R}^{m \times n}, and y \in \mathbf{R}^m are given, can be expressed in terms of the full QR decomposition of A:

A P=Q R, \quad R=\left(\begin{array}{cc} R_{11} & R_{12} \\ 0 & 0 \end{array}\right),

where R_{11} \in \mathbf{R}^{r \times r} is upper triangular and invertible, R_{12} \in \mathbf{R}^{r \times(n-r)}, P is a permutation matrix, and Q is m \times m and orthogonal.

Precisely we have \mathbf{S}=\left\{x_0+N z: z \in \mathbf{R}^{n-r}\right\}, with N a matrix whose columns span the nullspace of A:

x_0=P\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right), \quad N=\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) \in \mathbf{R}^{n \times(n-r)} .

Proof: Since Q and P are orthogonal, we have, with \bar{x}:=P^T x, \bar{y}:=Q^T y:

A x-y=A P P^T x-y=Q R \bar{x}-y=Q\left(R \bar{x}-Q^T y\right)=Q(R \bar{x}-\bar{y})

Exploiting the fact that Q leaves Euclidean norms invariant, we express the original least-squares problem in the equivalent form:

\min _{\bar{x}}\|R \bar{x}-\bar{y}\|_2 \text {. }

Once the above is solved, and \bar{x} is found, we recover the original variable x with x=P \bar{x}.

Now let us decompose \bar{x} and \bar{y} in a manner consistent with the block structure of R: \bar{x}=\left(x_1, x_2\right), \bar{y}=\left(y_1, y_2\right), with x_1, y_1 two r-vectors. Then

R \bar{x}-\bar{y}=\left(\begin{array}{c} R_{11} x_1+R_{12} x_2-y_1 \\ -y_2 \end{array}\right),

which leads to the following expression for the objective function:

\|R \bar{x}-\bar{y}\|_2^2=\left\|R_{11} x_1+R_{12} x_2-y_1\right\|_2^2+\left\|y_1\right\|_2^2 .

The optimal choice for the variables x_1, x_2 is to make the first term zero, which is achievable with

x_1=R_{11}^{-1}\left(y_1-R_{12} x_2\right),

where x_2 is free and describes the ambiguity in the solution. The optimal residual is \left\|y_1\right\|_2^2.

We are essentially done: with x_2=z, we can write

P^T x=\bar{x}=\left(\begin{array}{c} x_1 \\ x_2 \end{array}\right),\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right)+\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) z,

that is: x=P \bar{x}=x_0+N z, with

x_0=P\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right), N=P\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) \in \mathbf{R}^{n \times(n-r)}

License

Hyper-Textbook: Optimization Models and Applications Copyright © by L. El Ghaoui. All Rights Reserved.

Share This Book