[latexpage]
The set $\mathbf{S}$ of solutions to the least-squares problem \[ \min _x\|A x-y\|_2^2, \] where $A \in \mathbb{R}^{m \times n}$, and $y \in \mathbb{R}^m$ are given, can be expressed in terms of the full QR decomposition of $A$: \[ A P=Q R, \quad R=\left(\begin{array}{cc} R_{11} & R_{12} \\ 0 & 0 \end{array}\right), \] where $R_{11} \in \mathbb{R}^{r \times r}$ is upper triangular and invertible, $R_{12} \in \mathbb{R}^{r \times(n-r)}, P$ is a permutation matrix, and $Q$ is $m \times m$ and orthogonal. Precisely we have $\mathbf{S}=\left\{x_0+N z: z \in \mathbb{R}^{n-r}\right\}$, with $N$ a matrix whose columns span the nullspace of $A$: \[ x_0=P\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right), \quad N=\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) \in \mathbb{R}^{n \times(n-r)}. \] |
Proof: Since $Q$ and $P$ are orthogonal, we have, with $\bar{x}:=P^T x, \bar{y}:=Q^T y$:
\[ A x-y=A P P^T x-y=Q R \bar{x}-y=Q\left(R \bar{x}-Q^T y\right)=Q(R \bar{x}-\bar{y}), \]
Exploiting the fact that $Q$ leaves Euclidean norms invariant, we express the original least-squares problem in the equivalent form:
\[ \min _{\bar{x}}\|R \bar{x}-\bar{y}\|_2. \]
Once the above is solved, and $\bar{x}$ is found, we recover the original variable $x$ with $x=P \bar{x}$.
Now let us decompose $\bar{x}$ and $\bar{y}$ in a manner consistent with the block structure of $R: \bar{x}=\left(x_1, x_2\right), \bar{y}=\left(y_1, y_2\right)$, with $x_1, y_1$ two $r$-vectors. Then
\[ R \bar{x}-\bar{y}=\left(\begin{array}{c} R_{11} x_1+R_{12} x_2-y_1 \\ -y_2 \end{array}\right), \]
which leads to the following expression for the objective function:
\[ \|R \bar{x}-\bar{y}\|_2^2=\left\|R_{11} x_1+R_{12} x_2-y_1\right\|_2^2+\left\|y_2\right\|_2^2. \]
The optimal choice for the variables $x_1, x_2$ is to make the first term zero, which is achievable with
\[ x_1=R_{11}^{-1}\left(y_1-R_{12} x_2\right), \]
where $x_2$ is free and describes the ambiguity in the solution. The optimal residual is $\left\|y_1\right\|_2^2$.
We are essentially done: with $x_2=z$, we can write
\[ P^T x=\bar{x}=\left(\begin{array}{c} x_1 \\ x_2 \end{array}\right) = \left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right)+\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) z, \]
that is: $x=P \bar{x}=x_0+N z$, with
\[ x_0=P\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right), \quad N=P\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) \in \mathbb{R}^{n \times(n-r)}. \]