"

[latexpage]

 

The set $\mathbf{S}$ of solutions to the least-squares problem

\[ \min _x\|A x-y\|_2^2, \]

where $A \in \mathbb{R}^{m \times n}$, and $y \in \mathbb{R}^m$ are given, can be expressed in terms of the full QR decomposition of $A$:

\[ A P=Q R, \quad R=\left(\begin{array}{cc} R_{11} & R_{12} \\ 0 & 0 \end{array}\right), \]

where $R_{11} \in \mathbb{R}^{r \times r}$ is upper triangular and invertible, $R_{12} \in \mathbb{R}^{r \times(n-r)}, P$ is a permutation matrix, and $Q$ is $m \times m$ and orthogonal.

Precisely we have $\mathbf{S}=\left\{x_0+N z: z \in \mathbb{R}^{n-r}\right\}$, with $N$ a matrix whose columns span the nullspace of $A$:

\[ x_0=P\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right), \quad N=\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) \in \mathbb{R}^{n \times(n-r)}. \]

Proof: Since $Q$ and $P$ are orthogonal, we have, with $\bar{x}:=P^T x, \bar{y}:=Q^T y$:

\[ A x-y=A P P^T x-y=Q R \bar{x}-y=Q\left(R \bar{x}-Q^T y\right)=Q(R \bar{x}-\bar{y}), \]

Exploiting the fact that $Q$ leaves Euclidean norms invariant, we express the original least-squares problem in the equivalent form:

\[ \min _{\bar{x}}\|R \bar{x}-\bar{y}\|_2. \]

Once the above is solved, and $\bar{x}$ is found, we recover the original variable $x$ with $x=P \bar{x}$.

Now let us decompose $\bar{x}$ and $\bar{y}$ in a manner consistent with the block structure of $R: \bar{x}=\left(x_1, x_2\right), \bar{y}=\left(y_1, y_2\right)$, with $x_1, y_1$ two $r$-vectors. Then

\[ R \bar{x}-\bar{y}=\left(\begin{array}{c} R_{11} x_1+R_{12} x_2-y_1 \\ -y_2 \end{array}\right), \]

which leads to the following expression for the objective function:

\[ \|R \bar{x}-\bar{y}\|_2^2=\left\|R_{11} x_1+R_{12} x_2-y_1\right\|_2^2+\left\|y_2\right\|_2^2. \]

The optimal choice for the variables $x_1, x_2$ is to make the first term zero, which is achievable with

\[ x_1=R_{11}^{-1}\left(y_1-R_{12} x_2\right), \]

where $x_2$ is free and describes the ambiguity in the solution. The optimal residual is $\left\|y_1\right\|_2^2$.

We are essentially done: with $x_2=z$, we can write

\[ P^T x=\bar{x}=\left(\begin{array}{c} x_1 \\ x_2 \end{array}\right) = \left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right)+\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) z, \]

that is: $x=P \bar{x}=x_0+N z$, with

\[ x_0=P\left(\begin{array}{c} R_{11}^{-1} y_1 \\ 0 \end{array}\right), \quad N=P\left(\begin{array}{c} R_{11}^{-1} R_{12} \\ I \end{array}\right) \in \mathbb{R}^{n \times(n-r)}. \]

License

Icon for the Public Domain license

This work (Đại số tuyến tính by Tony Tin) is free of known copyright restrictions.