Optimal set of Least-Squares via SVD
The optimal set of the OLS problem
data:image/s3,"s3://crabby-images/e874c/e874cecdd61a7bf1607b57c5f000d8d9b772460b" alt="Rendered by QuickLaTeX.com p^*:= \min\limits_{x} ||Ax-y||_2"
can be expressed as
data:image/s3,"s3://crabby-images/4a5c9/4a5c9b9228f8e2186c8b251454055bf8e4019866" alt="Rendered by QuickLaTeX.com X^{opt} = A^{\dagger} y + {\bf N}(A)."
where is the pseudo-inverse of
, and
is the minimum-norm point in the optimal set. If
is full column rank, the solution is unique, and equal to
data:image/s3,"s3://crabby-images/8852f/8852fa613e382fc9ddad9914efdce2ae1ded44a4" alt="Rendered by QuickLaTeX.com x^* = A^{\dagger} y = (A^TA)^{-1} A^T y."
Proof: The following proof relies on the SVD of , and the rotational invariance of the Euclidean norm.
Optimal value of the problem
Using the SVD we can find the optimal set to the least-squares optimization problem
data:image/s3,"s3://crabby-images/7013d/7013d8465656593588e1b7d47a03ea13ba21d607" alt="Rendered by QuickLaTeX.com p^*:= \min\limits_{x} ||Ax-y||_2^2"
Indeed, if is an SVD of
, the problem can be written
data:image/s3,"s3://crabby-images/8f94d/8f94de2053d0c9283d6cea206d0356a5ea9ffe48" alt="Rendered by QuickLaTeX.com p^*=\min\limits_x\left\|U\left(\begin{array}{ll} S & 0 \\ 0 & 0 \end{array}\right) V^T x-y\right\|_2=\min\limits_x\left\|\left(\begin{array}{ll} S & 0 \\ 0 & 0 \end{array}\right) V^T x-U^T y\right\|_2^2,"
where we have exploited the fact that the Euclidean norm is invariant under the orthogonal transformation . With
, and
, and changing the variable
to
, we express the above as
data:image/s3,"s3://crabby-images/b8721/b8721cf44248808c4f8c910c6b3972904845183f" alt="Rendered by QuickLaTeX.com p^*= \min\limits_{\tilde{x}}\left\|\left(\begin{array}{ll} S & 0 \\ 0 & 0 \end{array}\right) \tilde{x}- \tilde{y}\right\|_2^2."
Expanding the terms, and using the partitioned notations ,
, we obtain
data:image/s3,"s3://crabby-images/575c6/575c6e0bdc2978bbcf09bd26a2dc6271c8a80026" alt="Rendered by QuickLaTeX.com p^* = \min\limits_{\tilde{x}_r} ||S\tilde{x}_r-\tilde{y}_{r}||_2^2 + ||\tilde{y}_{m-r}||_2^2."
Since is invertible, we can reduce the first term in the objective to zero with the choice
. Hence the optimal value is
data:image/s3,"s3://crabby-images/c8b25/c8b25e269e6f594b0ee4ab9965f338efbf77084f" alt="Rendered by QuickLaTeX.com p^* = ||\tilde{y}_{m-r}||_2^2."
We observe that the optimal value is zero if and only if , which is exactly the same as
.
Optimal set
Let us detail the optimal set for the problem. The variable is partly determined, via its first
components:
. The remaining
variables contained in
are free, as
does not appear in the objective function of the above problem.
Thus, optimal points are of the form , with
,
, and
free.
To express this in terms of the original SVD of , we observe that
means that
data:image/s3,"s3://crabby-images/4185e/4185ecb9425cf314685947f5a3b2717c376fa7e2" alt="Rendered by QuickLaTeX.com x=V_r \tilde{x}_r+V_{n-r} \tilde{x}_{n-r}=V_r S^{-1} \tilde{y}_r+V_{n-r} \tilde{x}_{n-r}"
where is partitioned as
, with
and
. Similarly, the vector
can be expressed as
, with
formed with the first
columns of
. Thus, any element
in the optimal set is of the form
data:image/s3,"s3://crabby-images/20d6e/20d6e24c6831686b64eb36414c773a7ad7532cfe" alt="Rendered by QuickLaTeX.com x^* = x_{MN} + V_{n-r} z: z \in \mathbb{R}^{n-r},"
where . (We will soon explain the acronym appearing in the subscript.) The free components
correspond to the degrees of freedom allowed to by the nullspace of
.
Minimum-norm optimal point
The particular solution to the problem, , is the minimum-norm solution, in the sense that it is the element of
that has the smallest Euclidean norm. This is best understood in the space of
-variables.
Indeed, the particular choice corresponds to the element in the optimal set that has the smallest Euclidean norm. Indeed, the norm of
is the same as that of its rotated version,
. the first
elements in
are fixed, and since
, we see that the minimal norm is obtained with
.
Optimal set via the pseudo-inverse
The matrix , which appears in the expression of the particular solution
mentioned above, is nothing else than the pseudo-inverse of
, which is denoted
. Indeed, we can express the pseudo-inverse in terms of the SVD as
data:image/s3,"s3://crabby-images/fa681/fa6813fd871529118a9c3ef24bf04b495a6e94cb" alt="Rendered by QuickLaTeX.com A^{\dagger} = V\left(\begin{array}{ll} S^-1 & 0 \\ 0 & 0 \end{array}\right) U^T."
With this convention, the minimum-norm optimal point is . Recall that the last
columns of
form a basis for the nullspace of
. Hence the optimal set of the problem is
data:image/s3,"s3://crabby-images/86f50/86f507fb70497daf894a62ec6fe429530a1121d1" alt="Rendered by QuickLaTeX.com X^{opt} = A^{\dagger}y + {\bf N}(A)."
When is full column rank
, the optimal set reduces to a singleton, as the nullspace is
. The unique optimal point expresses as
data:image/s3,"s3://crabby-images/df2df/df2dfd50ae0bd45c6e01c0b05d9d5d3131196210" alt="Rendered by QuickLaTeX.com x_{LS} = A^{\dagger}y = (A^TA)^{-1}A^Ty."