Least-squares and SVD
- Set of solutions via the pseudo inverse
- Sensitivity analysis
- BLUE property
Set of solutions
The following theorem provides all the solutions (optimal set) of a least-squares problem.
The optimal set of the OLS problem
data:image/s3,"s3://crabby-images/7461d/7461dd9d40b9e64d34a0c2dcd007e9a8cbb66c89" alt="Rendered by QuickLaTeX.com p^*:=\min _x\|A x-y\|_2"
can be expressed as
data:image/s3,"s3://crabby-images/527c7/527c733d1788768a1541d3a17992cf09385e2e53" alt="Rendered by QuickLaTeX.com \mathbf{X}^{\text {opt }}=A^{\dagger} y+\mathbf{N}(A) ."
where is the pseudo-inverse of
, and
is the minimum-norm point in the optimal set. If
is full column rank, the solution is unique, and equal to
data:image/s3,"s3://crabby-images/b3dd9/b3dd9648782c8bf65f1d122828463d5267978214" alt="Rendered by QuickLaTeX.com x^*=A^{\dagger} y=\left(A^T A\right)^{-1} A^T y ."
In general, the particular solution is the minimum-norm solution to the least-squares problem.
Proof: here.
Sensitivity analysis
We consider the situation where
data:image/s3,"s3://crabby-images/2a9b3/2a9b30ca0f636cbdf3ef86a15fb063ddba2d85a6" alt="Rendered by QuickLaTeX.com y+\delta y=A x,"
with
the data matrix (known), with
full column rank (hence
).
is the measurement (known).
is the vector to be estimated (unknown).
is a measurement noise or error (unknown).
We can use OLS to provide an estimate of
. The idea is to seek the smallest vector
such that the above equation becomes feasible, that is,
data:image/s3,"s3://crabby-images/a693c/a693c17310af79b232c2a169a8b483196f7fb84a" alt="Rendered by QuickLaTeX.com \min _{x, \delta y}\|\delta y\|_2: y+\delta y=A x ."
This leads to the OLS problem:
data:image/s3,"s3://crabby-images/c1a92/c1a92c6f1ca56747151a35b411c1f068f338d2a0" alt="Rendered by QuickLaTeX.com \min _x\|A x-y\|_2 ."
Since is full column rank, its SVD can be expressed as
where contains the singular values of
, with
.
Since is full column rank, the solution
to the OLS problem is unique, and can be written as a linear function of the measurement vector
:
data:image/s3,"s3://crabby-images/bc5a9/bc5a90b99dcc53c42899e45bf1b50fe8cdc8e39c" alt="Rendered by QuickLaTeX.com \hat{x}_{\mathrm{LS}}=A^{\dagger} y,"
with the pseudo-inverse of
. Again, since
is full column rank,
data:image/s3,"s3://crabby-images/1ab91/1ab91c5511950cd7a2d039b1a5ca9d1235ee595d" alt="Rendered by QuickLaTeX.com A^{\dagger}=\left(A^T A\right)^{-1} A^T=V\left(\begin{array}{cc} \Sigma^{-1} & 0 \end{array}\right) U^T ."
The OLS formulation provides an estimate of the input
such that the residual error vector
is minimized in norm. We are interested in analyzing the impact of perturbations in the vector
, on the resulting solution
. We begin by analyzing the absolute errors in the estimate and then turn to the analysis of relative errors.
Set of possible errors
Let us assume a simple model of potential perturbations: we assume that belongs to a unit ball:
, where
is given. We will assume
for simplicity; the analysis is easily extended to any
.
We have
data:image/s3,"s3://crabby-images/36031/36031cac962bf8dc813d0dc88a7aba449050045b" alt="Rendered by QuickLaTeX.com \begin{aligned} \delta x:=x-\hat{x} & =x-A^{\dagger} y \\ & =x-A^{\dagger}(A x-\delta y) \\ & =\left(I-A^{\dagger} A\right) x+A^{\dagger} \delta y \\ & =A^{\dagger} \delta y . \end{aligned}"
In the above, we have exploited the fact that is a left inverse of
, that is,
.
The set of possible errors on the solution is then given by
data:image/s3,"s3://crabby-images/4545d/4545d36a2b3dccb3e042cb3c6c7ea2274bbaab78" alt="Rendered by QuickLaTeX.com \mathbf{E}=\left\{A^{\dagger} \delta y:\|\delta y\|_2 \leq 1\right\},"
which is an ellipsoid centered at zero, with principal axes given by the singular values of . This ellipsoid can be interpreted as an ellipsoid of confidence for the estimate
, with size and shape determined by the matrix
.
We can draw several conclusions from this analysis:
- The largest absolute error in the solution that can result from a unit-norm, additive perturbation on
is of the order of
, where
is the smallest singular value of
.
- The largest relative error is
, the condition number of
.
BLUE property
We now return to the case of an OLS with full column rank matrix .
Unbiased linear estimators
Consider the family of linear estimators, which are of the form
data:image/s3,"s3://crabby-images/8d0f0/8d0f026fff47178959eaec86e1f4592b4ad7e2c9" alt="Rendered by QuickLaTeX.com \hat{x}=B y,"
where . To this estimator, we associate the error
data:image/s3,"s3://crabby-images/49c53/49c53368beb5470670078e2752eae7036129a989" alt="Rendered by QuickLaTeX.com \delta x=x-\hat{x}=x-B y=x-B(A x-\delta y)=(I-B A) x+B \delta y ."
We say that the estimator (as determined by matrix ) is unbiased if the first term is zero:
data:image/s3,"s3://crabby-images/e3132/e3132a1f1e838ff60feb70bbdc767951bb89da44" alt="Rendered by QuickLaTeX.com B A=I ."
Unbiased estimators only exist when the above equation is feasible, that is, has a left inverse. This is equivalent to our condition that
be full column rank. Since
is a left-inverse of
, the OLS estimator is a particular case of an unbiased linear estimator.
Best unbiased linear estimator
The above analysis leads to the following question: which is the best unbiased linear estimator? One way to formulate this problem is to assume that the perturbation vector is bounded in some way, and try to minimize the possible impact of such bounded errors on the solution.
Let us assume that belongs to a unit ball:
. The set of resulting errors on the solution
is then
data:image/s3,"s3://crabby-images/75557/75557b111e50c04d07a35cfd5a1a20a80e8fd299" alt="Rendered by QuickLaTeX.com \mathbf{E}=\left\{B \delta y:\|\delta\|_2 \leq 1\right\},"
which is an ellipsoid centered at zero, with principal axes given by the singular values of . This ellipsoid can be interpreted as an ellipsoid of confidence for the estimate
, with size and shape determined by the matrix
.
It can be shown that the OLS estimator is optimal in the sense that it provides the ‘‘smallest’’ ellipsoid of confidence among all unbiased linear estimators. Specifically:
data:image/s3,"s3://crabby-images/80b4c/80b4cd20edc1a967deb223cf82a303f33b53ac34" alt="Rendered by QuickLaTeX.com B B^T \succeq A^{\dagger}\left(A^{\dagger}\right)^T"
This optimality of the LS estimator is referred to as the BLUE (Best Linear Unbiased Estimator) property.