Variants of the Least-Squares Problem
- Linearly-constrained least-squares
- Minimum-norm solutions to linear equations
- Regularized least-squares
Linearly constrained least-squares
Definition
An interesting variant of the ordinary least-squares problem involves equality constraints on the decision variable :
data:image/s3,"s3://crabby-images/c61b0/c61b0f4e99a6e0d3b22ad50611ca6304c50815e9" alt="Rendered by QuickLaTeX.com \min _x\|A x-y\|_2^2 ; \quad C x=d_1"
where
data:image/s3,"s3://crabby-images/a2903/a29031fa9fd48749c0eb8fc0755a5af9c4215421" alt="Rendered by QuickLaTeX.com C \in \mathbf{R}^{p \times n}"
data:image/s3,"s3://crabby-images/efcf2/efcf2ff9251c443450a94536e7fae8cf161850f8" alt="Rendered by QuickLaTeX.com d \in \mathbf{R}^p"
Examples:
Solution
We can express the solution by first computing the null space of . Assuming that the feasible set of the constrained iS problem is not empty, that is,
is in the range of
, this set can be expressed as
where is the dimension of the nullspace of
is a matrix whose columns span the nullspace of
, and
is a particular solution to the equation
.
Expressing in terms of the free variable
, we can write the constrained problem as an unconstrained one:
Minimum-norm solution to linear equations
A special case of linearly constrained LS is
in which we implicitly assume that the linear equation in , has a solution, that is,
is in the range of
.
The above problem allows selecting a particular solution to a linear equation, in the case when there are possibly many, that is, the linear system is under-determined.
As seen here, when is full row rank, that is, the matrix
is invertible, the above has the closed-form solution
Examples: Control positioning of a mass.
Regularized least-squares
In the case when the matrix in the OLS problem is not full column rank, the closed-form solution cannot be applied. A remedy often used in practice is to transform the original problem into one where the full column rank property holds.
The regularized least-squares problem has the form
data:image/s3,"s3://crabby-images/361ca/361ca9af7c55c01cdad4c2ed85601e750d0efba4" alt="Rendered by QuickLaTeX.com \min _x\|A x-y\|_2^2+\lambda \mid x \|_2^2,"
where
data:image/s3,"s3://crabby-images/cd5fb/cd5fbddb85fa3e0dd46e9a9083d2c32fe214f0aa" alt="Rendered by QuickLaTeX.com \lambda>0"
The regularized problem can be expressed as an ordinary least-squares problem, where the data matrix is full column rank, Indeed, the above problem can be written as the ordinary LS problem
data:image/s3,"s3://crabby-images/eb962/eb9628f66a23cdd207b1bc49ad5e614da34906f0" alt="Rendered by QuickLaTeX.com \min _x\|\bar{A} x-\bar{y}\|_2^2"
where
data:image/s3,"s3://crabby-images/81a58/81a5877bc008e0893d0b45ad87e36a7163abaad9" alt="Rendered by QuickLaTeX.com \tilde{A}:=\left(\begin{array}{c} A \\ \sqrt{\lambda} I_n \end{array}\right), \bar{y}:=\left(\begin{array}{c} y \\ 0 \end{array}\right) ."
The presence of the identity matrix in the
data:image/s3,"s3://crabby-images/fd4b2/fd4b242c16d2216a86301824cadfbaf39fcc135c" alt="Rendered by QuickLaTeX.com (m+n) \times n"
data:image/s3,"s3://crabby-images/08b41/08b41abeebc3d3d62eab2e2a66a55a84a04c47c1" alt="Rendered by QuickLaTeX.com \bar{A}"
Solution
Since the data matrix in the regularized LS problem has full column rank, the formula seen here applies. The solution is unique and given by
data:image/s3,"s3://crabby-images/d976c/d976cb85f35f9dff9eca9d0e9efe8b2a8aacee2b" alt="Rendered by QuickLaTeX.com x^*=(\tilde{A}^T\tilde{A})^{-1}\tilde{A}^T\tilde{y}=(A^TA+\lambda I_n)^{-1}A^Ty"
For , we recover the ordinary LS expression that is valid when the original data matrix is full rank.
The above formula explains one of the motivations for using regularized least-squares in the case of a rank-deficient matrix : if
, but is small, the above expression is still defined, even if
is rank-deficient.
Weighted regularized least-squares
Sometimes, as in kernel methods, we are led to problems of the form
data:image/s3,"s3://crabby-images/2e907/2e90750a9375db2e1ea2a12f5db1f80e46965ba2" alt="Rendered by QuickLaTeX.com \min _{x}\ ||Ax-y||_2^2+x^TWx,"
where is positive definite (that is,
for every non-zero
). The solution is again unique and given by
data:image/s3,"s3://crabby-images/d916a/d916a978f70bb31590c027e846ecc0e9d7962270" alt="Rendered by QuickLaTeX.com x^*=(A^TA+W)^-1A^Ty"