Exercises
Interpretation of covariance matrix
We are given of points
in
. We assume that the average and variance of the data projected along a given direction do not change with the direction. In this exercise, we will show that the sample covariance matrix is then proportional to the identity.
We formalize this as follows. To a given normalized direction (
), we associate the line with direction
passing through the origin,
. We then consider the projection of the points
,
, on the line
, and look at the associated coordinates of the points on the line. These projected values are given by
data:image/s3,"s3://crabby-images/ec21c/ec21c5484a87490e9bbf8083094a788371936d3d" alt="Rendered by QuickLaTeX.com t_i(w):=\arg\min\limits_t ||tw-x_i||_2, \quad i = 1, \cdots, m."
We assume that for any , the sample average
of the projected values
,
, and their sample variance
, are both constant, independent of the direction
(with
). Denote by
and
the (constant) sample average and variance.
Justify your answer to the following as carefully as you can.
- Show that
- Show that the sample average of the data points
data:image/s3,"s3://crabby-images/53d6b/53d6bbb8cfff06b4967d2ef61ca085c19499f47a" alt="Rendered by QuickLaTeX.com \hat{x}:=\frac{1}{m} \sum\limits_{i=1}^m x_i,"
is zero.
- Show that the sample covariance matrix of the data points,
data:image/s3,"s3://crabby-images/a14bf/a14bfb94b9297afc9738b047e304846b162e2508" alt="Rendered by QuickLaTeX.com \sum: = \frac{1}{m} (x_i - \hat{x})(x_i -\hat{x})^T,"
is of the form , where
is the identity matrix of order
. (Hint: the largest eigenvalue
of the matrix
can be written as:
, and a similar expression holds for the smallest eigenvalue.)
Eigenvalue decomposition
Let be two linearly independent vectors, with unit norm (
). Define the symmetric matrix
. In your derivations, it may be useful to use the notation
.
- Show that
and
are eigenvectors of
, and determine the corresponding eigenvalues.
- Determine the nullspace and rank of
.
- Find an eigenvalue decomposition of
. Hint: use the previous two parts.
- What is the answer to the previous part if
are not normalized?
Positive-definite matrices, ellipsoids
- In this problem, we examine the geometrical interpretation of the positive definiteness of a matrix. For each of the following cases determine the shape of the region generated by the constraint
.
.
.
.
- Show that if a square,
symmetric matrix
is positive semi-definite, then for every
matrix
,
is also positive semi-definite. (Here,
is an arbitrary integer.)
- Drawing an ellipsoid. How would you efficiently draw an ellipsoid in
, if the ellipsoid is described by a quadratic inequality of the form
data:image/s3,"s3://crabby-images/34f4a/34f4a1388f3e30cf9eafea81983bfb6e1626b61d" alt="Rendered by QuickLaTeX.com {\bf E} = \{x^TAx+2b^Tx+c \leq 0\},"
where is
and symmetric, positive-definite,
, and
? Describe your algorithm as precisely as possible. (You are welcome to provide a Matlab code.) Draw the ellipsoid
data:image/s3,"s3://crabby-images/b77a2/b77a2e3ec5fbfeeacfb6b8e1026dd38b073ced65" alt="Rendered by QuickLaTeX.com {\bf E} = \{4x_1^2 +2x_2^2 +3x_1x_2+4x_1+5x_2+3\leq 1\}."
Least-squares estimation
- BLUE property of least-squares. Consider a system of linear equations in vector
data:image/s3,"s3://crabby-images/75f03/75f036c086e47e33a05bced115f95acef5be3deb" alt="Rendered by QuickLaTeX.com Ax=y+v,"
where is a noise vector, and the input is
, a full rank, tall matrix (
), and
. We do not know anything about
, except that it is bounded:
, with
a measure of the level of noise. Our goal is to provide an estimate
of
via a linear estimator, that is, a function
with
a
matrix. We restrict attention to unbiased estimators, which are such that
when
. This implies that
should be a left inverse of
, that is,
. An example of the linear estimator is obtained by solving the least-squares problem
data:image/s3,"s3://crabby-images/8207b/8207b5d7b8166ed4b2e84041c6ec22207768bd63" alt="Rendered by QuickLaTeX.com \min\limits_x ||Ax-y||_2"
The solution is, when is full column rank, of the form
, with
. We note that
, which means that the LS estimator is unbiased. In this exercise, we show that
is the best unbiased linear estimator. (This is often referred to as the BLUE property.)
-
- Show that the estimation error of an unbiased linear estimator is
.
- This motivates us to minimize the size of
, say using the Frobenius norm:
- Show that the estimation error of an unbiased linear estimator is
data:image/s3,"s3://crabby-images/cb922/cb9229439356d3fbd017008b2f3a490ca157aaee" alt="Rendered by QuickLaTeX.com \min\limits_B ||B||_F: BA=I."
Show that is the best unbiased linear estimator (BLUE), in the sense that it solves the above problem. Hint: Show that any unbiased linear estimator
can be written as
with
, and that
is positive semi-definite.