Exercises

Interpretation of covariance matrix

We are given m of points x_1, \cdots, x_m in \mathbb{R}^n. We assume that the average and variance of the data projected along a given direction do not change with the direction. In this exercise, we will show that the sample covariance matrix is then proportional to the identity.

We formalize this as follows. To a given normalized direction w \in \mathbb{R}^n (||w||_2 =1), we associate the line with direction w passing through the origin, \mathcal{L}(w)=\{tw: t\in \mathbb{R}\}. We then consider the projection of the points x_i, i = 1, \cdots, m, on the line \mathcal{L}(w), and look at the associated coordinates of the points on the line. These projected values are given by

t_i(w):=\arg\min\limits_t ||tw-x_i||_2, \quad i = 1, \cdots, m.

We assume that for any w, the sample average \hat{t}(w) of the projected values t_i(w), i = 1, \cdots, m, and their sample variance \sigma^2(w), are both constant, independent of the direction w (with ||w||_2 =1). Denote by \hat{t} and \sigma^2 the (constant) sample average and variance.

Justify your answer to the following as carefully as you can.

  1. Show that t_i(w) = x_i^Tw, \quad i = 1,\cdots, m.
  2. Show that the sample average of the data points
\hat{x}:=\frac{1}{m} \sum\limits_{i=1}^m x_i,

is zero.

  1. Show that the sample covariance matrix of the data points,
\sum: = \frac{1}{m} (x_i - \hat{x})(x_i -\hat{x})^T,

is of the form \sigma^2 \cdot I, where I is the identity matrix of order n. (Hint: the largest eigenvalue \lambda_{\max} of the matrix \sum can be written as: \lambda_{\max} = \max_w \{w^T\sum w: w^Tw=1\}, and a similar expression holds for the smallest eigenvalue.)

Eigenvalue decomposition

Let p,q \in \mathbb{R}^n be two linearly independent vectors, with unit norm (||p||_2 = ||q||_2 = 1). Define the symmetric matrix A: = pq^T + qp^T. In your derivations, it may be useful to use the notation c:=p^Tq.

  1. Show that p+q and p-q are eigenvectors of A, and determine the corresponding eigenvalues.
  2. Determine the nullspace and rank of A.
  3. Find an eigenvalue decomposition of AHint: use the previous two parts.
  4. What is the answer to the previous part if p,q are not normalized?

Positive-definite matrices, ellipsoids

  1. In this problem, we examine the geometrical interpretation of the positive definiteness of a matrix. For each of the following cases determine the shape of the region generated by the constraint x^TAx \leq 1.
    1. A = \left(\begin{array}{ll} 2 & 1 \\ 1 & 2 \end{array}\right).
    2. A = \left(\begin{array}{cc} 1 & -1 \\ -1 & 1 \end{array}\right).
    3. A = \left(\begin{array}{cc} -1 & 0 \\ 0 & -1 \end{array}\right).
  2. Show that if a square, n \times n symmetric matrix A is positive semi-definite, then for every n \times k matrix B, B^TAB is also positive semi-definite. (Here, k is an arbitrary integer.)
  3. Drawing an ellipsoid. How would you efficiently draw an ellipsoid in \mathbb{R}^2, if the ellipsoid is described by a quadratic inequality of the form
{\bf E} = \{x^TAx+2b^Tx+c \leq 0\},

where A is 2 \times 2 and symmetric, positive-definite, c \in \mathbb{R}^2, and b \in \mathbb{R} ? Describe your algorithm as precisely as possible. (You are welcome to provide a Matlab code.) Draw the ellipsoid

{\bf E} = \{4x_1^2 +2x_2^2 +3x_1x_2+4x_1+5x_2+3\leq 1\}.

Least-squares estimation

  1. BLUE property of least-squares. Consider a system of linear equations in vector x
Ax=y+v,

where v is a noise vector, and the input is A \in \mathbb{R}^{m\times n}, a full rank, tall matrix (m \ge n), and y \in \mathbb{R}^m. We do not know anything about v, except that it is bounded: ||v||_2 \leq \alpha, with \alpha \ge 0 a measure of the level of noise. Our goal is to provide an estimate \hat{x} of x via a linear estimator, that is, a function \hat{x} = By with B a n \times m matrix. We restrict attention to unbiased estimators, which are such that \hat{x} = x when v = 0. This implies that B should be a left inverse of A, that is, BA= I_n. An example of the linear estimator is obtained by solving the least-squares problem

\min\limits_x ||Ax-y||_2

The solution is, when A is full column rank, of the form x^* = B_{ls}y, with B_{ls} = (A^TA)^-1 A^T. We note that B_{ls}A=I, which means that the LS estimator is unbiased. In this exercise, we show that B_{ls} is the best unbiased linear estimator. (This is often referred to as the BLUE property.)

    1. Show that the estimation error of an unbiased linear estimator is x - \hat{x} = -Bv.
    2. This motivates us to minimize the size of B, say using the Frobenius norm:
\min\limits_B ||B||_F: BA=I.

Show that B_{ls} is the best unbiased linear estimator (BLUE), in the sense that it solves the above problem. Hint: Show that any unbiased linear estimator B can be written as B= B_{ls} +Z with ZB_{ls}^T = 0, and that BB^T = ZZ^T + B_{ls}B_{ls}^T is positive semi-definite.

License

Hyper-Textbook: Optimization Models and Applications Copyright © by L. El Ghaoui. All Rights Reserved.

Share This Book