Positive Semi-Definite Matrices

  • Definitions
  • Special cases and examples
  • Square root and Cholesky decomposition
  • Ellipsoids

Definitions

For a given symmetric matrix A\in \mathbb{R}^{n\times n}, the associated quadratic form is the function q: \mathbb{R}^n \rightarrow \mathbb{R} with values

q(x) = x^TAx.
  • A symmetric matrix A is said to be positive semi-definite (PSD, notation: A \succeq 0) if and only if the associated quadratic form q is non-negative everywhere:
q(x) \ge 0 \quad \forall x\in \mathbb{R}^n.
  • It is said to be positive definite (PD, notation: A \succ 0) if the quadratic form is non-negative and definite, that is, q(x) = 0 if and only if x=0.

It turns out that a matrix is PSD if and only if the eigenvalues of A are non-negative. Thus, we can check if a form is PSD by computing the eigenvalue decomposition of the underlying symmetric matrix.

Theorem: eigenvalues of PSD matrices

A quadratic form q(x)= x^TAx, with A \in {\bf S}^n is non-negative (resp. positive-definite) if and only if every eigenvalue of the symmetric matrix A is non-negative (resp. positive).

Proof.

By definition, the PSD and PD properties are properties of the eigenvalues of the matrix only, not of the eigenvectors. Also, if the n \times n matrix A is PSD, then for every matrix B with n columns, the matrix B^TAB also is.

Special cases and examples

Symmetric dyads

Special cases of PSD matrices include symmetric dyads. Indeed, if A = uu^T for some vector u \in \mathbb{R}^n, then for every x:

q_A(x) = x^Tuu^Tx = (u^Tx)^2 \ge 0.

More generally if B \in mathbb{R}^{m\times n}, then A = B^TB is PSD, since

q_A(x) = x^TB^TBx = ||Bx||_2^2 \ge 0.

Diagonal matrices

A diagonal matrix is PSD (resp. PD) if and only if all of its (diagonal) elements are non-negative (resp. positive).

Examples of PSD matrices

Square root and Cholesky decomposition

For PD matrices, we can generalize the notion of the ordinary square root of a non-negative number. Indeed, if A is PSD, there exists a unique PSD matrix, denoted A^{1/2}, such that A = (A^{1/2})^2. We can express this matrix square root in terms of the SED of A = U \Lambda U^T, as A^{1/2} = U \Lambda^{1/2} U^T, where \Lambda^{1/2} is obtained from \Lambda by taking the square root of its diagonal elements. If A is PD, then so is its square root.

Any PSD matrix can be written as a product A= LL^T for an appropriate matrix L. The decomposition is not unique, and L = A^{1/2} is only a possible choice (the only PSD one). Another choice, in terms of the SED of A = U^T \Lambda U, is L = U^T \Lambda^{1/2}. If A is positive-definite, then we can choose L to be lower triangular, and invertible. The decomposition is then known as the Cholesky decomposition of A.

Ellipsoids

There is a strong correspondence between ellipsoids and PSD matrices.

Definition

We define an ellipsoid to be an affine transformation of the unit ball for the Euclidean norm:

{\bf E} = \{\hat{x} +Lz: ||z||_2 \leq 1 \},

where L \in \mathbb{R}^{n\times n} is an arbitrary non-singular matrix. We can express the ellipsoid as

 \mathbf{E}=\left\{x:\left\|L^{-1}(x-\hat{x})\right\|_2 \leq 1\right\}=\left\{x:(x-\hat{x})^T A^{-1}(x-\hat{x}) \leq 1\right\},

where A: = L^{-T} L^{-1} is PD.

Geometric interpretation via SED

We can interpret the eigenvectors and associated eigenvalues of A in terms of the geometrical properties of the ellipsoid, as follows. Consider the SED of A: A = U \Lambda U^T, with U^T U = I and \Lambda diagonal, with diagonal elements positive. The SED of its inverse is A^{-1} = LL^T = U \Lambda^{-1} U^T. Let \tilde{x} = U^T(x- \hat{x}). We can express the condition x \in {\bf E} as

\tilde{x}^T\Lambda \tilde{x} = \sum\limits_{i=1}^n \lambda_i \tilde{x}_i^2 \leq 1.

Now set \overline{x} = \sqrt{\lambda_i} \tilde{x}_i, i = 1, \cdots, n. The above writes \overline{x}^T \overline{x} \leq 1: in \overline{x}-space, the ellipsoid is simply a unit ball. In \tilde{x}-space, the ellipsoid corresponds to scaling each \overline{x}-axis by the square roots of the eigenvalues. The ellipsoid has principal axes parallel to the coordinate axes in \tilde{x}-space. We then apply a rotation and a translation, to get the ellipsoid in the original x-space. The rotation is determined by the eigenvectors of A^{-1}, which are contained in the orthogonal matrix U. Thus, the geometry of the ellipsoid can be read from the SED of the PD matrix A^{-1} = LL^T: the eigenvectors give the principal directions, and the semi-axis lengths are the square root of the eigenvalues.

alt text The graph on the left shows the ellipsoid \{\hat{x} +Lz: ||z||_2 \leq 1 \},, with

\hat{x}=\left(\begin{array}{l} 2 \\ 1 \end{array}\right), \quad L=\left(\begin{array}{ll} 1 & 2 \\ 0 & 3 \end{array}\right) \text {. }

The matrix LL^T admits the SED

L L^T=U^T \Lambda U, \quad U=\left(\begin{array}{cc} -0.9871 & 0.1602 \\ 0.1602 & 0.9871 \end{array}\right), \Lambda=\left(\begin{array}{cc} 0.6754 & 0 \\ 0 & 13.3246 \end{array}\right).

We check that the columns of U determine the principal directions, and \sqrt{\lambda_1} = 3.6503, \sqrt{\lambda_2} = 0.8219 are the semi-axis lengths.

The above shows in particular that an equivalent representation of an ellipsoid is

\{x: (x-\hat{x})^TB(x-\hat{x}) \leq 1 \}

where B:=A^{-1} is PD.

It is possible to define degenerate ellipsoids, which correspond to cases when the matrix B in the above, or its inverse A, is degenerate. For example, cylinders or slabs (intersection of two parallel half-spaces) are degenerate ellipsoids.

License

Hyper-Textbook: Optimization Models and Applications Copyright © by L. El Ghaoui. All Rights Reserved.

Share This Book