30
30.1. Definitions
For a given symmetric matrix [latex]A\in \mathbb{R}^{n\times n}[/latex], the associated quadratic form is the function [latex]q: \mathbb{R}^n \rightarrow \mathbb{R}[/latex] with values
[latex]\begin{align*}q(x) = x^TAx.\end{align*}[/latex]
- A symmetric matrix [latex]A[/latex] is said to be positive semi-definite (PSD, notation: [latex]A \succeq 0[/latex]) if and only if the associated quadratic form [latex]q[/latex] is non-negative everywhere:
[latex]\begin{align*}q(x) \ge 0 \quad (\forall x\in \mathbb{R}^n).\end{align*}[/latex]
- It is said to be positive definite (PD, notation: [latex]A \succ 0[/latex]) if the quadratic form is non-negative and definite, that is, [latex]q(x) = 0[/latex] if and only if [latex]x=0[/latex].
It turns out that a matrix is PSD if and only if the eigenvalues of [latex]A[/latex] are non-negative. Thus, we can check if a form is PSD by computing the eigenvalue decomposition of the underlying symmetric matrix.
Theorem: eigenvalues of PSD matrices
A quadratic form [latex]q(x)= x^TAx[/latex], with [latex]A \in {\bf S}^n[/latex] is non-negative (resp. positive-definite) if and only if every eigenvalue of the symmetric matrix [latex]A[/latex] is non-negative (resp. positive). |
By definition, the PSD and PD properties are properties of the eigenvalues of the matrix only, not of the eigenvectors. Also, if the [latex]n \times n[/latex] matrix [latex]A[/latex] is PSD, then for every matrix [latex]B[/latex] with [latex]n[/latex] columns, the matrix [latex]B^TAB[/latex] also is.
30.2. Special cases and examples
Symmetric dyads
Special cases of PSD matrices include symmetric dyads. Indeed, if [latex]A = uu^T[/latex] for some vector [latex]u \in \mathbb{R}^n[/latex], then for every [latex]x[/latex]:
[latex]\begin{align*}q_A(x) = x^Tuu^Tx = (u^Tx)^2 \ge 0.\end{align*}[/latex]
More generally if [latex]B \in \mathbb{R}^{m\times n}[/latex], then [latex]A = B^TB[/latex] is PSD, since
[latex]\begin{align*}q_A(x) = x^TB^TBx = ||Bx||_2^2 \ge 0.\end{align*}[/latex]
Diagonal matrices
A diagonal matrix is PSD (resp. PD) if and only if all of its (diagonal) elements are non-negative (resp. positive).
Examples of PSD matrices
30.3. Square root and Cholesky decomposition
For PD matrices, we can generalize the notion of the ordinary square root of a non-negative number. Indeed, if [latex]A[/latex] is PSD, there exists a unique PSD matrix, denoted [latex]A^{1/2}[/latex], such that [latex]A = (A^{1/2})^2[/latex]. We can express this matrix square root in terms of the SED of [latex]A = U \Lambda U^T[/latex], as [latex]A^{1/2} = U \Lambda^{1/2} U^T[/latex], where [latex]\Lambda^{1/2}[/latex] is obtained from [latex]\Lambda[/latex] by taking the square root of its diagonal elements. If [latex]A[/latex] is PD, then so is its square root.
Any PSD matrix can be written as a product [latex]A= LL^T[/latex] for an appropriate matrix [latex]L[/latex]. The decomposition is not unique, and [latex]L = A^{1/2}[/latex] is only a possible choice (the only PSD one). Another choice, in terms of the SED of [latex]A = U^T \Lambda U[/latex], is [latex]L = U^T \Lambda^{1/2}[/latex]. If [latex]A[/latex] is positive-definite, then we can choose [latex]L[/latex] to be lower triangular, and invertible. The decomposition is then known as the Cholesky decomposition of [latex]A[/latex].
30.4. Ellipsoids
There is a strong correspondence between ellipsoids and PSD matrices.
Definition
We define an ellipsoid to be an affine transformation of the unit ball for the Euclidean norm:
[latex]\begin{align*}{\bf E} = \{\hat{x} +Lz: ||z||_2 \leq 1 \},\end{align*}[/latex]
where [latex]L \in \mathbb{R}^{n\times n}[/latex] is an arbitrary non-singular matrix. We can express the ellipsoid as
[latex]\begin{align*} \mathbf{E}=\left\{x:\left\|L^{-1}(x-\hat{x})\right\|_2 \leq 1\right\}=\left\{x:(x-\hat{x})^T A^{-1}(x-\hat{x}) \leq 1\right\},\end{align*}[/latex]
where [latex]A: = L^{-T} L^{-1}[/latex] is PD.
Geometric interpretation via SED
We can interpret the eigenvectors and associated eigenvalues of [latex]A[/latex] in terms of the geometrical properties of the ellipsoid, as follows. Consider the SED of [latex]A[/latex]: [latex]A = U \Lambda U^T[/latex], with [latex]U^T U = I[/latex] and [latex]\Lambda[/latex] diagonal, with diagonal elements positive. The SED of its inverse is [latex]A^{-1} = LL^T = U \Lambda^{-1} U^T[/latex]. Let [latex]\tilde{x} = U^T(x- \hat{x})[/latex]. We can express the condition [latex]x \in {\bf E}[/latex] as
[latex]\begin{align*}\tilde{x}^T\Lambda \tilde{x} = \sum\limits_{i=1}^n \lambda_i \tilde{x}_i^2 \leq 1.\end{align*}[/latex]
Now set [latex]\overline{x}_i = \sqrt{\lambda_i} \tilde{x}_i[/latex], [latex]i = 1, \cdots, n[/latex]. The above can be written as [latex]\overline{x}^T \overline{x} \leq 1[/latex]: in [latex]\overline{x}[/latex]-space, the ellipsoid is simply a unit ball. In [latex]\tilde{x}[/latex]-space, the ellipsoid corresponds to scaling each [latex]\overline{x}[/latex]-axis by the square roots of the eigenvalues. The ellipsoid has principal axes parallel to the coordinate axes in [latex]\tilde{x}[/latex]-space. We then apply a rotation and a translation to get the ellipsoid in the original [latex]x[/latex]-space. The rotation is determined by the eigenvectors of [latex]A^{-1}[/latex], which are contained in the orthogonal matrix [latex]U[/latex]. Thus, the geometry of the ellipsoid can be read from the SED of the PD matrix [latex]A^{-1} = LL^T[/latex]: the eigenvectors give the principal directions, and the semi-axis lengths are the square root of the eigenvalues.
The above shows in particular that an equivalent representation of an ellipsoid is
[latex]\begin{align*}\{x: (x-\hat{x})^TB(x-\hat{x}) \leq 1 \}\end{align*}[/latex]
where [latex]B:=A^{-1}[/latex] is PD.
It is possible to define degenerate ellipsoids, which correspond to cases when the matrix [latex]B[/latex] in the above, or its inverse [latex]A[/latex], is degenerate. For example, cylinders or slabs (intersection of two parallel half-spaces) are degenerate ellipsoids.