28
28.1. Symmetric matrices and quadratic functions
Symmetric matrices
A square matrix [latex]A \in \mathbb{R}^{n\times n}[/latex] is symmetric if it is equal to its transpose. That is,
[latex]\begin{equation*} A_{ij} = A_{ji}, \quad 1 \leq i,j \leq n. \end{equation*}[/latex]
The set of symmetric [latex]n \times n[/latex] matrices is denoted [latex]{\bf S}^n[/latex]. This set is a subspace of [latex]\mathbb{R}^{n\times n}[/latex].
Example 1: A [latex]3 \times 3[/latex] symmetric matrix. |
The matrix [latexpage] |
\[ A = \begin{pmatrix} 4 & 3/2 & 2 \\ 3/2 & 2 & 5/2 \\ 2 & 5/2 & 2 \end{pmatrix} \] |
is symmetric. The matrix |
\[ A = \begin{pmatrix} 4 & 3/2 & 2 \\ 3/2 & 2 & \mathbf{5} \\ 2 & \mathbf{5/2} & 2 \end{pmatrix} \] |
is not, since it is not equal to its transpose.
|
See also:
- Representation of a weighted, undirected graph.
- Laplacian matrix of a graph.
- Hessian of a function.
- Gram matrix of data points.
Quadratic functions
A function [latex]q: \mathbb{R}^n \rightarrow \mathbb{R}[/latex] is said to be a quadratic function if it can be expressed as
[latex]\begin{align*} q(x) &= \sum\limits_{i=1}^n \sum\limits_{j=1}^n A_{ij}x_ix_j+2\sum\limits_{i=1}^n b_ix_i+c, \end{align*}[/latex]
for numbers [latex]A_{ij}[/latex], [latex]b_i[/latex], and [latex]c[/latex], [latex]i,j \in \{1,\cdots,n\}[/latex]. A quadratic function is thus an affine combination of the [latex]x_i[/latex]‘s and all the ‘‘cross-products’’ [latex]x_i x_j[/latex]. We observe that the coefficient of [latex]x_i x_j[/latex] is [latex](A_{ij} + A_{ji})[/latex].
The function is said to be a quadratic form if there are no linear or constant terms in it:
[latex]\begin{equation*} b_i = 0, \quad c_i = 0. \end{equation*}[/latex]
Note that the Hessian (matrix of second-derivatives) of a quadratic function is constant.
Examples:
Link between quadratic functions and symmetric matrices
There is a natural relationship between symmetric matrices and quadratic functions. Indeed, any quadratic function [latex]q: \mathbb{R}^n \rightarrow \mathbb{R}[/latex] can be written as
[latex]\begin{equation*} q(x) = \begin{pmatrix} x \\ 1 \end{pmatrix}^T\begin{pmatrix} A & b \\ b^T & c \end{pmatrix}\begin{pmatrix} x \\ 1 \end{pmatrix} = x^T A x + 2 b^T x + c \end{equation*}[/latex]
for an appropriate symmetric matrix [latex]A \in {\bf S}^n[/latex], vector [latex]b \in \mathbb{R}^n[/latex] and scalar [latex]c \in \mathbb{R}[/latex]. Here:
- [latex]A_{ii}[/latex] is the coefficient of [latex]x_i^2[/latex] in [latex]q[/latex];
- for [latex]i \neq j[/latex], [latex]2A_{ij}[/latex] is the coefficient of the term [latex]x_i x_j[/latex] in [latex]q[/latex];
- [latex]2b_i[/latex] is the coefficient of the term [latex]x_i[/latex];
- [latex]c[/latex] is the constant term, [latex]q(0)[/latex].
If [latex]q[/latex] is a quadratic form, then [latex]b=0[/latex], [latex]c=0[/latex], and we can write [latex]q(x) = x^TAx[/latex] where [latex]A \in {\bf S}^n[/latex].
Examples: Two-dimensional example.
28.2. Second-order approximations of non-quadratic functions
We have seen how linear functions arise when one seeks a simple, linear approximation to a more complicated non-linear function. Likewise, quadratic functions arise naturally when one seeks to approximate a given non-quadratic function by a quadratic one.
One-dimensional case
If [latex]f: \mathbb{R} \rightarrow \mathbb{R}[/latex] is a twice-differentiable function of a single variable, then the second-order approximation (or, second-order Taylor expansion) of [latex]f[/latex] at a point [latex]x_0[/latex] is of the form
[latex]\begin{align*} f(x) &\approx q(x) = f(x_0)+f'(x_0)(x-x_0)+\frac{1}{2}f''(x_0)(x-x_0)^2, \end{align*}[/latex]
where [latex]f'(x_0)[/latex] is the first derivative, and [latex]f''(x_0)[/latex] the second derivative, of [latex]f[/latex] at [latex]x_0[/latex]. We observe that the quadratic approximation [latex]q[/latex] has the same value, derivative, and second-derivative as [latex]f[/latex], at [latex]x_0[/latex].
Multi-dimensional case
In multiple dimensions, we have a similar result. Let us approximate a twice-differentiable function [latex]f: \mathbb{R}^n \rightarrow \mathbb{R}[/latex] by a quadratic function [latex]q[/latex], so that [latex]f[/latex] and [latex]q[/latex] coincide up and including to the second derivatives.
The function [latex]q[/latex] must be of the form
[latex]\begin{align*} q(x) &= x^TAx+2b^Tx+c, \end{align*}[/latex]
where [latex]A \in {\bf S}^n[/latex], [latex]b \in \mathbb{R}^n[/latex], and [latex]c \in \mathbb{R}[/latex]. Our condition that [latex]q[/latex] coincides with [latex]f[/latex] up and including to the second derivatives shows that we must have
[latex]\begin{align*} \nabla^2 q(x) &= 2A = \nabla^2 f(x_0), \\ \nabla q(x) &= 2(Ax_0+b) = \nabla f(x_0), \\ x^TAx_0 + 2b^Tx_0 + c &= f(x_0), \end{align*}[/latex]
where [latex]\nabla^2 f(x_0)[/latex] is the Hessian, and [latex]\nabla f(x_0)[/latex] the gradient, of [latex]f[/latex] at [latex]x_0[/latex].
Solving for [latex]A,b,c[/latex] we obtain the following result:
Second-order expansion of a function. The second-order approximation of a twice-differentiable function [latex]f[/latex] at a point [latex]x_0[/latex] is of the form
[latex]\begin{align*} f(x) &\approx q(x) = f(x_0)+\nabla f(x_0)^T(x-x_0)+\frac{1}{2}(x-x_0)^T\nabla^2 f(x_0)(x-x_0), \end{align*}[/latex]
where [latex]\nabla f(x_0) \in \mathbb{R}^n[/latex] is the gradient of [latex]f[/latex] at [latex]x_0[/latex], and the symmetric matrix [latex]\nabla^2 f(x_0)[/latex] is the Hessian of [latex]f[/latex] at [latex]x_0[/latex].
Example: Second-order expansion of the log-sum-exp function.
28.3. Special symmetric matrices
Diagonal matrices
Perhaps the simplest special case of symmetric matrices is the class of diagonal matrices, which are non-zero only on their diagonal.
If [latex]\lambda \in \mathbb{R}^n[/latex], we denote by [latex]{\bf diag}(\lambda_1, \cdots, \lambda_n)[/latex], or [latex]{\bf diag} (\lambda)[/latex] for short, the [latex]n \times n[/latex] (symmetric) diagonal matrix with [latex]\lambda[/latex] on its diagonal. Diagonal matrices correspond to quadratic functions of the form
[latex]\begin{align*} q(x) &= \sum\limits_{i=1}^n \lambda_i x_i^2 = x^T{\bf diag}(\lambda)x. \end{align*}[/latex]
Such functions do not have any ‘‘cross-terms’’ of the form [latex]x_i x_j[/latex] with [latex]i \neq j[/latex].
Example 3: A diagonal matrix and its associated quadratic form. |
Define a diagonal matrix: |
[latex]\begin{align*} D &:= \mathbf{diag}(1,4,-3) =\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & -3 \end{array}\right). \end{align*}[/latex] |
For the matrix above, the associated quadratic form is |
[latex]\begin{align*} q(x) &= x^T\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 4 & 0 \\ 0 & 0 & -3 \end{array}\right) x = x_1^2 +4x_2^2 - 3x_3^2. \end{align*}[/latex] |
Symmetric dyads
Another important class of symmetric matrices is that of the form [latex]uu^T[/latex], where [latex]u \in \mathbb{R}^n[/latex]. The matrix has elements [latex]u_i u_j[/latex] and is symmetric. Such matrices are called symmetric dyads. (If [latex]||u||_2=1[/latex], then the dyad is said to be normalized.)
Symmetric dyads correspond to quadratic functions that are simply squared linear forms: [latex]q(x) = (u^Tx)^2[/latex].
Example: A squared linear form.