"

Linear Maps

  • Definitions and interpretation
  • First-order approximation of non-linear maps

Definition and Interpretation

Definition

A map $f: \mathbf{R}^n \rightarrow \mathbf{R}^m$ is linear (resp. affine) if and only if every one of its components is. The formal definition we saw here for functions applies verbatim to maps.
To an $m \times n$ matrix $A$, we can associate a linear map $f: \mathbf{R}^n \rightarrow \mathbf{R}^m$, with values $f(x)=A x$. Conversely, to any linear map, we can uniquely associate a matrix $A$ which satisfies $f(x)=A x$ for every $x$.
Indeed, if the components of $f, f_i, i=1, \ldots, m$, are linear, then they can be expressed as $f_i(x)=a_i^T x$ for some $a_i \in \mathbf{R}^n$. The matrix $A$ is the matrix that has $a_i^T$ as its $i$-th row:
$$
f(x)=\left(\begin{array}{c}
f_1(x) \\
\vdots \\
f_n(x)
\end{array}\right)=\left(\begin{array}{c}
a_1^T x \\
\vdots \\
a_n^T x
\end{array}\right)=A x, \quad \text { with } A:=\left(\begin{array}{c}
a_1^T \\
\vdots \\
a_m^T
\end{array}\right) \in \mathbf{R}^{m \times n} .
$$
Hence, there is a one-to-one correspondence between matrices and linear maps. This is extending what we saw for vectors, which are in one-to-one correspondence with linear functions.
This is summarized as follows.
Representation of affine maps via the matrix-vector product. A function $f: \mathbf{R}^n \rightarrow \mathbf{R}^m$ is affine if and only if it can be expressed via a matrix-vector product:
$$
f(x)=A x+b,
$$
for some unique pair $(A, b)$, with $A \in \mathbf{R}^{m \times n}$ and $b \in \mathbf{R}^m$. The function is linear if and only if $b=0 . \diamond$
The result above shows that a matrix can be seen as a (linear) map from the “input” space $\mathbf{R}^n$ to the “output” space $\mathbf{R}^m$. Both points of view (matrices as simple collections of vectors, or as linear maps) are useful.

Interpretations

Consider an affine map $x \rightarrow y=A x+b$. An element $A_{i j}$ gives the coefficient of influence of $x_j$ over $y_i$. In this sense, if $A_{13}>>A_{14}$ we can say that $x_3$ has much more influence on $y_1$ than $x_4$. Or, $A_{24}=0$ says that $y_2$ does not depend at all on $x_4$. Often the constant term $b=f(0)$ is referred to as the “bias” vector.

First-order approximation of non-linear maps

Since maps are just collections of functions, we can approximate a map with a linear (or affine) map, just as we did with functions here. If $f: \mathbf{R}^n \rightarrow \mathbf{R}^m$ is differentiable, then we can approximate the (vector) values of $f$ near a given point $x_0 \in \mathbf{R}^n$ by an affine map $\tilde{f}$ :
$$
f(x) \approx \tilde{f}(x):=f\left(x_0\right)+A\left(x-x_0\right),
$$
where $A_{i j}=\frac{\partial f_i}{\partial x_j}\left(x_0\right)$ is the derivative of the $i$-th component of $f$ with respect to $x_j$. ( $A$ is referred to as the Jacobian matrix of $f$ at $x_0$.)

Examples:

License

Hyper-Textbook: Optimization Models and Applications Copyright © by L. El Ghaoui. All Rights Reserved.