Gradient of a function
The gradient of a differentiable function contains the first derivatives of the function with respect to each variable. As seen here, the gradient is useful to find the linear approximation of the function near a point.
- Definition
- Composition rule
- Examples
- Geometric interpretation
Definition
The gradient of at
, denoted
, is the vector in
given by
Examples:
- Distance function: The distance function from a point
to another point
is defined as
The function is differentiable, provided , which we assume. Then
- Log-sum-exp function: Consider the ‘‘log-sum-exp’’ function
, with values
The gradient of at
is
where . More generally, the gradient of the function
with values
is given by
where , and
.
Composition rule with an affine function
If is a matrix, and
is a vector, the function
with values
is called the composition of the affine map with
with
. Its gradient is given by (see here for proof)
Geometric interpretation
Geometrically, the gradient can be read on the plot of the level set of the function. Specifically, at any point , the gradient is perpendicular to the level set and points outwards from the sub-level set (that is, it points towards higher values of the function).