2.5: Chain Rule
2.5: Chain Rule
In the last two sections we discussed the derivative rules for functions arising from the classic algebraic operations on basic types of functions: addition, subtraction, multiplication, and division. There is one more type of function that we will want to know how to differentiate: composition. The so called Chain Rule will let us find the derivative of a composition. (This is the last derivative rule we will learn in this course!)
Example 1
Find the derivative of [latex]y=\left(4x^3+15x\right)^2[/latex]
Answer:
This is not a simple polynomial, so we can’t use the basic building block rules yet. It is a product, so we could write it as
[latex]y=\left(4x^3+15x\right)^2=\left(4x^3+15x\right)\left(4x^3+15x\right)[/latex]
and then use the product rule. Or we could multiply it out and simply differentiate the resulting polynomial. Let’s try do it the second way:
[latex]\begin{align*} y&=\left(4x^3+15x\right)^2\\ &=16x^6+120x^4+225x^2\\\\ &\Rightarrow y'=96x^5+480x^3+450x \end{align*}[/latex]
Now suppose we want to find the derivative of [latex]y=\left(4x^3+15x\right)^{20}[/latex]. We could write it as a product with 20 factors and use the product rule 19 times over, or we could multiply it out. You are likely to agree that either of the two approaches would be extremely time-consuming.
We need an easier way, a rule that will handle a composition like this, called the chain rule. The chain rule may seem a little complicated at first, but it saves us the much more complicated algebra of multiplying something like this out. It will also handle compositions where it wouldn’t be possible to multiply it out.
Students often make mistakes when applying the chain rule. Part of the reason is that the notation takes a little getting used to. And part of the reason is that students often forget to use it when they should.
When should you use the chain rule? You should use it any time when your function contains a composition of functions that cannot be dealt with through the other, simpler rules.
How should you use it?
Derivative of a composition of functions using Chain Rule
In what follows, [latex]f[/latex] and [latex]g[/latex] are differentiable functions where [latex]y = f(g(x))[/latex]. By convention, we use the letter [latex]u[/latex] to think of [latex]g(x)[/latex] as the input variable for [latex]f[/latex]. This allows us to see the function [latex]g[/latex] both as the output of the inside function and as the input of the outside function. This distinction in how we view [latex]g[/latex] is very important in the application of the chain rule as we calculate the derivative of the composition.
Chain Rule (Leibniz notation)
[latex]\frac{dy}{dx}=\frac{dy}{du}\cdot\frac{du}{dx}=\frac{du}{dx}\cdot\frac{dy}{du}[/latex]
Note that the [latex]du[/latex]’s seem to cancel. They do not! There is no division involved here! However, this is one advantage of the Leibniz notation – it can remind you of how the chain rule chains together the derivatives of the functions involved in the composition.
Without introducing the [latex]y[/latex], we can also write this as
[latex]\frac{d}{dx}f\left (g(x)\right)=\frac{df}{dg}\cdot\frac{dg}{dx}=\frac{dg}{dx}\cdot \frac{df}{dg}[/latex]
Chain Rule (using prime notation)
[latex]\frac{d}{dx} f\left(g(x)\right) =f'\left(g(x)\right)\cdot g'(x)=g'(x)\cdot f'\left(g(x)\right)[/latex]
Chain Rule (in words)
The derivative of a composition is the derivative of the outside (with the inside staying the same) times the derivative of the inside. Alternatively, we can go in the opposite direction and say that the derivative of a composition is the derivative of the inside times the derivative of the outside (with the inside staying the same).
You should recite the version in words each time you take a derivative, especially if the function is complicated.
Video Demonstration
Compositions of Functions and the Chain Rule
© 2014 Eric Bancroft
Example 2
Find the derivative of [latex]y=\left(4x^3+15x\right)^2[/latex]
Answer: [latex]y'=?[/latex]
This is the same function as in Example 1, which we differentiated by first multiplying out the square. This time, let’s use the Chain Rule: The inside function is what appears inside the parentheses: [latex]4x^3+15x[/latex]. The outside function is the first thing we find as we come in from the outside – it’s the square function, [latex]({inside})^2[/latex].
In [latex]y=\left(4x^3+15x\right)^2[/latex], the inside function is [latex]4x^3+15x[/latex], which is then squared, so the square is the outside function.
The derivative of this outside function with respect to the inside function is [latex]2\cdot\left({inside}\right)[/latex]. The derivative of the inside with respect to [latex]x[/latex] is [latex]4\cdot 3x^{3-1}+15[/latex]. Now using the chain rule, the derivative of our original function is the derivative of the outside with respect to the inside times the derivative of the inside with respect to [latex]x[/latex] :
[latex]y=\overbrace{\left(\underbrace{4x^3+15x}_{inside}\right)^2}^{outside}[/latex]
[latex]\Rightarrow y'=\overset{\text{derivative of outside wrt inside}}{2\left(4x^3+15x\right)^{2-1}}\cdot\underset{\text{derivative of inside wrt }x}{\left(4\cdot 3x^{3-1}+15 \right)}=2\left(4x^3+15x\right)\left(12x^2+15 \right)[/latex]
Alternatively, we could have gone in the in-then-out direction:
[latex]y=\overbrace{\left(\underbrace{4x^3+15x}_{inside}\right)^2}^{outside}[/latex]
[latex]\Rightarrow y'=\overset{\text{derivative of inside wrt }x}{\left(4\cdot 3x^{3-1}+15\right)}\cdot \underset{\text{derivative of outside wrt inside}}{2\left(4x^3+15x\right)^{2-1}}=2\left(12x^2+15 \right)\left(4x^3+15x\right)[/latex]
Note that the answer using this approach is the same as the answer in Example 1 if we multiply it out.
Let’s now try the more complicated function we considered earlier.
Example 3
Find the derivative of [latex]y=\left(4x^3+15x\right)^{20}[/latex].
Answer: [latex]y'=?[/latex]
We note that this function is a composition of two functions: a power function applied to a polynomial.
Therefore, the derivative can be calculated using the chain rule, where the outside function is the power function and the inside function is the polynomial. Since the derivative of a composite function is the derivative of the outside times the derivative of the inside, we have:
[latex]y'=\overset{out'}{20\left(4x^3+15x\right)^{20-1}}\cdot \overset{in'}{\left(4\cdot 3x^{3-1}+15\right)}=20\left(4x^3+15x\right)^{19}\cdot \left(12x^{3}+15\right)[/latex]
Alternatively, going the in-then-out route,
[latex]y'=\overset{in'}{\left(4\cdot 3x^{3-1}+15\right)}\cdot\overset{out'}{20\left(4x^3+15x\right)^{20-1}}=20\left(12x^{3}+15\right)\left(4x^3+15x\right)^{19}[/latex]
Chain Rule summarized:
What – When – How
What: Chain Rule is a rule we apply to determine the derivative of a composite function.
When: We apply the chain rule on the components of a given function that are made of composite layers of functions.
How: We apply the chain rule by multiplying sequentially the derivatives of each layer of the composite function with respect to the next inner layer.
Example 4
Differentiate [latex]y=e^{x^2+5}[/latex].
Answer: [latex]y'=?[/latex]
This isn’t a simple exponential function because the exponent is not simply the variable – the exponent itself is another function (a polynomial). As a result, [latex]y[/latex] is a composition of functions. Hence we will be applying the chain rule to calculate the derivative, which means we first have to identify the inside and the outside function.
To determine the inside function and the outside function, let’s analyze what happens to our input variable. To produce the value for [latex]y[/latex], our variable [latex]x[/latex] is first used in a polynomial, which is [latex]x^2+5[/latex]. This tells us that [latex]x^2+5[/latex] is the inside function. Then the polynomial is put into the exponent of [latex]e[/latex], so [latex]e^{\text{inside}}[/latex] is the outside function.
Now we can use the chain rule: We want the derivative of the outside times the derivative of the inside. The outside is the [latex]e^{something}[/latex], so its derivative with respect to [latex]something[/latex] is the same thing. The derivative of what’s inside is [latex]2x^{2-1}+0[/latex]. So
[latex]\frac{d}{dx}\left( e^{x^2+5} \right)=\overset{out'}{ \left( e^{x^2+5} \right)}\cdot \overset{in'}{\left(2x^{2-1}+0\right)}=2xe^{x^2+5}[/latex]
Video Demonstration
More Chain Rule Examples
© 2014 Eric Bancroft
How Chain Rule got its name
So far we only considered compositions of functions with two layers. We can, of course, have compositions that have many more layers, say:
[latex]f(x)=\left(f_n\circ f_{n-1}\circ\ldots\circ{f_2}\circ f_1\right)(x)[/latex]
In this case, using the chain rule, we have that
[latex]\frac{df}{dx}=\frac{df_1}{dx}\cdot\frac{df_2}{df_1}\cdot\frac{df_3}{df_2}\cdot \ldots\cdot\frac{df_n}{df_{n-1}}[/latex]
In other words, a derivative of a composite function is a multiplicative chain of the derivatives of each layer in the composition.
Let’s consider one such example…
Example 5
Let [latex]f(x)=4\cdot 2^{\left(5x^5-x+7\right)^{13}}[/latex]. Find [latex]f'(x)[/latex].
Answer: [latex]f'(x)=?[/latex]
By analyzing
[latex]f(x)=4\cdot 2^{\left(5x^5-x+7\right)^{13}}[/latex]
we can see that [latex]f(x)[/latex] is a composite function.
We can split it into layers in a couple of ways and one of them is:
- outside: [latex]4\cdot 2^{(\ldots)}[/latex] (constant multiple of exponential fn applied to …)
- middle: [latex](\ldots)^{13}[/latex] (power fn applied to …)
- inside: [latex]5x^5-x+7[/latex] (polynomial)
Let’s apply the chain rule going in-towards-out:
[latex]\begin{align*} f'(x)&=\overset{in'}{\left(5\cdot 5x^{5-1}-1+0\right)}\cdot \overset{mid'}{\left(13\left(5x^5-x+7\right)^{13-1}\right)}\cdot \overset{out'}{\left(4\cdot 2^{\left(5x^5-x+7\right)^{13}}\right)}\\\\ &=52\left(25x^4-1\right)\left(5x^5-x+7\right)^{12}\cdot 2^{\left(5x^5-x+7\right)^{13}} \end{align*}[/latex]
Important note: Constant Multiple Rule is just a special case of Chain Rule. This is because a constant multiple of a function is simply a vertical stretch of that function, which is a composition of multiplication by a constant applied to the function. As a result, when multiplication by a constant is one of the layers in the given composition of functions, we can join it with the next layer down through use of Constant Multiple Rule, as we did in the previous example.
Video Demonstration
Example 6
If 2400 people are currently using an app, and the number of people using the app appears to double every three years, then the number of people expected to be using the app in [latex]t[/latex] years is [latex]n(t)=2400\cdot 2^{t/3}[/latex].
a. How many people are expected to be using the app in two years?
b. When is the number of app users expected to be 50,000?
c. How fast is the number of app users expected to grow now and two years from now?
Answer:
a. Since [latex]n(2) = 2400\cdot 2^{2/3} \approx 3810[/latex], in two years, the expected number of app users is 3,810.
b. [latex]t=?[/latex] when [latex]n(t) = 50,000[/latex]
Therefore
[latex]\begin{align*} 50,000 &= 2400\cdot 2^{t/3}\\\\ &\Rightarrow \frac{50000}{2400}=2^{t/3} \Rightarrow\ln\left(\frac{50000}{2400}\right)=\ln\left(2^{t/3}\right)\\ &\Rightarrow\ln\left(\frac{50000}{2400}\right)=\frac{t}{3}\cdot \ln(2)\Rightarrow t=\frac{3\ln\left(\frac{50000}{2400}\right)}{\ln(2)}\approx 13.14\text{ years} \end{align*}[/latex]
Therefore we expect 50,000 people to have the disease about 13.14 years from now.
c. [latex]n'(0)=?, n'(2)=?[/latex]
By analyzing we see that the function [latex]n(t)=2^{t/3}[/latex] is a composition of functions: an exponential function applied to a linear function. Hence, to find its derivative we must use the chain rule. The inside function is [latex]\frac{1}{3}t[/latex] and the outside function is [latex]2^{{(inside})}[/latex].
[latex]\begin{align*} \frac{dn}{dt} & = \frac{d}{dt}\left(2400\cdot 2^{t/3}\right) \\ & =\overset{out'}{ 2400\cdot 2^{t/3}}\cdot \overset{in'}{\ln(2)\cdot\frac{1}{3}} \\ & =800\ln(2)\cdot 2^{t/3} \end{align*}[/latex]
Therefore
[latex]\begin{align*} n'(0)&=800\ln(2)\cdot 2^{0/3}\approx 554.5\\ n'(2)&=800\ln(2)\cdot 2^{2/3}\approx 880 \end{align*}[/latex]
So currently the number of app users is growing by approximately 554.5 users per year while in two years it is expected to be increasing by approximately 880 users per year.
Derivatives of Complicated Functions
You’re now ready to take the derivative of some mighty complicated functions. But how do you tell what rule applies first? One way is to work your way in from the outside – which operation do you encounter first? What is the derivative rule that applies to that operation? Use the Product, Quotient, and Chain Rules to peel off the layers, one at a time, until you’re all the way inside.
Example 7
Find [latex]\frac{d}{dx}\left( e^{3x}\cdot\ln(5x+7) \right)[/latex].
Answer: Let’s first analyze the function.
Looking at the whole expression, we see that this is a product of two compositions of functions. So we’ll need the product rule first and, within it, we’ll need to apply the chain rule. We can do this step by step, by first finding the derivatives of each factor in the product and then applying the product rule. Or we can do it all in one shot. We will do the latter.
[latex]\begin{align*} \frac{d}{dx}\left( e^{3x}\cdot\ln(5x+7) \right)&=\overset{first'}{\left(\overset{in'}{3}\cdot \overset{out'}{e^{3x}}\right)}\cdot\overset{second}{\overset{}{\overset{}{\ln(5x+7)}}} + \overset{first}{\overset{}{\overset{}e^{3x}}}\cdot\overset{second'}{\left(\overset{in'}{(5+0)}\cdot\overset{out'}{\frac{1}{5x+7}}\right)}\\\\ &=e^{3x}\left(3\ln(5x+7)+\frac{5}{5x+7}\right) \end{align*}[/latex]
Example 8
Differentiate [latex]z=\left(\dfrac{3t^3}{e^t(t-1)}\right)^4[/latex]
Answer: [latex]z'(t)=?[/latex]
Don’t panic! Let’s analyze.
As we look at the function, we can see that we have some complicated thing, raised to the power of 4. That tells us that we are dealing with a composition of functions: to find the value of the function, the thing in the bracket gets calculated first, then it is raised to the power of 4. So the inside function is what is inside the bracket. This tells us that we will have to use the chain rule. In order to use the chain rule, we will have to calculate the derivative of the outside function (power function) and the derivative of the inside function (the complicated thing).
Now, when we look at the inside function, we can see that it is a quotient, or a fraction, of two functions. This means we will have to use the the quotient rule when we differentiate the inside function. This quotient is made of a polynomial in the numerator and a product of functions in the denominator, for which we will have to use the product rule.
We can go two ways: step by step or all in one shot. Let’s do both, starting with the latter. (Tip: start with reading “one-shot”, then read “step-wise”, then circle back and re-read “one-shot” – ideally you want to get yourself to the stage where you can do even complicated things such as this all in one shot.)
All in one shot:
[latex]\begin{align*} \frac{dz}{dt}&=\overset{out'}{\overset{}{\overset{}{4\left(\dfrac{3t^3}{e^t(t-1)}\right)^{3}}}}\cdot\overset{in'}{\frac{\overset{top'}{\overset{}{\overset{}{(9t^{2}}}})\cdot \overset{bottom}{\overset{}{\overset{}{(e^t(t-1))}}}-\overset{top}{\overset{}{\overset{}{(3t^3)}}}\cdot\overset{bottom'}{\left(\overset{first'}{e^t}\cdot\overset{second}{(t-1)}+\overset{first}{e^t}\cdot\overset{second'}{(1)}\right)}}{\underset{bottom^2}{\left(e^t(t-1)\right)^2}}}\\\\ &\overset{clean-up}{=}\ldots \text{(not necessary for our purposes)} \end{align*}[/latex]
Step by step:
Step One: Use the chain rule. The derivative of the outside times the derivative of the inside:
[latex]\frac{dz}{dt}=\frac{d}{dt}\left(\frac{3t^3}{e^t(t-1)}\right)^4=4\left(\frac{3t^{3}}{e^t(t-1)}\right)^3\cdot \frac{d}{dt}\left(\frac{3t^3}{e^t(t-1)}\right)[/latex]
Now we’re one step in, and we can concentrate on just the [latex]\frac{d}{dt}\left(\frac{3t^3}{e^t(t-1)}\right)[/latex]. This is the derivative of a quotient of two functions.
Step Two: Use the Quotient Rule. The derivative of the numerator is straightforward, so we can just calculate it. The derivative of the denominator uses product rule, so we’ll leave it for now:
[latex]\frac{d}{dt}\left(\frac{3t^3}{e^t(t-1)}\right)=\frac{\left( 9t^2 \right)\left( e^t(t-1) \right)-\left( 3t^3 \right)\left( \frac{d}{dt}\left( e^t(t-1) \right) \right)}{\left(e^t(t-1)\right)^2}[/latex]
Now we’ve gone one more step in, and we can concentrate on just the denominator [latex]\frac{d}{dt}\left( e^t(t-1) \right)[/latex], which involves a product.
Step Three: Using the product rule,
[latex]\frac{d}{dt}\left( e^t(t-1)\right) = \left( e^t \right)(t-1)+\left( e^t \right)(1)[/latex]
And now we’re all the way in – no more derivatives to take!
Step Four: Now it’s just a question of substituting back – be careful now!
[latex]\frac{d}{dt}\left( e^t(t-1)\right) = \left( e^t \right)(t-1)+\left( e^t \right)(1)[/latex]
so
[latex]\frac{d}{dt}\left(\frac{3t^3}{e^t(t-1)}\right)=\frac{\left( 9t^2 \right)\left( e^t(t-1) \right)-\left( 3t^3 \right)\left( \left( e^t \right)(t-1)+\left( e^t \right)(1) \right)}{\left(e^t(t-1)\right)^2}[/latex]
so
[latex]\begin{align*} \frac{dz}{dt}&=\frac{d}{dt}\left(\frac{3t^3}{e^t(t-1)}\right)^4\\\\ &=4\left(\frac{3t^3}{e^t(t-1)}\right)^3\cdot \left( \frac{\left( 9t^2 \right)\left( e^t(t-1) \right)-\left( 3t^3 \right)\left( \left( e^t \right)(t-1)+\left( e^t \right)(1) \right)}{\left(e^t(t-1)\right)^2} \right) \end{align*}[/latex]
Phew!
Now circle back to the “one-shot” approach and see if your reasoning through it is clearer. Can you work it out yourself trying both ways?
Example 9
The normalised sinc function [latex]\text{sinc}(x)[/latex], also called the “sampling function”, is a function that arises frequently in digital signal processing and information theory:
[latex]\text{sinc}(x)=\frac{\sin(\pi x)}{\pi x}[/latex]
Find the derivative function of [latex]\text{sinc}(x)[/latex]. Use Desmos to graph the function and compare the behaviour of the function to its derivative.
Answer: [latex]\frac{d}{dx}(\text{sinc} (x))=?[/latex]
By analyzing [latex]\frac{\sin(\pi x)}{\pi x}[/latex], we see this is a quotient of two functions: the sine function applied to a linear function, and a linear function. Thus, we will have to apply the quotient rule and, within it, the chain rule, the sine function rule, and the power rule.
[latex]\begin{align*} \frac{d}{dx}(\text{sinc} x)&=\frac{d}{dx}\left(\frac{\sin(\pi x)}{\pi x}\right)=\frac{\pi\cos(\pi x)\cdot \pi x-sin(\pi x)\cdot \pi}{(\pi x)^2}\\\\ &=\frac{\pi^2 x\cos(\pi x)-\pi\sin(\pi x)}{\pi^2x^2}=\frac{\pi\left(\pi x\cos(\pi x)-\sin(\pi x)\right)}{\pi^2x^2}\\\\ &=\frac{\pi x\cos(\pi x)-\sin(\pi x)}{\pi x^2} \end{align*}[/latex]
Desmos visualization: link
What if the Derivative Doesn’t Exist?
Differentiable functions
We’ve been acting as if derivatives exist everywhere for every function. This is true for most of the functions that you will run into in this course. But there are some common places where the derivative doesn’t exist.
Remember that the derivative is the slope of the tangent line to the curve. That’s what we need to think about.
Where can a slope not exist? First of all, the slope can’t exist where the function is not defined. So the derivative will not exist at values that are outside of the domain of the function.
Second, if the tangent line is vertical, it’s slope is not defined, and so the derivative at the point that has a vertical tangent line will not exist.
Example 10
Show that [latex]f(x)=\sqrt[3]{x}=x^{1/3}[/latex] is not differentiable at [latex]x = 0[/latex].
Answer: Proof that [latex]f'(0)[/latex] does not exist?
At [latex]x=0[/latex], [latex]f(x)=0^{1/3}=0[/latex] and so the function is defined at 0.
Finding the derivative, we get [latex]f(x)=\frac{1}{3}x^{-2/3}=\frac{1}{3x^{2/3}}[/latex]. At [latex]x = 0[/latex], the derivative function is undefined.
From the graph, we can see that the tangent line to this curve at [latex]x = 0[/latex] is vertical with undefined slope, which is why the derivative does not exist at [latex]x = 0[/latex].
We could also have a case where the tangent line does not exist at all. Here are a couple of examples when this happens:
- If there is a sharp corner (cusp) in the graph, the derivative will not exist at that point because there is no well-defined tangent line (a teetering tangent, if you will).
- If there is a discontinuity in the graph (a jump, a break, a hole in the graph, or a vertical asymptote), the tangent line will be different on either side and the derivative will not exist at that point .
Example 11
Show that [latex]f(x)=|x|[/latex] is not differentiable at [latex]x = 0[/latex].
Answer: Proof that [latex]f'(0)[/latex] does not exist?
On the left side of the graph, the slope of the line is -1. On the right side of the graph, the slope is +1. There is no well-defined tangent line at the sharp corner at [latex]x = 0[/latex], so the function is not differentiable at that point.
Video Demonstration
When is a function not differentiable?
© 2014 Eric Bancroft
Section Exercises
Work on the following exercises. Discuss your solutions with your peers and/or course instructor.
IC4NITS Exercises 2.5 – Chain Rule