I’m soon going to embark on teaching the chain rule in calculus. I have found ways to help kids remember the chain rule (“the outer function is the mama, the inner function is the baby… when you take the derivative, you derive the mama and leave the baby inside, and then you multiply by the derivative of baby”), ways to write things down so their information stays organized, and I have shown them enough patterns to let them *see* it’s true. **But I have never yet found a way to conceptually get them to understand it without confusing them.** (The gear thing doesn’t help

*me*get it… Although I understand the analogy, it feels divorced from the actual functions themselves… and these functions have a constant rate of change.)

I think I now have a way that might help students to get conceptually understand what’s going on. I only had the insight 10 minutes ago so I’m going to use this blogpost to see if I can’t get the ideas straight in my head… The point of this post is *not* to share a way I’ve made the chain rule understandable. It’s for me to work through some unformed ideas. I am not yet sure if I have a way to turn this into something that my kids will understand.

So here’s where I’m starting from. Every “nice” function (and those are the functions we’re dealing with) is basically like an infinite number of little line segments connected together. Thus, when we take a derivative, we’re pretty much just asking “what’s the slope of the little line segment at ?” for example.

Now here’s the magic. In my class, we’ve learned that **whatever transformations a function undergoes, the tangent line undergoes the same transformations**! If you want to see that, you can check it out here.

For a quick example, let’s look at and .

We see that is secretly which has undergone a vertical stretch of 2, a horizontal shrink of 1/5, and has been moved up 1.

Let’s look at the tangent line to at . It is approximately .

Now let’s put that tangent line through the transformations:

Vertical Stretch of 2:

Horizontal shrink of 1/5:

Shift up 1:

Now let’s plot and our transmogrified tangent line:

Yay! It worked! (But of course we knew that would happen.)

The whole point of this is to show **that tangent lines undergo the same transformations as the functions** — because the functions themselves are pretty much just a bunch of these infinitely tiny tangent line segments all connected together! So it would actually be *weird* if the tangent lines didn’t behave like the functions.

## My Thought For Using This for The Chain Rule

So why not look at function composition in the same way?

**We can look at a composition of functions at a point as simply a composition of these little line segments. **

Let’s see if I can’t clear this up by making it concrete with an example.

Let’s look at .

And so we can be super concrete, let’s try to find , which is simply the slope of the tangent line of at .

I’m going to argue that just as and are composed to get our final function, we can compose the tangent lines to these two functions to get the final tangent line at .

Let’s start with the . At , the tangent line is (I’m not showing the work, but you can trust me that it’s true, or work it out yourself.)

Now let’s start with the square root function. We have to be thoughtful about this. We are dealing with which really means that we’re taking the square root of 9. We we want the tangent line to at . That turns out to be (again, trust me?): .

So now we have our two line segments.

We have to compose them.

This simplifies to:

Let’s look at a graph of and our tangent line:

Yup!

Where did we ultimately get the slope of 2 from? When we composed to two lines together, we multiplied the slope of the inner function (12) by the slope of the outer function (1/6). And that became our new line’s slope.

Chain rule!

## How we generalize this to the chain rule

For any composition of functions, we are going to have an inner and an outer function. Let’s write where we can clearly remember which one is the inner and which one is the outer functions. Let’s pick a point where we want to find the derivative.

We are going to have to find the little line segment of the inner function and compose that with the little line segment of the outer function, both at . That will approximate the function at .

The line segment of the inner function is going to be

The line segment of the outer function is going to be

I am going to keep those terms blah1 and blah2 only because we won’t really need them. Let’s remember we *only want the derivative* (the slope of the tangent line), not the tangent line itself. So our task becomes easier.

Let’s compose them:

This simplifies to

And since we only want the *slope* of this line (the derivative is the slope of the tangent line, remember), we have:

.

Of course we chose an arbitrary point to take the derivative at. So we really have:

Which is the chain rule.