# An unformed idea to teach understanding to the chain rule

I’m soon going to embark on teaching the chain rule in calculus. I have found ways to help kids remember the chain rule (“the outer function is the mama, the inner function is the baby… when you take the derivative, you derive the mama and leave the baby inside, and then you multiply by the derivative of baby”),  ways to write things down so their information stays organized, and I have shown them enough patterns to let them see it’s true. But I have never yet found a way to conceptually get them to understand it without confusing them. (The gear thing doesn’t help me get it… Although I understand the analogy, it feels divorced from the actual functions themselves… and these functions have a constant rate of change.)

I think I now have a way that might help students to get conceptually understand what’s going on. I only had the insight 10 minutes ago so I’m going to use this blogpost to see if I can’t get the ideas straight in my head… The point of this post is not to share a way I’ve made the chain rule understandable. It’s for me to work through some unformed ideas. I am not yet sure if I have a way to turn this into something that my kids will understand.

So here’s where I’m starting from. Every “nice” function (and those are the functions we’re dealing with) is basically like an infinite number of little line segments connected together. Thus, when we take a derivative, we’re pretty much just asking “what’s the slope of the little line segment at $x=3$?” for example.

Now here’s the magic. In my class, we’ve learned that whatever transformations a function undergoes, the tangent line undergoes the same transformations! If you want to see that, you can check it out here.

For a quick example, let’s look at $f(x)=\sin{x}$ and $g(x)=2\sin{(5x)}+1$.

We see that $g(x)$ is secretly $f(x)$ which has undergone a vertical stretch of 2, a horizontal shrink of 1/5, and has been moved up 1.

Let’s look at the tangent line to $f(x)$ at $x=\pi/3$. It is approximately $y=0.5x+0.34$.

Now let’s put that tangent line through the transformations:

$y=0.5x+0.34$

Vertical Stretch of 2: $y=2(0.5x+0.34)=x+0.68$

Horizontal shrink of 1/5: $y=5x+0.68$

Shift up 1: $y=5x+1.68$

Now let’s plot $g(x)$ and our transmogrified tangent line:

Yay! It worked! (But of course we knew that would happen.)

The whole point of this is to show that tangent lines undergo the same transformations as the functions — because the functions themselves are pretty much just a bunch of these infinitely tiny tangent line segments all connected together! So it would actually be weird if the tangent lines didn’t behave like the functions.

## My Thought For Using This for The Chain Rule

So why not look at function composition in the same way?

We can look at a composition of functions at a point as simply a composition of these little line segments.

Let’s see if I can’t clear this up by making it concrete with an example.

Let’s look at $m(x)=\sqrt{x^3+1}$.

And so we can be super concrete, let’s try to find $m'(2)$, which is simply the slope of the tangent line of $m(x)$ at $x=2$.

I’m going to argue that just as $\sqrt{x}$ and $x^3+1$ are composed to get our final function, we can compose the tangent lines to these two functions to get the final tangent line at $x=2$.

Let’s start with the $x^3+1$. At $x=2$, the tangent line is $y_{inner}=12x-15$ (I’m not showing the work, but you can trust me that it’s true, or work it out yourself.)

Now let’s start with the square root function. We have to be thoughtful about this. We are dealing with $m(2)$ which really means that we’re taking the square root of 9. We we want the tangent line to $\sqrt{x}$ at $x=9$. That turns out to be (again, trust me?): $y_{outer}=\frac{1}{6}x+\frac{3}{2}$.

So now we have our two line segments.

We have to compose them.

$y_{composed}=\frac{1}{6}[12x-15]+\frac{3}{2}$

This simplifies to:

$y_{composed}=2x-1$

Let’s look at a graph of $m(x)$ and our tangent line:

Yup!

Where did we ultimately get the slope of 2 from? When we composed to two lines together, we multiplied the slope of the inner function (12) by the slope of the outer function (1/6). And that became our new line’s slope.

Chain rule!

## How we generalize this to the chain rule

For any composition of functions, we are going to have an inner and an outer function. Let’s write $c(x)=o(i(x))$ where we can clearly remember which one is the inner and which one is the outer functions. Let’s pick a point $x_0$ where we want to find the derivative.

We are going to have to find the little line segment of the inner function and compose that with the little line segment of the outer function, both at $x_0$. That will approximate the function $c(x)$ at $x_0$.

The line segment of the inner function is going to be $y_{inner}=i'(x_0)x+blah1$

The line segment of the outer function is going to be $y_{outer}=o'(i(x_0))x+blah2$

I am going to keep those terms blah1 and blah2 only because we won’t really need them. Let’s remember we only want the derivative (the slope of the tangent line), not the tangent line itself. So our task becomes easier.

Let’s compose them: $y_{composed}=o'(i(x_0))[i'(x_0)x+blah1]+blah2$

This simplifies to $y_{composed}=o'(i(x_0))i'(x_0)x+blah3$

And since we only want the slope of this line (the derivative is the slope of the tangent line, remember), we have:

$o'(i(x_0)i'(x_0)$.

Of course we chose an arbitrary point $x_0$ to take the derivative at. So we really have:

$c'(x)=o'(i(x))i'(x)$

Which is the chain rule.

# I got rid* of Limits in Calculus (*almost entirely)

I’ve been meaning to write this post for a while. I teach non-AP Calculus. My goal in this course is to get my kids to understand calculus with depth — that means my primary focus is on conceptual understanding, where facility with fancy-algebra things is secondary. Now don’t go thinking my kids come out of calculus not knowing how to do real calculus. They do. It is just that I pare things down so that they don’t have to find the derivatives of things like $y=\cot(x)$. Why? Because even though I could teach them that (and I have in the past), I would rather spend my time doing less work on moving through algebraic hoops, and more work on deep conceptual understanding.

Everything I do in my course aims for this. Sometimes I succeed. Sometimes I fail. But I don’t lose sight of my goal.

Each year, I have parts of the calculus curriculum I rethink, or have insights on. In the past few years, I’ve done a lot of thinking about limits and where they fit in the big picture of things. Each year, they lose more and more value in my mind. I used to spend a quarter of a year on them. In more recent years, I spent maybe a sixth of a year on them. And this year, I’ve reduced the time I spend on limits to about 5 minutes.*

*Okay, not really. But kinda. I’ll explain.

First I’ll explain my reasoning behind this decision. Then I’ll explain how I did it.

## Reasoning Behind My Decision to Eliminate Limits

For me, calculus has two major parts: the idea of the derivative, and the idea of the integral.

Limits show up in both [1]. But where do they show up in derivatives?

• when you use the formal definition of the derivative

and… that’s pretty much it. And where do they show up in integrals?

• when you say you are taking the sum of an infinite sum of infinitely thin rectangles

and… that’s pretty much it. I figure if that’s all I need limits for, I can target how I introduce and use limits to really focus on those things. Do I really need them to understand limits at infinity of rational functions? Or limits of piecewise functions? Or limits of things like $y=\sin(1/x)$ as $x\rightarrow 0$?

Nope. And this way I’m not wasting a whole quarter (or even half a quarter) with such a simple idea. All I really need — at least for derivatives — is how to find the limit as one single variable goes to 0. C’est tout!

## How I did it

This was our trajectory:

(1) Students talked about average rate of change.

(2) Students talked about the idea of instantaneous rate of change. They saw it was problematic, because how can something be changing at an instant? If you say you’re travelling “58 mph at 2:03pm,” what exactly does that mean? There is no time interval for this 58mph to pop out of, since we’re talking about an instant, a single moment in time (of 2:03pm). So we problematized the idea of instantaneous rate of change. But we also recognized that we understand that instantaneous rates of change do exist, because we believe our speedometers in our car which say 60mph. So we have something that feels philosophically impossible but in our guts and everyday experience feels right. Good. We have a problem we need to resolve. What might an instantaneous rate of change mean? Is it an oxymoron to have a rate of change at a instant?

(3) Students came to understand that we could approximate the instantaneous rate of change by taking the slope of two points really really really close to each other on a function. And the closer that we got, the better our approximation was. (Understanding why we got a better and better approximation was quite hard conceptual work.) Similarly students began to recognize graphically that the slope of two points really close to each other is actually almost the slope of the tangent line to the function.

(4) Now we wanted to know if we could make things exact. We knew we could make things exact if we could bring the two points infinitely close to each other. But each time we tried that, we got either got two points pretty close to each other or the two points lay directly on top of each other (and you can’t find the slope between a point and itself). So still we have a problem.

And this is where I introduced the idea of introducing a new variable, and eventually, limits.

We encountered the question: “what is the exact instantaneous rate of change for $f(x)=x^2$ at $x=3$?

We started by picking two points close to each other: $(3,9)$ and $(3+h,(3+h)^2)$

This was the hardest thing for students to understand. Why would we introduce this extra variable $h$. But we talked about how $(3.0001,3.0001^2)$ wasn’t a good second point, and how $(3.0000001,3.0000001^2)$ also wasn’t a good second point. But if they trusted me on using this variable thingie, they will see how our problems would be resolved.

We then found the average rate of change between the two points, recognizing that the second point could be really faraway from the first point if $h$ were a large positive or negative number… or close to the first point if $h$ were close to 0.

Yes, students had to first understand that $h$ could be any number. And they had to come to the understanding that $h$ represented where the second point was in relation to the first point (more specifically: how far horizontally the second point was from the first point).

And so we found the average rate of change between the two points to be:

$AvgRateOfChange=\frac{(3+h)^2-9}{(3+h)-3}$

We then said: how can we make this exact? How can we bring the two points infinitely close to each other? Ahhh, yes, by letting $h$ get infinitely close to 0.

And so I introduce the idea of the limit as such:

If I have $\lim_{h\rightarrow 0} blah$, it means what blah gets infinitely close to if $h$ gets infinitely close to 0 but is not equal to 0. That last part is key. And honestly, that’s pretty much the entirety of my explanation about limits. So that’s the 5 minutes I spend talking about limits.

So to find the instantaneous rate of change, we simply have:

$InstRateOfChange=\lim_{h\rightarrow0} \frac{(3+h)^2-9}{(3+h)-3}$

This is simply the slope between two points which have been brought infinitely close together. Yes, that’s what limits do for you.

And then we simplify:

$InstRateOfChange=\lim_{h\rightarrow0} \frac{9+6h+h^2-9}{h}$

$InstRateOfChange=\lim_{h\rightarrow0} \frac{6h+h^2}{h}$

$InstRateOfChange=\lim_{h\rightarrow0} \frac{h(6+h)}{h}$

$InstRateOfChange=\lim_{h\rightarrow0} \frac{h}{h} \frac{(6+h)}{1}$

Now because we know that $h$ is close to 0, but not equal to 0, we can say with confidence that $\frac{h}{h}=1$. Thus we can say:

$InstRateOfChange=\lim_{h\rightarrow 0} (6+h)$

And now as $h$ goes to 0, we see that $6+h$ gets infinitely close to 6.

Done. (Here’s a do now I did in class.)

We did this again and again to find the instantaneous rate of change of various functions at a points. For examples, functions like:

$f(x)=x^3-2x+1$ at $x=1$

$g(x)=\sqrt{2-3x}$ at $x=-2$

$h(x)=\frac{5}{2-x}$ at $x=1$

For these, the algebra got more gross, but the idea and the reasoning was the same in every problem. Notice to do all of these, you don’t need any more knowledge of limits than what I outlined above with that single example. You need to know why you can “remove” the $\frac{h}{h}$ (why it is allowed to be “cancelled” out), and then what happens as $h$ goes to 0. That’s all.

Yup, again, notice I only needed to rely on this very basic understanding of limits to solve these three problems algebraically: $\lim_{h\rightarrow 0} blah$ means what blah gets infinitely close to if $h$ gets infinitely close to 0 but is not equal to 0.

(5) Eventually we generalize to find the instantaneous rate of change at any point, using the exact same process and understanding. At this point, the only difference is that the algebra gets slightly more challenging to keep track of. But not really that much more challenging.

(6) Finally, waaaay at the end, I say: “Surprise! The instantaneous rate of change has a fancy calculus word — derivative.

Apologies in advance if any of this was unclear. I feel I didn’t explain thing as well as I could have. I also want to point out that I understand if you don’t agree with this approach. We all have different thoughts about what we find important and why. I can (and in fact, in the past, I have) made the case that going into depth into limits is of critical importance. I personally just don’t see things the same way anymore.

Now I should also say that there have been a few downsides to this approach, but on the whole it’s been working well for me so far. I would elaborate on the downsides but right now I’m just too exhausted. Night night!

[1] Okay, I should also note that limits show up in the definition for continuity. But since in my course I don’t really focus on “ugly” functions, I haven’t seen the need to really spend time on the idea of continuity except in the conceptual sense. Yes, I can ask my kids to draw the derivative of $y=|x|$ and they will be able to. They will see there is a jump at $x=0$. I don’t need more than that.

A couple years ago, Kate Nowak asked us to ask our kids:

What is 1 Radian?” Try it. Dare ya. They’ll do a little better with: “What is 1 Degree?”

I really loved the question, and I did it last year with my precalculus kids, and then again this year. In fact, today I had a mini-assessment in precalculus which had the question:

What, conceptually, is 3 radians? Don’t convert to degrees — rather, I want you to explain radians on their own terms as if you don’t know about degrees. You may (and are encouraged to) draw pictures to help your explanation.

My kids did pretty well. They still were struggling with a bit of the writing aspect, but for the most part, they had the concept down. Why? It’s because my colleague and geogebra-amaze-face math teacher friend made this applet which I used in my class. Since this blog can’t embed geogebra fiels, I entreat you to go to the geogebratube page to check it out.

Although very simple, I dare anyone to leave the applet not understanding: “a radian is the angle subtended by the bit of a circumference of the circle that has 1 radius a circle that has a length of a single radius.” What makes it so powerful is that it shows radii being pulled out of the center of the circle, like a clown pulls colorful a neverending set of handkerchiefs out of his pocket.

If you want to see the applet work but are too lazy to go to the page, I have made a short video showing it work.

PS. Again, I did not make this applet. My awesome colleague did. And although there are other radian applets out there, there is something that is just perfect about this one.

# Mission #8: Sharing is Caring in the MTBoS

It's amazing. You're amazing. You joined in the Explore the MathTwitterBlogosphere set of missions, and you've made it to the eighth week. It's Sam Shah here, and whether you only did one or two missions, or you were able to carve out the time and energy to do all seven so far, I am proud of you.

I've seen so many of you find things you didn't know were out there, and you tried them out.

Here I'm reblogging our last mission from the Explore the #MTBoS!

# Trig War

This is going to be a quick post.

Kate Nowak played “log war” with her classes. I stole it and LOVED it. Her post is here. It really gets them thinking in the best kind of way. Last year I wanted to do “inverse trig war” with my precalculus class because Jonathan C. had the idea. His post is here. I didn’t end up having time so I couldn’t play it with my kids, sadly.

This year, I am teaching precalculus, and I’m having kids figure out trig on the unit circle (in both radians and degrees). So what do I make? The obvious: “trig war.”

The way it works…

I have a bunch of cards with trig expressions (just sine, cosine, and tangent for now) and special values on the unit circle — in both radians and degrees.

You can see all the cards below, and can download the document here (doc).

They played it like a regular game of war:

I let kids use their unit circle for the first 7 minutes, and then they had to put it away for the next 10 minutes.

And that was it!

# An expanded understanding of basic derivatives – graphically

The guilt that I feel for not blogging more regularly this year has been considerable, and yet, it has not driven me to post more. I’ve been overwhelmed and busy, and my philosophy about blogging it is: do it when you feel motivated. And so, I haven’t.

Today, I feel a slight glimmer of motivation. And so here I am.

Here’s what I want to talk about.

In calculus, we all have our own ways of introducing the power rule for derivatives. Graphically. Algebraically. Whatever. But then, armed with this knowledge…

that if $f(x)=x^n$, then $f'(x)=nx^{n-1}$

…we tend to drive forward quickly. We immediately jump to problems like:

take the derivative of $g(x)=4x^3-3x^{-5}+2x^7$

and we hurdle on, racing to the product and quotient rules… We get so algebraic, and we go very quickly, that we lose sight of something beautiful and elegant. This year I decided to take an extra few days after the power rule but before problems like the one listed above to illustrate the graphical side of things.

Here’s what I did. We first got to the point where we comfortably proved the power rule for derivatives (for n being a counting number). Actually, before I move on and talk about the crux of this post, I should show you what we did…

Okay. Now I started the next class with kids getting Geogebra out and plotting on two graphics windows the following:

and they saw the following:

At this point, we saw the transformations. On the left hand graph, we saw that the function merely shifted up one unit. On the right hand graphs, we saw a vertical stretch for one function, and a vertical shrink for the other.

Here’s what I’m about to try to illustrate for the kids.

Whatever transformation a function undergoes, the tangent lines to the function also undergoes the exact same transformation.

What this means is that if a function is shifted up one unit, then all tangent lines are shifted up one unit (like in the left hand graph). And if a function undergoes vertical stretching or shrinking, all tangent lines undergo the same vertical stretching or shrinking.

I want them to see this idea come alive both graphically and algebraically.

So I have them plot all the points on the functions where $x=1$. And all the tangent lines.

For the graph with the vertical shift, they see:

The original tangent line (to $f(x)=x^2$) was $y=2x-1$. When the function moved up one unit, we see the tangent line simply moved up one unit too.

Our conclusion?

Yup. The tangent line changed. But the slope did not. (Thus, the derivative is not affected by simply shifting a function up or down. Because even though the tangent lines are different, the slopes are the same.)

Then we went to the second graphics view — the vertical stretching and shrinking. We drew the points at $x=1$ and their tangent lines…

…and we see that the tangent lines are similar, but not the same. How are they similar? Well the original function’s tangent line is the red one, and has the equation $y=2x-1$. Now the green function has undergone a vertical shrink of 1/4. And lo and behold, the tangent line has also!

To show that clearly, we did the following. The original tangent line has equation $y=2x-1$. So to apply a vertical shrink of 1/4 to this, you are going to see $y=\frac{1}{4}(2x-1)$ (because you are multiplying all y-coordinates by 1/4. And that simplifies to $y=0.5x-0.25$. Yup, that’s what Geogebra said the equation of the tangent line was!

Similarly, for the blue function with a vertical stretch of 3, we get $y=3(2x-1)=6x-3$. And yup, that’s what Geogebra said the equation of the tangent line was.

What do we conclude?

And in this case, with the vertical stretching and shrinking of the functions, we get a vertical stretching and shrinking of the tangent lines. And unlike moving the function up or down, this transformation does affect the slope!

I repeat the big conclusion:

Whatever transformation a function undergoes, the tangent lines to the function also undergoes the exact same transformation.

I didn’t actually tell this to my kids. I had them sort of see and articulate this.

Now they see that if a function gets shifted up or down, they can see that the derivative stays the same. And if there is a vertical stretch/shrink, the derivative is also vertically stretched/shrunk.

The next day, I started with the following “do now.” We haven’t learned the derivative of $\sin(x)$, so I show them what Wolfram Alpha gives them.

For (a), I expect them to give the answer $g'(x)=3\cos(x)$ and for (b), $h'(x)=-\cos(x)$.

The good thing here is now I get to go for depth. WHY?

And I hear conversations like: “Well, g(x) is a transformation of the sine function which gives a vertical stretch of 3, and then shifts the function up 4. Well since the function undergoes those transformations, so does the tangent lines. So each tangent line is going to be vertically stretched by 3 and moved up 4 units. Since the derivative is only the slope of the tangent line, we have to see what transformations affect the slope. Only the vertical stretch affects the slope. So if the original slope of the sine function was $\cos(x)$, then we know that the slope of the transformed function is $3\cos(x)$.

That’s beautiful depth. Beautiful.

For (b), I heard talk about how the negative sign is a reflection over the x-axis, so the tangent lines are reflected over the x-axis also. Thus, the slopes are the opposite sign… If the original sine functions slope of the tangent lines was $\cos(x)$, then the new slopes are going to be $-\cos(x)$.

This isn’t easy for my kids, so when I saw them struggling with the conceptual part of things, I whipped up this sheet (.docx).

And here are the solutions

And here is a Geogebra sheet which shows the transformations, and the new tangent line (and equation), for this worksheet.

Now to be fair, I don’t think I did a killer job with this. It was my first time doing it. I think some kids didn’t come out the stronger for this. But I do feel that the kids who do get it have a much more intuitive understanding of what’s going on.

I am much happier to know that if I ask kids what the derivative of $q(x)=6x^9$ is, they immediately think (or at least can understand) that we get $q'(x)=6*9x^8$, because…

our base function is $x^9$ which has derivative (aka slope of the tangent line) $9x^8$… Thus the transformed function $6x^9$ is going to be a vertical stretch, so all the tangent lines are going to be stretched vertically by a factor of 9 too… thus the derivative of this (aka the slope of the tangent line) is $q'(x)=6*9x^8$.

To me, that sort of explanation for something super simple brings so much graphical depth to things. And that makes me feel happy.

# Infinite Geometric Series

I did a bad job (in my opinion) of teaching infinite geometric series in precalculus in my previous class. I told them I did a bad job. I was rushing. They were confused. (One of them said: “you did a fine job, Mr. Shah” which made me feel better, but I still felt like they were super confused.)

At the start of the lesson, I gave each group one colored piece of paper. (I got this idea last year from my friend Bowen Kerins on Facebook! He is not only a math genius but he’s also a 5 time world pinball player champion. Seriously.) I don’t know why but it was nice to give each group a different color piece of paper. Then I had them designate one person to be the “paper master” and two people to be the friends of the paper master. Any group with a fourth person simply had to have the fourth person be the observer.

I did not document this, so I have made photographs to illustrate ex post facto.

I started, “Paper master, you have a whole sheet of paper! One whole sheet of paper! And you have two friends. You feel like being kind, sharing is caring, so why don’t you give them each a third of your paper.”

The paper master divided the paper in thirds, tore it, and shared their paper.

Then I said: “Your friends loveeeed their paper gift. They want just a little bit more. Why don’t you give them each some more… Maybe divide what you have left into thirds so you can keep some too.”

And the paper master took what they had, divided it into thirds, and shared it.

To the friends, I said: “Hey, friends, how many of you LOOOOOVE all these presents you’re getting? WHO WANTS MORE?” and the friends replied “MEEEEEEEEEEEEEEE!”

“Paper master, your friends are getting greedy. And they demand more paper. They said you must give them more or they won’t be your friends. And you are peer pressured into giving them more. So divide what little you have left and hand it to them.”

They do.

“Now do it again. Because your greedy friends are greedy and evil, but they’re still your friends.”

“Again.”

“Again.”

Here we stop. The friends have a lot of slips of paper of varying sizes. The paper master has a tiny speck.

I ask the class: “If we continue this, how much paper is the paper master going to eventually end up with?”

(Discussion ensues about whether the answer is 0 or super duper super close to 0.)

I ask the class: “If we continue this, how much paper are each of the friends going to have?”

(A more lively short discussion ensues… Eventually they agree… each friend will have about 1/2 the paper, since there was a whole piece of paper to start, each friend gets the same amount, and the paper master has essentially no paper left.)

I then go to the board.

I write $\frac{1}{2}=$

and then I say: “How much paper did you get in your initial gift, friends?”

I write $\frac{1}{2}=\frac{1}{3}+$

and then we continue, until I have:

$\frac{1}{2}=\frac{1}{3}+\frac{1}{9}+\frac{1}{27}+\frac{1}{81}+...$

Ooohs and aahs.

Next year I am going to task each student to do this with two friends or people from their family, and have them write down their friends/family member’s reactions…

I love this.