There might be light at the end of the Chain Rule tunnel… maybe.

This is going to be a half-formed post. I wanted to get a conceptual way for kids to grok why the chain rule works in calculus. But without doing too much handwaving. And I wanted something visual.

The hard part is: if we have a function g(f(x)), we can approximate the derivative at a particular point by doing the following.

Find  two points close to each other, like (x,g(f(x)) and (x+0.001,g(f(x+0.001)).

Find the slope between those two points: \frac{g(f(x+0.001)-g(f(x))}{(x+0.001)-x}.

There we go. An approximation for the derivative! (We can use limits to write the exact expression for the derivative if we want.)

But that doesn’t help us understand that \frac{d}{dx}[g(f(x)]=g'(f(x))f'(x) on any level. They seem disconnected!

But I’m on my way there. I’m following things in this way: x \rightarrow f \rightarrow g

Check out this thing I whipped up after school today. The diagram on top does x \rightarrow f and the diagram on the bottom does f \rightarrow g. The diagram on the right does both. It shows how two initial inputs (in this case, 3 and 3.001) change as they go through the functions f and g.

At the very bottom, you see the heart of this. It has \frac{\Delta g}{\Delta f}\cdot\frac{\Delta f}{\Delta x}=\frac{\Delta g}{\Delta x}

And then I thought: okay, this is getting me somewhere, but it’s to abstract. So I went more concrete. So I started thinking of something physical. So I went to how maybe someone is heating something up, and in three seconds, the temperature rises dramatically. The temperature measurements are made in Farenheit, but you are a true scientist at heart and want to see how the temperature changed in Celcius.


I love this. I’m proud of this page.

And then of course when I got home, I wanted to see this process visualized, so I hopped on Geogebra and had fun creating this applet (click here or on the image below to go to the applet). These sorts of input-output diagrams going from numberline to numberline are called dynagraphs. You can change the two functions, and you can drag the two initial points on the left around. (The scale of the middle and right bar change automatically with new functions you type! Fancy!)


And of course after doing all this, I remembered watching a video that Jim Fowler made on the chain rule for his online calculus course, and yes, all my thinking is pretty much recapturing his progression.

This, to be clear, is about the fourth idea I’ve had as I’ve been thinking about how to conceptually get at the chain rule for my kids. The other ideas weren’t bad! I just didn’t have time to blog about them, but I also abandoned them because they still felt too tough for my kids. But I think this approach has some promise. It’s definitely not there yet, and I don’t know if I’ll have time to get there this year (so I might have to work on it for next year). But I know to get there I’ll have to focus on making the abstract very tangible, and not have too many logical leaps (so the chain of logic gets lost).

If I’m going to create something I’m proud of, kids are going to have to come out saying “oh, yeah… OBVIOUSLY the chain rule makes sense.” Not “Oh, I guess we did a lot of stuff and it all worked out, so it must be true.”

A blogpost of unformed thoughts, and an applet. Sorry, not sorry. This is my process!


POP! Popcorn Optimization Problem

I’m in the middle of optimization in my calculus class now. I had a “long block” (every seven school days, I see a class once for 90 minutes) and for the second half of that long block, I like to do something slightly different. Since I knew my kids hadn’t seen or done the traditional “box optimization problem” in precalculus (since I taught them last year also!), I decided to do that.

This might jog your memory if you don’t know what I’m talking about. You take a piece of paper. You cut out four squares (the same size squares) from the corners. You then fold up the four flaps and tape the box shut. There you go!


You can probably tell that the box’s volume is going to be based on the original paper you start with, and the size of the square you decide to cut out. The question is: what’s the largest volume you can get for this box. 

If you cut out a teeeny tiiiiny square, you’re going to have a very large base for the box, but almost no height. And if you cut out a giant square, you’re going to have a large height but a teeeeeeeny tiiny base. And somewhere between a teeeeeny tiny square and a giant square is going to be the perfect square to cut out which will give you the largest volume.

So the question is: given a specific piece of paper, what size square do you need to cut out to get the maximum volume.

This question has been done to death in middle school classes, in Algebra II classes, in Precalculus classes, and in Calculus classes. And I recognize that this post is just another rehashing of the same old problem. But I remember reading about a teacher who did a variation of this by including popcorn. And I wanted to do the same. No surprise, when I looked it up, it was dear Fawn. But I had such a lovely time in class today watching this unfold that I wanted to share the specific sheet I made up for kids to do this.

[2018-01-31 Popcorn Activity .docx version to download]

Teacher Moves / Outline

This activity requires students knowing and using the quadratic formula. My kids (standard level calculus) are pretty weak with algebra, so I started the class with a “do now” that had kids use the QF. So I recommend that.

Show kids the popcorn. (I had two different flavors.) Show your excitement about the activity. (I was genuinely excited!) Get this psyched. Hand out the worksheet but nothing else.

Put a three minute timer on the board. Explain the problem. Show kids a piece of cardstock with 4 squares drawn on it. Show kids a second piece of cardstock with those same four squares cut out and the flaps folded up so it looks like a box (but untaped, so you can unfold it too). Tell them the volume they create is the amount of popcorn they are going to get. And that you aren’t going to overfill their boxes — just to the brim. Tell them they have 3 minutes to work with their partner to come up with the best size square they want to cut out. And they are not allowed to do any calculations. Just visual estimation. 

At that point, give cardstock, ruler, scissors, and tape to kids. Do not let kids start until you press “GO” on the timer. Then… GO!

After three minutes, my kids were done. They measured the side length of the square they cut out and recorded it on the worksheet. They then cut and taped. They weren’t allowed to get their popcorn until they did one more thing… some math…


It was super important to me that kids didn’t measure anything, except for the side length of the square, to do these problems. Why? Because this is where I want kids to recognize the side length of the square is the height (that was obvious to all my kids), but also that when calculating the length and width, they were going to be doing 216-2x and 279-2x (where x was their side length). Only a few kids didn’t get the 2bit (they only subtracted x), but I sent them back to their seats to rethink their length and width and they immediately got it. It was actually awesome to hear their OOOOOOHHHHH moment. But yeah, no measuring. They have to use their brainzzz to come up with the length and width with what they are given!

Only after checking their volume with me, and I said it was correct, could they fill their boxes with popcorn.

As an aside, when writing this activity, I had to decide what level of scaffolding I wanted to give for this. I decided not much. So I didn’t include any diagrams. (Well, I did put two on the very last page of the worksheet in case a kid needed some additional help. Turns out no one did.) I also initially wrote the worksheet to be in inches, but then changed to centimeters, and then after thinking a bit more, I changed to millimeters. Why? So kids don’t have to deal with fractions (inches) or decimals (centimeters), and we could keep our eye on the prize. It also made the volume huge — and so kids would have to do a little work to get the correct window when graphing.

At this point, I sent them back to their seats with popcorn in their box to then solve the general case. Close to the end of class, I posted the different volumes students got by estimation (it was a tiny class today… kids were absent or at sports).


Overall, I spent about 35 minutes on this in class. One pair finished completely. All the others are at the place where they are in the middle of the calculus work (close to being done).

Play! Create! Adult!: My Second TMC17 Recap Post

Here are some more TMC17 notes!

Don’t play with your food, damnit! Play with your math!

I love the idea of having kids engaging in recreational math. I don’t have much time to encourage that in my curriculum — or at least the only way I’ve found for that to happen is with my explore math project [posts 1, 2, 3; website]. Some kids get some extra math problems to work on at math club (usually problems from math competitions or, and kids do math problems on our math team. But that isn’t the spirit of what I want to bring to my school. I want to get kids just fooling around with math for fun! Tinkering! Thinkering! Building! Collaborating! So that’s why I fell in love with Joey Kelly (@joeykelly89)’s my favorite presentation. Where he shared with us Play With Your

He and a friend created it. Right now it has 15 sheets of paper that can be printed out, each with a challenge. The name, inspired. Design wise, fantastic. But the problems are captivating, easy to dive into, and many have this open-endedness that can lead to obsession. When I was at the Desmos Fellowship a couple weeks ago, they had these for us to work on as a way to get to know each other. Each table had a different one and we were encouraged to play, and meet others who were playing, and then move to a different table and meet and play when we felt like it. The one I spent all my time on, trying to come up with a strategy? One that I know will get my kids in competitive mode? Poster 5:


I liked getting to know people and I liked these problems! At TMC we were given poster 14 and I became obsessed. And eventually, I solved it (and a second more complicated one). But it took A LONG TIME and I DIDN’T CARE. I refused to go play boardgames at gamenite until I had climbed this mountain!

I need to brainstorm if and how I am going to use these in my school. Some initial ideas:

1. Leave copies of these in the library for kids to use. Or put many copies of all of them on a bulletin board for kids to take, so when they’re board and standing there, they just grab one and start thinking.

2. Use these when I need to fill a long block (we have double periods one out of every five times we meet our kids) and I don’t have a good idea.

3. Plan an Upper School math night, where we gather at a space in the school, do math, order pizza. Like PCMI’s “pizza and math” (was that what it was called? we can do better!). These can be the amuse bouche or the main event!

Math Art!

Speaking of recreational math, at TMC17 there was so much math art. I just wanted to share some of it!

Captivating! I hope at some point to learn how to make crochet coral. It feels like once I get in the rhythm, it could be so soothing. Actually, I wonder if it would be fun to have a MAKER MATH club where we make math stuff together. And create our own math art gallery. Things like the things shown here, but also like these, and origami (demaine and lang), and a menger sponge made of business cards, and design and 3d print these optical illusions, and carefully color in pictures from Patterns of the Universe, and create our own mathart coloring pages. If you are reading this and have ideas of things that we could make, let me know in the comments! You probably can tell this is something I’m actually totally *feeling* (FYI, for me, the definitive math art page is @mathhombre’s page here.)

How To Adult: Let’s Buy A House

So @rawrdimus gave a my favorite on how to adult. He was teaching calculus and wanted to keep his seniors engaged. So he came up with this project that had kids pick a few houses and figure out what they’d need to buy it. He was the banker (a hilarious banker) and gave them two different mortgage options (a 15 year and a 30 year, with different interest rates) and they had to figure out their monthly payments.

I know come the spring, the kids in my calculus class will have their attention wane. So I think something like this could work (this investigation on wealth inequality worked a few years ago)! But right now it’s a little bit like trying to put a square peg into a round hole. I need it to have some more calculus before I do something like this though. Maybe we’ll spend some time talking about e or we’ll do something with summing (in)finite geometric series, and maybe seeing that as a riemann sum? I think it’s totally doable — I just need to think a bit more! But if you want to get a sense of why I’m trying to make this happen, just watch Jonathan’s presentation and you’ll totally get it. (Here’s his blogpost.)

The Formal Definition of the Derivative, or Why Holes Matter

Lucky you! Two calculus posts in one day. Mainly because I don’t want some of these ideas to disappear in my hiatus from teaching it. This one deals with our favorite topic: the formal definition of the derivative.

\lim_{h \rightarrow 0}\frac{f(x+h)-f(x)}{(x+h)-(x)}

I see that expression and my mind goes to the following places:

  • Doing a bunch of tedious algebraic calculations for a particular function in order to find the derivative.
  • I “see” in the expression the slope of two points close together.
  • I envision the following image, showing a secant line turning into a tangent line

And I think for many teachers and most calculus students, they think something similar.

However I asked my (non-AP) calculus kids what the h stood for. Out of two sections of kids, I think only one or two kids got it with minimal prompting. (Eventually I worked on getting the rest to understand, and I think I did a decent job.) I dare you to ask your kids and see what you get as a response.

What I suspect is that kids get told the meaning of \lim_{h \rightarrow 0}\frac{f(x+h)-f(x)}{(x+h)-(x)} and it gets drilled into their heads that they might not fully understand what algebraically is going on with it.

It was only a few years ago that I came to the conclusion that even I myself didn’t understand it. And when I finally thought it all through, I came to the conclusion that all of differential calculus is based on the question: how do you find the height of a hole? I started seeing holes as the lynchpin to a conceptual understanding of derivatives. I never got to fully exploit this idea in my classes, but I did start doing it. It felt good to dig deep.

The big thing I realized is that I rarely looked at the formal definition of the derivative as an equation. I almost always looked at it as an expression. But if it’s an equation…

f'(x)=\lim_{h \rightarrow 0}\frac{f(x+h)-f(x)}{(x+h)-(x)}

… what is it an equation of? An equation with a limit as part of it?! Let’s ignore the limit for now.

Without the limit, we have an average rate of change function, between (x,f(x)) and (x+h,f(x+h)). And since we have removed the limit, we really have a function of two variables.


We feed an x and h into the function, and we get an output of a slope! It’s the slope between (x,f(x)) and (x+h,f(x+h))!

Let’s get concrete. Check out this applet (click the image to have it open up):


On the left is the original function. We are going to calculate the “average rate of change function” with an x-input of 1.64 (the x-value the applet opens up with).We are now going to vary h and see what our average rate of change function looks like: f(1.64,h)=\frac{f(1.64+h)-f(1.64)}{h}. That’s what the yellow point is.

Before varying h, notice in the image when h is a little above 2, the yellow “Average Rate of Change” dot is negative. That’s because the slope of the secant line between the original point (1.64, f(1.64)) and a second point on the function that is a little over 2 units to the right is negative. (Look at the secant line on the graph on the left!)

Now let’s change h. Drag the point on the right graph that says “h value.” As you drag it, you’ll see the second point on the function move, and also the yellow point will change with the corresponding new slope. As you drag h, you’re populating points on the right hand graph. What’s being drawn on the right hand graph is the average rate of change graph for all these various distances h!

Here’s an image of what it looks like after you drag h for a bit.


Notice now when our h-value is almost -3 (so the second point is 3 horizontal units left of the original point of interest), we have a positive slope for the secant line… a positive average rate of change.

The left graph is an x-f(x) graph (those are the axes). The right graph is a $h-AvgRateOfChange$ graph (those are the axes).

Okay okay, this is all well and dandy. But who cares?


We may have generated an average rate of change function, but we wanted a derivative function. That is when h approaches 0. We want to examine our average rate of change graph near where h is 0. Recall the horizontal axis is the h-axis on the right graph. So when h is close to 0, we’re looking at the the vertical axis… Let’s look…

zzzzz3 Oh dear missing points! Why? Let’s drag the h value to exactly h=0.


The yellow average rate of change point disappeared. And it says the average rate of change is undefined! 0/0. We have a hole! Why?

(When h=0 exactly, our average rate of change function is: \frac{f(x+0)-f(x)}{(x+0)-x} which is 0/0. YIKES!

But the height of the hole is precisely the value of the derivative. Because remember the derivative is what happens as h gets super duper infinitely close to 0.

We can drag h to be close to 0. Here h is 0.02. zzzzz5

But that is not infinitely close. So this is a good approximation. But it isn’t perfect.

And this is why I have concluded that all of differential calculus actually reduces to the problem of finding the height of a hole. 

Here are three different average rate of change applets that you might find fun to play with:

one (this is the one above)     two     three

In short (now that you’ve made it this far):

  • Look at the formal definition of the derivative as an equation, not an expression. It yields a function.
  • What kind of function does it indicate? An average rate of change function. And in fact, thinking deeply, it actually forces you to create a function with two inputs: an x-value and an h-value.
  • Now to make it a derivative, and not an average rate of change, you need to bring h close to 0.
  • As you do this, you will see you create a new function, but with a hole at h=0.
  • It is the height of this hole that is the derivative.

 PS. A random thought… This could be useful in a multivariable calculus course. Let’s look at the average rate of change function for f(x)=x^2:


Let’s convert this to a more traditional form:


Now we have a function of two variables. We want to find what happens as h (I mean y) gets closer and closer to 0 for a given x-value. So to do this, we can just visually look at what happens to the function near y=0. Even though the function will be undefined at all points where y=0, visually the intersection of the plane y=0 and the average rate of function should carve out the derivative function.

If this doesn’t make sense, I did some quick graphs on WinPlot…

This is for f(x)=x^2. And I graphed the plane where y=0. We should get the intersection to look like the line f'(x)=2x.


Yup. Cool.

I did it for f(x)=\sin(x) also… The intersection should look like f'(x)=\cos(x).


u-substitution, visually

I created some calculus Geogebra applet thingies last summer that I wanted to use last year. Alas, time ran short and we never got to use them. However since I’m no longer teaching calculus (at least not next year), I figured I’d throw them up in case anyone else out there finds them useful.

They deal with u-substitution. I’ve always had a problem with teaching it. Here’s how it goes… You have some integral in terms of x. You convert all the xs and dxs into us and dus. And viola! It works out. It’s very powerful. And it’s procedural. And kids have throughout the years learned this “substitution”-y thing works [1]. So kids tend to like it.

But here’s the thing. For my kids, it’s just a random method to evaluate an integral. They don’t conceptually understand what is going on… what this changing of variables is doing.

When I thought deeply about this, I realized what truly is happening is that we are transforming space… From the x-f(x) plane to a much convoluted u-f(u) plane. But it is through our particular choice of u that makes the change in space beautiful, because it turns something that looks particularly nasty and converts it into something that looks rather nice. Ish.

Here is a screenshot from one of my geogebra applets illustrating this (you can click on the screenshot to be taken to the applet):


We start with a pretty ugly function that we’re integrating. But by using this substitution to morph space, we end up with a much nicer function. I mean, throw both of these up and ask your kids — which one of these would they rather find the integral of. They’ll say the one on the right! The u-substitution one. Although not perfect [2], it’s pretty kewl.

The applets are here:

One     Two     Three     Four     Five

And the applets are dynamic! You can change the lower and upper bounds on the x-f(x) graphs and the lower and upper bounds automatically change on the u-f(u) graph! But because math is awesome, the areas are preserved!

Some things I maybe would have done with the applets in my class:

  • Let kids play with the applets and get familiar with them.
  • For the first applet (starting simple), have kids count the boxes and estimate the area on one graph, and then do it on the other (careful though! the gridlines are different on the two graphs!). Whoa, they are always the same!
  • For the first applet (again, starting simple), ask them to drag the upper limit to the left of the lower limit. Explain what happens and why.
  • The second applet is my favorite! Put the lower limit at x=0. Drag the upper limit to the right. Explain what is happening graphically — and that tie that graphically understanding to the particular u-substitution chosen.
  • In the second applet, can students find three different sets of bounds which give a signed area of 0?
  • In the fourth applet, have students put the lower and upper bounds on x=6 and x=7. Have them calculate the average height of that function in that interval (the area is given!). Do they have visual confirmation of this average height for this interval?Now Looking at the u-graph, the bounds are now u=8 and u=10. Have them estimate the average height of that function in that interval (again, the area is given)! (The average height “halves” in order to compensate for the wider interval. It has to since the areas must be the same) Have students do this again for any lower and upper bounds for this graph. It will always work!
  • In the fifth applet, have students put the lower bound at x=0, and have them drag the upper bound to the right. What can they conclude about the areas of each of the pink regions on the x-f(x) graph? (Alternatively, you can ask: you can see from the u-f(u) graph that the signed area on the original graph will never get bigger than 1, no matter what bounds you choose. Try it! It is impossible! Armed with that information, can you conclude about the pink regions in original graph?)

I’m confident I had more ideas about how to use these when I made them [3]. But it was over a year ago and I haven’t really thought of them since. But anyway, I hope they are of some use to you. Even if you just show them to your kids cursorily to illustrate what graphically is going on when you are doing u-substitution. 


[1] Though I bet if you asked a class why they can use “substitution” when solving a system of equations, what the reasoning is behind this method, they might draw a bit of a blank… But that’s neither here nor there…

[2] What would actually be perfect would be a copy of individual Riemann Sum rectangles from the x-f(x) graph “leaving” the first graph, then in front of the viewer stretching/shrinking their height and width for the appropriate u-f(u) graph, and then floating over to the u-f(u) graph and placing itself at the appropriate place on the u axis. And then a second rectangle does that. And a third. And a fourth. You get the picture. But even though the height and width morph, the area of the original rectangle and the area of the new rectangle will be the same (or to be technical, very very close to the same, since we’re just doing approximations). In this sort of applet, you’d see the actual morphing. That’s what is hidden in my applets above. But that’s actually where the magic happens!

[3] I recall now I was going to make kids do some stuff by hand. For example: before they use the applets, kids would be given lower and upper x-bounds, and asked to calculate lower and upper u bounds. And then use the applets to confirm. Similarly, given lower and upper u-bounds, calculate lower and upper x-bounds. Use the applets to confirm.

An unformed idea to teach understanding to the chain rule

I’m soon going to embark on teaching the chain rule in calculus. I have found ways to help kids remember the chain rule (“the outer function is the mama, the inner function is the baby… when you take the derivative, you derive the mama and leave the baby inside, and then you multiply by the derivative of baby”),  ways to write things down so their information stays organized, and I have shown them enough patterns to let them see it’s true. But I have never yet found a way to conceptually get them to understand it without confusing them. (The gear thing doesn’t help me get it… Although I understand the analogy, it feels divorced from the actual functions themselves… and these functions have a constant rate of change.)

I think I now have a way that might help students to get conceptually understand what’s going on. I only had the insight 10 minutes ago so I’m going to use this blogpost to see if I can’t get the ideas straight in my head… The point of this post is not to share a way I’ve made the chain rule understandable. It’s for me to work through some unformed ideas. I am not yet sure if I have a way to turn this into something that my kids will understand.

So here’s where I’m starting from. Every “nice” function (and those are the functions we’re dealing with) is basically like an infinite number of little line segments connected together. Thus, when we take a derivative, we’re pretty much just asking “what’s the slope of the little line segment at x=3?” for example.

Now here’s the magic. In my class, we’ve learned that whatever transformations a function undergoes, the tangent line undergoes the same transformations! If you want to see that, you can check it out here.

For a quick example, let’s look at f(x)=\sin{x} and g(x)=2\sin{(5x)}+1.

We see that g(x) is secretly f(x) which has undergone a vertical stretch of 2, a horizontal shrink of 1/5, and has been moved up 1.

Let’s look at the tangent line to f(x) at x=\pi/3. It is approximately y=0.5x+0.34.


Now let’s put that tangent line through the transformations:


Vertical Stretch of 2: y=2(0.5x+0.34)=x+0.68

Horizontal shrink of 1/5: y=5x+0.68

Shift up 1: y=5x+1.68

Now let’s plot g(x) and our transmogrified tangent line:


Yay! It worked! (But of course we knew that would happen.)

The whole point of this is to show that tangent lines undergo the same transformations as the functions — because the functions themselves are pretty much just a bunch of these infinitely tiny tangent line segments all connected together! So it would actually be weird if the tangent lines didn’t behave like the functions.

My Thought For Using This for The Chain Rule

So why not look at function composition in the same way?

We can look at a composition of functions at a point as simply a composition of these little line segments. 

Let’s see if I can’t clear this up by making it concrete with an example.

Let’s look at m(x)=\sqrt{x^3+1}.

And so we can be super concrete, let’s try to find m'(2), which is simply the slope of the tangent line of m(x) at x=2.

I’m going to argue that just as \sqrt{x} and x^3+1 are composed to get our final function, we can compose the tangent lines to these two functions to get the final tangent line at x=2.

Let’s start with the x^3+1. At x=2, the tangent line is y_{inner}=12x-15 (I’m not showing the work, but you can trust me that it’s true, or work it out yourself.)

Now let’s start with the square root function. We have to be thoughtful about this. We are dealing with m(2) which really means that we’re taking the square root of 9. We we want the tangent line to \sqrt{x} at x=9. That turns out to be (again, trust me?): y_{outer}=\frac{1}{6}x+\frac{3}{2}.

So now we have our two line segments.

We have to compose them.


This simplifies to:


Let’s look at a graph of m(x) and our tangent line:



Where did we ultimately get the slope of 2 from? When we composed to two lines together, we multiplied the slope of the inner function (12) by the slope of the outer function (1/6). And that became our new line’s slope.

Chain rule!

How we generalize this to the chain rule

For any composition of functions, we are going to have an inner and an outer function. Let’s write c(x)=o(i(x)) where we can clearly remember which one is the inner and which one is the outer functions. Let’s pick a point x_0 where we want to find the derivative.

We are going to have to find the little line segment of the inner function and compose that with the little line segment of the outer function, both at x_0. That will approximate the function c(x) at x_0.

The line segment of the inner function is going to be y_{inner}=i'(x_0)x+blah1

The line segment of the outer function is going to be y_{outer}=o'(i(x_0))x+blah2

I am going to keep those terms blah1 and blah2 only because we won’t really need them. Let’s remember we only want the derivative (the slope of the tangent line), not the tangent line itself. So our task becomes easier.

Let’s compose them: y_{composed}=o'(i(x_0))[i'(x_0)x+blah1]+blah2

This simplifies to y_{composed}=o'(i(x_0))i'(x_0)x+blah3

And since we only want the slope of this line (the derivative is the slope of the tangent line, remember), we have:


Of course we chose an arbitrary point x_0 to take the derivative at. So we really have:


Which is the chain rule.

I got rid* of Limits in Calculus (*almost entirely)

I’ve been meaning to write this post for a while. I teach non-AP Calculus. My goal in this course is to get my kids to understand calculus with depth — that means my primary focus is on conceptual understanding, where facility with fancy-algebra things is secondary. Now don’t go thinking my kids come out of calculus not knowing how to do real calculus. They do. It is just that I pare things down so that they don’t have to find the derivatives of things like y=\cot(x). Why? Because even though I could teach them that (and I have in the past), I would rather spend my time doing less work on moving through algebraic hoops, and more work on deep conceptual understanding.

Everything I do in my course aims for this. Sometimes I succeed. Sometimes I fail. But I don’t lose sight of my goal.

Each year, I have parts of the calculus curriculum I rethink, or have insights on. In the past few years, I’ve done a lot of thinking about limits and where they fit in the big picture of things. Each year, they lose more and more value in my mind. I used to spend a quarter of a year on them. In more recent years, I spent maybe a sixth of a year on them. And this year, I’ve reduced the time I spend on limits to about 5 minutes.*

*Okay, not really. But kinda. I’ll explain.

First I’ll explain my reasoning behind this decision. Then I’ll explain how I did it.

Reasoning Behind My Decision to Eliminate Limits

For me, calculus has two major parts: the idea of the derivative, and the idea of the integral.

Limits show up in both [1]. But where do they show up in derivatives?

  • when you use the formal definition of the derivative

and… that’s pretty much it. And where do they show up in integrals?

  • when you say you are taking the sum of an infinite sum of infinitely thin rectangles

and… that’s pretty much it. I figure if that’s all I need limits for, I can target how I introduce and use limits to really focus on those things. Do I really need them to understand limits at infinity of rational functions? Or limits of piecewise functions? Or limits of things like y=\sin(1/x) as x\rightarrow 0?

Nope. And this way I’m not wasting a whole quarter (or even half a quarter) with such a simple idea. All I really need — at least for derivatives — is how to find the limit as one single variable goes to 0. C’est tout!

How I did it

This was our trajectory:

(1) Students talked about average rate of change.

(2) Students talked about the idea of instantaneous rate of change. They saw it was problematic, because how can something be changing at an instant? If you say you’re travelling “58 mph at 2:03pm,” what exactly does that mean? There is no time interval for this 58mph to pop out of, since we’re talking about an instant, a single moment in time (of 2:03pm). So we problematized the idea of instantaneous rate of change. But we also recognized that we understand that instantaneous rates of change do exist, because we believe our speedometers in our car which say 60mph. So we have something that feels philosophically impossible but in our guts and everyday experience feels right. Good. We have a problem we need to resolve. What might an instantaneous rate of change mean? Is it an oxymoron to have a rate of change at a instant?

(3) Students came to understand that we could approximate the instantaneous rate of change by taking the slope of two points really really really close to each other on a function. And the closer that we got, the better our approximation was. (Understanding why we got a better and better approximation was quite hard conceptual work.) Similarly students began to recognize graphically that the slope of two points really close to each other is actually almost the slope of the tangent line to the function.

(4) Now we wanted to know if we could make things exact. We knew we could make things exact if we could bring the two points infinitely close to each other. But each time we tried that, we got either got two points pretty close to each other or the two points lay directly on top of each other (and you can’t find the slope between a point and itself). So still we have a problem.

And this is where I introduced the idea of introducing a new variable, and eventually, limits.

We encountered the question: “what is the exact instantaneous rate of change for f(x)=x^2 at x=3?

We started by picking two points close to each other: (3,9) and (3+h,(3+h)^2)

This was the hardest thing for students to understand. Why would we introduce this extra variable h. But we talked about how (3.0001,3.0001^2) wasn’t a good second point, and how (3.0000001,3.0000001^2) also wasn’t a good second point. But if they trusted me on using this variable thingie, they will see how our problems would be resolved.

We then found the average rate of change between the two points, recognizing that the second point could be really faraway from the first point if h were a large positive or negative number… or close to the first point if h were close to 0.

Yes, students had to first understand that h could be any number. And they had to come to the understanding that h represented where the second point was in relation to the first point (more specifically: how far horizontally the second point was from the first point).

And so we found the average rate of change between the two points to be:


We then said: how can we make this exact? How can we bring the two points infinitely close to each other? Ahhh, yes, by letting h get infinitely close to 0.

And so I introduce the idea of the limit as such:

If I have \lim_{h\rightarrow 0} blah, it means what blah gets infinitely close to if h gets infinitely close to 0 but is not equal to 0. That last part is key. And honestly, that’s pretty much the entirety of my explanation about limits. So that’s the 5 minutes I spend talking about limits.

So to find the instantaneous rate of change, we simply have:

InstRateOfChange=\lim_{h\rightarrow0} \frac{(3+h)^2-9}{(3+h)-3}

This is simply the slope between two points which have been brought infinitely close together. Yes, that’s what limits do for you.

And then we simplify:

InstRateOfChange=\lim_{h\rightarrow0} \frac{9+6h+h^2-9}{h}

InstRateOfChange=\lim_{h\rightarrow0} \frac{6h+h^2}{h}

InstRateOfChange=\lim_{h\rightarrow0} \frac{h(6+h)}{h}

InstRateOfChange=\lim_{h\rightarrow0} \frac{h}{h} \frac{(6+h)}{1}

Now because we know that h is close to 0, but not equal to 0, we can say with confidence that \frac{h}{h}=1. Thus we can say:

InstRateOfChange=\lim_{h\rightarrow 0} (6+h)

And now as h goes to 0, we see that 6+h gets infinitely close to 6.

Done. (Here’s a do now I did in class.)

We did this again and again to find the instantaneous rate of change of various functions at a points. For examples, functions like:

f(x)=x^3-2x+1 at x=1

g(x)=\sqrt{2-3x} at x=-2

h(x)=\frac{5}{2-x} at x=1

For these, the algebra got more gross, but the idea and the reasoning was the same in every problem. Notice to do all of these, you don’t need any more knowledge of limits than what I outlined above with that single example. You need to know why you can “remove” the \frac{h}{h} (why it is allowed to be “cancelled” out), and then what happens as h goes to 0. That’s all. 

Yup, again, notice I only needed to rely on this very basic understanding of limits to solve these three problems algebraically: \lim_{h\rightarrow 0} blah means what blah gets infinitely close to if h gets infinitely close to 0 but is not equal to 0. 

(5) Eventually we generalize to find the instantaneous rate of change at any point, using the exact same process and understanding. At this point, the only difference is that the algebra gets slightly more challenging to keep track of. But not really that much more challenging.

(6) Finally, waaaay at the end, I say: “Surprise! The instantaneous rate of change has a fancy calculus word — derivative.

Apologies in advance if any of this was unclear. I feel I didn’t explain thing as well as I could have. I also want to point out that I understand if you don’t agree with this approach. We all have different thoughts about what we find important and why. I can (and in fact, in the past, I have) made the case that going into depth into limits is of critical importance. I personally just don’t see things the same way anymore.

Now I should also say that there have been a few downsides to this approach, but on the whole it’s been working well for me so far. I would elaborate on the downsides but right now I’m just too exhausted. Night night!

[1] Okay, I should also note that limits show up in the definition for continuity. But since in my course I don’t really focus on “ugly” functions, I haven’t seen the need to really spend time on the idea of continuity except in the conceptual sense. Yes, I can ask my kids to draw the derivative of y=|x| and they will be able to. They will see there is a jump at x=0. I don’t need more than that.