The Formal Definition of the Derivative, or Why Holes Matter

Lucky you! Two calculus posts in one day. Mainly because I don’t want some of these ideas to disappear in my hiatus from teaching it. This one deals with our favorite topic: the formal definition of the derivative.

\lim_{h \rightarrow 0}\frac{f(x+h)-f(x)}{(x+h)-(x)}

I see that expression and my mind goes to the following places:

  • Doing a bunch of tedious algebraic calculations for a particular function in order to find the derivative.
  • I “see” in the expression the slope of two points close together.
  • I envision the following image, showing a secant line turning into a tangent line

And I think for many teachers and most calculus students, they think something similar.

However I asked my (non-AP) calculus kids what the h stood for. Out of two sections of kids, I think only one or two kids got it with minimal prompting. (Eventually I worked on getting the rest to understand, and I think I did a decent job.) I dare you to ask your kids and see what you get as a response.

What I suspect is that kids get told the meaning of \lim_{h \rightarrow 0}\frac{f(x+h)-f(x)}{(x+h)-(x)} and it gets drilled into their heads that they might not fully understand what algebraically is going on with it.

It was only a few years ago that I came to the conclusion that even I myself didn’t understand it. And when I finally thought it all through, I came to the conclusion that all of differential calculus is based on the question: how do you find the height of a hole? I started seeing holes as the lynchpin to a conceptual understanding of derivatives. I never got to fully exploit this idea in my classes, but I did start doing it. It felt good to dig deep.

The big thing I realized is that I rarely looked at the formal definition of the derivative as an equation. I almost always looked at it as an expression. But if it’s an equation…

f'(x)=\lim_{h \rightarrow 0}\frac{f(x+h)-f(x)}{(x+h)-(x)}

… what is it an equation of? An equation with a limit as part of it?! Let’s ignore the limit for now.

Without the limit, we have an average rate of change function, between (x,f(x)) and (x+h,f(x+h)). And since we have removed the limit, we really have a function of two variables.


We feed an x and h into the function, and we get an output of a slope! It’s the slope between (x,f(x)) and (x+h,f(x+h))!

Let’s get concrete. Check out this applet (click the image to have it open up):


On the left is the original function. We are going to calculate the “average rate of change function” with an x-input of 1.64 (the x-value the applet opens up with).We are now going to vary h and see what our average rate of change function looks like: f(1.64,h)=\frac{f(1.64+h)-f(1.64)}{h}. That’s what the yellow point is.

Before varying h, notice in the image when h is a little above 2, the yellow “Average Rate of Change” dot is negative. That’s because the slope of the secant line between the original point (1.64, f(1.64)) and a second point on the function that is a little over 2 units to the right is negative. (Look at the secant line on the graph on the left!)

Now let’s change h. Drag the point on the right graph that says “h value.” As you drag it, you’ll see the second point on the function move, and also the yellow point will change with the corresponding new slope. As you drag h, you’re populating points on the right hand graph. What’s being drawn on the right hand graph is the average rate of change graph for all these various distances h!

Here’s an image of what it looks like after you drag h for a bit.


Notice now when our h-value is almost -3 (so the second point is 3 horizontal units left of the original point of interest), we have a positive slope for the secant line… a positive average rate of change.

The left graph is an x-f(x) graph (those are the axes). The right graph is a $h-AvgRateOfChange$ graph (those are the axes).

Okay okay, this is all well and dandy. But who cares?


We may have generated an average rate of change function, but we wanted a derivative function. That is when h approaches 0. We want to examine our average rate of change graph near where h is 0. Recall the horizontal axis is the h-axis on the right graph. So when h is close to 0, we’re looking at the the vertical axis… Let’s look…

zzzzz3 Oh dear missing points! Why? Let’s drag the h value to exactly h=0.


The yellow average rate of change point disappeared. And it says the average rate of change is undefined! 0/0. We have a hole! Why?

(When h=0 exactly, our average rate of change function is: \frac{f(x+0)-f(x)}{(x+0)-x} which is 0/0. YIKES!

But the height of the hole is precisely the value of the derivative. Because remember the derivative is what happens as h gets super duper infinitely close to 0.

We can drag h to be close to 0. Here h is 0.02. zzzzz5

But that is not infinitely close. So this is a good approximation. But it isn’t perfect.

And this is why I have concluded that all of differential calculus actually reduces to the problem of finding the height of a hole. 

Here are three different average rate of change applets that you might find fun to play with:

one (this is the one above)     two     three

In short (now that you’ve made it this far):

  • Look at the formal definition of the derivative as an equation, not an expression. It yields a function.
  • What kind of function does it indicate? An average rate of change function. And in fact, thinking deeply, it actually forces you to create a function with two inputs: an x-value and an h-value.
  • Now to make it a derivative, and not an average rate of change, you need to bring h close to 0.
  • As you do this, you will see you create a new function, but with a hole at h=0.
  • It is the height of this hole that is the derivative.

 PS. A random thought… This could be useful in a multivariable calculus course. Let’s look at the average rate of change function for f(x)=x^2:


Let’s convert this to a more traditional form:


Now we have a function of two variables. We want to find what happens as h (I mean y) gets closer and closer to 0 for a given x-value. So to do this, we can just visually look at what happens to the function near y=0. Even though the function will be undefined at all points where y=0, visually the intersection of the plane y=0 and the average rate of function should carve out the derivative function.

If this doesn’t make sense, I did some quick graphs on WinPlot…

This is for f(x)=x^2. And I graphed the plane where y=0. We should get the intersection to look like the line f'(x)=2x.


Yup. Cool.

I did it for f(x)=\sin(x) also… The intersection should look like f'(x)=\cos(x).



u-substitution, visually

I created some calculus Geogebra applet thingies last summer that I wanted to use last year. Alas, time ran short and we never got to use them. However since I’m no longer teaching calculus (at least not next year), I figured I’d throw them up in case anyone else out there finds them useful.

They deal with u-substitution. I’ve always had a problem with teaching it. Here’s how it goes… You have some integral in terms of x. You convert all the xs and dxs into us and dus. And viola! It works out. It’s very powerful. And it’s procedural. And kids have throughout the years learned this “substitution”-y thing works [1]. So kids tend to like it.

But here’s the thing. For my kids, it’s just a random method to evaluate an integral. They don’t conceptually understand what is going on… what this changing of variables is doing.

When I thought deeply about this, I realized what truly is happening is that we are transforming space… From the x-f(x) plane to a much convoluted u-f(u) plane. But it is through our particular choice of u that makes the change in space beautiful, because it turns something that looks particularly nasty and converts it into something that looks rather nice. Ish.

Here is a screenshot from one of my geogebra applets illustrating this (you can click on the screenshot to be taken to the applet):


We start with a pretty ugly function that we’re integrating. But by using this substitution to morph space, we end up with a much nicer function. I mean, throw both of these up and ask your kids — which one of these would they rather find the integral of. They’ll say the one on the right! The u-substitution one. Although not perfect [2], it’s pretty kewl.

The applets are here:

One     Two     Three     Four     Five

And the applets are dynamic! You can change the lower and upper bounds on the x-f(x) graphs and the lower and upper bounds automatically change on the u-f(u) graph! But because math is awesome, the areas are preserved!

Some things I maybe would have done with the applets in my class:

  • Let kids play with the applets and get familiar with them.
  • For the first applet (starting simple), have kids count the boxes and estimate the area on one graph, and then do it on the other (careful though! the gridlines are different on the two graphs!). Whoa, they are always the same!
  • For the first applet (again, starting simple), ask them to drag the upper limit to the left of the lower limit. Explain what happens and why.
  • The second applet is my favorite! Put the lower limit at x=0. Drag the upper limit to the right. Explain what is happening graphically — and that tie that graphically understanding to the particular u-substitution chosen.
  • In the second applet, can students find three different sets of bounds which give a signed area of 0?
  • In the fourth applet, have students put the lower and upper bounds on x=6 and x=7. Have them calculate the average height of that function in that interval (the area is given!). Do they have visual confirmation of this average height for this interval?Now Looking at the u-graph, the bounds are now u=8 and u=10. Have them estimate the average height of that function in that interval (again, the area is given)! (The average height “halves” in order to compensate for the wider interval. It has to since the areas must be the same) Have students do this again for any lower and upper bounds for this graph. It will always work!
  • In the fifth applet, have students put the lower bound at x=0, and have them drag the upper bound to the right. What can they conclude about the areas of each of the pink regions on the x-f(x) graph? (Alternatively, you can ask: you can see from the u-f(u) graph that the signed area on the original graph will never get bigger than 1, no matter what bounds you choose. Try it! It is impossible! Armed with that information, can you conclude about the pink regions in original graph?)

I’m confident I had more ideas about how to use these when I made them [3]. But it was over a year ago and I haven’t really thought of them since. But anyway, I hope they are of some use to you. Even if you just show them to your kids cursorily to illustrate what graphically is going on when you are doing u-substitution. 


[1] Though I bet if you asked a class why they can use “substitution” when solving a system of equations, what the reasoning is behind this method, they might draw a bit of a blank… But that’s neither here nor there…

[2] What would actually be perfect would be a copy of individual Riemann Sum rectangles from the x-f(x) graph “leaving” the first graph, then in front of the viewer stretching/shrinking their height and width for the appropriate u-f(u) graph, and then floating over to the u-f(u) graph and placing itself at the appropriate place on the u axis. And then a second rectangle does that. And a third. And a fourth. You get the picture. But even though the height and width morph, the area of the original rectangle and the area of the new rectangle will be the same (or to be technical, very very close to the same, since we’re just doing approximations). In this sort of applet, you’d see the actual morphing. That’s what is hidden in my applets above. But that’s actually where the magic happens!

[3] I recall now I was going to make kids do some stuff by hand. For example: before they use the applets, kids would be given lower and upper x-bounds, and asked to calculate lower and upper u bounds. And then use the applets to confirm. Similarly, given lower and upper u-bounds, calculate lower and upper x-bounds. Use the applets to confirm.

An unformed idea to teach understanding to the chain rule

I’m soon going to embark on teaching the chain rule in calculus. I have found ways to help kids remember the chain rule (“the outer function is the mama, the inner function is the baby… when you take the derivative, you derive the mama and leave the baby inside, and then you multiply by the derivative of baby”),  ways to write things down so their information stays organized, and I have shown them enough patterns to let them see it’s true. But I have never yet found a way to conceptually get them to understand it without confusing them. (The gear thing doesn’t help me get it… Although I understand the analogy, it feels divorced from the actual functions themselves… and these functions have a constant rate of change.)

I think I now have a way that might help students to get conceptually understand what’s going on. I only had the insight 10 minutes ago so I’m going to use this blogpost to see if I can’t get the ideas straight in my head… The point of this post is not to share a way I’ve made the chain rule understandable. It’s for me to work through some unformed ideas. I am not yet sure if I have a way to turn this into something that my kids will understand.

So here’s where I’m starting from. Every “nice” function (and those are the functions we’re dealing with) is basically like an infinite number of little line segments connected together. Thus, when we take a derivative, we’re pretty much just asking “what’s the slope of the little line segment at x=3?” for example.

Now here’s the magic. In my class, we’ve learned that whatever transformations a function undergoes, the tangent line undergoes the same transformations! If you want to see that, you can check it out here.

For a quick example, let’s look at f(x)=\sin{x} and g(x)=2\sin{(5x)}+1.

We see that g(x) is secretly f(x) which has undergone a vertical stretch of 2, a horizontal shrink of 1/5, and has been moved up 1.

Let’s look at the tangent line to f(x) at x=\pi/3. It is approximately y=0.5x+0.34.


Now let’s put that tangent line through the transformations:


Vertical Stretch of 2: y=2(0.5x+0.34)=x+0.68

Horizontal shrink of 1/5: y=5x+0.68

Shift up 1: y=5x+1.68

Now let’s plot g(x) and our transmogrified tangent line:


Yay! It worked! (But of course we knew that would happen.)

The whole point of this is to show that tangent lines undergo the same transformations as the functions — because the functions themselves are pretty much just a bunch of these infinitely tiny tangent line segments all connected together! So it would actually be weird if the tangent lines didn’t behave like the functions.

My Thought For Using This for The Chain Rule

So why not look at function composition in the same way?

We can look at a composition of functions at a point as simply a composition of these little line segments. 

Let’s see if I can’t clear this up by making it concrete with an example.

Let’s look at m(x)=\sqrt{x^3+1}.

And so we can be super concrete, let’s try to find m'(2), which is simply the slope of the tangent line of m(x) at x=2.

I’m going to argue that just as \sqrt{x} and x^3+1 are composed to get our final function, we can compose the tangent lines to these two functions to get the final tangent line at x=2.

Let’s start with the x^3+1. At x=2, the tangent line is y_{inner}=12x-15 (I’m not showing the work, but you can trust me that it’s true, or work it out yourself.)

Now let’s start with the square root function. We have to be thoughtful about this. We are dealing with m(2) which really means that we’re taking the square root of 9. We we want the tangent line to \sqrt{x} at x=9. That turns out to be (again, trust me?): y_{outer}=\frac{1}{6}x+\frac{3}{2}.

So now we have our two line segments.

We have to compose them.


This simplifies to:


Let’s look at a graph of m(x) and our tangent line:



Where did we ultimately get the slope of 2 from? When we composed to two lines together, we multiplied the slope of the inner function (12) by the slope of the outer function (1/6). And that became our new line’s slope.

Chain rule!

How we generalize this to the chain rule

For any composition of functions, we are going to have an inner and an outer function. Let’s write c(x)=o(i(x)) where we can clearly remember which one is the inner and which one is the outer functions. Let’s pick a point x_0 where we want to find the derivative.

We are going to have to find the little line segment of the inner function and compose that with the little line segment of the outer function, both at x_0. That will approximate the function c(x) at x_0.

The line segment of the inner function is going to be y_{inner}=i'(x_0)x+blah1

The line segment of the outer function is going to be y_{outer}=o'(i(x_0))x+blah2

I am going to keep those terms blah1 and blah2 only because we won’t really need them. Let’s remember we only want the derivative (the slope of the tangent line), not the tangent line itself. So our task becomes easier.

Let’s compose them: y_{composed}=o'(i(x_0))[i'(x_0)x+blah1]+blah2

This simplifies to y_{composed}=o'(i(x_0))i'(x_0)x+blah3

And since we only want the slope of this line (the derivative is the slope of the tangent line, remember), we have:


Of course we chose an arbitrary point x_0 to take the derivative at. So we really have:


Which is the chain rule.

I got rid* of Limits in Calculus (*almost entirely)

I’ve been meaning to write this post for a while. I teach non-AP Calculus. My goal in this course is to get my kids to understand calculus with depth — that means my primary focus is on conceptual understanding, where facility with fancy-algebra things is secondary. Now don’t go thinking my kids come out of calculus not knowing how to do real calculus. They do. It is just that I pare things down so that they don’t have to find the derivatives of things like y=\cot(x). Why? Because even though I could teach them that (and I have in the past), I would rather spend my time doing less work on moving through algebraic hoops, and more work on deep conceptual understanding.

Everything I do in my course aims for this. Sometimes I succeed. Sometimes I fail. But I don’t lose sight of my goal.

Each year, I have parts of the calculus curriculum I rethink, or have insights on. In the past few years, I’ve done a lot of thinking about limits and where they fit in the big picture of things. Each year, they lose more and more value in my mind. I used to spend a quarter of a year on them. In more recent years, I spent maybe a sixth of a year on them. And this year, I’ve reduced the time I spend on limits to about 5 minutes.*

*Okay, not really. But kinda. I’ll explain.

First I’ll explain my reasoning behind this decision. Then I’ll explain how I did it.

Reasoning Behind My Decision to Eliminate Limits

For me, calculus has two major parts: the idea of the derivative, and the idea of the integral.

Limits show up in both [1]. But where do they show up in derivatives?

  • when you use the formal definition of the derivative

and… that’s pretty much it. And where do they show up in integrals?

  • when you say you are taking the sum of an infinite sum of infinitely thin rectangles

and… that’s pretty much it. I figure if that’s all I need limits for, I can target how I introduce and use limits to really focus on those things. Do I really need them to understand limits at infinity of rational functions? Or limits of piecewise functions? Or limits of things like y=\sin(1/x) as x\rightarrow 0?

Nope. And this way I’m not wasting a whole quarter (or even half a quarter) with such a simple idea. All I really need — at least for derivatives — is how to find the limit as one single variable goes to 0. C’est tout!

How I did it

This was our trajectory:

(1) Students talked about average rate of change.

(2) Students talked about the idea of instantaneous rate of change. They saw it was problematic, because how can something be changing at an instant? If you say you’re travelling “58 mph at 2:03pm,” what exactly does that mean? There is no time interval for this 58mph to pop out of, since we’re talking about an instant, a single moment in time (of 2:03pm). So we problematized the idea of instantaneous rate of change. But we also recognized that we understand that instantaneous rates of change do exist, because we believe our speedometers in our car which say 60mph. So we have something that feels philosophically impossible but in our guts and everyday experience feels right. Good. We have a problem we need to resolve. What might an instantaneous rate of change mean? Is it an oxymoron to have a rate of change at a instant?

(3) Students came to understand that we could approximate the instantaneous rate of change by taking the slope of two points really really really close to each other on a function. And the closer that we got, the better our approximation was. (Understanding why we got a better and better approximation was quite hard conceptual work.) Similarly students began to recognize graphically that the slope of two points really close to each other is actually almost the slope of the tangent line to the function.

(4) Now we wanted to know if we could make things exact. We knew we could make things exact if we could bring the two points infinitely close to each other. But each time we tried that, we got either got two points pretty close to each other or the two points lay directly on top of each other (and you can’t find the slope between a point and itself). So still we have a problem.

And this is where I introduced the idea of introducing a new variable, and eventually, limits.

We encountered the question: “what is the exact instantaneous rate of change for f(x)=x^2 at x=3?

We started by picking two points close to each other: (3,9) and (3+h,(3+h)^2)

This was the hardest thing for students to understand. Why would we introduce this extra variable h. But we talked about how (3.0001,3.0001^2) wasn’t a good second point, and how (3.0000001,3.0000001^2) also wasn’t a good second point. But if they trusted me on using this variable thingie, they will see how our problems would be resolved.

We then found the average rate of change between the two points, recognizing that the second point could be really faraway from the first point if h were a large positive or negative number… or close to the first point if h were close to 0.

Yes, students had to first understand that h could be any number. And they had to come to the understanding that h represented where the second point was in relation to the first point (more specifically: how far horizontally the second point was from the first point).

And so we found the average rate of change between the two points to be:


We then said: how can we make this exact? How can we bring the two points infinitely close to each other? Ahhh, yes, by letting h get infinitely close to 0.

And so I introduce the idea of the limit as such:

If I have \lim_{h\rightarrow 0} blah, it means what blah gets infinitely close to if h gets infinitely close to 0 but is not equal to 0. That last part is key. And honestly, that’s pretty much the entirety of my explanation about limits. So that’s the 5 minutes I spend talking about limits.

So to find the instantaneous rate of change, we simply have:

InstRateOfChange=\lim_{h\rightarrow0} \frac{(3+h)^2-9}{(3+h)-3}

This is simply the slope between two points which have been brought infinitely close together. Yes, that’s what limits do for you.

And then we simplify:

InstRateOfChange=\lim_{h\rightarrow0} \frac{9+6h+h^2-9}{h}

InstRateOfChange=\lim_{h\rightarrow0} \frac{6h+h^2}{h}

InstRateOfChange=\lim_{h\rightarrow0} \frac{h(6+h)}{h}

InstRateOfChange=\lim_{h\rightarrow0} \frac{h}{h} \frac{(6+h)}{1}

Now because we know that h is close to 0, but not equal to 0, we can say with confidence that \frac{h}{h}=1. Thus we can say:

InstRateOfChange=\lim_{h\rightarrow 0} (6+h)

And now as h goes to 0, we see that 6+h gets infinitely close to 6.

Done. (Here’s a do now I did in class.)

We did this again and again to find the instantaneous rate of change of various functions at a points. For examples, functions like:

f(x)=x^3-2x+1 at x=1

g(x)=\sqrt{2-3x} at x=-2

h(x)=\frac{5}{2-x} at x=1

For these, the algebra got more gross, but the idea and the reasoning was the same in every problem. Notice to do all of these, you don’t need any more knowledge of limits than what I outlined above with that single example. You need to know why you can “remove” the \frac{h}{h} (why it is allowed to be “cancelled” out), and then what happens as h goes to 0. That’s all. 

Yup, again, notice I only needed to rely on this very basic understanding of limits to solve these three problems algebraically: \lim_{h\rightarrow 0} blah means what blah gets infinitely close to if h gets infinitely close to 0 but is not equal to 0. 

(5) Eventually we generalize to find the instantaneous rate of change at any point, using the exact same process and understanding. At this point, the only difference is that the algebra gets slightly more challenging to keep track of. But not really that much more challenging.

(6) Finally, waaaay at the end, I say: “Surprise! The instantaneous rate of change has a fancy calculus word — derivative.

Apologies in advance if any of this was unclear. I feel I didn’t explain thing as well as I could have. I also want to point out that I understand if you don’t agree with this approach. We all have different thoughts about what we find important and why. I can (and in fact, in the past, I have) made the case that going into depth into limits is of critical importance. I personally just don’t see things the same way anymore.

Now I should also say that there have been a few downsides to this approach, but on the whole it’s been working well for me so far. I would elaborate on the downsides but right now I’m just too exhausted. Night night!

[1] Okay, I should also note that limits show up in the definition for continuity. But since in my course I don’t really focus on “ugly” functions, I haven’t seen the need to really spend time on the idea of continuity except in the conceptual sense. Yes, I can ask my kids to draw the derivative of y=|x| and they will be able to. They will see there is a jump at x=0. I don’t need more than that.

An expanded understanding of basic derivatives – graphically

The guilt that I feel for not blogging more regularly this year has been considerable, and yet, it has not driven me to post more. I’ve been overwhelmed and busy, and my philosophy about blogging it is: do it when you feel motivated. And so, I haven’t.

Today, I feel a slight glimmer of motivation. And so here I am.

Here’s what I want to talk about.

In calculus, we all have our own ways of introducing the power rule for derivatives. Graphically. Algebraically. Whatever. But then, armed with this knowledge…

that if f(x)=x^n, then f'(x)=nx^{n-1}

…we tend to drive forward quickly. We immediately jump to problems like:

take the derivative of g(x)=4x^3-3x^{-5}+2x^7

and we hurdle on, racing to the product and quotient rules… We get so algebraic, and we go very quickly, that we lose sight of something beautiful and elegant. This year I decided to take an extra few days after the power rule but before problems like the one listed above to illustrate the graphical side of things.

Here’s what I did. We first got to the point where we comfortably proved the power rule for derivatives (for n being a counting number). Actually, before I move on and talk about the crux of this post, I should show you what we did…

Okay. Now I started the next class with kids getting Geogebra out and plotting on two graphics windows the following:

slide1and they saw the following:

slide2At this point, we saw the transformations. On the left hand graph, we saw that the function merely shifted up one unit. On the right hand graphs, we saw a vertical stretch for one function, and a vertical shrink for the other.

Here’s what I’m about to try to illustrate for the kids.

Whatever transformation a function undergoes, the tangent lines to the function also undergoes the exact same transformation.

What this means is that if a function is shifted up one unit, then all tangent lines are shifted up one unit (like in the left hand graph). And if a function undergoes vertical stretching or shrinking, all tangent lines undergo the same vertical stretching or shrinking.

I want them to see this idea come alive both graphically and algebraically.

So I have them plot all the points on the functions where x=1. And all the tangent lines.

For the graph with the vertical shift, they see:


The original tangent line (to f(x)=x^2) was y=2x-1. When the function moved up one unit, we see the tangent line simply moved up one unit too.

Our conclusion?

slide4Yup. The tangent line changed. But the slope did not. (Thus, the derivative is not affected by simply shifting a function up or down. Because even though the tangent lines are different, the slopes are the same.)

Then we went to the second graphics view — the vertical stretching and shrinking. We drew the points at x=1 and their tangent lines…


…and we see that the tangent lines are similar, but not the same. How are they similar? Well the original function’s tangent line is the red one, and has the equation y=2x-1. Now the green function has undergone a vertical shrink of 1/4. And lo and behold, the tangent line has also!

To show that clearly, we did the following. The original tangent line has equation y=2x-1. So to apply a vertical shrink of 1/4 to this, you are going to see y=\frac{1}{4}(2x-1) (because you are multiplying all y-coordinates by 1/4. And that simplifies to y=0.5x-0.25. Yup, that’s what Geogebra said the equation of the tangent line was!

Similarly, for the blue function with a vertical stretch of 3, we get y=3(2x-1)=6x-3. And yup, that’s what Geogebra said the equation of the tangent line was.

What do we conclude?


And in this case, with the vertical stretching and shrinking of the functions, we get a vertical stretching and shrinking of the tangent lines. And unlike moving the function up or down, this transformation does affect the slope!

I repeat the big conclusion:

Whatever transformation a function undergoes, the tangent lines to the function also undergoes the exact same transformation.

I didn’t actually tell this to my kids. I had them sort of see and articulate this.

Now they see that if a function gets shifted up or down, they can see that the derivative stays the same. And if there is a vertical stretch/shrink, the derivative is also vertically stretched/shrunk.

The next day, I started with the following “do now.” We haven’t learned the derivative of \sin(x), so I show them what Wolfram Alpha gives them.


For (a), I expect them to give the answer g'(x)=3\cos(x) and for (b), h'(x)=-\cos(x).

The good thing here is now I get to go for depth. WHY?

And I hear conversations like: “Well, g(x) is a transformation of the sine function which gives a vertical stretch of 3, and then shifts the function up 4. Well since the function undergoes those transformations, so does the tangent lines. So each tangent line is going to be vertically stretched by 3 and moved up 4 units. Since the derivative is only the slope of the tangent line, we have to see what transformations affect the slope. Only the vertical stretch affects the slope. So if the original slope of the sine function was \cos(x), then we know that the slope of the transformed function is 3\cos(x).

That’s beautiful depth. Beautiful.

For (b), I heard talk about how the negative sign is a reflection over the x-axis, so the tangent lines are reflected over the x-axis also. Thus, the slopes are the opposite sign… If the original sine functions slope of the tangent lines was \cos(x), then the new slopes are going to be -\cos(x).

This isn’t easy for my kids, so when I saw them struggling with the conceptual part of things, I whipped up this sheet (.docx).

And here are the solutions

And here is a Geogebra sheet which shows the transformations, and the new tangent line (and equation), for this worksheet.

Now to be fair, I don’t think I did a killer job with this. It was my first time doing it. I think some kids didn’t come out the stronger for this. But I do feel that the kids who do get it have a much more intuitive understanding of what’s going on.

I am much happier to know that if I ask kids what the derivative of q(x)=6x^9 is, they immediately think (or at least can understand) that we get q'(x)=6*9x^8, because…

our base function is x^9 which has derivative (aka slope of the tangent line) 9x^8… Thus the transformed function 6x^9 is going to be a vertical stretch, so all the tangent lines are going to be stretched vertically by a factor of 9 too… thus the derivative of this (aka the slope of the tangent line) is q'(x)=6*9x^8.

To me, that sort of explanation for something super simple brings so much graphical depth to things. And that makes me feel happy.

Do They Get It? The Instantaneous Rate of Change Exactly

Today in calculus I wanted to check if students really understood what they were doing when they were finding the instantaneous rate of change. (We haven’t learned the word derivative yet, but this is the formal definition of the derivative.)

So I handed out this worked out problem.

And I had them next to each of the letters write a note answering the following individually (not as a group):

A: write what the expression represents graphically and conceptually

B: write what the notation \lim_{h\rightarrow0} actually means. Why does it need to be there to calculate the instantaneous rate of change. (Be sure to address with h means.)

C: write what mathematical simplification is happening, and why were are allowed to do that

D: write what the reasoning is behind why were are allowed to make this mathematical move

E: explain what this number (-1) means, both conceptually and graphically

It was a great activity. I had them do it individually, but I should have had students (after completing it) discuss in groups before we went to the whole group context. Next time…

Anyway, the answers I was looking for (written more drawn out):

A: the expressions represents the average rate of change between two points, one fixed, and the other one defined in relation to that first point. The average rate of change is the constant rate the function would have to go at to start at one point and end up at the second. Graphically, it is the slope of the secant line going through those two points.

B: the \lim_{h\rightarrow0} is simply a fancy way to say we want to bring h closer and closer and closer to zero (infinitely close) but not equal zero. That’s all. The expression that comes after it is the average rate of change between two points. As h gets closer and closer to 0, the two points get closer and closer to each other. We learned that if we take the average rate of change of two points super close to each other, that will be a good approximation for the instantaneous rate of change. If the two points are infinitely close to each other, then we are going to get an exact instantaneous rate of change!

C: we see that \frac{h}{h} is actually 1. We normally would not be allowed to say that, because there is the possibility that h is 0, and then the expression wouldn’t simplify to 1. However we know from the limit that h is really close to 0, but not equal to 0. Thus we can say with mathematical certainty that \frac{h}{h}=1

D: as we bring h closer and closer to 0, we see that h-1 gets closer and closer to -1. Thus if we bring h infinitely close to 0, we see that h-1 gets infinitely close to -1.

E: the -1 represents the instantaneous rate of change of x^2-5x+1 at x=2. This is how fast the function is changing at that instant/point. It is graphically understood as the slope of the tangent line drawn at x=2.

I loved doing this because if a student were able to properly answer each of the questions, they really truly understand what is going on.

Starting Calculus with Area Functions

So I decided to try a new beginning to (non-AP) calculus this year. Instead of doing an algebra bootcamp and diving into limits, I decided to teach kids a new kind of function transformation. I’d say this is something that makes my classroom uniquely mine (this is my contribution to Mission 1 of Explore the MTBoS). I don’t think anyone else I know does something like this.

You see, I was talking with a fellow calculus teacher, and we had a big realization. Yes, calculus is hard for kids because of all the algebra. But also, calculus involves something that students have never seen before.

It involves transformations that morph one graph into another graph. And not just standard up, down, left, right, stretch, shrink, reflect transformations. Although they do transform functions, they don’t make them look too different from the original. Given a function and a basic up, left, reflect, shrink transformation of it, you’d be able to pair them up and say they were related… But in calculus, students start grappling with seriously weird and abstract transformations. For example: if you hold an f(x) graph and an f'(x) graph next to each other — they don’t look alike at all. You would never pair them up and say “oh, these are related.”

So I wanted to start out with a unit on abstract and weird function transformations. Turns out, even though the other teacher and I had brainstormed 5 different abstract function transformations, I got so much mileage out of one of them that I didn’t have to do anything else. You see: I introduced my kids to integrals, without ever saying the word integrals. Well, to be fair, I introduced them to something called the area transformation and the only difference between this and integrals is that we can’t have negative area. [1]

You can look at this geogebra page to see what I mean by area functions.

Here’s the packet I created (.docx)

That packer was just the bare backbones of what we did. There was a lot of groupwork in class, a lot of conceptual questions posed to them, and more supplemental documents that were created as I started to realize this was going to morph into a much larger unit because I was getting so much out of it. (I personally was finding so much richness in it! A perfect blend of the concrete and the abstract!)

Here are other supplemental documents:

2013-09-16 Abstract Functions 1.5

2013-09-17 Abstract Functions 1.75

2013-09-20 Area Function Concept Questions

2013-09-23 Abstract Functions 1.9375

The benefits I’m already reaping:

  • It’s conceptual, so those kids who aren’t strong with the algebraic stuff gain confidence at the start of the year
  • Kids start to understand the idea of integration as accumulation (though they don’t know that’s what they are doing!)
  • Kids understand that something can be increasing at a decreasing rate, increasing at a constant rate, or increasing at an increasing rate. They discovered those terms, and realized what that looks like graphically.
  • Kids already know why the integral of a constant function is a linear function, and why the integral of a linear function is a quadratic function.
  • Kids are talking about steepness and flatness of a function, and giving the steepness and flatness meaning… They are making statements like “because the original graph is close to the x-axis near x=2, not much area is being added as we inch forward on the original graph, so the area function will remain pretty flat, slightly increasing… but over near x=4, since the original function is far from the x-axis, a lot of area is being added as we inch forward on the original graph, so the area function shoots up, thus it is pretty steep”
  • Once we finish investigating the concept of “instantaneous rate of change” (which is soon), kids will have encountered and explored the conceptual side of both major ideas of calculus: derivatives and integrals. All without me having used the terms. I’m being a sneaky teacher… having kids do secret learning.

I mean… I worked these kids hard. Here is a copy of my assessment so you can see what was expected of them.

I love it.

Love. It.



I’m going to put a picture gallery below of some things from my smartboards.

This slideshow requires JavaScript.

[1] To be super technical, I am having kids relate f(x) and \int_{0}^{x} |f(t)|dt