Why is the gradient related to the normal vector to a surface?
16 Jan
Today in Multivariable Calculus I was supposed to teach my students how to find the plane tangent to a surface at a point.
The book, however, was not clear how to do this. They had an equation involving the gradient of a function, but the equation was derived via local linear approximations. Fine and dandy, but I didn’t like it. I didn’t “see” it or grasp what was going on.
What’s clear is that to find the equation for the plane — for any plane — we need a point and a vector pointed in the direction normal to the plane. We are given the point, but we need to find the direction normal to the plane. That’s the same as the direction normal to the surface!
So I set my class up with the task of doing this on their own. They’re still working on it.
But honestly, I’m not quite there yet. I don’t want to just give them the equation and method on how to apply it, but I don’t think I can explain it in any good way. I’m almost there, at a conceptual tipping point, but I need one last shove over the edge. Anyone out there ready to help?
First of all, I decided that working with surfaces is silly and I’d reduce the problem to curves. So let’s start simple.
Let’s say we have the graph of and we want to find vectors normal to the curve at
and
(the blue and green dots).
Well, traditionally, we’d be crazy and parametrize the parabola by creating the vector-valued function and then calculate the unit tangent vector (
) and then from that calculate the unit normal vector (
). [1] Then we’d calculate
and
to find the vectors.
But trust me, this is an awful amount of work, and is not a pretty function. We had to parametrize, take derivatives, and plug in values. And if you remember, we started out with such a simple equation
. Why can’t it be easier?
And it can. And this is where I need your help.
Instead of considering the plain old boring function , we turn this into a surface by introducing a
direction:
.
The function is a surface. We’re only interested in one slice of the surface, when
(when the height is 0). This will then reduce to our original equation
. The set of level curves of the surface is below. Note that the level curve that goes through the origin is the level curve we’re interested in.
Remember that one important (perhaps the most important) property of the gradient is that the gradient of a function points in the direction of maximum of steepness on a graph of level curves.
Let’s look at the points we’re interested in!
Just looking at the graph shows we’re onto something. Look at the blue dot. Which direction is the steepest, if you were standing at the blue dot and wanted to walk in the steepest direction? Well, clearly it would be directly north. (You want to walk the shortest distance to get to the next level curve. Since the change in heights between level curves is constant, you want to minimize the distance you’ve walked to get to the next height to have the steepest slope.) What about the green dot? Clearly, northwest.
And actually calculating the gradient of gives us
.
At the blue dot, we get , which is a vector pointing straight up.
At the green dot, we get which is a vector pointing northwest.
I’m plotting them below.
And without all the pesky level curves to distract us.
Clearly this method works. We take the original function and bring it into a higher dimension (
). We use the fact that the gradient gives us the direction which is “steepest” on this surface, if we were trapped at a particular point. (In this case,
or
. Notice these points lie on the level curve we care about, the level curve which actually is the equation we were initially concerned about (
). Then we recognize — somehow — that the gradient of the higher dimension equation somehow gives us the normal vector of the original equation we were concerned with.
The questions I have after doing this:
(1) Why did we have to change our nice curve into a surface
to solve this problem? And why this surface?
(2) How can we understand that the vector normal to the curve somehow is “magically” the gradient of the surface we created — one of whose level curves is the curve we’re interested in.
(3) Extending this analysis to problems where we want to find the normal vector to a surface like an ellipsoid (like ) at a particular point, we’re going to be using the function
— whose level curves will be surfaces, stacked one on top of another. To find the normal vector, we take the point on the “level surface” which describes our ellispoid, and find the quickest way to get to the next “level surface”? Is that right? I think that seems right. Strange, but right.
(For a picture of some level surfaces, check it out here.)
Anyway, this is just my musings, my way of thinking through this. I’m not quite there. Any help you can give, great. If not, that’s cool too.
[1] I guess to make things simpler, we could simply calculate the direction of the normal vector and not worry about making it a unit normal vector, so we could simply calculate $\vec{T}’(t)$ only. We’re not concerned about the magnitude of the normal vector, only the fact that it’s normal.








I am several decades removed from studying calculus :-) so I may be missing the point completely, but why is it any more complicated than this:
1. Slope of curve y = x**2 is found by differentiating : dy/dx = 2x
2. Slope of the vector normal to the curve is the negative reciprocal of this : -1/(2x).
3. Can’t compute -1/(2x) for x=0 of course, but x=1 gives -1/2, x=2 gives -1/4 etc. which seems correct.
@Will: hahaha, you’re TOTALLY right about using this method to help us with our 2D function, instead of having to go to gradients and all that other nonsense. (It’s hilarious, because when I reduced the problem in class, everyone went straight to MV Calc tools to solve it. When we all learned how to do this type of work in pre-calc.)
But the reason I did it with gradients is because for 3D is because I don’t see a way to make your method work for surfaces. (Find the vector that points normal to a surface.)
Best,
Sam
I picked this up in my reader, and came here to leave the exact same comment as Will (starting with, with no little embarrassment, “decades removed…”)
But I see your answer. And I think, didn’t I reduce this problem twice, once to f(x) in one plane (|| to axis) containing the point, and then to f(y) in a plane containing the point, and end up with x and y components of something analogous to slope that would let us write the equation of the plane?
Or am I miles off?
Jonathan
@Sam,
Ah, yes, I see the point now. I had made the erroneous assumption that there existed a technique to do ‘differentiation in 3D’ to get the gradient of the plane – without knowing whether such a technique actually existed!
I guess that’s the difference between a mathematician and and engineer (me): a mathematician would demand to see the proof that ‘differentiation in 3D’ existed before using it, but an engineer would just try extrapolating from 2D and then try to verify empirically the results were correct (or at least close enough for the problem at hand). Which would not work in this case…
Come to think of it (now I’m really beyond what I recall from calculus) isn’t there a notion of partial derivatives where you can get the rate of change of z with respect to x AND y? Wouldn’t that be the ‘gradient’ of the plane that is tangential to our surface? Or can partial derivatives only be done ‘one axis at a time’?
@Will and Jonathan,
In fact, you’re both heading in the same direction as the book. You do need to take partial derivatives. To refresh,
is the slope of the function if you hold y constant and vary x. Similarly,
is the slope of the function if you hold x constant and vary y.
The tangent plane to a surface
at a point
is
.
However, when I read this, it didn’t make clear what was going on intuitively/concretely. In some sense, the first term of the three terms makes sure that that point
is on the plane, and the second and third terms describes what’s happening locally when you move slightly to the right or left. This sounds like what you are talking about Jonathan.
But I don’t think that explanation is totally correct — or makes great sense to me. And then the book goes on to say that another way to write the tangent plane is
, for any function
. Well, that was a real challenge for me — where the heck this comes from.
Which led to this post.
Yeah, I don’t work with this much, and my writing is not so good. But that looks like the 3 space analog of the point-slope form of the equation of a line.
Jonathan
I am kind of late to this discussion, but here goes…
The rigorous proof that the gradient produces a vector normal to a given level surface is evidently pretty complex, since it is omitted from my vector analysis text. There is, however, an informal explanation.
Imagine a function f(x,y,z) defined everywhere in a box. Pick a point (x0,y0,z0), and let’s say that f(x0,y0,z0)=C. Then it isn’t hard to believe that a smooth surface could exist where f(x,y,z)=C, which passes through (x0,y0,z0) and is continuous throughout the box.
If you took a directional derivative while staying on the surface, it is 0, since f(x,y,z) is constant on the surface.
Notice that the directional derivative is in a direction tangent to the surface, and its magnitude is zero. df/du = grad(f).u implies that this direction, tangent to the surface, is perpindicular to the direction of greatest change. This means that the direction of greatest change (aka the gradient vector) is normal to the level surface.
At the green dot, we get F(1,1) = which is a vector pointing northwest.
Seems to me you plotted F(1,1) =