Here’s my challenge, created by me, for me. I want to explain where the line of best fit comes from. Not just the algorithm to find it, but conceptually how it is found. My intended audience: students in Algebra II. Where the derivation comes from? Multivariable calculus.
So here we go.
Let’s say we have a set of 5 points: (1,1), (3,5), (4, 5), (6, 8), (8,8)
We want a “line of best fit.” It’s tricky because we don’t exactly know what that might mean, quite yet, but we do know that we want a line that will pass near a lot of the points. We want the line to “model” the points. So the line and the points should be close together. In other words, even without knowing what exactly a “line of best fit” is, we can say pretty certainly that it is not:
Instead, we know it probably looks like one of the following lines:
LINE A: y=1.1x
or
Of course it doesn’t have to be either of those lines… but we can be pretty sure it will look similar to one of them. You should notice the lines are slightly different. The y-intercepts are different and the slopes are different. But both actually lie fairly close to the points. So is Line A or Line B a better model for the data? And an even more important question: might there be another line that is an even better model for the data?
In other words, our key question is now:
How are we going to be able to choose one line, out of all the possible lines I could draw, that seems like it fits the data well? (One line to rule them all…)
Another way to think of this question: is there a way to measure the “closeness” of the data to the line, so we can decide if Line A or Line B is a better fit for the data? And more importantly, is there an even better line (besides Line A or Line B) that fits the data?
(Part II to come…)
UPDATE: Part II here



