Mathematics for the least-squares slope

Mathematics for the least-squares slope#

This page follows on from the page on mean squared deviations. Like that page, is assumes a lot more maths than the standard flow of the course. In particular, it assumes that you know the basics of finding the derivative of a function.

Please make sure you have read and understand the mean squared deviation page, because this page builds on that argument.

You should also check the page on means and slopes for the problem we are trying to solve.

In our problem, we have n x values x1,x2,...,xn, that we want to use to predict n corresponding y values y1,y2,...,yn. For example, in terms of the mean and slopes page, we have 158 hemoglobin concentration values, so n=158 and we can write our hemoglobin values as x1,x2,...,x158. We have 158 packed cell volume values, and we can write these as y1,y2,...,y158.

We decide we will use a straight line going through the origin to predict our y points from our x points. We define this line with its slope s. This is the number of units that y increases for every unit increase in x. Our predicted values will therefore be sx1,sx2,...,sxn.

We want to choose s such that it minimizes the sum of squared prediction errors. We define the prediction error for the first value as the actual value y1 minus the prediction for that value sx1. We have n prediction errors y1sx1,y2sx2,...,ynsxn. The thing we want to minimize is the sum of squared prediction error for a particular slope s, defined as:

(3)#SSEs(yisxi)2

The symbol means is defined as.

This is the general formula for the specific plot we saw at the end of mean and slope page, where the value for s is on the horizontal axis, and the value for SSEs is on the vertical axis.

We want to find the value of s that gives the smallest value for SSEs.

The plot turned out to be U-shaped; we want to find the horizontal axis location (s value) corresponding to the bottom of the U (minimum of the corresponding SSEs values).

We follow the same scheme as for the mean squared deviations page; we transform the formula in (3) above into a formula for the gradient of the line that (3) represents, by taking the derivative. When the derivative of equation (3) is equal to zero, it means the gradient of (3) is 0, and this is true when we are at a peak or a trough of (3). We want the trough.

First we expand (3), and use the laws of sums to simplify the result:

SSEs(yisxi)2=(yi22yisxi+s2xi2)=yi22syixi+s2xi2

Now differentiate with respect to s:

(4)#SSEss=2yixi+2sxi2

Find the zero(s) for equation (4):

2yixi=2sxi2yixixi2=s

Equation (4) only has one zero.

We take the second derivative of (4) to see if the solution to s is at a trough (with a positive second derivative) or a peak (with a negative second derivative).

2SSEss2=2xi2

xi2 is always positive; this means that the second derivative is always positive, and therefore, it is also positive at our zero point s=yixixi2. So, equation (3) only has a one trough, at s=yixixi2, and no peaks.

This is the value s for the slope that minimizes the sum of squared errors, also called the sum of squared deviations, also called the sum of squared prediction errors.