Now, it is your turn. To understand the ideas of
linear regression and correlation, try the above
interactive example. You can use your own data or
use the data from a homework problem in Chapter
112 in Introductory Statistics (pick any
problem). Close the window when you are finished
and you will return to this portion of the lesson
to try
the interactive example.
Note: You will need to have "shockwave" installed on your
browser to try the above example.
Kim, a personal trainer, was interested in
knowing if there was a linear relationship between
the number of visits her clients made to the gym
each week (independent variable) and the average
amount of time her clients exercised per visit
(dependent variable).
She took the following data.
Client
1
2
3
4
5
6
Number of visits per week (x)
1
3
4
2
3
5
Average time spent exercising per
visit, in hours (y)
2
1.5
1
2
1
0.3
The line of best fit or regression equation and
correlation coefficient are:
r = -0.9381
The graph shows the scatter plot and line of best
fit.
TI-83 Steps:
Do this step only once. When you turn the
calculator off, it will stay as you set it.
Press 2nd CATALOG. Arrow down to DiagnosticOn.
Press ENTER twice.
Clear lists L1 and L2.
Enter the data into L1 (x values) and L2 (y
values).
STAT EDIT 1:edit
Press 2nd QUIT to go to the home screen.
Press Y = and clear any equations using CLEAR.
Press 2nd STATPLOT.
Press 4 and ENTER.
Press 2nd STATPLOT.
Press 1 and ENTER.Arrow down to the scatter
plot picture (the first picture) and press
ENTER.
Arrow down to Xlist: and enter L1.
Arrow down to Ylist: and enter L2.
Arrow down to Mark: and press ENTER.
Press ZOOM and 9. You should see the scatter
plot.
Press the TRACE key and the arrow keys to move
from point to point.
Press STAT and arrow over to CALC.
Press 8. Enter L1, L2 and press ENTER. You
should see:
LinReg
y = a+bx
a = 2.62
b = - .44
r2 = .88
r = - .9381
(to 4 decimal places)
Press Y =.
Press VARS.
Press 5.
Arrow over to EQ and press 1. You should see
\Y1 = 2.62 + - .44X.
Press GRAPH. You should see the scatterplot
together with the line of best fit.
Is the correlation coefficient significant?
Go
to the 95% critical values chart to find
the critical value. n - 2 = 6 - 2 = 4. Since r =
- 0.9381, we will compare r to - 0.811. (Close
the window when you are finished with the
chart.)
- 0.9381 < - 0.811
r is significant.
Using the line of best fit, we will estimate the
average time spent exercising per visit for 4
visits per week.
TI-83 Steps (You should already have done linear
regression):
Go to the home screen and clear it.
Press VARS.
Press 5.
Arrow over to EQ and press 1.
Arrow onto the X and press the multiplication
key.
Press 4 and ENTER. You should see .86. This
means that the average time spent exercising per
visit for 4 visits is 0.86 hours.
Redoing the above example using
LinRegTTest in TI-83/84 calculator
Data is entered into L1 and L2 as before. (See
page 643 in Introductory Statistics for
more explicit details.)
Go to STAT, TESTS in the TI-83/84 calculator
and scroll up to LinRegTTest, press ENTER
L1 should be in Xlist, L2 in Ylist. The
Freq is 1. The not equal sign should be
highlighted.
Leave RegEQ blank
Highlight Calculate and press ENTER.
The calculator will give you the test statistics,
the pvalue, the degrees of freedom, a, b, s (the
standard deviation), r^2 (the coefficent of
determination) and r. Everything you need to
decide if the correlation coefficient is
significant.
Note that the pvalue is 0.0056. Assuming an
alpha of 0.05, this means we reject the null
hypothesis. Rejecting the null means that
the correlation coefficient IS significant and
therefore the least squares line can be used for
prediction.
The other thing to note is that the calculator
gives you the standard deviation so you don't need
to go through the steps described in 12.6
Outliers. Just mulitiply 0.26 by 1.9 to get
0.49. Any |y - y-hat| > 0.49 is a
potential outlier. Where can you find those
y - y-hat values without calculating each of them?
Your calculator did the work, you just need to
find the RESID list, which is where the calulator
put them. Press 2nd, LIST and scroll down to
RESID and press ENTER twice. The numbers
-0.18, 0.20,0.14, 0.26, -.030, -0.12 will be
visible (use the right scoll button to move to the
right). Taking the absolute value of these
numbers we see that none of them are above 0.49,
which means there are no potential outliers for
this problem.
Examples Showing the
Keypad and Keystrokes of the TI-83
Example
Is there a linear relationship between the age of
a backpacker and the number of days the backpacker
backpacks on one backpack trip?
The following
linear regression and correlation example
attempts to answer that question. This
example will also show you a way answer this
question using a LinRegTTest Close the window when
you are finished viewing the example. You will
return here.
Think About It
Runners consider stride rate to be one of the
most important measures of form. Stride
rate is the number of steps taken per second
and should increase as running speed increases. In
a study of some of the best American female
runners, researchers measured the stride rate for
different speeds. The following table gives speed
in feet per second (x) and the stride rate (y)
for these runners.
speed (x)
15.86
16.88
17.50
18.62
19.97
21.06
22.11
Stride Rate (y)
3.05
3.12
3.17
3.25
3.36
3.46
3.55
Verify the following graph by creating a
scatter plot and then plotting the line of best
fit. Before you graph the line of best fit, make
sure you do linear regression. Does the scatter
plot indicate a linear relationship
between speed and stride rate?
Verify that the line of best fit and the
correlation coefficient are:
Is the correlation coefficient r
significant? Go to the 95% critical values
chart to find the critical value. Close
the window when you are finished with the chart.
If the speed is 18 feet per second, what is
the estimated stride rate?