```                            LINEAR REGRESSION

I.  Analyzing the relationship between two interval level
(numeric)variables.
A.  There are two commonly used statistics that measure the
relationship between interval level variables--the linear
regression coefficient and Pearson's correlation
coefficient--both are measures of the linear relationship
between to interval variables.
B.  Both provide some measure of how strong a relationship is
and the direction of the relationship (positive or
negative).  The Pearson's correlation coefficient is
similar to measures the of asssociation for categorical
variables discussed earlier in the course.

II.  Linear relationships and regression.
A.  Two variables are linearly related if increases in one
variable are accompanied by relatively consistent
increases in the other (a positive linear relationship)
or if increases in one variable are accompanied by
relatively consistent decreases in the other (a negative
linear relationship).
B.  Linear relationships can be illustrated graphically using
scattergrams to plot the scores of cases of two the
variables simultaneously.  Scores on one variable (the
inedpendent variable) are represented by the horizontal
axis and scores on the other variable (the dependent
variable) are represented by the vertical axis.  Dots are
used to represent the scores of individual respondents on
the two variables.
C.  If a linear relationship exists, the dots on a
scattergram will line up in a relatively consistent
pattern.  That does not necessarily mean they are exactly
in line with each other, but that they cluster in a
somewhat linear fashion.  If there is no linear
relationship, the dots will be distributed randomly, or
according to some other pattern.
D.  If a linear relationship exists, a line can be drawn that
represents the general linear pattern of the dots.  The
goal is to draw the line so that the distance between
each dot and the line is as small as possible.  This line
is called the least-squares regression line because sum
of the squares of the distance between each dot and the
line (errors) is the least that is possible.  It is
represented by the formula  Y = a + bX, where Y is the
predicted value of the dependent variable, a is the Y
interecept (place where the regression line crosses the
vertical axis), b is the slope of the line, and X is the
score of the independent variable.
E.  The Y intercept (a), sometimes called the constant, is
the value of Y when X is zero.
1.  Calculating the constant.
2.  Interpreting the constant.
F.  The slope of the line (b), sometimes called the
regression coefficient represents how many units the
dependent variable (Y) changes, on the average, when the
independent variable (X) increases by one unit.
1.  Calculating the slope.
2.  Interpreting the slope.
a.  Negative and postive slopes.
b.  Magnitude of the slope.

III.  Testing hypotheses about regression coefficients.
A.  State the null hypothesis.
B.  Choose a statistical test.
C.  Check the assumptions.
1.  Random probability samples.
2.  Two Interval variables.
3.  Distributions of both variables normal.
4.  Linear relationship.
5.  Distribution of dependent variable same at all
levels of independent variable (homoscedasticity).
6.  Sample size sufficiently large.
D.  Choose an alpha level.
E.  Compute the test statistic and make a decision about the null.
1.  Compute the statistic.
2.  Determine the probability associated with the statistic.
3.  Compare the probability to the alpha level.
4.  Make a decision about the null.

IV.  The correlation coefficient.
A.  While the regression coeficient (b) provides an
indication of the relationship between the variables, it
has no upper or lower limits and its size is dependent on
how the two variables are measured, making its use as a
measure of association somewhat problematic.
B.  Pearson's correlation coefficient (r), on the other hand,
ranges from -1 to +1 negative numbers indicate negative
associations, positive numbers indicate postive
associations), with values approaching 1 (or -1)
indicating strong associations and values near 0
indicating weak associations.
C.  It can be seen as a measure of how closely individual
observations fall to the least-squares regression line.
The more observations cluster around the regression line,
the larger the absolute value of the coefficient.

V.  The coefficient of determination.
A.  The square of the Pearson's correlation coefficient (r2)
is called the coefficient of determination and has a PRE
interpretation.  It represents the amount of improvement
in predicting the dependent variable when the
least-squares regression line is used compared to when
the mean of the dependent variable is used.
B.  When the mean of the dependent variable is used to
predict the value of the dependent variable, the sum of
the squared errors represents the total amount of
variation in the dependent variable (TSS).
C.  When the least-squares regression line is used
predict the value of the dependent variable, the sum of
squared errors represents the amount of variation in the
dependent variable unexplained by the independent
variable.
D.  The difference between these two represents the amount of
variation in the dependent variable explained by the
independent variable.
E.  The coefficient of determination represents the ratio of
the amount of variation explained by the independent
variable to the total amount of variation.  In other
words, it is the proportion of the total variation in the
dependent variable explained by the independent variable.

VI.  Testing hypotheses correlation coefficients.
A.  State the null hypotheses.
B.  Choose a statistical test.
C.  Check the assumptions.
1.  Random probability samples.
2.  Two Interval variables.
3.  Distributions of both variables normal.
4.  Linear relationship.
5.  Distribution of dependent variable same at all
levels of independent variable (homoscedasticity).
6.  Sample size sufficiently large.
D.  Select an alpha level.
E.  Calculate the test statistic and make a decision about
the null hypothesis.
1.  Calculate the statistic.
2.  Determine the probability.
3.  Compare the probability to the alpha level.
4.  Make a decision about the null.
```