LINEAR REGRESSION I. Analyzing the relationship between two interval level (numeric)variables. A. There are two commonly used statistics that measure the relationship between interval level variables--the linear regression coefficient and Pearson's correlation coefficient--both are measures of the linear relationship between to interval variables. B. Both provide some measure of how strong a relationship is and the direction of the relationship (positive or negative). The Pearson's correlation coefficient is similar to measures the of asssociation for categorical variables discussed earlier in the course. II. Linear relationships and regression. A. Two variables are linearly related if increases in one variable are accompanied by relatively consistent increases in the other (a positive linear relationship) or if increases in one variable are accompanied by relatively consistent decreases in the other (a negative linear relationship). B. Linear relationships can be illustrated graphically using scattergrams to plot the scores of cases of two the variables simultaneously. Scores on one variable (the inedpendent variable) are represented by the horizontal axis and scores on the other variable (the dependent variable) are represented by the vertical axis. Dots are used to represent the scores of individual respondents on the two variables. C. If a linear relationship exists, the dots on a scattergram will line up in a relatively consistent pattern. That does not necessarily mean they are exactly in line with each other, but that they cluster in a somewhat linear fashion. If there is no linear relationship, the dots will be distributed randomly, or according to some other pattern. D. If a linear relationship exists, a line can be drawn that represents the general linear pattern of the dots. The goal is to draw the line so that the distance between each dot and the line is as small as possible. This line is called the least-squares regression line because sum of the squares of the distance between each dot and the line (errors) is the least that is possible. It is represented by the formula Y = a + bX, where Y is the predicted value of the dependent variable, a is the Y interecept (place where the regression line crosses the vertical axis), b is the slope of the line, and X is the score of the independent variable. E. The Y intercept (a), sometimes called the constant, is the value of Y when X is zero. 1. Calculating the constant. 2. Interpreting the constant. F. The slope of the line (b), sometimes called the regression coefficient represents how many units the dependent variable (Y) changes, on the average, when the independent variable (X) increases by one unit. 1. Calculating the slope. 2. Interpreting the slope. a. Negative and postive slopes. b. Magnitude of the slope. III. Testing hypotheses about regression coefficients. A. State the null hypothesis. B. Choose a statistical test. C. Check the assumptions. 1. Random probability samples. 2. Two Interval variables. 3. Distributions of both variables normal. 4. Linear relationship. 5. Distribution of dependent variable same at all levels of independent variable (homoscedasticity). 6. Sample size sufficiently large. D. Choose an alpha level. E. Compute the test statistic and make a decision about the null. 1. Compute the statistic. 2. Determine the probability associated with the statistic. 3. Compare the probability to the alpha level. 4. Make a decision about the null. IV. The correlation coefficient. A. While the regression coeficient (b) provides an indication of the relationship between the variables, it has no upper or lower limits and its size is dependent on how the two variables are measured, making its use as a measure of association somewhat problematic. B. Pearson's correlation coefficient (r), on the other hand, ranges from -1 to +1 negative numbers indicate negative associations, positive numbers indicate postive associations), with values approaching 1 (or -1) indicating strong associations and values near 0 indicating weak associations. C. It can be seen as a measure of how closely individual observations fall to the least-squares regression line. The more observations cluster around the regression line, the larger the absolute value of the coefficient. V. The coefficient of determination. A. The square of the Pearson's correlation coefficient (r2) is called the coefficient of determination and has a PRE interpretation. It represents the amount of improvement in predicting the dependent variable when the least-squares regression line is used compared to when the mean of the dependent variable is used. B. When the mean of the dependent variable is used to predict the value of the dependent variable, the sum of the squared errors represents the total amount of variation in the dependent variable (TSS). C. When the least-squares regression line is used predict the value of the dependent variable, the sum of squared errors represents the amount of variation in the dependent variable unexplained by the independent variable. D. The difference between these two represents the amount of variation in the dependent variable explained by the independent variable. E. The coefficient of determination represents the ratio of the amount of variation explained by the independent variable to the total amount of variation. In other words, it is the proportion of the total variation in the dependent variable explained by the independent variable. VI. Testing hypotheses correlation coefficients. A. State the null hypotheses. B. Choose a statistical test. C. Check the assumptions. 1. Random probability samples. 2. Two Interval variables. 3. Distributions of both variables normal. 4. Linear relationship. 5. Distribution of dependent variable same at all levels of independent variable (homoscedasticity). 6. Sample size sufficiently large. D. Select an alpha level. E. Calculate the test statistic and make a decision about the null hypothesis. 1. Calculate the statistic. 2. Determine the probability. 3. Compare the probability to the alpha level. 4. Make a decision about the null.