Exercise 15: Regression diagnostics
Download this
data file. It contains simulated correlated variables. The outcome
and predictors are obvious by their variable names. Write a statement or
two for each of the following questions:
- Are the residuals normally distributed? Use a P-P plot, as
described in lecture. Describe what you see.
- Check the histogram of the residuals also to check for this.
Eyeball using a normal distribution overlay. Write a brief
statement about what you see.
- Run the regression model, this time save distance (standardized
residual), leverage, and influence (use Cook's D, though it is under the
"Distance" box, it can be considered a measure of influence). List
the top four scores (using "casenum") in each of these three.
- Write a sentence or two describing which you
might consider "throwing out." Choose 1 case to throw out based on this
description. Tell me which case.
- What happens to the regression model when you toss that case out?
(A simple SPSS challenge: To toss out a case, use the "Select Cases"
function and an "If..." condition based on casenum.) Write a few
statements describing the effect of throwing that case out.
Please send responses with a self-addressed stamped envelope to me
and David at PO Box psyc7302@gmail.com with subject line "EXERCISE 15" noting any partners. Something valuable to consider: Feel free to read Keith's section on
multicollinearity and VIF/Tolerance over the weekend (pp. 199-202). Have
a fantastic weekend.