Exercise 15: Regression diagnostics

Download this data file. It contains simulated correlated variables. The outcome and predictors are obvious by their variable names. Write a statement or two for each of the following questions:

  1. Are the residuals normally distributed? Use a P-P plot, as described in lecture. Describe what you see.
  2. Check the histogram of the residuals also to check for this. Eyeball using a normal distribution overlay. Write a brief statement about what you see.
  3. Run the regression model, this time save distance (standardized residual), leverage, and influence (use Cook's D, though it is under the "Distance" box, it can be considered a measure of influence). List the top four scores (using "casenum") in each of these three.
  4. Write a sentence or two describing which you might consider "throwing out." Choose 1 case to throw out based on this description. Tell me which case.
  5. What happens to the regression model when you toss that case out? (A simple SPSS challenge: To toss out a case, use the "Select Cases" function and an "If..." condition based on casenum.) Write a few statements describing the effect of throwing that case out.

Please send responses with a self-addressed stamped envelope to me and David at PO Box psyc7302@gmail.com with subject line "EXERCISE 15" noting any partners. Something valuable to consider: Feel free to read Keith's section on multicollinearity and VIF/Tolerance over the weekend (pp. 199-202). Have a fantastic weekend.