* ch03eg.sas, chapter 3 example, residual plots * using the Toluca Company data (page 19); options ls=76; ; data toluca; input x y; run = _N_; lines; 80 399 30 121 50 221 90 376 70 361 60 224 120 546 80 352 100 353 50 157 40 160 70 252 90 389 20 113 110 435 100 420 30 212 50 268 90 377 110 421 30 273 90 468 40 244 80 342 70 323 ; * plot the raw data; proc plot; plot y*x / vpos=20; ; proc reg; model y = x; output out=B p=yhat r=e; title 'Generate residual plots: Toluca Company Example (page 19).'; title2 '(X = lot size, Y = work hours)'; ; * create the normal score variable "nscore" for NP plots; proc rank normal=Blom; * use Blom's normal scores; var e; ranks nscore; * nscore = probit((rank-0.375)/(n+0.25)), see (3.6), p111; ; * print the data, first unsorted, then sorted by residuals; proc print; proc sort; by e; proc print; ; * generate residual plots; proc plot; plot e*x / vpos=25 vref=0; plot e*run / vpos=25 vref=0; plot e*yhat / vpos=25 vref=0; plot e*nscore / vpos=25 vref=0 href=0; The SAS System 1 Plot of y*x. Legend: A = 1 obs, B = 2 obs, etc. y | 600 + | | A | | A | A B 400 + A A | A A B A | A A | | A A A | A A A 200 + A | A A | A |A | | 0 + -+------+------+------+------+------+------+------+------+------+------+ 20 30 40 50 60 70 80 90 100 110 120 x Generate residual plots: Toluca Company Example (page 19). 2 (X = lot size, Y = work hours) The REG Procedure Model: MODEL1 Dependent Variable: y Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 252378 252378 105.88 <.0001 Error 23 54825 2383.71562 Corrected Total 24 307203 Root MSE 48.82331 R-Square 0.8215 Dependent Mean 312.28000 Adj R-Sq 0.8138 Coeff Var 15.63447 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 62.36586 26.17743 2.38 0.0259 x 1 3.57020 0.34697 10.29 <.0001 Generate residual plots: Toluca Company Example (page 19). 3 (X = lot size, Y = work hours) Obs x y run yhat e nscore 1 80 399 1 347.982 51.018 1.06444 2 30 121 2 169.472 -48.472 -0.90336 3 50 221 3 240.876 -19.876 -0.30236 4 90 376 4 383.684 -7.684 -0.19987 5 70 361 5 312.280 48.720 0.90336 6 60 224 6 276.578 -52.578 -1.06444 7 120 546 7 490.790 55.210 1.25930 8 80 352 8 347.982 4.018 0.19987 9 100 353 9 419.386 -66.386 -1.51920 10 50 157 10 240.876 -83.876 -1.96422 11 40 160 11 205.174 -45.174 -0.76286 12 70 252 12 312.280 -60.280 -1.25930 13 90 389 13 383.684 5.316 0.30236 14 20 113 14 133.770 -20.770 -0.51871 15 110 435 15 455.088 -20.088 -0.40814 16 100 420 16 419.386 0.614 0.09944 17 30 212 17 169.472 42.528 0.76286 18 50 268 18 240.876 27.124 0.51871 19 90 377 19 383.684 -6.684 -0.09944 20 110 421 20 455.088 -34.088 -0.63604 21 30 273 21 169.472 103.528 1.96422 22 90 468 22 383.684 84.316 1.51920 23 40 244 23 205.174 38.826 0.63604 24 80 342 24 347.982 -5.982 0.00000 25 70 323 25 312.280 10.720 0.40814 Generate residual plots: Toluca Company Example (page 19). 4 (X = lot size, Y = work hours) Obs x y run yhat e nscore 1 50 157 10 240.876 -83.876 -1.96422 2 100 353 9 419.386 -66.386 -1.51920 3 70 252 12 312.280 -60.280 -1.25930 4 60 224 6 276.578 -52.578 -1.06444 5 30 121 2 169.472 -48.472 -0.90336 6 40 160 11 205.174 -45.174 -0.76286 7 110 421 20 455.088 -34.088 -0.63604 8 20 113 14 133.770 -20.770 -0.51871 9 110 435 15 455.088 -20.088 -0.40814 10 50 221 3 240.876 -19.876 -0.30236 11 90 376 4 383.684 -7.684 -0.19987 12 90 377 19 383.684 -6.684 -0.09944 13 80 342 24 347.982 -5.982 0.00000 14 100 420 16 419.386 0.614 0.09944 15 80 352 8 347.982 4.018 0.19987 16 90 389 13 383.684 5.316 0.30236 17 70 323 25 312.280 10.720 0.40814 18 50 268 18 240.876 27.124 0.51871 19 40 244 23 205.174 38.826 0.63604 20 30 212 17 169.472 42.528 0.76286 21 70 361 5 312.280 48.720 0.90336 22 80 399 1 347.982 51.018 1.06444 23 120 546 7 490.790 55.210 1.25930 24 90 468 22 383.684 84.316 1.51920 25 30 273 21 169.472 103.528 1.96422 Generate residual plots: Toluca Company Example (page 19). 5 (X = lot size, Y = work hours) Plot of e*x. Legend: A = 1 obs, B = 2 obs, etc. 100 + A | | A | | | A 50 + A A | A A R | e | A s | i | A A d 0 +--------------------------------------A-----------A-------------- u | A B a | A A A l | | A | A -50 + A A | A | A | | A | -100 + ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-- 20 30 40 50 60 70 80 90 100 110 120 x Generate residual plots: Toluca Company Example (page 19). 6 (X = lot size, Y = work hours) Plot of e*run. Legend: A = 1 obs, B = 2 obs, etc. 100 + A | | A | | | A 50 + A A | A A R | e | A s | i | A A d 0 +------------------A---------------A-------------------- u | A A A a | A A A l | | A | A -50 + A A | A | A | | A | -100 + ---+---------+---------+---------+---------+---------+-- 0 5 10 15 20 25 run Generate residual plots: Toluca Company Example (page 19). 7 (X = lot size, Y = work hours) Plot of e*yhat. Legend: A = 1 obs, B = 2 obs, etc. 100 + A | | A | | | A 50 + A A | A A R | e | A s | i | A A d 0 +--------------------------------A---------A------------ u | A B a | A A A l | | A | A -50 + A A | A | A | | A | -100 + ---+---------+---------+---------+---------+---------+-- 133.77 205.17 276.58 347.98 419.39 490.79 Predicted Value of y Generate residual plots: Toluca Company Example (page 19). 8 (X = lot size, Y = work hours) Plot of e*nscore. Legend: A = 1 obs, B = 2 obs, etc. 100 + | A | | | | A | | | | | | A 50 + | A A | | A A R | | e | | A s | | i | | AA d 0 +----------------------------+A-A------------------------- u | A AA a | A AA | l | | | A | | A | -50 + A A | | A | | A | | | | A | | | -100 + | ---+------------+------------+------------+------------+-- -2 -1 0 1 2 Rank for Variable e