* egreg1.sas, simple linear regression example, * using the Toluca Company data (page 19); options ls=76; ; data toluca; title 'Simple Linear Regression: Toluca Company example, page 19.'; input x y; * create new variables; x2=x**2; y2=y**2; xy=x*y; lines; 80 399 30 121 50 221 90 376 70 361 60 224 120 546 80 352 100 353 50 157 40 160 70 252 90 389 20 113 110 435 100 420 30 212 50 268 90 377 110 421 30 273 90 468 40 244 80 342 70 323 ; * print the data set; proc print; ; * compute the sum of values for each variable; proc means sum; ; * add an observation with x=65 and y missing to the data set; data newobs; x=65; y=.; output; data toluca2; set toluca newobs; * concatenate data sets; ; * fit a simple linear regression model, saving residuals, predicted * values, and other statistics into a new data set named "stats"; proc reg alpha=0.05; * specify alpha for CIs and PIs; model y = x; output out=stats residual=e predicted=yhat LCLM=LCLM UCLM=UCLM stdp=stdm LCL=LCL UCL=UCL stdi=stdi; ; * plot the fitted regression line and data; proc plot; plot y*x yhat*x='*' / overlay; ; * print the data fitted values, residuals, and other statistics; proc print; var x y yhat e LCLM UCLM stdm LCL UCL stdi; ; * compute Blom's normal scores for the residuals; proc rank normal=Blom; var e; ranks nscore; ; * generate residual plots; proc plot; plot e*x / vpos=20 vref=0; plot e*yhat / vpos=20 vref=0; plot e*nscore / vpos=20 vref=0 href=0; ; * compute sample correlations; proc corr nosimple; var x y e nscore; Simple Linear Regression: Toluca Company example, page 19. 1 08:47 Monday, September 18, 2006 Obs x y x2 y2 xy 1 80 399 6400 159201 31920 2 30 121 900 14641 3630 3 50 221 2500 48841 11050 4 90 376 8100 141376 33840 5 70 361 4900 130321 25270 6 60 224 3600 50176 13440 7 120 546 14400 298116 65520 8 80 352 6400 123904 28160 9 100 353 10000 124609 35300 10 50 157 2500 24649 7850 11 40 160 1600 25600 6400 12 70 252 4900 63504 17640 13 90 389 8100 151321 35010 14 20 113 400 12769 2260 15 110 435 12100 189225 47850 16 100 420 10000 176400 42000 17 30 212 900 44944 6360 18 50 268 2500 71824 13400 19 90 377 8100 142129 33930 20 110 421 12100 177241 46310 21 30 273 900 74529 8190 22 90 468 8100 219024 42120 23 40 244 1600 59536 9760 24 80 342 6400 116964 27360 25 70 323 4900 104329 22610 Simple Linear Regression: Toluca Company example, page 19. 2 08:47 Monday, September 18, 2006 The MEANS Procedure Variable Sum ------------------------ x 1750.00 y 7807.00 x2 142300.00 y2 2745173.00 xy 617180.00 ------------------------ Simple Linear Regression: Toluca Company example, page 19. 3 08:47 Monday, September 18, 2006 The REG Procedure Model: MODEL1 Dependent Variable: y Number of Observations Read 26 Number of Observations Used 25 Number of Observations with Missing Values 1 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 252378 252378 105.88 <.0001 Error 23 54825 2383.71562 Corrected Total 24 307203 Root MSE 48.82331 R-Square 0.8215 Dependent Mean 312.28000 Adj R-Sq 0.8138 Coeff Var 15.63447 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 62.36586 26.17743 2.38 0.0259 x 1 3.57020 0.34697 10.29 <.0001 Simple Linear Regression: Toluca Company example, page 19. 4 08:47 Monday, September 18, 2006 Plot of y*x. Legend: A = 1 obs, B = 2 obs, etc. Plot of yhat*x. Symbol used is '*'. y | | 577.13 + | | 541.42 + A | | 505.72 + | * | 470.02 + A | * | 434.32 + A | A A | 398.62 + A | A | B 362.91 + A | A A | A 327.21 + A | * | 291.51 + * | * | A A 255.81 + A | A * | 220.11 + A A | A * | 184.40 + | * | A A 148.70 + | * | A 113.00 + A | ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-- 20 30 40 50 60 70 80 90 100 110 120 x NOTE: 1 obs had missing values. 17 obs hidden. Simple Linear Regression: Toluca Company example, page 19. 5 08:47 Monday, September 18, 2006 Obs x y yhat e LCLM UCLM stdm LCL UCL stdi 1 80 399 347.982 51.018 326.545 369.419 10.3628 244.733 451.231 49.9110 2 30 121 169.472 -48.472 134.367 204.577 16.9697 62.546 276.397 51.6884 3 50 221 240.876 -19.876 216.095 265.657 11.9793 136.882 344.870 50.2715 4 90 376 383.684 -7.684 358.903 408.465 11.9793 279.690 487.678 50.2715 5 70 361 312.280 48.720 292.080 332.480 9.7647 209.281 415.279 49.7902 6 60 224 276.578 -52.578 255.141 298.015 10.3628 173.329 379.827 49.9110 7 120 546 490.790 55.210 449.608 531.973 19.9079 381.718 599.862 52.7261 8 80 352 347.982 4.018 326.545 369.419 10.3628 244.733 451.231 49.9110 9 100 353 419.386 -66.386 389.862 448.911 14.2723 314.160 524.612 50.8666 10 50 157 240.876 -83.876 216.095 265.657 11.9793 136.882 344.870 50.2715 11 40 160 205.174 -45.174 175.649 234.698 14.2723 99.948 310.400 50.8666 12 70 252 312.280 -60.280 292.080 332.480 9.7647 209.281 415.279 49.7902 13 90 389 383.684 5.316 358.903 408.465 11.9793 279.690 487.678 50.2715 14 20 113 133.770 -20.770 92.587 174.952 19.9079 24.698 242.842 52.7261 15 110 435 455.088 -20.088 419.983 490.193 16.9697 348.163 562.014 51.6884 16 100 420 419.386 0.614 389.862 448.911 14.2723 314.160 524.612 50.8666 17 30 212 169.472 42.528 134.367 204.577 16.9697 62.546 276.397 51.6884 18 50 268 240.876 27.124 216.095 265.657 11.9793 136.882 344.870 50.2715 19 90 377 383.684 -6.684 358.903 408.465 11.9793 279.690 487.678 50.2715 20 110 421 455.088 -34.088 419.983 490.193 16.9697 348.163 562.014 51.6884 21 30 273 169.472 103.528 134.367 204.577 16.9697 62.546 276.397 51.6884 22 90 468 383.684 84.316 358.903 408.465 11.9793 279.690 487.678 50.2715 23 40 244 205.174 38.826 175.649 234.698 14.2723 99.948 310.400 50.8666 24 80 342 347.982 -5.982 326.545 369.419 10.3628 244.733 451.231 49.9110 25 70 323 312.280 10.720 292.080 332.480 9.7647 209.281 415.279 49.7902 26 65 . 294.429 . 273.913 314.945 9.9176 191.368 397.490 49.8204 Simple Linear Regression: Toluca Company example, page 19. 6 08:47 Monday, September 18, 2006 Plot of e*x. Legend: A = 1 obs, B = 2 obs, etc. | 100 + A | A | | | A A R | A A e | A s | A i | A d 0 +--------------------------------------A-----A-----A-------------- u | A B a | A A A l | A | A A | A A | A | | A -100 + ---+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-- 20 30 40 50 60 70 80 90 100 110 120 x NOTE: 1 obs had missing values. Simple Linear Regression: Toluca Company example, page 19. 7 08:47 Monday, September 18, 2006 Plot of e*yhat. Legend: A = 1 obs, B = 2 obs, etc. | 100 + A | A | | | A A R | A A e | A s | A i | A d 0 +--------------------------------A----A----A------------ u | A B a | A A A l | A | A A | A A | A | | A -100 + ---+---------+---------+---------+---------+---------+-- 133.77 205.17 276.58 347.98 419.39 490.79 Predicted Value of y NOTE: 1 obs had missing values. Simple Linear Regression: Toluca Company example, page 19. 8 08:47 Monday, September 18, 2006 Plot of e*nscore. Legend: A = 1 obs, B = 2 obs, etc. | | 100 + | A | | A | | | | | | A A R | | A A e | | A s | | A i | | A d 0 +----------------------------+A-AA------------------------ u | A AA a | A AA | l | A | | A A | | A A | | A | | | | A | -100 + | ---+------------+------------+------------+------------+-- -2 -1 0 1 2 Rank for Variable e NOTE: 1 obs had missing values. Simple Linear Regression: Toluca Company example, page 19. 9 08:47 Monday, September 18, 2006 The CORR Procedure 4 Variables: x y e nscore Pearson Correlation Coefficients Prob > |r| under H0: Rho=0 Number of Observations x y e nscore x 1.00000 0.90638 0.00000 0.03606 <.0001 1.0000 0.8641 26 25 25 25 y 0.90638 1.00000 0.42245 0.45155 <.0001 0.0354 0.0235 25 25 25 25 e 0.00000 0.42245 1.00000 0.99151 Residual 1.0000 0.0354 <.0001 25 25 25 25 nscore 0.03606 0.45155 0.99151 1.00000 Rank for Variable e 0.8641 0.0235 <.0001 25 25 25 25