* ch03eg2.sas, chapter 3 example, tests involving residuals, * using the Toluca Company data (page 19); options ls=76; filename data1 'toluca.dat'; ; data toluca; infile data1; input x y; ; proc print; ; proc reg; model y = x; output out=B p=yhat r=e; title 'Tests involving residuals: Toluca Company data'; ; * F-test for Lack of Fit (LOF) (Section 3.7, p119); * The MSE generated by the ANOVA model is the MSPE; proc glm; class x; model y = x; * or, to get the test statistic directly...; data B; set B; A=x; proc glm; class A; model y = x A; * (see pp121-123 concerning full and reduced models); * (see p 126 for the ANOVA table, but the book uses * a different example -- a Bank Example); ; * create the normal score variable "nscore" for NP plots; proc rank normal=Blom; * use Blom's normal scores; var e; ranks nscore; * nscore = probit((rank-0.375)/(N+0.25)), see p111; ; * Compute sample correlation between residuals and normal * scores (see Section 3.5, p115); proc corr nosimple; * Don't print simple statistics; var e nscore; The SAS System 1 Obs x y 1 80 399 2 30 121 : : : 23 40 244 24 80 342 25 70 323 Reduced and full models: ---------------------------------------------------------------------------- Tests involving residuals: Toluca Company data 2 The REG Procedure Model: MODEL1 Dependent Variable: y Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 252378 252378 105.88 <.0001 Error 23 54825 2383.71562 Corrected Total 24 307203 ---------------------------------------------------------------------------- Tests involving residuals: Toluca Company data 3 The GLM Procedure Class Level Information Class Levels Values x 11 20 30 40 50 60 70 80 90 100 110 120 Tests involving residuals: Toluca Company data 4 The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 10 269622.2067 26962.2207 10.04 <.0001 Error 14 37580.8333 2684.3452 Corrected Total 24 307203.0400 Source DF Type I SS Mean Square F Value Pr > F x 10 269622.2067 26962.2207 10.04 <.0001 Both models in one: ---------------------------------------------------------------------------- Tests involving residuals: Toluca Company data 5 The GLM Procedure Class Level Information Class Levels Values A 11 20 30 40 50 60 70 80 90 100 110 120 Tests involving residuals: Toluca Company data 6 The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 10 269622.2067 26962.2207 10.04 <.0001 Error 14 37580.8333 2684.3452 Corrected Total 24 307203.0400 Source DF Type I SS Mean Square F Value Pr > F x 1 252377.5808 252377.5808 94.02 <.0001 A 9 17244.6259 1916.0695 0.71 0.6893 ---------------------------------------------------------------------------- Testing normality: Tests involving residuals: Toluca Company data 7 The CORR Procedure 2 Variables: e nscore Pearson Correlation Coefficients, N = 25 Prob > |r| under H0: Rho=0 e nscore e 1.00000 0.99151 Residual <.0001 nscore 0.99151 1.00000 Rank for Variable e <.0001