* ch16eg1; * This SAS program: (i) fits a cell means (1-way ANOVA) * model to the data of Table 16.1, (aka Kenton Food Co.) * in a standard way, (ii) shows the LSEs SAS uses for * the cell means model, and (iii) gets the same fit and * ANOVA as a regression model; ; options ls=76 noovp; ; data KFC; input y A j; * A=i; lines; 11 1 1 17 1 2 16 1 3 14 1 4 15 1 5 12 2 1 10 2 2 15 2 3 19 2 4 11 2 5 23 3 1 20 3 2 18 3 3 17 3 4 27 4 1 33 4 2 22 4 3 26 4 4 28 4 5 ; * Do ANOVA using proc glm, each factor being a class(ification) variable; proc glm; class A; model y = A; means A; output out=B p=yhat r=e; title 'Kenton Food Company, Table 16.1 data.'; ; * plot the raw data and fitted model; proc plot; plot yhat*A='*' y*A / overlay; title2 'Plot of raw data and fitted model'; ; * get LSE's; proc glm; class A; model y = A / solution; title2 'One set of LSEs under an overparameterized model'; ; * regression approach to ANOVA; data KFC reg; set KFC; * make a copy of the data set KFC; * create 3 indicator variables; if A=1 then x1=1; else x1=0; if A=2 then x2=1; else x2=0; if A=3 then x3=1; else x3=0; * fit a regression model that does the ANOVA; proc glm; model y = x1 x2 x3; title2 'regression model equivalent to SAS ANOVA LSEs'; Kenton Food Company, Table 16.1 data. 1 12:51 Thursday, January 8, 2009 The GLM Procedure Class Level Information Class Levels Values A 4 1 2 3 4 Number of Observations Read 19 Number of Observations Used 19 Kenton Food Company, Table 16.1 data. 2 12:51 Thursday, January 8, 2009 The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 3 588.2210526 196.0736842 18.59 <.0001 Error 15 158.2000000 10.5466667 Corrected Total 18 746.4210526 R-Square Coeff Var Root MSE y Mean 0.788055 17.43042 3.247563 18.63158 Source DF Type I SS Mean Square F Value Pr > F A 3 588.2210526 196.0736842 18.59 <.0001 Source DF Type III SS Mean Square F Value Pr > F A 3 588.2210526 196.0736842 18.59 <.0001 Kenton Food Company, Table 16.1 data. 3 12:51 Thursday, January 8, 2009 The GLM Procedure Level of --------------y-------------- A N Mean Std Dev 1 5 14.6000000 2.30217289 2 5 13.4000000 3.64691651 3 4 19.5000000 2.64575131 4 5 27.2000000 3.96232255 Kenton Food Company, Table 16.1 data. 4 Plot of raw data and fitted model 12:51 Thursday, January 8, 2009 Plot of yhat*A. Symbol used is '*'. Plot of y*A. Legend: A = 1 obs, B = 2 obs, etc. yhat | | 35.0 + | | | A 32.5 + | | | 30.0 + | | | A 27.5 + * | A | A | 25.0 + | | | A 22.5 + | A | | 20.0 + A | * | A | A 17.5 + | A A | A | 15.0 + A A | * | A | * 12.5 + | A | A A | 10.0 + A | ---+-----------------+-----------------+-----------------+-- 1 2 3 4 A NOTE: 15 obs hidden. Kenton Food Company, Table 16.1 data. 5 One set of LSEs under an overparameterized model 12:51 Thursday, January 8, 2009 The GLM Procedure Class Level Information Class Levels Values A 4 1 2 3 4 Number of Observations Read 19 Number of Observations Used 19 Kenton Food Company, Table 16.1 data. 6 One set of LSEs under an overparameterized model 12:51 Thursday, January 8, 2009 The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 3 588.2210526 196.0736842 18.59 <.0001 Error 15 158.2000000 10.5466667 Corrected Total 18 746.4210526 R-Square Coeff Var Root MSE y Mean 0.788055 17.43042 3.247563 18.63158 Source DF Type I SS Mean Square F Value Pr > F A 3 588.2210526 196.0736842 18.59 <.0001 Source DF Type III SS Mean Square F Value Pr > F A 3 588.2210526 196.0736842 18.59 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 27.20000000 B 1.45235441 18.73 <.0001 A 1 -12.60000000 B 2.05393930 -6.13 <.0001 A 2 -13.80000000 B 2.05393930 -6.72 <.0001 A 3 -7.70000000 B 2.17853162 -3.53 0.0030 A 4 0.00000000 B . . . NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. Kenton Food Company, Table 16.1 data. 7 regression model equivalent to SAS ANOVA LSEs 12:51 Thursday, January 8, 2009 The GLM Procedure Number of Observations Read 19 Number of Observations Used 19 Kenton Food Company, Table 16.1 data. 8 regression model equivalent to SAS ANOVA LSEs 12:51 Thursday, January 8, 2009 The GLM Procedure Dependent Variable: y Sum of Source DF Squares Mean Square F Value Pr > F Model 3 588.2210526 196.0736842 18.59 <.0001 Error 15 158.2000000 10.5466667 Corrected Total 18 746.4210526 R-Square Coeff Var Root MSE y Mean 0.788055 17.43042 3.247563 18.63158 Source DF Type I SS Mean Square F Value Pr > F x1 1 110.2924812 110.2924812 10.46 0.0056 x2 1 346.1730159 346.1730159 32.82 <.0001 x3 1 131.7555556 131.7555556 12.49 0.0030 Source DF Type III SS Mean Square F Value Pr > F x1 1 396.9000000 396.9000000 37.63 <.0001 x2 1 476.1000000 476.1000000 45.14 <.0001 x3 1 131.7555556 131.7555556 12.49 0.0030 Standard Parameter Estimate Error t Value Pr > |t| Intercept 27.20000000 1.45235441 18.73 <.0001 x1 -12.60000000 2.05393930 -6.13 <.0001 x2 -13.80000000 2.05393930 -6.72 <.0001 x3 -7.70000000 2.17853162 -3.53 0.0030