
15. Analysis of Variance
There are situations and experiments that require processes to be compared at more than two levels. Data from such experiments can be analysed using analysis of variance or ANOVA.
15.1. Comparing Population Means
There are other ways to compare population means than ANOVA, but these are based on the assumption of either paired observations or independent random samples, and can only be used to compare two population means. ANOVA can be used to compare more than two populations, and also uses assessments of variation, which forms a large problem in other methods.
15.2. One-Way ANOVA
The procedure for testing the equality of population means is called a one-way ANOVA. This procedure is based on the assumption that all included populations have a common variance.
The total sum of squares (SST) in this procedure is made up of a within-group sum of squares (SSW) and a between groups sum of squares (SSG): SST = SSW + SSG
This division of the SST forms the basis of the one-way ANOVA, as it expresses the total variability around the mean for the sample observations.
If the null hypothesis is true (all population means are the same) then both SSW and SSG can be used to estimate the common population variance. This is done by dividing by the appropriate number of degrees of freedom.
Because SSW and SSG both provide an unbiased estimate of the common population variance if the null hypothesis is true, a difference between the two values indicates that the null hypothesis is false. The test of the null hypothesis is thus based on the ratio of mean squares:
Where and
. With the assumptions that the population variances are equal and the population distributions are normal.
The closer the ratio is to 1, the less indication there is that the null hypothesis is false.
These results are also summarized in a one-way ANOVA table, which has the following format:
Source of Variation | Sum of Squares | Degrees of Freedom | Mean Squares | F-ratio |
Between groups | SSG | K – 1 | MSG | MSG/MSW |
Within groups | SSW | n – K | MSW | |
Total | SST | n – 1 |
|
|
It is also possible to calculate a minimum significant difference (MSD) between two sample means, as evidence to conclude whether the population means are different. This is done:
With sp being the estimate of variance (), n the number of observations, K the number of populations, and Q being a factor from Table 13 from the Appendix.
15.3. Kruskal-Wallis test
The Kruskal-Wallis test is a nonparametric alternative to the one-way ANOVA and is used when there is a strong indication that the parent population distributions are markedly different from the normal. Like the majority of nonparametric tests this test is based on the ranks of the sample observations. In this test the null hypothesis is based on the calculation:
Where R are the ranks for the sample observations. The hypothesis is rejected if W is larger than χ2k-1,α (a number with probability α, by a random χ2 variable, with (K-1) degrees of freedom.
15.4. Two-way ANOVA
If there is a situation where a second factor also influences the outcome, it is best to design the experiment in such a manner that the influence of this factor can also be taken into account. This additional variable is then called a blocking variable and this design is called a randomized block design, the outcomes of which can be analysed using a two-way ANOVA.
In a randomized block design because the several categories from the two independent variables are randomly combined.
Using the observation for the ith group and the jth block, the population model can be portrayed as following: Xij = μ + Gi + Bj + εij.
Here Xij is the random variable, μ is the overall mean, the parameter Gi measures the discrepancy between the mean of group i and μ, the parameter Bj measures the discrepancy between the mean of block j and μ, and εij represents the experimental error.
In a two-way ANOVA the SST is split up in the between-blocks sum of squares (SSB) and the between-groups sum of squares (SSG), and also contains the error sum of squares (SSE). It is thus split up as: SST = SSB + SSG + SSE.
The null hypothesis of the population group means being equal is then tested through the ratio of the mean square for groups to the mean square error: .
The results of a two-way ANOVA are also best summarized in a two-way ANOVA table. This has the same set-up as a one-way ANOVA table, except for the sources of variation (between groups, between blocks, error, and total).
15.5. Two-Way ANOVA with multiple observations per cell
It is also possible to have more than one observation per cell. This has two advantages:
- More sample data leads to more precise estimates meaning that the differences among the population means can be distinguished better.
- The interaction between groups and blocks, as a source of variability, can be isolated.
This model thus has three null hypothesis: no difference between group means, no difference between block means, and no group-block interaction.
In this model the SST consists of one more factor: the interaction sum of squares (SSI), corresponding with the extra source of variation: Interaction.
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Contributions: posts
Spotlight: topics
Samenvatting Statistics for Business and Economics
Samenvatting voor het vak Statistics for Business and Economics op de Rijksuniversiteit Groningen. Hoofdstuk 12, 13, 15, 16, & 17.
- Lees verder over Samenvatting Statistics for Business and Economics
- 1331 keer gelezen
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
Add new contribution