What is ANOVA? – Chapter 12

12.1 How do dummy variables replace categories?
12.2 How do you make multiple comparisons of means?
12.3 What is one-way ANOVA?
12.4 What is two-way ANOVA?
12.5 How does ANOVA with repeated measures work?
12.6 How does two-way ANOVA with repeated measures of a factor work?

12.1 How do dummy variables replace categories?

For analyzing categorical variables without assigning a ranking, dummy variables are an option. This means that fake variables are created from observations:

z₁ = 1 and z₂ = 0 : observations of category 1 (men)

z₁ = 0 and z₂ = 1 : observations of category 2 (women)

z₁ = 0 and z₂ = 0 : observations of category 3 (transgender and other identities)

The model is: E(y) = α + β₁z₁ + β₂z₂. The means are deducted from the model: μ₁ = α + β₁ and μ₂ = α + β₂ and μ₃ = α. Three categories only require two dummy variables, because what remains falls in category 3.

A significance test using the F-distribution tests whether the means are the same. The null hypothesis H₀ : μ₁ = μ₂ = μ₃ = 0 is the same as H₀ : β₁ = β₂ = 0. A small F means a big P and much evidence against the null hypothesis.

The F-test is robust against small violations of normality and differences in the standard deviations. However, it can't handle very skewed data. This is why randomization is important.

12.2 How do you make multiple comparisons of means?

A small P doesn't say which means differ or how much. Confidence intervals give more information. For every mean a confidence interval can be constructed, or for the difference between two means. An estimate of the difference in population means is:

$(\bar{y}_i-\bar{y}_j)\pm ts\sqrt{\frac{1}{n_i}+\frac{1}{n_j}}$

The degrees of freedom of the t-score are df = N – g, in which g is the number of categories and N is the combined sample size (n₁ + n₂ + … + n_g). When the confidence interval doesn't contain 0, this is proof of difference between the means.

In case of lots of groups with equal population means, it might happen that a confidence interval finds a difference anyway, due to the increase in errors that comes with the increase in the number of comparisons. Multiple comparison methods check the probability that all intervals of a lot of comparisons contain the real differences. For a 95% confidence interval the probability that one comparison contains an error is 5%, this is the multiple comparison error rate. One such method is the Bonferroni method, which divides the desired error rate by the number of comparisons (5% / 4 comparisons = 1,25% per comparison). Another option is Tukey's method, this method can be calculated with software and uses the so-called Studentized range, a special kind of distribution. The advantage of Tukey's method is that it gives more specific confidence intervals than Bonferroni's method.

12.3 What is one-way ANOVA?

Analysis of variance (ANOVA) is an inferential method to compare the means of multiple groups. This is an independence test between a quantitative response variable and a categorical explanatory variable. The categorical explanatory variables are called factors in ANOVA. The test is basically a F-test. The assumptions are the same: normal distribution, equal standard deviations for the groups and independent random samples. The null hypothesis is H₀ : μ₁ = μ₂ = … = μ_g and the alternative hypothesis is H_a : at least two means differ.

The F-test uses two measures of variance. The between-groups estimate is the variability between each sample mean ȳ_i and the general mean ȳ. The within-groups estimate is the variability within each group; within ȳ₁, ȳ₂, etc. This is an estimate of the variance σ². Generally, the bigger the variability between the sample means and the smaller the variability within the groups, the more evidence that the population means are inequal. This is the equation for F: between-groups estimate / within-groups estimate. When F increases, P decreases.

In an ANOVA table the mean squares (MS) are the between-groups estimate and the within-groups estimate, these are estimates of the population variance σ². The between-groups estimate is the sum of squares between the groups (the regression SS) divided by df₁. The within-groups estimate is the sum of squares within the groups (the remaining SS, or SSE) divided by df₂. Together the SS between the groups and the SSE are the TSS; total sum of squares.

The degrees of freedom of the within-groups estimate are: df₂ = N (total sample size) – g (number of groups). The estimate of variance by the within-groups sum of squares is:

$s^2 = \frac{(n_1-1)s_1^2+(n_2-1)s_2^2+...+(n_g-1)s_g^2}{N-g}$

The degrees of freedom of the between-groups estimate are: df₁ = g – 1. The variance by the between-groups sum of squares is:

$\sigma^2 = \frac{\sum_i{n_i(\bar{y}_i-\bar{y})^2}}{g-1}$

When this value increases, the population mean is further from the null hypothesis and the difference between the means increases.

For a distribution very different from the normal distribution, the nonparametric Kruskal-Wallis test is an option, this test ranks the data and also works for distributions far from normal.

12.4 What is two-way ANOVA?

One-way ANOVA works for a quantitative dependent variable and the categories of a single explanatory variable. Two-way ANOVA works for multiple categorical explanatory variables. Each factor has a null hypothesis to measure the main effects of an individual factor on the response variable, while controlling for the other variable. The main effect of a factor is: MS / residual MS. The MS is calculated by dividing the sum of squares by the degrees of freedom. Because two-way ANOVA is complex, software is used that shows the MS and the degrees of freedom in an ANOVA table.

ANOVA can be done by creating dummy variables. For instance in research about the groceries spendings of vegetarians, taking into account how someone identifies:

v₁ = 1 if the subject is vegetarian, 0 if the subject isn't

v₂ = 1 if the subject is vegan, 0 if the subject isn't

If someone is vegetarian nor vegan, then they fall in the remaining category (meat eaters).

k = 1 if the subject identifies as budget-minded, 0 if the subject doesn't

Then the model is: E(y) = α + β₁v₁ + β₂v₂ + β₃k. The prediction equation can be deduced. A confidence interval indicates the difference between the effects.

In reality, two-way ANOVA needs to be checked for interaction effects first, using an expanded model: E(y) = α + β₁v₁ + β₂v₂ + β₃k.+ β₄(v₁ x k) + β₅(v₂ x k).

The sum of squares of one of the (dummy) variables is called the partial sum of squares or Type III sum of squares. This is the variability in y that is explained by a certain variable when the other aspects are already in the model.

ANOVA with multiple factors is factorial ANOVA. The advantage of factorial ANOVA and two-way ANOVA compared to one-way ANOVA is that it's possible to study the interaction of effects.

12.5 How does ANOVA with repeated measures work?

Within research, sometimes samples depend on each other, like with repeated measures in different moments of time but using the same subjects. Then each subject is a factor. This may result in three pairs of means (for instance before, during and after treatment), requiring multiple comparison methods. The Bonferroni method divides the margin of error over several confidence intervals.

An assumption of ANOVA with repeated measures is sphericity. This means that the variances of the differences between all possible pairs of explanatory variables are the same. If even the standard deviations and correlations are the same, then there is compound symmetry. Software tests for sphericity with Mauchly's test. If sphericity is lacking, then software uses the Greenhouse-Geisser adjustment of the degrees of freedom to allow for a F-test.

The advantage of using the same subjects is that certain factors are constant, this is called blocking.

Factors with a selected number of outcomes are fixed effects. Random effects are the randomly happening output of factors, like the characteristics of random people that happen to become research subjects.

12.6 How does two-way ANOVA with repeated measures of a factor work?

In research with repeated measures, more fixed effects can be involved. An example of a within-subjects factor is time (before/during/after treatment), because it requires the same subjects. The subjects are crossed with the factor. Something else is a between-subjects factor, for example the kind of treatment, because it compares the experiences of different subjects. Then subjects are nested in the factor.

Due to these two kinds of factors, the SSE consists of two kinds of errors. To analyze every difference between two categories, a confidence interval is required. With the two kinds of errors, residuals can't be used. What can be used instead, are multiple one-way ANOVA F-tests with Bonferroni's method.

Multivariate analysis of variance (MANOVA) is a method that can handle multivariate responses and that makes less assumptions. The disadvantage of making less assumptions is that it has a weaker power.

A disadvantage of repeated measures in general is that it requires data from all subjects in all moments. A model that has both fixed effects and random effects is called a mixed model.

Access:

Public

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

This content is related to:

Statistical methods for the social sciences - Agresti - 5th edition, 2018 - Summary (EN)

Summary of Statistical methods for the social sciences by Agresti, 5th edition, 2018. Summary in English.Read more

3062 keer gelezen

Check more of topic:

Samenvattingen voor psychologie en gedrag

Universiteit Groningen en studieverenigingen

This content is used in:

Statistical methods for the social sciences - Agresti - 5th edition, 2018 - Summary (EN)

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Associate with your Field of Study

Search Summaries or Notes

Start using Summaries

Add a Summary

Search a summary

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

Spotlight: topics

Check the related and most recent topics and summaries:

Institutions, jobs and organizations:

Universiteit Groningen en studieverenigingen

Activities abroad, study fields and working areas:

Samenvattingen voor psychologie en gedrag

This content is also used in .....

Statistical methods for the social sciences - Agresti - 5th edition, 2018 - Summary (EN)

Summary of Statistical methods for the social sciences by Agresti, 5th edition, 2018. Summary in English.

What are statistical methods? – Chapter 1

Which kinds of samples and variables are possible? – Chapter 2

What are the main measures and graphs of descriptive statistics? - Chapter 3

What role do probability distributions play in statistical inference? – Chapter 4

How can you make estimates for statistical inference? – Chapter 5

How do you perform significance tests? – Chapter 6

How do you compare two groups in statistics? - Chapter 7

How do you analyze the association between categorical variables? – Chapter 8

How do linear regression and correlation work? – Chapter 9

Which types of multivariate relationships exist? – Chapter 10

What is multiple regression? – Chapter 11

What is ANOVA? – Chapter 12

How does multiple regression with both quantitative and categorical predictors work? – Chapter 13

How do you make a multiple regression model for extreme or strongly correlating data? – Chapter 14

What is logistic regression? – Chapter 15

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams
How and why use WorldSupporter.org for your summaries and study assistance?
Using and finding summaries, notes and practice exams on JoHo WorldSupporter
Quicklinks to fields of study for summaries and study assistance

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

How and why use WorldSupporter.org for your summaries and study assistance?

For free use of many of the summaries and study aids provided or collected by your fellow students.
For free use of many of the lecture and study group notes, exam questions and practice questions.
For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
For compiling your own materials and contributions with relevant study help
For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Use the summaries home pages for your study or field of study
Use the check and search pages for summaries and study aids by field of study, subject or faculty
Use and follow your (study) organization
- by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
- this option is only available through partner organizations
Check or follow authors or other WorldSupporters
Use the menu above each page to go to the main theme pages for summaries
- Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports

Main study fields NL:

Studies: Bedrijfskunde en economie, communicatie en marketing, geneeskunde en gezondheidszorg, internationale studies en betrekkingen, IT, Logistiek en technologie, maatschappij, cultuur en sociale studies, pedagogiek en onderwijskunde, rechten en bestuurskunde, statistiek, onderzoeksmethoden en SPSS
Studie instellingen: Maatschappij: ISW in Utrecht - Pedagogiek: Groningen, Leiden , Utrecht - Psychologie: Amsterdam, Leiden, Nijmegen, Twente, Utrecht - Recht: Arresten en jurisprudentie, Groningen, Leiden

WorldSupporter: what are the features, functionalities and rules on WorldSupporter.org?

WorldSupporter NL: hoe vind je samenvattingen en studiehulp op WorldSupporter.org en JoHo.org

Summaries and Study Assistance - Start

Follow the author: Annemarie JoHo

Annemarie JoHo

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

1814