Summary of Discovering statistics using IBM SPSS statistics by Andy Field - 5th edition
- 3135 keer gelezen
There are three main misconceptions of statistical significance:
The use of NHST encourages ‘all-or-nothing’ thinking. A result is either significant or not. If a confidence interval contains zero, it could be that the population effect might be zero.
An empirical probability is the proportion of events that have the outcome in which you’re interested in an indefinitely large collective of events. The p-value is the probability of getting a test statistic at least as large as the one observed relative to all possible values of the null hypothesis from an infinite number of identical replications of the experiment. It is the frequency of the observed test statistic relative to all possible values that could be observed in the collective of identical experiments. The p-value is affected by the intention of the researcher as the p-values are relative to all possible values in identical experiments and sample size and time of collection of data (the intentions) could influence the p-values.
In journals, based on NHST, there is a publication bias. Significant results are more likely to get published. Researcher degrees of freedom are ways in which the researcher could influence the p-value. This could be used to make it more likely to find a significant result (e.g. by excluding some cases to make the result significant). Researcher degrees of freedom could include not using some observations and not publishing key findings.
P-hacking refers to selective reporting of significant p-values by trying multiple analyses and reporting only the significant ones. HARKing refers to making a hypothesis after data collection and presenting it as if it was made before data collection. P-hacking and HARKing makes results difficult to replicate. Tests of excess success (e.g. looking at multiple studies studying the same and calculating the probability of them all having success) are used to see whether it is likely that p-hacking or something else may have occurred.
EMBERS
There is an abbreviation for how to tackle the problems of NHST: Effect sizes (E), Meta-analysis (M), Bayesian Estimation (BE), Registration (R) and Sense (S), together making EMBERS.
SENSE
There are six principles for when using NHST in order to use your sense:
The problems of NHST and p-hacking can be combatted by pre-registering research and using open science.
EFFECT SIZES
A statistical significant result does not tell us anything about the importance of an effect. The size of an effect can be measured by calculating the effect size. This is an objective and standardized measure of the magnitude of observed effect. There are several measures of effect size.
One way to do this is using Cohen’s d and it uses the following formula:
It is standardized in standard deviations. The rules of thumb for using Cohen’s d are the following: d=0.2 (small), d=0.5 (medium), d=0.8 (large). If the standard deviations are not equal then it is possible to use the pooled standard deviation which uses the following formula:
N denotes the sample size of each group and s denotes the standard deviation. Another way of calculating effect sizes is making use of Pearson’s r. It is a measure of strength of a relationship between two continuous variables and uses the following rules of thumb:
r | Effect size | Variance explained |
0.10 | Small effect | 1% of total variance |
0.30 | Medium effect | 9% of total variance |
0.50 | Large effect | 25% of total variance |
Pearson’s r can vary from -1 to 1. Cohen’s d is favoured if the group sizes are very discrepant. ‘r’ can be quite biased compared to ‘d’. Another way to calculate the effect size is by making use of the odds ratio. It is useful when using it for counts (contingency table). The odds of an event occurring refers to the probability of an event occurring divided by the probability of that event not occurring. It uses the following formula:
The odds ratio is the odds of an event divided by the odds of another event. It uses the following formula:
META-ANALYSIS
A basic meta-analysis is taking the average of the effect sizes of the studies. It uses the following formula:
It is the sum of all the effect sizes divided by the number of studies. An actual meta-analysis uses a weighted average, instead of the regular average.
BAYESIAN APPROACHES
Bayesian statistics is about using the data you collect to update your beliefs about a model parameter or a hypothesis. Beliefs are updated based on new information. Bayes’ theorem is used to calculate the conditional probabilities. Conditional probability deals with finding the probability of an event when you know that the outcome was in some particular part of the sample space. It is most commonly used to find a probability about a category for one variable (e.g: a person being a drug user).
For events A and B, the conditional probability of event A, given that event B has occurred, is:
Depression | Positive | Negative | Total probability |
Yes | P(Pos|D) = 0.99 | Pos(Neg|D) = 0.01 | 1 |
No | P(Pos|Dc) = 0.02 | Pos(Neg|Dc) = 0.98 | 1 |
It can be calculated using a tree-diagram, or by using Bayes’ theorem. Bayes’ theorem uses the following formulas:
Posterior probability is our belief in a hypothesis after having considered the data. The prior probability is our belief in a hypothesis before considering the data (base rate). The marginal likelihood is the probability of the observed data.
When estimating a parameter, the prior probability is a distribution of possibilities. An informative prior distribution shows the distribution of the probabilities before considering the data. In an uninformative prior distribution (a flat line) you are prepared to believe all possible outcomes with equal probability. Unlike the uninformative prior distribution, the informative prior distribution does show you what values are more probable. A credible interval are the limits between which 95% o the values fall in the posterior distribution fall.
Bayes’ theorem can be used to compare two hypotheses using posterior odds. This uses the following formula:
A Bayes factor is the ratio of the probability of the data given the alternative hypothesis to that for the null hypothesis. A Bayes factor greater than 1 suggests that the observed data are more likely given the alternative hypothesis than given the null.
The positives of Bayesian statistics is that it is not affected by the problems of NHST, but a negative is that it requires a prior belief, which is subjective.
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>
This bundle contains the chapters of the book "Discovering statistics using IBM SPSS statistics by Andy Field, fifth edition". It includes the following chapters:
- 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18.
There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
Main summaries home pages:
Main study fields:
Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports
Main study fields NL:
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
2958 | 1 | 1 |
Add new contribution