Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition
- 11788 keer gelezen
Statistics
Chapter 2
The spine of statistics
What is the spine of statistics?
The spine of statistics: (an acronym for)
Testing hypotheses involves building statistical models of the phenomenon of interest.
Scientists build (statistical) models of real-world processes to predict how these processes operate under certain conditions. The models need to be as accurate as possible so that the prediction we make about the real world are accurate too.
The degree to which a statistical model represents the data collected is known as the fit of the model.
The data we observe can be predicted from the model we choose to fit plus some amount of error.
Scientists are usually interested in finding results that apply to an entire population of entities.
Populations can be very general or very narrow.
Usually, scientists strive to infer things abut general populations rather than narrow ones.
We collect data from a smaller subset of the population known as a sample, and use these data to infer things about the population as a whole.
The bigger the sample, the more likely it is to reflect the whole population.
Statistical models are made up of variables and parameters.
Parameters are not measured an are (usually) constants believed to represent some fundamental truth about the relations between variables in the model.
(Like mean and median).
We can predict values of an outcome variable based on a model. The form of the model changes, but there will always be some error in prediction, and there will always be parameters that tell us about the shape or form of the model.
To work out what the model looks like, we estimate the parameters.
The mean as a statistical model
The mean is a hypothetical value and not necessarily one that is observed in the data.
Estimates have ^.
Assessing the fit of a model: sums of squares and variance revisited.
The error or deviance for a particular entity is the score predicted by the model for that entity subtracted from the corresponding observed score.
Degrees of freedom (df): the number of scores used to compute the total adjusted for the fact that we’re trying to estimate the population value.
The degrees of freedom relate to the number of observations that are free to vary.
We can use the sum of squared errors and the mean squared error to assess the fit of a model.
The mean squared error is the variance.
Estimating parameters
The equation for the mean is designed to estimate that parameter to minimize the error.
Although the equations for estimating the parameters will differ from that of the mean, they are based on the principle of minimizing error: they will give you the parameter that has the least error given the data you have.
Method of least squares (or ordinary least squares OLS): the principle of minimizing the sum of squared errors.
The standard deviation tells us about how well the mean represents the sample data.
Sampling variation: samples vary because they contain different members of the population.
Sampling distribution: the frequency distribution of sample means from the same population.
Only hypothetical.
The sampling distribution of the mean tells us about the behaviour of samples from the population. It is centred at the same value as the mean of the population.
If we took the average of all the sample means, we get the value of the population mean.
We can use the sampling distribution to tell us how representative a sample is of the population.
Standard error of the mean (SE) (or standard error: the standard deviation of sample means.
Central limit theorem: tells us that a samples get large, the sampling distribution has a normal distribution with a mean equal to the population mean, and a standard deviation shown in equation.
When the sample is relatively small (fewer than 30), the sampling distribution is not normal. It is a t-distribution.
Any parameter that can be estimated in a sample has a hypothetical sampling distribution and standard error.
In short:
The standard error of the mean is the standard deviation of sample means. As such, it is a measure of how representative of the population a sample mean is likely to be. A large standard error (relative to the sample mean) means that there is a lot of variability between the means of different samples and so the sample mean we have might not be representative of the population mean. A small standard error indicates that most sample means are similar to the population mean.
Applies to any parameter.
Calculating confidence intervals
Point estimate: a single value from the sample.
Interval estimate:; use our sample value as the midpoint, but set a lower and upper limit as well.
A confidence interval show us how often, in the long run, an interval contains the true value of the parameter we’re trying to estimate.
They are limits constructed such that, for a certain percentage of samples, the true value of the population parameter falls within the limit.
To calculate the confidence interval, we need to know the limits within which 95% of the sample mean will fall.
We know (in large samples) that the sampling distribution of means will be normal, and the normal distribution has been precisely defined such that it has a mean of 0 and a standard deviation of 1.
The limits of a confidence interval would be -1,96 to 1,96.
Samples above about 30, will be normally distributed.
The mean is always the centre of the confidence interval.
We assume the confidence interval contains the true mean.
If the interval is small, the sample mean must be very close to the true mean.
Calculating other confidence intervals
If we want to compute confidence intervals for a value other than 95%, we need to look up the value of z for the percentage we want.
The values of z are multiplied by the standard error to calculate the confidence interval.
Calculating confidence intervals in small samples
For small samples, the sampling distribution is not normal. It has a t-distribution.
The t-distribution is a family of probability distributions that change shape as the sample size gets bigger.
It is the same principle as z, but instead we use the value for t.
n-1 is the degrees of freedom, which tell us which of the t-distributions tu use.
Showing confidence intervals visually
Confidence intervals provide us with information about a parameter.
A confidence interval for the mean is a range of scores such that the population mean will fall within this range in 95% of samples.
The confidence interval is NOT an interval within which we are 95% confident that the population mean will fall.
Fisher’s p-value
Scientist tend to use 5% as a threshold for confidence. Only when there is a 5% change of getting the result we have if no effects exist are we confident enough to accept that the effect is genuine.
Types of hypothesis
Null hypothesis: an effect is absent
Alternative hypotheses: there is an effect.
Null hypotheses is the baseline against which we evaluate how plausible our alternative hypothesis is.
Hypotheses can be:
The process of NHST
p value presents the long-run probability.
We can never be completely sure that one of the hypothesis is correct.
Test statistics
NHST relies on fitting a model to the data and then evaluating the probability of this model, given the assumption that no effect exists.
Systematic variation: variation that can be explained by the model that we’ve fitted to the data.
Unsystematic variation: variation that cannot be explained by the model that we’ve fitted. It is error, or variation not attributable to the effect we’re investigating.
To test whether the model fits the data, is to compare the systematic variation against the unsystematic variation.
In effect, we look at a signal-to-noise ration.
Effect/error
The best way to test a parameter is to look at the size of the parameter relative to the back-ground noise that produced it. The ratio of how big a parameter is to how much it can vary across samples.
The ratio of effect relative to error is a test statistic.
The exact from of the equation changes depending on which test statistic you’re calculating.
They all represtn the same thing. Signal-to-noise or the amount of variance explained by the model we’ve fitted to the data compared to the variance that can’t be explained by the model.
A test statistic: a statistic for which we know how frequently different values occur.
As test statistics get bigger, the probability of them occurring becomes smaller.
If this probability falls below a certain value (p>0,05), we presume that the test statistic is as large as it is because our model explains a sufficient amount of variation to reflect a genuine effect in the real world.
The test statistic is said to be statistically significant.
One- and two-tailed tests
Hypotheses can be directional.
One-tailed test: a statistical model that tests a directional hypothesis
Two-tailed test: a statistical model that tests a non-directional hypothesis.
If the result of a one-tailed test is in the opposite direction to what you expected, you cannot and must not reject the null hypothesis.
Type I and type II errors
There are two types of errors we can make when we test hypotheses.
There is a trade-off between these two errors.
If we lower the probability of accepting an effect as genuine, we increase the probability that we’ll reject an effect that dies genuinely exists.
Inflated error rates
If a test uses a 0,05 level of significance, then the chances of making a type I error are only 5%.
We always need to conduct different tests.
Familiwise or experimentwise error rate: the more tests, the more chance of a type I error.
Familywise error = 1-0,95n
the most popular way to correct this is to divide alpha by the number of comparisons.
Statistical power
Statistical power: the ability of a test to find an effect.
ß level (type II error rate).
The power of a test is 1-ß.
We typically aim to achieve a power of 0,8.
The power of a statistical test depends on:
Calculate the power of a test:
Given that we’ve conducted our experiment, we will have already selected a value of alpha, we can estimate the effect size based on our sample data, and we will know how many participants we used. We can calculate the power. If this value is 0,8 or more, we can be confident that we have achieved sufficient power to detect any effects might have existed.
Confidence intervals and statistical significance
Guidelines:
Sample size and statistical significance
The is a connection between the sample size and the p-value associated with a test statistic.
If the sample gets larger, the standard error (and therefore the confidence interval) gets smaller.
The significance of a test is directly linked to the sample size. The same effect will have different p-values in different-sized samples. Small differences can be deemed significant in large samples, and large effects might be deemed non-significant in small samples.
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>
This is a summary of the book "Discovering statistics using IBM SPSS statistics" by A. Field. In this summary, everything students at the second year of psychology at the Uva will need is present. The content needed in the thirst three blocks are already online, and the rest
...There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
Main summaries home pages:
Main study fields:
Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports
Main study fields NL:
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
6795 |
Add new contribution