The spine of statistics - summary of chapter 2 of Statistics by A. Field (5th edition)

Statistics
Chapter 2
The spine of statistics

What is the spine of statistics?

The spine of statistics: (an acronym for)

  • Standard error
  • Parameters
  • Interval estimates (confidence intervals)
  • Null hypotheses significance testing
  • Estimation

Statistical models

Testing hypotheses involves building statistical models of the phenomenon of interest.
Scientists build (statistical) models of real-world processes to predict how these processes operate under certain conditions. The models need to be as accurate as possible so that the prediction we make about the real world are accurate too.
The degree to which a statistical model represents the data collected is known as the fit of the model.

The data we observe can be predicted from the model we choose to fit plus some amount of error.

Populations and samples

Scientists are usually interested in finding results that apply to an entire population of entities.
Populations can be very general or very narrow.
Usually, scientists strive to infer things abut general populations rather than narrow ones.

We collect data from a smaller subset of the population known as a sample, and use these data to infer things about the population as a whole.
The bigger the sample, the more likely it is to reflect the whole population.

P is for parameters

Statistical models are made up of variables and parameters.
Parameters are not measured an are (usually) constants believed to represent some fundamental truth about the relations between variables in the model.
(Like mean and median).

We can predict values of an outcome variable based on a model. The form of the model changes, but there will always be some error in prediction, and there will always be parameters that tell us about the shape or form of the model.

To work out what the model looks like, we estimate the parameters.

The mean as a statistical model

The mean is a hypothetical value and not necessarily one that is observed in the data.

Estimates have ^.

Assessing the fit of a model: sums of squares and variance revisited.

The error or deviance for a particular entity is the score predicted by the model for that entity subtracted from the corresponding observed score.

Degrees of freedom (df): the number of scores used to compute the total adjusted for the fact that we’re trying to estimate the population value.
The degrees of freedom relate to the number of observations that are free to vary.

We can use the sum of squared errors and the mean squared error to assess the fit of a model.
The mean squared error is the variance.

Estimating parameters

The equation for the mean is designed to estimate that parameter to minimize the error.

Although the equations for estimating the parameters will differ from that of the mean, they are based on the principle of minimizing error: they will give you the parameter that has the least error given the data you have.

Method of least squares (or ordinary least squares OLS): the principle of minimizing the sum of squared errors.

Standard error

The standard deviation tells us about how well the mean represents the sample data.

Sampling variation: samples vary because they contain different members of the population.

Sampling distribution: the frequency distribution of sample means from the same population.
Only hypothetical.
The sampling distribution of the mean tells us about the behaviour of samples from the population. It is centred at the same value as the mean of the population.
If we took the average of all the sample means, we get the value of the population mean.

We can use the sampling distribution to tell us how representative a sample is of the population.

Standard error of the mean (SE) (or standard error: the standard deviation of sample means.

  • hypothetically, the standard error could be calculated by taking the difference between each sample mean and the overall mean, squaring the differences, adding them up, and dividing by the number of samples. And then taking the square root of this value to get the standard deviation of sample means.
  • in the real world, we compute the standard error from a mathematical approximation.

Central limit theorem: tells us that a samples get large, the sampling distribution has a normal distribution with a mean equal to the population mean, and a standard deviation shown in equation.

When the sample is relatively small (fewer than 30), the sampling distribution is not normal. It is a t-distribution.

Any parameter that can be estimated in a sample has a hypothetical sampling distribution and standard error.

In short:

The standard error of the mean is the standard deviation of sample means. As such, it is a measure of how representative of the population a sample mean is likely to be. A large standard error (relative to the sample mean) means that there is a lot of variability between the means of different samples and so the sample mean we have might not be representative of the population mean. A small standard error indicates that most sample means are similar to the population mean.

(Confidence) interval

  • we usually use a sample value as an estimate of a parameter in the population.
  • the estimate of a parameter will differ across samples.
  • we can use the standard error to get some idea of the extent to which these estimates differ across samples.
  • we can use this information to calculate boundaries within which we believe the population value will fall. → confidence intervals.

Applies to any parameter.

Calculating confidence intervals

Point estimate: a single value from the sample.

Interval estimate:; use our sample value as the midpoint, but set a lower and upper limit as well.

A confidence interval show us how often, in the long run, an interval contains the true value of the parameter we’re trying to estimate.
They are limits constructed such that, for a certain percentage of samples, the true value of the population parameter falls within the limit.

To calculate the confidence interval, we need to know the limits within which 95% of the sample mean will fall.
We know (in large samples) that the sampling distribution of means will be normal, and the normal distribution has been precisely defined such that it has a mean of 0 and a standard deviation of 1.

The limits of a confidence interval would be -1,96 to 1,96.
Samples above about 30, will be normally distributed.

The mean is always the centre of the confidence interval.
We assume the confidence interval contains the true mean.
If the interval is small, the sample mean must be very close to the true mean.

Calculating other confidence intervals

If we want to compute confidence intervals for a value other than 95%, we need to look up the value of z for the percentage we want.
The values of z are multiplied by the standard error to calculate the confidence interval.

Calculating confidence intervals in small samples

For small samples, the sampling distribution is not normal. It has a t-distribution.
The t-distribution is a family of probability distributions that change shape as the sample size gets bigger.
It is the same principle as z, but instead we use the value for t.

n-1 is the degrees of freedom, which tell us which of the t-distributions tu use.

Showing confidence intervals visually

Confidence intervals provide us with information about a parameter.

A confidence interval for the mean is a range of scores such that the population mean will fall within this range in 95% of samples.
The confidence interval is NOT an interval within which we are 95% confident that the population mean will fall.

Null hypothesis significance testing

Fisher’s p-value

Scientist tend to use 5% as a threshold for confidence. Only when there is a 5% change of getting the result we have if no effects exist are we confident enough to accept that the effect is genuine.

Types of hypothesis

Null hypothesis: an effect is absent

Alternative hypotheses: there is an effect.

Null hypotheses is the baseline against which we evaluate how plausible our alternative hypothesis is.

Hypotheses can be:

  • Directional
    States that an effect will occur, and also states the direction of that effect
  • Non-directional
    States that an effect will occur, but doesn’t state the direction of the effect.

The process of NHST

p value presents the long-run probability.

We can never be completely sure that one of the hypothesis is correct.

Test statistics

NHST relies on fitting a model to the data and then evaluating the probability of this model, given the assumption that no effect exists.

Systematic variation: variation that can be explained by the model that we’ve fitted to the data.

Unsystematic variation: variation that cannot be explained by the model that we’ve fitted. It is error, or variation not attributable to the effect we’re investigating.

To test whether the model fits the data, is to compare the systematic variation against the unsystematic variation.
In effect, we look at a signal-to-noise ration.

Effect/error
The best way to test a parameter is to look at the size of the parameter relative to the back-ground noise that produced it. The ratio of how big a parameter is to how much it can vary across samples.

The ratio of effect relative to error is a test statistic.
The exact from of the equation changes depending on which test statistic you’re calculating.
They all represtn the same thing. Signal-to-noise or the amount of variance explained by the model we’ve fitted to the data compared to the variance that can’t be explained by the model.

A test statistic: a statistic for which we know how frequently different values occur.

As test statistics get bigger, the probability of them occurring becomes smaller.
If this probability falls below a certain value (p>0,05), we presume that the test statistic is as large as it is because our model explains a sufficient amount of variation to reflect a genuine effect in the real world.
The test statistic is said to be statistically significant.

One- and two-tailed tests

Hypotheses can be directional.

One-tailed test: a statistical model that tests a directional hypothesis

Two-tailed test: a statistical model that tests a non-directional hypothesis.

If the result of a one-tailed test is in the opposite direction to what you expected, you cannot and must not reject the null hypothesis.

Type I and type II errors

There are two types of errors we can make when we test hypotheses.

  • Type I error: occurs when we believe that there is a genuine effect in our population, when in fact there isn’t.
    α level
  • Type II error: when we believe that there is not effect in the population when, in reality, there is.
    ß level

There is a trade-off between these two errors.
If we lower the probability of accepting an effect as genuine, we increase the probability that we’ll reject an effect that dies genuinely exists.

Inflated error rates

If a test uses a 0,05 level of significance, then the chances of making a type I error are only 5%.

We always need to conduct different tests.

Familiwise or experimentwise error rate: the more tests, the more chance of a type I error.

Familywise error = 1-0,95n

the most popular way to correct this is to divide alpha by the number of comparisons.

Statistical power

Statistical power: the ability of a test to find an effect.

ß level (type II error rate).

The power of a test is 1-ß.

We typically aim to achieve a power of 0,8.

The power of a statistical test depends on:

  • how big the effect is (effect size)
  • how strict we are about deciding that an effect is significant
  • sample size

Calculate the power of a test:
Given that we’ve conducted our experiment, we will have already selected a value of alpha, we can estimate the effect size based on our sample data, and we will know how many participants we used. We can calculate the power. If this value is 0,8 or more, we can be confident that we have achieved sufficient power to detect any effects might have existed.

Confidence intervals and statistical significance

Guidelines:

  • 95% confidence intervals that just about toch end-to-end represent a p-value for testing the null hypothesis of no differences of approximately 0,01.
  • If there is a gap between the upper end of one 95% confidence interval and the lower end of another, then p<0,01.
  • A p-value of 0.05 is represented by moderate overlap between the bars.

Sample size and statistical significance

The is a connection between the sample size and the p-value associated with a test statistic.

If the sample gets larger, the standard error (and therefore the confidence interval) gets smaller.

The significance of a test is directly linked to the sample size. The same effect will have different p-values in different-sized samples. Small differences can be deemed significant in large samples, and large effects might be deemed non-significant in small samples.

 

 

Image

Access: 
Public

Image

Join WorldSupporter!
This content is used in:

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Image

 

 

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.

Image

Spotlight: topics

Check the related and most recent topics and summaries:
Activities abroad, study fields and working areas:
Countries and regions:
WorldSupporter and development goals:
Institutions, jobs and organizations:
This content is also used in .....

Image

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

  • For free use of many of the summaries and study aids provided or collected by your fellow students.
  • For free use of many of the lecture and study group notes, exam questions and practice questions.
  • For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
  • For compiling your own materials and contributions with relevant study help
  • For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

  1. Use the summaries home pages for your study or field of study
  2. Use the check and search pages for summaries and study aids by field of study, subject or faculty
  3. Use and follow your (study) organization
    • by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
    • this option is only available through partner organizations
  4. Check or follow authors or other WorldSupporters
  5. Use the menu above each page to go to the main theme pages for summaries
    • Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Main study fields NL:

Submenu: Summaries & Activities
Follow the author: SanneA
Work for WorldSupporter

Image

JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics
7032
Search a summary, study help or student organization