Why is my evil lecturer forcing me to learn statisics? - summary of chapter 1 of statistics by A. Field (5th edition)

Statistics
Chapter 1
Why is my evil lecturer forcing me to learn statistics?

The research process
Collecting data: research design
Analysing data
Reporting data

The research process

Initial observation: finding something that needs explaining

To see whether an observation is true, you need to define one or more variables to measure that quantify the thing you’re trying to measure.

Generating and testing theories and hypotheses

A theory: an explanation or set of principles that is well substantiated by repeated testing and explains a broad phenomenon.

A hypotheses: a proposed explanation for a fairly narrow phenomenon or set of observations.
An informed, theory-driven attempt to explain what has been observed.

A theory explains a wide set of phenomena with a small set of well-established principles.
A hypotheses typically seeks to explain a narrower phenomenon and is, as yet, untested.
Both theories and hypotheses exist in the conceptual domain, and you cannot observe them directly.

To test a hypotheses, we need to operationalize our hypotheses in a way that enables us to collect and analyse data that have a bearing on the hypotheses.
Predictions emerge from a hypotheses. A prediction tells us something about the hypotheses from which it derived.

Falsification: the act of disproving a hypotheses or theory.

Collecting data: measurement

Independent and dependent variable

Variables: things that can change

Independent variable: a variable thought to be the cause of some effect.

Dependent variable: a variable thought to be affected by changes in an independent variable.

Predictor variable: a variable thought to predict an outcome variable. (independent)

Outcome variable: a variable thought to change as a function of changes in a predictor variable (dependent)

Levels of measurement

The level of measurement: the relationship between what is being measured and the number that represent what is being measured.

Variables can be categorical or continuous, and can have different levels of measurement.

A categorical variable is made up of categories.
It names distinct entities.
In its simplest form it names just two distinct types of things (like male or female).
Binary variable: there are only two categories.
Nominal variable: there are more than two categories.

Ordinal variable: when categories are ordered.
Tell us not only that things have occurred, but also the order in which they occurred.
These data tell us nothing about the differences between values. Yet they still do not tell us about the differences between point scale.

Continuous variable: a variable that gives us a score for each person and can take on any value on the measurement scale that we are using.
Interval variable: to say that data are interval, we must certain that equal intervals on the scale represents equal differences in the property being measured.
Ratio variables: in addition to the measurement scale meeting the requirements of an interval variable, the ratios of values along the scale should be meaningful. For this to be true, the scale must have a true and meaningful zero point.

Discrete variable: can take on only certain values (usually whole numbers) on the scale. Whereas continuous variable can be everywhere on the scale.

Measurement error

Ideally we want our measure to be calibrated such that values have the same meaning over time and across situations.

Measurement error: the discrepancy between the numbers we use to represent the thing we’re measuring and the actual value of the thing we’re measuring.

Validity and reliability

Validity: whether an instrument measures what it sets out to measure.

Reliability: whether an instrument can be interpreted consistently across different situations.

Criterion validity: whether you can establish that an instrument measures what it claims to measure through comparison to objective criteria.

Often impractical because objective criteria that can be measured easily may not exist.

Concurrent validity: when data are recorded simultaneously using the new instrument and existing criteria.

Content validity: the degree to which individual items represent the construct being measured.

Predictive validity: when data from the new instrument are used to predict observations at a later point in time.

Validity is a necessary but not sufficient condition of a measure.
To be valid, the instrument must first be reliable.

Test-retest reliability: a reliable instrument will produce similar scores at both points in time.

Collecting data: research design

In correlational or cross-sectional research we observe what naturally goes on in the world without directly interfering with it.

In experimental research we manipulate one variable to see its effects on another.

Correlational research methods

Observing natural events. Not interfering. No causes!
Correlational research provides a very natural view of the question we’re researching because we’re not influencing what happens and the measures of the variables should not be biased by the researcher being there.

In correlational research variables are often measured simultaneously.
Problems with this:

it provides no information about the contiguity between different variables.
things might be caused by a third variable (tertium quid) or (confounding variables)

Longitudinal research: measuring variables repeatedly at different time points.

Experimental research methods

Makes a causal link between variables.
With dependent and independent variable.

Experimental methods strive to provide a comparison of situations in which the proposed cause is present or absent.

The levels of the independent variable are the ways in which they are manipulated.

Two methods of data collection

There are two ways to manipulate the independent variable.

test different entities
between-groups
between subjects
independent design
using the same entities
within-subject
repeated-measure design

Two types of variation

Unsystematic variation: variation due to random factors that exist between the experimental conditions.
Systematic variation: variation due to the experimenter doing something in one condition but not in the other condition.

Randomization

Randomization is important because it eliminates most other sources of systematic variation.

Counterbalancing the order in which a person participates in a condition.
We can use randomization to determine in which order the conditions are completed. We randomly determine whether a participant completes condition 1 before condition 2 or the other way.

Analysing data

Frequency distributions

Frequency distribution (or histogram): a graph plotting values of observations on the horizontal axis, with a bar showing how many times each value occurred in the data set.

Can be very useful for assessing properties of the distribution of scores.
They come in many different shapes and sizes.

Normal distribution: data that is distributed symmetrically on both sizes. In the form of a bell. This shape implies that the majority of scores lie around the centre of the distribution.
There are two ways in which a distribution can deviate from normal:

skew: lack of symmetry
The frequent scores are clustered at the end scale.
These skews can be:
Positively skewed: the frequent scores are clustered at the lower end and the tail points toward the higher or more positive scores
Pnegatively skewed: the frequent scores are clustered at the higher end and the tail points toward the lower or more negative scores.
Kurtosis: pointyness. The degree to which scores cluster at the ends of the distribution (the tails) and this tends to express itself in how pointy a distribution is.
This can be:
Leptokurtic (positive kurtosis) has many scores in the tails and is pointy.
Platykurtic (negative kurtosis) is relatively thin in the tails and tends to be flatter than normal.

In a normal distribution the values of skew and kurtosis are 0.
if a distribution has values of skew or kurtosis above or below 0, then this indicates a deviation from normal.

The mode

Central tendency: where the centre of a frequency distribution lies.

The mode: the score that occurs most frequently in a data-set.
The mode can take on several values:

bimodal: two modes
multimodal: data sets with more than two modes.

The median

Median: the middle score when scores are ranked in order of magnitude.
If it is an even number, we add the two middle scores and divide it by two.

The median is relatively unaffected by extreme scores at either end of the distribution.
It is also relatively unaffected by skewed distributions, and can be used with ordinal, interval and ratio data. (Not nominal data, for these data have no numerical order).

The mean

To calculate the mean we add up all of the scores and then divide them by the total number of scores we have.

Disadvantage: it can be influenced by extreme scores.
It is also affected by skewed distributions and can be used only with interval or ratio data.

But:

The mean uses every score
The mean tends to be stable in different samples.

The dispersion in a distribution

The range of scores: take the largest score and subtract from it the smallest score of a data-set.

Problem: the range is affected dramatically be extreme scores.

Interquartile range: cut off the top and bottom 25% of scores and calculate the range of the middle 50% of scores.

Quartiles: the three values that split the data into four equal parts.

the second quartile: the median (splits our data into two equal parts)
the lower quartile: the median of the lower half of the data
the upper quartile: the median of the upper half of the data

The median is not included in the two halves when they split, but you can include it.

The interquartile range isn’t affected by extreme scores at either end of the distribution, but you lose data!

Quantiles: values that split a data set into equal portions.
Quartiles are quantiles that split data into four equal parts.
Percentiles are quantiles that split data into 100 equal parts.

If we want to use all the data rather than half of it, we can calculate the spread of scores by looking at how different each score is from the centre of the distribution.
Deviance: the difference between each score and the mean.

If we want to know the total deviance, we could add up the deviances for each data point.
The problem with using the total is that its size will depend on how many scores we have in the data.

Sum of squared errors (ss): square all the deviances and add them up.
We can use the sum of squares as an indicator of the total dispersion, or total deviance of scores from the mean.

Variance: the average error between the mean and the observations made.
The sum of squares divided by the number of observations.

The variance gives us a measure in units squared.

Standard deviation: the square root of the variance.

The sum of squares, variance and standard deviation are all measures of the dispersion or spread of data around the mean.
A small standard deviation indicates that the data points are close to the mean.
A large standard deviation indicates that the data points are distant from the mean.
A standard deviation of 0 indicates that the scores were the same.

Using a frequency distribution to go beyond data

Another way to thing about frequency distributions is not in terms of how often scores actually occurred, but how likely it is that a score would occur.

Probability distribution: just like a histogram except that the lumps and bumbs have been smoothed out so that we see a nice smooth curve. The area under the curve tells us something about the probability of a value occurring.

We often use a normal distribution with a mean of 0 and a standard deviation of 1 as standard by a probability distribution.

To centre the data around zero, we take each score (X) and subtract from it the mean of all scores ( X with -). Then we divide the resulting score by the standard deviation.

The resulting scores are denoted by the letter z and are the z-scores.
The sign of the z-score tells us whether the original score was above or below the mean. The value of the z-score tells us how far the score was from the mean in standard deviation units.

Reporting data

Dissemination of research

Sharing information is a fundamental part of being a scientist.

Scientific journal: a collection of articles written by scientists on a vaguely similar topic.

A capital N represents the entire sample.
A lower case n represents a subsample.

Access:

Public

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

This content is related to:

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Check more of topic:

Samenvattingen voor psychologie en gedrag

Universiteit Amsterdam: UVA

This content is used in:

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Spotlight: topics

Check the related and most recent topics and summaries:

Activities abroad, study fields and working areas:

Samenvattingen voor psychologie en gedrag

Research, science and statistics

WorldSupporter and development goals:

Development Goal 04: Quality Education

Institutions, jobs and organizations:

Universiteit Amsterdam: UVA

This content is also used in .....

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

This is a summary of the book "Discovering statistics using IBM SPSS statistics" by A. Field. In this summary, everything students at the second year of psychology at the Uva will need is present. The content needed in the thirst three blocks are already online, and the rest

...

analysis-2958826_960_720.jpg

Why is my evil lecturer forcing me to learn statisics? - summary of chapter 1 of statistics by A. Field (5th edition)

The spine of statistics - summary of chapter 2 of Statistics by A. Field (5th edition)

The beast of bias - summary of chapter 6 of Statistics by A. Field (5th edition)

Non-parametric models - summary of chapter 7 of Statistics by A. Field (5h edition)

Correlation - summary of chapter 8 of Statistics by A. Field (5th edition)

The linear model - summary of Chapter 9 by A. Field 5th edition

Comparing two means - summary of chapter 10 of Statistics by A. Field (5th edition)

Moderation, mediation, and multi-category predictors - summary of chapter 11 of Statistics by A. Field (5th edition),

Comparing several independent means - summary of chapter 12 of Statistics by A. Field (5th edition)

Analysis of covariance - summary of chapter 13 of Statistics by A. Field (5th edition)

Factorial designs - summary of chapter 14 of statistics by A. Field (5th edition)

Repeated measures designs - summary of chapter 15 of Statistics by A. Field (5th edition)

Mixed designs - summary of chapter 16 of Statistics by A. Field (5th edition)

Multivariate analysis of variance (MANOVA) - summary of chapter 17 of Statistics by A. Field (5th edition)

Exploratory factor analysis - summary of chapter 18 of Statistics by A. Field (5th edition)

Categorical outcomes: chi-square and loglinear analysis - summary of chapter 19 of Statistics by A. Field

WSRt using SPSS, manual for tests in the third block of the second year of psychology at the uva

Everything you need for the course WSRt of the second year of Psychology at the Uva

Categorical outcomes: logistic regression - summary of (part of) chapter 20 of Statistics by A. Field

Check how to use summaries on WorldSupporter.org

Submenu: Summaries & Activities

Follow the author: SanneA

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

Search a summary, study help or student organization

Select any filter and click on Search to see results

Why is my evil lecturer forcing me to learn statisics? - summary of chapter 1 of statistics by A. Field (5th edition)

The research process

Collecting data: research design

Analysing data

Reporting data

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Samenvattingen voor psychologie en gedrag

Universiteit Amsterdam: UVA

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Contributions: posts

Add new contribution

Spotlight: topics

Samenvattingen voor psychologie en gedrag

Research, science and statistics

Development Goal 04: Quality Education

Universiteit Amsterdam: UVA

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

analysis-2958826_960_720.jpg

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

Quicklinks to fields of study for summaries and study assistance