Lecture Notes Introduction to Research Methods and Statistics, Psychology Bachelor 1, University of Leiden 2018/19

Week 1

Lecture 1 05/09/18

What is psychology?

In the terms of statistics, psychology is a scientific discipline in which behaviour and mental processes are measured and observed, using scientific methods. Important in this process are three things:

  1. Public Varification. This means that you must always note down precisely what and how you did something, do that other scientists can recreate your experiment if they wish. This ensures that the results of a study can be validated and accepted as true.
  2. Systematic Empiricism. This means using a systematic way of measuring and testing to test, formulate or support hypotheses.
  3. Solvable problems. Science works with the things that we see occur and logical conclusions. Solving the problem of the reason of our existence, for instance, would not be a solvable problem and thus not be very scientific.

It is a scientist's job to discover and document new patterns, phenomena or correlations, but also to develop and evaluate explanations for these phenomena. This can be done in four ways:

  1. A descriptive study. Describing observations, patterns and results in words and numbers.
  2. A correlational study. With this kind of study, you can only find proof that there is a relationship between two certain valuables, although you cannot say which influences the other and how. This is because in a correlational study, you simply observe and do not interfere, and thus have no control over any other interfering variables.
  3. An experimental study. Because an experimental study is done in a controlled environment, this kind of study can explain a cause-effect relationship between two or more variables.
  4. A quasi-experimental study. Doing a quasi-experimental study means that you do it outside a controlled environment but do interfere with the occurrences. 

 

The empirical cycle

The empirical cycle consists of the stages that a study goes through when it is developed, tested and evaluated. There are five stages that lead in a circle: Observation > Induction > Deduction > Testing > Evaluation > Observation again.

The observation stage is often also described as the freedom of design. In this stage, the idea for a study or research question occurs. This idea can come from anywhere, from looking out of your window to reading a newspaper at the breakfast table.

Induction means that you begin to formulate your idea into a general theory. You come up with a set of rules or guidelines that tries to explain the connection between two or more concepts.

Deduction means that you form a very specific research question, often phrased as a hypothesis, which is a prediction that is based on your theory. You also come up with how you want to test this hypothesis.This is done with two definitions. Conceptional, what is it that you are testing, and operational, how do you measure your outcomes?

Moving into the testing phase means that you carry out your experiment and collect the results. after this you analyse the data and draw a conclusion from your results.

Evaluation means that you take the conclusions drawn in the testing phase and look back on the theory. What do my results say about my hypothesis? About my specific research question? About my general theory? You should hold your study to critical review and where necessary expand, improve or adjust before you start the cycle all over again.

A theory can never really be proven or disproven, it can only be confirmed or falsified. Successful data supports a theory rather than proving it, simply because our current understanding of science and the world is ever evolving, ever changing. 

 

Variability and variance

In studies, there is always one or more variable. The amount that this variable changes or differs is called variability. Variability is very important in psychology because it describes and explains differences between people.

Variance is the measure of variability. It shows how much variability there is within a study. The sign for Variance is Sand can be calculated by dividing the Sums of Squares by the amount of observations minus one. In formula, it looks like this:

S2 = ∑ (yij - y)2 / (N - 1)

In which: y = the grand mean, yij = the score of individual i in group j, ∑ means summation, N = the number of observations and ∑ (yij - y)2 = the sum of squares, also labelled SS(total).

Within an experiment, there is always a certain amount of error that is outside the examiner's control, such as, for instant, a genetic predisposition for losing weight or naturally higher anxiety levels. Therefore, the mount of total variance is built up of systematic variance (real variance) and error variance (the mistakes).

Systematic variance and error variance can be calculated by the formulas: 

Systematic S2 = ∑nj (yj - y)2 / (N - 1), where yj = the mean of group j and nj = the participants in group j

Error S2 = ∑ (yij - yj)2 / (N - 1) 

The sums of squares of these two formulas are respectively called SS(between) and SS(within)

The proportion of systematic variance within the total variance is also called Variance Accounted For (VAF). It is calculated by dividing the systematic variance by the total.

 

Week 2

Lecture 2 12/09/18

Observation and measurement

Observation in psychology means that you put things you see, hear or smell into several categories for later evaluation and to measure means that you give a value to these categories. This can be in words (Good, bad, heavy, light, big, small) or in numbers (46%, 00.1, 5 years).

A variable is a collection of these values and it is important in a study that all your measurements and observations comply to two ground rules: Exhaustiveness and mutual exclusivity.

An observation is exhaustive when there is a value for each and every measurement. This means that it must be possible , for instance in a questionnaire, for every person to always give an answer. This can be done by, for example, adding an 'other' category to multiple choice questions, so that you include everyone.

Mutual exclusivity means that it must be impossible to give more than one answer. There mustn't be any overlap between answers. For instance, the categories 'more than once', 'more than five times' and 'more than ten times' are not mutually exclusive because six can be more than once but also more than five. The categories '0', '1-5' and '6-10', however, are, because there is only one option.

 

Levels of Measurement

The way you measure an observation can also be placed into categories, called the scales of measurement. There are five levels in total.

  1. Nominal. This means that the numbers that are given do a group serve only as labels without a larger meaning. Example = 'group 1' and 'group 2'
  2. Ordinal. There is a ranking order, but the steps are not of equal size. Example = a race or competition.
  3. Interval. There is a ranking order and the steps are equal, but the 0 point doesn't mean there is nothing to be measured. Example = temperature. There is a 0 point, but it is different in C° and in F°and in K°.
  4. Ratio. There is a ranking order and the steps are equal and there is a fixed 0 point, but the unit of measure can vary. Example = distance or volume.
  5. Absolute. There is a ranking order and the steps are equal and there is a fixed 0 point and a fixed unit of measure. Very few things are of an absolute level of measurement. Example = probability and chance.

 

Important in observations

When doing an observation, there are always a few factors you must take into account.

What is the research setting? A lab might influence the subject's behaviour, or maybe what you are measuring is factored by the environment. How are you going to control it?

Is the subject aware? There are two types of observation - direct and disguised. Direct observation, or also called undisguised observation means that the subject knows you are there. This might influence the subject's behaviour, but can also sometimes be necessary. Disguised observation means that the subject(s) does not know you are there, for instance, sitting at a park bench and studying the behaviour of people with dogs.

 

Week 3

Lecture 3 19/09/18

How to measure correctly

When doing an experiment or observation, there are two factors that are very important when it comes to measuring your variable, reliability and validity. Reliability means that you measure your variable correctly with the least amount of error and validity means that you actually measure what you intend to measure.

 

Reliability

Reliability can be compared to the Variance Accounted For (Week 1). When measuring a variable, the measured score always consists of two parts, the true score and the error measurement. This error measurement can consist of a few things;

  • Situational factors (such as environment or social engagements)
  • Characteristics of the measure (e.g. if it's based on a person's judgement)
  • Mistakes
  • Stability (are you sure your measurement isn't influenced by mechanical factors)

There are several ways to make sure that your experiment or study is reliable, such as;

  1. Test- retest reliability. This simply means that you measure things more than once and it proves the consistency of your study.
  2. Parallel form reliability. You find different instruments to test the same thing. Beware that you make sure they do actually test the same thing and not something different. (E. g. the CITO and the NIHO)
  3. Interitem reliability. This means that you test if all the parts of your measuring instruments work. This can be done by, for instance with a questionaire, splitting the results in half and compare, or by using Cronbach's Alpha.
  4. Interrater reliability. This means you use different, similarly trained, examiners so avoid single person bias.
  5. Replication. This means that you do your entire study all over again, expecting the same results.

 

Validity

Validity means that you make sure you actually measure what you want to measure. There are four types of validity; 

  1. Face validity. Does it look like you're measuring what you want to measure?
  2. Content validity. Does it cover every aspect of what you want to measure or do you leave something out?
  3. Construct validity. Does it have the correlations with other variables as it should have? There are two types of construct validity. There is convergent, meaning it has strong correlations (for instance, introversion and number of friends, or introversion and level of shyness), and divergent, meaning is has weak to no correlation (for instance, introversion and blood pressure).
  4. Criterion-related validity. Does the measurement relate to a particular behavioural criterion or situation? Of this there are also two types. Concurrent, which means present behaviour that can be influence by a situation, like for instance, taking your driver's test, or predictive, which predicts future behaviour or results, like an entrance examination for a school or study.

 

Week 4

Lecture 4 26/09/18

Descriptive research

When doing descriptive research, there are different ways to go about it. You can do a survey, measuring attitudes, problems, thought patterns, or you can do a demographic, noting patterns in life, such as birth or marriage or voting. You can also put together an epidemiological, meaning you catalogue things such as disease and death. When doing a survey, there are three ways to do it;

  1. Cross-sectional. Meaning you do it once, in various places in the population.
  2. Successive independent. This means you perform a cross-sectional survey multiple times. The downside of this is that you will have different respondents every time, so can you really compare them?
  3. Longitudinal (also called panel survey). This means you perform the survey with the same people, over several periods of time. The only risk you have is that people might drop out, for instance if they move to a different place.

 

Describing data

The entire overview of your data, hereafter called the distribution, can tell you a lot about the pattern in society and there are several things you can take into account when analysing it;

  1. The shape. When drawing your distribution up in a graph, take a look at the shape. Does it have one, or more peaks? Is it symmetrical or is the data skewed?
  2. The central tendency. The mean, the mode and the median, where do they lie?
  3. The spread. How far apart is the data? 
  4. Outliers. Are there any bits of data that lie so far from the rest that they can be ignored?

 

Frequency

When dealing with a bunch of raw data, it is sometimes easier to oversee if you take the frequency in consideration. There are two types of frequency, absolute and relative. Absolute means the exact number of participant that had a certain score. The downside of this is that it is hard to interpret. Easier is using the relative frequency, meaning that you put it into percentages, using the P for proportion. This way, you can clearly see, how big of a group you are looking at, compared to the rest.

When you have only a small number of participants in each category or when you are dealing with variables with many categories in itself, you can decide to make a grouped frequency table. This means you distribute the raw data over K class intervals and make a new frequency distribution. There are two guidelines for this. One, the number of new classes (K) = √n, and two, the class interval width (I) = range / number of classes (Range being the highest score – lowest score)

When doing a cumulative frequency distribution, it means that you add up the proportions to eventually make 1. E.g. this table.

Class intervalFrequency (f)Proportion (P)Cum. Frequency (F)
1-720.040.04
8-14160.320.36
15-21240.480.84
22-2880.161.00
total501.00 

 

Measures of spread, 5 point summary and boxplots

When measuring the spread of a distribution, there are four measures you can use;

  1. Range (R) = Highest score – Lowest score
  2. Interquartile range (IQR) = Q3 – Q1  
  3. Standard deviation (s or σ) = spread around the mean  
  4. Variance (s² or σ²) = spread around the mean

There is also the 5 point summary that gives a clear overview of any raw data. Like the name says, this consists of calculating five numbers.

  • The minimum = Lowest (non-outlying) score
  • Q1 = 25th percentile (25% lower, 75% higher)
  • The median (=Q2) = 50th percentile
  • Q3 = 75th percentile
  • The maximum = Highest (non-outlying) score

With these five points, you can also draw up a good boxplot, which is the most accurate graphical display for this kind of data.

 

Week 5

Lecture 5 10/10/18

Recap: How can we portray and describe a distribution?

There are several ways how you can show and describe a distribution. The best way to show a clear overview is through a graphical display. This can be done through a table, or a histogram or an actual line graph. When describing a distribution, take note of patters and significant deviations, as well as numerical descriptions such range and spread.

 

Density curves and the normal distribution

Density curves are often described as "an ideal approximation of empirical data". This means that it is a curve that isn't always a hundred percent match to the data, it is the closest match to a perfect curve. A density curve can be positively, negatively or normally skewed. A positive skew means the top of the curve is more to the left of the distribution, a negative skew more to the right.

The normal distribution has a top that is right in the middle, and is perfectly symmetrical on both sides. It is often described as 'bell-shaped'. The symbol \mu   stands for the mean of the distribution and \sigma  means standard deviation. Distributions with a larger standard deviation are often low and wide, while distributions with a smaller distribution are high and narrow. Just like any other distribution, the normal distribution is an approximation and will not always fit a 100%, 

For the normal distribution are three fist rules. Between -1 and +1 standard deviation is 68% of the curve. Between -2 and +2 is 95% and -3/+3 is 99.7%. This is often referred to as the 68-95-99.7% rule.

 

Standardizing with z-scores

To find the statistics or probabilities of a distribution, it is often easier to standardize the distribution. Standardizing has no effect on the shape of a distribution, but it does allow you to judge a distribution without having to take context in account.

Standardizing is done by use of the mean and standard deviation and transfer the scores into z- scores. This is done with a simple formula, wherein X is the exact score you want to transfer;

{\displaystyle Z={\frac {X-\mu }{\sigma }}}

When you have the z-score, you can look up the exact percentage in a conversion table. Take into account that these percentages are taken from the left of the table aka 0, up to your score. So if you want to know how much percent is above your score, you have to converse it by doing 1 - your z-score. Such a conversion can be found at, for instance; https://en.wikipedia.org/wiki/Standard_normal_table

 

Practise

On the math aptitude test for the second grade, girls have an average score of 77 and boys of 73. The standard deviation is 16. Which percentage of all girls has a score that is equal to or higher than the mean score of the boys?

How to go about this: First, draw a picture and define what you are looking for. In this case, P(Xg ≥ Xb), or rather, P(Xg ≥ 73). The z score of this would be z = 77 - 73 / 16 = -0.25/

Taking this proportion (F) from the table would be 0.4013. This means that about 60% or the girls have a higher average score.

 

Week 6 

Lecture 6 17/10/18

Relationships between variables

There are two common relationships between variables; association and dependence. Association is also known as interdependence. This is when both variables have the same role in a scenario or phenomenon and often work alongside each other. Occasionally, there is also a third (dependable) variable involved. Dependence is often related to causality. There is the independent or explanatory variable X and the dependant, response variable Y.

 

Scatterplots

Scatterplots are handy for graphically showing the relationship between variables. When assessing a scatterplot, there are a few points that can help you judge the correlation of the variable;

  • Direction: It can either be high-high or high-low. High-high means that high scores on the X-axis give high scores on the Y-axis. The other way around, high-low means high scores on the X-axis give low scores on the Y-axis.
  • Strength. The stronger the correlation, the closer all the points will be around a straight line. If the points are very spread out, the correlation is often weaker. 
  • Shape. Scatterplots can be linear or non-linear.
  • Deviations. Outliers can heavily influence correlation is they are located far from the others.

If you want to practice guessing and judging correlations, try the game http://guessthecorrelation.com/

 

Covariance

Covariance is the variability of two variables together. It does not use the sum of squares, but makes use of the cross products, because two variables often differ. The formula of covariance is;

Sxy = ∑ (xi - x)(yi - y)/ (N - 1)

Pearson Product-Moment Correlation

Pearson Product-Moment Correlation, or also called, Pearson's r, is a measure of a relationship between variables. The correlation of Pearson's r always lies somewhere between -1.00 and +1.00. Pearson's r is a standardised measurement and thus does not have a unit that can influence it. The formula for Pearson's r is;

rxy = ∑ zxz/ n - 1        or         rxy = sx/ sxs

Where zxzis the sum of both z-scores, and sx stands for the covariance of x and y.

The rule of thumb for interpreting r is that 0.1 is small, 0.3 is medium and 0.5 or up is a large correlation.

 

Week 7

Lecture 7 - 24/10/18

Relationships in variables

There are three ways to explain a relationship between two variables, and how they interact with each other. These three are;

  • Causality - Variable X causes change in variable Y
  • Common response - Variable X and variable Y both influence variable Z
  • Confounding - Variable X and variable Z both influence variable Y

A lurking variable is a variable that, just like confounding's Z, has and influence on your test, but you are not aware of it or do not have a way to measure. For instance, if you are measuring the amount of money spent on clothes and measure ads and spending habits, but forget about the social pressure of teenagers, that would be a lurking variable.

 

Determining causality

There are three things important in determining causality;

  • Covariance - do the variables correlate properly?
  • Directionality - does the cause precede the effect over time?
  • Internal validity - eliminating all other possible explanations

 

Internal validity

Proper internal validity means that you draw correct conclusions during your study, that you eliminate confound variables. Your internal validity can be raised by performing experimental control checks. To eliminate your confounds, there are three things very important, environment, instructions and invasiveness. 

Your environment can always influence your participant. In some studies, a lab setting can unsettle or sensitize the subject, while in others, an uncontrolled, non-lab environment can have confounding influences. Make sure that you have checked your instructions are so clear there cannot be any mistakes in reading or understanding, thus confounding your study. Lastly, your presence and invasiveness in the study can also be confounding and influencing. By controlling these three things, you can eliminate as many of your confounding variables as possible. 

However, there are always independent variables that you cannot control, such as gender or age or upbringing. Keep in mind that these can have an influence as well.

Experimenting has two levels that you need to have for proper internal validity. You need an experimental condition, in which treatment takes place. This can be quantitative - many times - or qualitative - highly condensed and focussed. Next, you also need a control condition. This means there is either no treatment or dummy treatment, meaning for instance a placebo.

 

Assignment and threats on internal validity

There are three ways of randomly assigning levels to people;

  1. Simple random assignment - drawing names, for instance
  2. Matched random assignment - matching people to conditions or other people
  3. Blocking - take groups of equal size of similar participants and assign them randomly to conditions

When working on internal validity, there are several situations when something happens with the participants that is outside of your control and yet still influence your study. A couple of these occurences are;

  • Selection = biased assignment of participants, based on the judgement of the examiner
  • Attrition = loss of participants, due to e.g. moving or illness
  • Testing = effect of pretest on Y
  • (Pretest) Sensitization = interaction pretest x treatment, because of the pretest, participants become more aware of the treatment
  • Maturation = changes within participants, such as growing up or practise
  • History = an event happing that changes something with the participant

Expectancy, of both the examiner  can also majorly influence your results and thus also influence your internal validity. As an examiner, beware of the examiner-expectancy effect. This means that you, as observer, will be tempted to only see and take note of the observations that help your hypothesis and tend to forget about the rest.

For the participant, there are usually two major expectancy effects, the demand characteristics and the placebo effect. The placebo effect means that the participant think they've received, for instance, a medicine while they actually have not, and yet still seem to experience the effects of said medicine. The demand characteristics are in play when the participant thinks something is expected of them, and then in result, slightly changes their behaviour to match that demand.

 

Week 8

Lecture 8 - 31/10/18

Different designs

When creating a study, there are a lot of designs you can pick from and need to take into consideration. Here is an overview of some of the major designs.

  • One way design - A simple study with a single variable
  • Factorial - A study with two or more variables
  • Posttest only - a treatment with a test that comes after the manipulation
  • Pretest-Posttest - a test before the treatment and a test after
  • Between-subject - several groups that preform the group under different conditions
  • Within-subject - multiple test with the same group, spread out over time
  • Experimental conditions - treatment or manipulation
  • Control conditions - no treatment or intervention

Symbols

When describing a design, it is easier to use symbols to describe it. It could, for instance, look like this: 

NR    III     O0.....X.....O1

In this case, NR = No Random assignment. Its opposite would R = Random assignment. The III is the group number in roman numerals and X is the treatment. OO is pretest and O1 is the posttest. 

 

Solomon 4 group design

The Solomon 4 group design allows you to calculate the influences of the treatment (X), the Testing effect (T), Sensitization (S) and History/Maturation (H/M).

R   I   O0 (6)   --- X --- O1 (23)

R   II   O0 (7) ---------- O1 (15)

R   III                  X --- O1 (19)

R   IV                        O1 (11)

 

How to draw conclusions from this:

  • Posttest - pretest group 1: O1 (I) – O0 (I) = 23 – 6 = 17      → HM + T + X + S (1)
  • Posttest - pretest group 2: O1 (II) – O0 (II) = 15 – 7 = 8      → HM + T (2)
  • So the effect of X could be:(1) - (2) = 17 – 8 = 9              → X + S (3)
  • Rule out Sensitization: O1 (III) – O1 (IV) = 19 – 11 = 8         → X (4)
  • So the value of S is: (3) - (4) = 9 – 8 = 1                           → S (5)
  • Value of Testing: O1 (II) - O1 (IV) = 15 – 11 = 4                 → T (6)
  • So value of HM: (2) – (6) = 8 – 4 = 4                                  → HM (7)

ExamTips

  • Week 5: Always draw a picture of your normal distribution, it gives you a better idea what you are looking for. 
  • Week 7: Keep in mind that the Z variable in a confounding relationship cannot be measured. Living expenses would thus not be a confounding variable but social pressure could be.

Image

Access: 
Public

Image

Join WorldSupporter!
This content is related to:
Work Group Excersices Introduction to Research Methods and Statistics - Psychology Bachelor 1, University of Leiden 2018/19
Search a summary

Image

 

 

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.

Image

Spotlight: topics

Check the related and most recent topics and summaries:
Institutions, jobs and organizations:
Activity abroad, study field of working area:
Countries and regions:

Image

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

  • For free use of many of the summaries and study aids provided or collected by your fellow students.
  • For free use of many of the lecture and study group notes, exam questions and practice questions.
  • For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
  • For compiling your own materials and contributions with relevant study help
  • For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

  1. Use the summaries home pages for your study or field of study
  2. Use the check and search pages for summaries and study aids by field of study, subject or faculty
  3. Use and follow your (study) organization
    • by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
    • this option is only available through partner organizations
  4. Check or follow authors or other WorldSupporters
  5. Use the menu above each page to go to the main theme pages for summaries
    • Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Main study fields NL:

Follow the author: Emy
Work for WorldSupporter

Image

JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics
1999 1