Work Group Excersices Introduction to Research Methods and Statistics - Psychology Bachelor 1, University of Leiden 2018/19
- 1664 reads
In the terms of statistics, psychology is a scientific discipline in which behaviour and mental processes are measured and observed, using scientific methods. Important in this process are three things:
It is a scientist's job to discover and document new patterns, phenomena or correlations, but also to develop and evaluate explanations for these phenomena. This can be done in four ways:
The empirical cycle consists of the stages that a study goes through when it is developed, tested and evaluated. There are five stages that lead in a circle: Observation > Induction > Deduction > Testing > Evaluation > Observation again.
The observation stage is often also described as the freedom of design. In this stage, the idea for a study or research question occurs. This idea can come from anywhere, from looking out of your window to reading a newspaper at the breakfast table.
Induction means that you begin to formulate your idea into a general theory. You come up with a set of rules or guidelines that tries to explain the connection between two or more concepts.
Deduction means that you form a very specific research question, often phrased as a hypothesis, which is a prediction that is based on your theory. You also come up with how you want to test this hypothesis.This is done with two definitions. Conceptional, what is it that you are testing, and operational, how do you measure your outcomes?
Moving into the testing phase means that you carry out your experiment and collect the results. after this you analyse the data and draw a conclusion from your results.
Evaluation means that you take the conclusions drawn in the testing phase and look back on the theory. What do my results say about my hypothesis? About my specific research question? About my general theory? You should hold your study to critical review and where necessary expand, improve or adjust before you start the cycle all over again.
A theory can never really be proven or disproven, it can only be confirmed or falsified. Successful data supports a theory rather than proving it, simply because our current understanding of science and the world is ever evolving, ever changing.
In studies, there is always one or more variable. The amount that this variable changes or differs is called variability. Variability is very important in psychology because it describes and explains differences between people.
Variance is the measure of variability. It shows how much variability there is within a study. The sign for Variance is S2 and can be calculated by dividing the Sums of Squares by the amount of observations minus one. In formula, it looks like this:
S2 = ∑ (yij - y)2 / (N - 1)
In which: y = the grand mean, yij = the score of individual i in group j, ∑ means summation, N = the number of observations and ∑ (yij - y)2 = the sum of squares, also labelled SS(total).
Within an experiment, there is always a certain amount of error that is outside the examiner's control, such as, for instant, a genetic predisposition for losing weight or naturally higher anxiety levels. Therefore, the mount of total variance is built up of systematic variance (real variance) and error variance (the mistakes).
Systematic variance and error variance can be calculated by the formulas:
Systematic S2 = ∑nj (yj - y)2 / (N - 1), where yj = the mean of group j and nj = the participants in group j
Error S2 = ∑ (yij - yj)2 / (N - 1)
The sums of squares of these two formulas are respectively called SS(between) and SS(within)
The proportion of systematic variance within the total variance is also called Variance Accounted For (VAF). It is calculated by dividing the systematic variance by the total.
Observation in psychology means that you put things you see, hear or smell into several categories for later evaluation and to measure means that you give a value to these categories. This can be in words (Good, bad, heavy, light, big, small) or in numbers (46%, 00.1, 5 years).
A variable is a collection of these values and it is important in a study that all your measurements and observations comply to two ground rules: Exhaustiveness and mutual exclusivity.
An observation is exhaustive when there is a value for each and every measurement. This means that it must be possible , for instance in a questionnaire, for every person to always give an answer. This can be done by, for example, adding an 'other' category to multiple choice questions, so that you include everyone.
Mutual exclusivity means that it must be impossible to give more than one answer. There mustn't be any overlap between answers. For instance, the categories 'more than once', 'more than five times' and 'more than ten times' are not mutually exclusive because six can be more than once but also more than five. The categories '0', '1-5' and '6-10', however, are, because there is only one option.
The way you measure an observation can also be placed into categories, called the scales of measurement. There are five levels in total.
When doing an observation, there are always a few factors you must take into account.
What is the research setting? A lab might influence the subject's behaviour, or maybe what you are measuring is factored by the environment. How are you going to control it?
Is the subject aware? There are two types of observation - direct and disguised. Direct observation, or also called undisguised observation means that the subject knows you are there. This might influence the subject's behaviour, but can also sometimes be necessary. Disguised observation means that the subject(s) does not know you are there, for instance, sitting at a park bench and studying the behaviour of people with dogs.
When doing an experiment or observation, there are two factors that are very important when it comes to measuring your variable, reliability and validity. Reliability means that you measure your variable correctly with the least amount of error and validity means that you actually measure what you intend to measure.
Reliability can be compared to the Variance Accounted For (Week 1). When measuring a variable, the measured score always consists of two parts, the true score and the error measurement. This error measurement can consist of a few things;
There are several ways to make sure that your experiment or study is reliable, such as;
Validity means that you make sure you actually measure what you want to measure. There are four types of validity;
When doing descriptive research, there are different ways to go about it. You can do a survey, measuring attitudes, problems, thought patterns, or you can do a demographic, noting patterns in life, such as birth or marriage or voting. You can also put together an epidemiological, meaning you catalogue things such as disease and death. When doing a survey, there are three ways to do it;
The entire overview of your data, hereafter called the distribution, can tell you a lot about the pattern in society and there are several things you can take into account when analysing it;
When dealing with a bunch of raw data, it is sometimes easier to oversee if you take the frequency in consideration. There are two types of frequency, absolute and relative. Absolute means the exact number of participant that had a certain score. The downside of this is that it is hard to interpret. Easier is using the relative frequency, meaning that you put it into percentages, using the P for proportion. This way, you can clearly see, how big of a group you are looking at, compared to the rest.
When you have only a small number of participants in each category or when you are dealing with variables with many categories in itself, you can decide to make a grouped frequency table. This means you distribute the raw data over K class intervals and make a new frequency distribution. There are two guidelines for this. One, the number of new classes (K) = √n, and two, the class interval width (I) = range / number of classes (Range being the highest score – lowest score)
When doing a cumulative frequency distribution, it means that you add up the proportions to eventually make 1. E.g. this table.
Class interval | Frequency (f) | Proportion (P) | Cum. Frequency (F) |
1-7 | 2 | 0.04 | 0.04 |
8-14 | 16 | 0.32 | 0.36 |
15-21 | 24 | 0.48 | 0.84 |
22-28 | 8 | 0.16 | 1.00 |
total | 50 | 1.00 |
When measuring the spread of a distribution, there are four measures you can use;
There is also the 5 point summary that gives a clear overview of any raw data. Like the name says, this consists of calculating five numbers.
With these five points, you can also draw up a good boxplot, which is the most accurate graphical display for this kind of data.
There are several ways how you can show and describe a distribution. The best way to show a clear overview is through a graphical display. This can be done through a table, or a histogram or an actual line graph. When describing a distribution, take note of patters and significant deviations, as well as numerical descriptions such range and spread.
Density curves are often described as "an ideal approximation of empirical data". This means that it is a curve that isn't always a hundred percent match to the data, it is the closest match to a perfect curve. A density curve can be positively, negatively or normally skewed. A positive skew means the top of the curve is more to the left of the distribution, a negative skew more to the right.
The normal distribution has a top that is right in the middle, and is perfectly symmetrical on both sides. It is often described as 'bell-shaped'. The symbol stands for the mean of the distribution and means standard deviation. Distributions with a larger standard deviation are often low and wide, while distributions with a smaller distribution are high and narrow. Just like any other distribution, the normal distribution is an approximation and will not always fit a 100%,
For the normal distribution are three fist rules. Between -1 and +1 standard deviation is 68% of the curve. Between -2 and +2 is 95% and -3/+3 is 99.7%. This is often referred to as the 68-95-99.7% rule.
To find the statistics or probabilities of a distribution, it is often easier to standardize the distribution. Standardizing has no effect on the shape of a distribution, but it does allow you to judge a distribution without having to take context in account.
Standardizing is done by use of the mean and standard deviation and transfer the scores into z- scores. This is done with a simple formula, wherein X is the exact score you want to transfer;
When you have the z-score, you can look up the exact percentage in a conversion table. Take into account that these percentages are taken from the left of the table aka 0, up to your score. So if you want to know how much percent is above your score, you have to converse it by doing 1 - your z-score. Such a conversion can be found at, for instance; https://en.wikipedia.org/wiki/Standard_normal_table
On the math aptitude test for the second grade, girls have an average score of 77 and boys of 73. The standard deviation is 16. Which percentage of all girls has a score that is equal to or higher than the mean score of the boys?
How to go about this: First, draw a picture and define what you are looking for. In this case, P(Xg ≥ Xb), or rather, P(Xg ≥ 73). The z score of this would be z = 77 - 73 / 16 = -0.25/
Taking this proportion (F) from the table would be 0.4013. This means that about 60% or the girls have a higher average score.
There are two common relationships between variables; association and dependence. Association is also known as interdependence. This is when both variables have the same role in a scenario or phenomenon and often work alongside each other. Occasionally, there is also a third (dependable) variable involved. Dependence is often related to causality. There is the independent or explanatory variable X and the dependant, response variable Y.
Scatterplots are handy for graphically showing the relationship between variables. When assessing a scatterplot, there are a few points that can help you judge the correlation of the variable;
If you want to practice guessing and judging correlations, try the game http://guessthecorrelation.com/
Covariance is the variability of two variables together. It does not use the sum of squares, but makes use of the cross products, because two variables often differ. The formula of covariance is;
Sxy = ∑ (xi - x)(yi - y)/ (N - 1)
Pearson Product-Moment Correlation, or also called, Pearson's r, is a measure of a relationship between variables. The correlation of Pearson's r always lies somewhere between -1.00 and +1.00. Pearson's r is a standardised measurement and thus does not have a unit that can influence it. The formula for Pearson's r is;
rxy = ∑ zxzy / n - 1 or rxy = sxy / sxsy
Where zxzy is the sum of both z-scores, and sxy stands for the covariance of x and y.
The rule of thumb for interpreting r is that 0.1 is small, 0.3 is medium and 0.5 or up is a large correlation.
There are three ways to explain a relationship between two variables, and how they interact with each other. These three are;
A lurking variable is a variable that, just like confounding's Z, has and influence on your test, but you are not aware of it or do not have a way to measure. For instance, if you are measuring the amount of money spent on clothes and measure ads and spending habits, but forget about the social pressure of teenagers, that would be a lurking variable.
There are three things important in determining causality;
Proper internal validity means that you draw correct conclusions during your study, that you eliminate confound variables. Your internal validity can be raised by performing experimental control checks. To eliminate your confounds, there are three things very important, environment, instructions and invasiveness.
Your environment can always influence your participant. In some studies, a lab setting can unsettle or sensitize the subject, while in others, an uncontrolled, non-lab environment can have confounding influences. Make sure that you have checked your instructions are so clear there cannot be any mistakes in reading or understanding, thus confounding your study. Lastly, your presence and invasiveness in the study can also be confounding and influencing. By controlling these three things, you can eliminate as many of your confounding variables as possible.
However, there are always independent variables that you cannot control, such as gender or age or upbringing. Keep in mind that these can have an influence as well.
Experimenting has two levels that you need to have for proper internal validity. You need an experimental condition, in which treatment takes place. This can be quantitative - many times - or qualitative - highly condensed and focussed. Next, you also need a control condition. This means there is either no treatment or dummy treatment, meaning for instance a placebo.
There are three ways of randomly assigning levels to people;
When working on internal validity, there are several situations when something happens with the participants that is outside of your control and yet still influence your study. A couple of these occurences are;
Expectancy, of both the examiner can also majorly influence your results and thus also influence your internal validity. As an examiner, beware of the examiner-expectancy effect. This means that you, as observer, will be tempted to only see and take note of the observations that help your hypothesis and tend to forget about the rest.
For the participant, there are usually two major expectancy effects, the demand characteristics and the placebo effect. The placebo effect means that the participant think they've received, for instance, a medicine while they actually have not, and yet still seem to experience the effects of said medicine. The demand characteristics are in play when the participant thinks something is expected of them, and then in result, slightly changes their behaviour to match that demand.
Lecture 8 - 31/10/18
When creating a study, there are a lot of designs you can pick from and need to take into consideration. Here is an overview of some of the major designs.
When describing a design, it is easier to use symbols to describe it. It could, for instance, look like this:
NR III O0.....X.....O1
In this case, NR = No Random assignment. Its opposite would R = Random assignment. The III is the group number in roman numerals and X is the treatment. OO is pretest and O1 is the posttest.
The Solomon 4 group design allows you to calculate the influences of the treatment (X), the Testing effect (T), Sensitization (S) and History/Maturation (H/M).
R I O0 (6) --- X --- O1 (23)
R II O0 (7) ---------- O1 (15)
R III X --- O1 (19)
R IV O1 (11)
How to draw conclusions from this:
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>
There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
Main summaries home pages:
Main study fields:
Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports
Main study fields NL:
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
1999 | 1 |
Add new contribution