Research Methods and Statistics: Summaries, Study Notes & Practice Exams - UvA
- 3666 reads
Psychology is based on research and studies by psychologists. Psychologists can be seen as scientists and therefore also as empiricists. Empiricists base their conclusions on systematic observations. Psychologists base their ideas about behavior on studies they have carried out with animals on people in their natural environment or in an environment that has been specially made for the research. Anyone who wants to think as a psychologist must think like a researcher.
Who are the producers and consumers in research?
Psychology students who are interested in conducting research, conducting questionnaires, examining animals, the brain or other themes from psychology are called producers of research information. These students will probably publish articles and work as a research scientist or professor. Of course there are psychology students who do not want to work in a laboratory, but who like to read about research with animals and people. These students are seen as consumers of research. They read about research and they can apply what they have read in the professional field, their hobby or friends and family. These students can become therapists, study advisers or teachers. In practice, it is often the case that psychologists take on both roles. They are both producers and consumers of research.
For the subjects that you will still receive during your psychology studies, it is important to know how you can be a researcher. Even if you do not plan to start a PhD after your studies. Of course you have to write a thesis for graduation and your thesis will have to meet the APA standards. The APA standards are mainly about how you should note references in your text. For example, in the text you must refer to authors and the year of publication. In your reference list you must note the name of the author (s), followed by the year in which the article was published, the title of the article, name of the journal, year and finally the pages. Also, according to the APA standards, you must use the font Times New Roman, font 11 or 12, with line spacing of 2.0. You will also have to follow a number of courses where doing research is important. It is of course important to know how you can randomly assign test subjects to certain conditions and how to read graphs.
However, most psychology students do not become researchers. It is therefore important to be a good consumer of research. You will therefore have to read research, understand, learn from it and ask good questions about it. Most information that a psychologist seeks on the internet is based on research. Many newspapers also have headings that mention scientific studies, and nowadays there are also many journals that summarize the results of certain studies. However, it is true that only part of all studies are accurate and useful. A large proportion of the studies were not performed accurately. It is important to know how to distinguish the good from the bad studies. Knowledge about research methods helps. Therapists also need to be able to interpret published studies properly, so that they are kept informed of new and effective types of therapy. For therapists it is important to follow so - called evidence-based treatments . These are therapies that are supported by research. If you are able to find, read and understand scientific articles, you can separate the wheat from the chaff.
How do scientists approach their work?
What do scientists do? Scientists are empiricists and they observe the world systematically. Scientists also test their theories with studies and adapt their theories to the data found. Scientists approach empirical research (problems from daily life) and basic research (intended to contribute to general knowledge). Scientists are also continuing to investigate. As soon as a scientist has found an effect, he / she wants to do follow-up research to find out why, when and for whom the effect works. In addition, scientists make their findings known in the scientific world and the media.
How do empiricists approach their work?
Empiricists do not base their conclusions on intuition, their experiences or observations. Empiricism means that evidence of the senses or instruments that help senses (questionnaires, thermometer or photographs) is used to draw conclusions. Empiricists want to be systematic and they also want their work to be independently verifiable by other scientists and observers.
What is the theory-data circle?
The theory-data circle means that scientists collect data to test, change or update their theories. This will be further clarified with an example from attachment psychology. When babies can crawl, they follow their mothers very often. Baby monkeys also often cling to the mother's hair. Psychologists wanted to know why animals are so attached to their caretakers. One of the theories is the so-called cupboard theory . This means that mothers are important for baby animals because mothers give birth to babies. The babies get their mother's food and they will experience a pleasant feeling. Over time, seeing the mother alone will make a baby happy. An alternative theory states that baby animals often cling to their mother because it offers their comfort. This is called the comfort contact theory . Harlow tested both theories in a lab. He built two mother monkeys (of gauze). One monkey was only of gauze and had a bottle of milk (so this monkey gave food, but no comfort), the other monkey was covered with a warm cloth and gave comfort, but no food. Harlow had baby monkeys in the cages with the fake mothers and he watched how much time they spent with the mothers. His research showed that the baby monkeys spent much more time with the warm mother than with the mother who gave food. This suggests that the contact comfort theory is the right one.
What are theories, hypotheses and data?
A theory contains assertions about the relationship between variables. Theories lead to specific hypotheses . A hypothesis can be seen as a prediction. It says something about what the scientists expect to observe, if their theory is correct. A single theory can have many hypotheses. Data can be seen as a set of observations. Data can support or contradict a theory.
What are the characteristics of good scientific theories?
Some theories are better than others. The best theories are supported by data, are falsifiable and parsimonious (or the simplest). It goes without saying that good theories must be supported by data. They must also be falsifiable. This means that theories must be able to lead to hypotheses that, when tested, do not support the theory. In addition, a theory must be as simple as possible. If two theories explain the data equally well, but one is simpler than the other, then the simple one has to be chosen. In addition, it is important to realize that theories do not prove anything. It can be said that data supports a theory or is consistent with a theory, but it can not be said that a finding proves a theory.
What is the difference between applied and basic research?
Applied research is done with practical problems. Scientists hope that their findings will be applied immediately to solve a problem in the real world. Basic research is not aimed at solving specific practical problems. It is precisely aimed at increasing our general knowledge about certain subjects. An example of this is investigating the motivation of depressed persons. It is often the case that basic research will later be used for applied research. Translational research is the use of the knowledge of basic research to test and develop applications for healthcare, psychotherapy and other forms of treatments. It can actually be seen as a bridge between basic and applied research.
Are researchers continuing?
It rarely happens that psychologists only do a research once and then stop. Usually, every research leads to new questions. An investigation can find a simple effect, but the researcher will want to know why this effect happens, when it happens and what the boundary conditions are. He / she will set up a new investigation to test these things.
How is scientific work published?
Scientists publish their research in scientific journals. These sheets often come out once a month, but the article will only be published if it has been approved by experts. When you send your article to a journal as a scientist, the editor of that sheet will send the article to three or four experts of that subject. These experts will tell the editor about the good and bad sides of the article and the editor will then decide whether the article will be published or not. This process is rigorous. The experts remain anonymous and in this way they can give their opinion unhindered. The experts simply have to ensure that research that is well done and interesting is published. If the article is published, other scientists who have discovered errors in the article can send comments. Scientists can also quote the article and conduct further research into the subject.
How does scientific work come in a newspaper article?
Articles in scientific journals are mainly read by other scientists. The 'normal population' does not read these sheets. Other well-known magazines or magazines are not written by experts or scientists. Nevertheless, these latter sheets contain articles about scientific research. These articles are, however, written somewhat easier than the original article and a lot shorter. Psychologists benefit from it if their work is also published in normal magazines. The normal audience can then see what psychologists really do and the normal audience can learn more about a certain subject. Journalists, however, do not always choose the important story, but the sensational story. In addition, not all journalists understand the scientific article accurately. After all, they are not trained to understand scientific articles. An example of this is the article that was published about the joy of people in different cities in England. In a scientific article it was said that people in Edinburgh are the most unfortunate of England, but that this was not a significant finding. The journalists who subsequently wrote an article about this did not understand anything about significant findings. So all the articles were published in normal magazines about the unfortunate people in Edinburgh, while this fact was not even significant. The researcher tried to explain to the journalists that the data was not significant and hoped that the journalists would rectify the story, but the journalists did not want to know anything about it.
When people have to make decisions they often rely on their own experience. If you have not had a good experience with a certain car brand, you will not soon be able to buy that car again. They also often rely on experiences from acquaintances and family members. Why not trust your own experience or someone you know?
Why is it important to have comparison groups?
There are several reasons why convictions should not be based on your own experiences alone. One of these reasons is that experiences do not have a comparison group. Research is always asked 'in comparison with whom?' With a comparison group, we can look at what happens with and without what is being investigated. In order to be able to draw conclusions about a particular treatment or effect, groups must be compared with each other. The treated / recovered group, the treated / non-recovered group, the untreated / recovered group and the untreated / non-recovered group must be looked at. With these groups, the relative number recovered with the use of a treatment compared with no treatment can be calculated. If you only look at your own experiences, then you do not have a comparison group. You only look at one person and that is yourself. Only research offers a systematic comparison.
Why is experience confounded (confusing)?
A lot is happening in daily life and it is therefore problematic to base conclusions on own experiences. If a change takes place, then you can not know for sure what caused the change. In daily life there are several explanations for a solution. In the research these alternative explanations are called confounds . A confound occurs when you think that one thing has caused a result, but other things have changed and you are not sure what the cause was. In everyday life it is difficult to isolate variables. In research it is possible to check variables and to change one variable at a time.
Why is research better than experience?
By using controlled and systematic comparisons, hypotheses can be tested. In a research, a so-called confederate can also be used. That is a person who works with the researcher, but pretends to be a fake person. In a controlled study, researchers can set up the conditions in such a way that there is at least one comparison group. Researchers can check for a confound.
What is probabilistic about research?
Research is normally more accurate than individual experiences, but sometimes our own experiences contradict research results. Personal experiences are often strong and many people attach too much value to their personal experiences. Sometimes your own experiences can be an exception to what has been found in research. Should this exception contradict the research results? That is not necessary, because research is probabilistic . This means that it is not expected that the found things can explain all cases. The conclusions of a study explain part of all possible cases. Research can predict that there is a high chance that something happens, but that does not mean that it will always be the case.
People often base their conclusions on intuition. We often think that our intuition is reliable, but it can lead to less effective decisions. That is because most people are not scientific thinkers and are therefore biased. A bias can be cognitive or someone's motivation.
What makes intuition distorted?
Our intuitions are often distorted because our brains do not work perfectly. People can sometimes be too easily persuaded by a story that sounds logical, but is actually incorrect. A cognitive bias is that we accept a conclusion because it sounds logical. Another example of a cognitive bias is the availability heuristic (availability heuristics). This means that things that we can quickly recall, send our thoughts. These are often events that are lively or have recently taken place. Some things get more media attention and therefore we can also think that something happens more often. For example, accidents with an airplane are more often mentioned in the news than car accidents. Because of this, some people may think that there are more deaths per year due to plane crashes than car accidents. Availability heuristics can ensure that we overestimate things. Another problem is that people often do not look up negative information. We often look at the things that are present, but not at the things that are not present. If you only look at the things that are present and not at the things that are absent, then you commit a present / present bias . This bias ensures that the events in which the treatment and outcome are present can be quickly brought to the mind, but events where the treatment was not present but the outcome can not be brought to mind quickly.
How can motivation distort Intuition?
Sometimes people do not want to change their ideas. Because they do not want to change their beliefs, some people only look at information that matches their beliefs. Sometimes we can steer our thinking by asking questions that provide answers that fit our thinking. This is also called the confirmatory hypothesis testing . This is not a scientific way of doing research. Questions are asked that confirm a hypothesis, but no questions are asked that could contradict the hypothesis. People are also biased about being biased. Even though some people know about the existence of biases, they think that these biases do not apply to them. The bias blind spot is the belief that we will not fall prey to the bias. Most people think that their beliefs are less distorted than the beliefs of others. This bias can ensure that we become convinced of our equal and trust in being right is not a scientific way of thinking.
Can we trust authoritarian people?
So it has already been said that you have to be careful to base conclusions on your own experiences or those of people you know. What about authoritarian figures? Should we trust these people? Before you take the advice of high people, ask yourself where their ideas come from. Has this person compared the different conditions in a systematic and objective way? If this person refers to research, you can state with more certainty that this person is right. Keep in mind that these people can also base their conclusions on intuitions and experiences. You also have to take into account that not all studies have been carried out in an accurate way.
Where can we find research articles and read them?
Where and how do we find articles about scientific research?
Your conclusions should be based on research, but where do you find the articles about research? Most psychologists publish their work in three different sources. Often the work is published in scientific journals. They can also publish their work in a chapter of a book. There are also researchers who write books for students and tell about their research there.
Most scientific papers come out once in the month or quarter. These sheets can be found in the library of your university or online. The articles from these magazines are either empirical articles or review articles. Empirical articles report the results of an investigation for the first time. These articles tell us something about the method used, the statistical tests that were used and the results of an investigation. A review article provides a summary of many / all published studies that have been done on a topic. Sometimes a review article uses a meta-analysis , which combines the results of several studies and gives a value for the effect size of a relationship. Scientists appreciate meta-analyzes, because these analyzes weigh all studies proportionally. Before empirical and review articles can be published, they must be read and assessed by experts (see also chapter 1). These sheets are read by other scientists and students.
A so-called edited book consists of a number of chapters dealing with the same subject, but written by various authors. One of the editors invites the other scientists to each write a chapter (or several). These books often summarize the work on a specific topic. The books are not judged as strictly as a scientific article, but the editor only asks other scientists to write something for the book. These books are also read by psychologists and psychology students. In addition, psychologists can also describe their research in a complete book. However, that does not happen often. Other directions, such as anthropology and history, usually describe their work in whole books.
Scientific articles can often be found in the library of a university. You can also use online databases. One of the best known online databases of scientific articles is PsycINFO. PsycINFO is updated weekly by the APA. It can find the articles on a certain topic, but you can also search on all articles by an author. In addition, PsycINFO shows how often the article is cited and by whom. An alternative to PsycINFO is Google Scholar. However, Google Scholar is not always free and it is less well organized than PsycINFO.
Where and how can you best read about scientific research?
Some students have difficulty reading scientific articles. Especially in the beginning it is a challenge. Most scientific articles are written in a certain format. Most articles contain parts that are listed in the same order: summary, introduction, methods, results, discussion and references. An advice is to not read word for word. You have to read the article for a certain purpose. You want to know what the main argument is and what the evidence is for or against the argument. It is therefore important to first read the summary (abstract). At the end of the introduction you will find the hypotheses. Also at the beginning of the discussion you will find the hypotheses and results of a research. If you have read about the hypotheses, you can start to really read the introduction. In chapters from books and review articles there are no specific sections like in empirical articles. Yet you also have to ask yourself what the argument is and the evidence for or against this argument.
It is best to read about research in scientific articles. Yet the work of psychologists is also mentioned in non-scientific journals. A bookstore often has a psychology department. Here are books that are written for people who have not followed a psychology study. These books are written to help people, to entertain and to make money. The language in these books is easier than the language used in scientific articles. To find out whether these books base their story on scientific articles, the book should be looked at at the back of the notes. This states which articles it is based on. Books that do not contain references should not be taken seriously.
In addition, some students often consult Wikipedia. Wikipedia can be a source of information, but it is not always reliable. Some phenomena from psychology have their own Wikipedia page, but that does not mean that everything in it is reliable. In fact, anyone can edit a Wikipedia page and consult the sources he or she wants. Often sources are mentioned, but that is just a small selection and it is only the sources that a person who created the page wanted to consult. Writers for Wikipedia are often enthusiastic people, but not always experts in a particular subject.
Variables are important parts of research. A variable is something that can vary, so it must have at least two levels or values. A constant is something that can vary, but only has one value in a study. In research each variable is measured or manipulated. A measured variable is a variable where the values are observed and noted. Examples include IQ, sex and blood pressure. To measure abstract variables (depression and stress), scientists must use questionnaires. A manipulated variable is a variable that a researcher exerts influence on. This is usually done by assigning subjects to different conditions of a variable. Some variables, such as gender, can only be measured and not manipulated. Some variables should not be manipulated because it would be unethical. For example, people should not be attributed in a condition where they experience great emotional pain. Other variables can be both measured and manipulated.
Each variable can be described in two ways. C onceptual variables are abstract concepts and an example of this is intelligence. These variables are also called constructs. A definition must be given carefully to these variables. These definitions are called conceptual definitions . To be able to test hypotheses, researchers must create operational definitions of variables. Operationalization means that a concept is converted into a measurable or manipulable variable. A conceptual variable of shyness can be operationalized as a structured set of questions. Sometimes it is difficult to operationalize concepts.
What are the three claims in psychology?
A claim is an argument that someone makes. Psychologists make claims based on research. There are three different types of claims: frequency claims, association claims and causal claims. These three will be discussed below.
Frequency claims describe a certain amount of a variable. This is therefore expressed as a numerical value (often percentages). These claims claim how often something occurs. Frequency claims are always about one variable. These variables have always been measured, never manipulated. Association claims claim that a certain level of a variable is associated with a certain level of another variable. Variables that are associated correlate. Association claims contain at least two variables and the variables are measured, not manipulated. There are three types of association: no association, positive association and negative association. A positive association means that a high degree of a variable goes together with a high degree of the other variable. A low level also goes hand in hand with a low level. This is also called a positive correlation. A negative association means that a high degree of a variable goes together with a low degree of another variable. No association means that no correlation between the two variables can be found. A correlation can be represented in a scattergram. If a line goes up, it shows a positive correlation, a line down shows a negative correlation and a horizontal line shows that there is no correlation.
Associations can help us make predictions. These are mathematical predictions, no predictions about the future. These predictions are used to make our estimates more accurate. The stronger the relationship between the two variables (the closer the correlation at 1), the more accurate our predictions will be.
Causal claims claim that a variable is responsible for the other variable. Causal claims always start with an association, but they continue even further. These claims often use the words "cause" and "raise / lower." Causal claims can also include "careful use of language," such as "being, appearing, sometimes and suggesting." To go from association to causality, a research must meet three criteria. It must first determine that the two variables correlate. In addition, it must demonstrate that the causal variable has found a variable place for the outcome. Finally, it must also state that there is no other explanation for the relationship between the variables (so that the relationship between the two variables is not influenced by a third variable). Unfortunately, not all claims published in magazines are based on research.
Which four validities are there and how are they used?
Consumers of research must evaluate claims on the basis of different validities. Validity refers to the suitability of a conclusion. When a claim is valid, it is accurate. Psychologists, however, do not simply say that a claim is valid or invalid. Psychologists look at the four different types of validity and mention that.
How are frequency claims evaluated?
To evaluate frequency claims, it is necessary to look at construct validity and external validity. We can also look at statistical validity. Construct validity looks at how well a conceptual variable has been operationalized. It must therefore be examined how well the researchers measured their variables. Construct validity is about how well a research has measured or manipulated a variable. The different levels in a variable must also match real differences.
External validity is about the generalizability. Which test subjects were used and how well do these subjects represent the population? When you want to make a statement about Dutch people, you have to question different people from the population. For example, not only Dutch from the middle class should be asked, but also Dutch from the low and high classes. Statistical validity looks at the extent to which the statistical conclusions are accurate. When looking at frequency claims, the statistical validity will often say something about the margin of error. If it is claimed that 26% of the population is unhappy, a 3% margin says something about the actual percentages where the true percentage lies. In that case it would be 23-29%.
How are association claims evaluated?
With association claims, we also look at construct validity and external validity. With association claims, two variables are looked at. So there must also be looked twice for construct validity. Construct validity must be examined for each variable. If it turns out that one of those variables is not well measured, then you can not trust the conclusions based on that variable. You can also look at statistical validity. One of the aspects of statistical validity is strength. We have to look at how strong the association between the variables is. The association between a person's height and shoe size is quite strong, but the association between hair color and income is weak. The statistical significance between associations must also be examined.Some reported associations have come about purely by chance. In addition, there are also two types of errors that can be made with regard to statistical validity. A study can conclude, based on the data, that there is an association between two variables, while there is actually no association in the real population. This is the false error or Type I error . In addition, a study based on data can conclude that there is no association between two variables, while there is actually an association in the real population. This is called a miss or Type II error . To find these two errors, you have to be well trained. This can be done, for example, by following statistics courses.
How are causal claims evaluated?
To evaluate causal claims, one has to look at the three criteria for causation: covariance, temporal precedence and internal validity. Covariance means that there is an association between two variables. Temporal precedence means that a variable takes place earlier than another variable. The variable that, according to the survey, influences the other variable, must take place earlier than the influenced variable. Internal validity looks at the influence of another (third) variable on the relationship of the two variables studied. If you want your research to be valid internally, you must ensure that you have control over any other variables. Covariance is going about research results. Temporal precedence and internal validity are more about the method of the research than the results. To investigate causal claims, researchers design experiments. The manipulated variable is called the independent variable and the measured variable is called the dependent variable . A variable manipulation means that you randomly assign some subjects to a certain condition and other test subjects to a different condition. The random assignment to a condition ensures that a third variable is checked.
Construct validity, external validity and statistical validity should also be investigated for causal claims. You must of course know whether the variables are well manipulated. You also want to find out whether the results can be generalized to another population or other settings. You also want to know how strong the relationship is between the two variables.
Which of the four validities is the most important? That depends on the situation. All validities are important, but a study can never be perfect. Most researchers find it difficult to accommodate all four validities. They will therefore have to choose between certain validities and the choice depends on the objectives of the research. If you do research by telephone and you want the results to be generalized to the entire Dutch population, you have to call people from all twelve provinces. So you have to opt for external validity in any case.
Today, psychologists have to adhere to ethical guidelines when doing research with humans or animals. In the past psychologists had different ideas about the ethical way of dealing with subjects. Two well-known studies and ethical problems are described below. An example comes from health care and the other from psychology.
What are some ethical violations committed in the past?
What were the violations of the Tuskegee syphilis research?
The first example that will be described in the Tuskegee syphilis research. At the end of the 20s of the last century, many people from the south of the United States were concerned that about 35% of the dark-skinned men from the south of the country were infected with syphilis. At that time the disease was difficult to treat and because of the disease one could not work normally, contribute to society and not get out of poverty. The only treatment was an infusion of toxic metals. When this treatment worked, it often had serious / fatal side effects.
In 1932, the US Public Health Service (PHS) decided to collaborate with the Tuskegee Institute and they conducted a study in which 600 dark-skinned men participated. 400 of these men were infected with syphilis and 200 men were uninfected. Scientists wanted to investigate the effects of untreated syphilis in the long term. Most of the subjects were enthusiastic about the study because they thought they would receive free health care. The men were not told that the research was about syphilis. The study lasted 40 years and the researchers wanted to follow the men who had syphilis until their death.
The men were not told they had syphilis, but "bad blood." The researchers also said that the men would be treated and that they had to come to the institute to be evaluated and tested. The men were never really treated for their illness and sometimes dangerous actions were performed. In order to ensure that all subjects would come, the researchers lied and told the subjects that they should come for a special treatment that was free. 250 of the men in the investigation wanted to join the US Army to help during WW II. The men were tested and after syphilis was diagnosed with them, she was told by the army that they could come back after their syphilis had been treated.The researchers did not listen to the army and they decided not to treat these men. These men were not allowed to enter the army and they would therefore not receive any money and other benefits that they receive in the army.
In 1943 the PHS approved penicillin as a treatment method against syphilis, but the researchers did not tell the subjects about this. It was not until 1972, when there was a complaint in the media about the investigation, that the research was stopped. Many men had become ill during the investigation and some had died. Some men infected their wives and so did their children.
Today we would call unethical the choices that the researchers made at the time. These choices fall into three categories. First, the subjects were not treated with respect. The researchers had lied and kept information behind. As a result, the subjects could not really agree with the research. If they had known everything, they might not have agreed to take part in the research. Second, the subjects were mistreated. They were not told that there was a new treatment and they had to make painful tests. Finally, the subjects belonged to a group that was disadvantaged. Everyone can get syphilis, but the researchers only wanted to use poor black Americans as test subjects.
What were the problems of the Milgram research?
Sometimes it is difficult to make decisions about ethical issues. Milgram examined obedience to authority in the 1960s. A test subject was the teacher and the other test subject was the student. The teacher had to give the student an electric shock if he had made a mistake. The teacher could not see the student. The shocks became higher with each wrong answer. At one point the student shouts and he says that the shocks are too painful and that he wants to stop. At a certain moment the student says nothing more. The researcher tells the teacher that he / she should continue to administer shocks. Subjects who did not want to continue were told by the researcher in his white lab coat to continue to administer shocks.The study showed that 65% of the subjects listened to the researchers and administered fatal electric shocks to a fellow human being. Fortunately, it was not really the case that the student got shocks, but the other test subject who had to serve the shocks needed to think this.
Was it unethical for Milgram to carry out such research? Scientists saw two ethical issues in the research. First, they found that the research provided the teacher subjects with a lot of stress. Secondly, researchers were concerned about the lasting effects of the research. After the study, all subjects were told that it was all fake and it was told what the idea behind the research was. However, some subjects were disappointed that they could have hurt another test subject. Some researchers found that Milgram could have intervened, after seeing the stressed subjects. Other researchers think that we have learned a lot about obedience from Milgram and that we would not have known it without his research.So it is sometimes difficult to decide whether an investigation is unethical or not. We often have to weigh the potential risks for the subjects against the knowledge that we can gather. This is difficult for Milgram's research, however.
What are the most important ethical principles?
After World War II certain agreements were made about ethical guidelines in medical research (Helsinki Accords). In the United States, the ethical systems are based on the Belmont Reports. Doctors, philosophers, scientists and other citizens came together to discuss how to deal with test subjects. There were three ethical principles for decisions: respect for test subjects, humanity (beneficence) and justice.
The principle of respect means that test subjects must be treated as autonomous agents. They are allowed to know for themselves whether they want to participate in the research. They must first know what the research is about and what the risks are. Only then can someone decide whether or not to participate. Researchers are not allowed to influence test subjects to participate. In addition, some groups must be protected in terms of the agreement to participate in research. These groups are children, people with an intellectual or developmental disorder and prisoners.
The principle of humanity means that researchers have to check in advance whether test subjects incur risks or gain benefits from the research. It must also be checked whether there are risks or benefits for the population. This must all be done before an investigation can start. Researchers may not retain new medicines or treatments from the subjects. However, it is sometimes difficult to find out how much psychological / emotional pain a person can produce. In addition, it is also difficult to estimate the benefits that the research will bring.
The principle of justice requires a balance between the people who participate in the research and the people who get benefits from the research. The researchers of the Tuskegee syphilis research have committed an ethical error because they only examined black Americans and no white Americans. They have disadvantaged a target group. When only one ethnic group is examined, the researcher must demonstrate that the problem that is being investigated only or mainly occurs in that group.
What are the guidelines for psychologists?
What are the five general principles?
In addition to the guidelines in the Belmont Report, American psychologists can also follow the guidelines of the American Psychological Association (APA). These guidelines are about the roles psychologists can take upon themselves: research scientists, teachers and practitioners (therapists). There are five generic ethical APA principles: respect, humanity, justice, integrity and loyalty and responsibility (taken together as one principle). The first three correspond to the Belmont Report. Integrity means that teachers need to learn accurate things from their students and that therapists must remain informed of empirical evidence about therapeutic techniques. Loyalty and responsibility keep in mind that psychologists can not engage in sexual relations with their pupils or clients and that teachers should not have one of their pupils as clients.
What are the ten specific standards?
In addition to the five general principles, the APA also has ten specific standards that can be seen as rules. Psychologists who do not comply with these standards may lose their authority to be a therapist. Ethical Standard 8 is most important for researchers. The other standards are more for therapists and teachers. Ethical Standard 8 is explained below.
Standard 8.01 states that there must be an institutional review board (IRB) . The board determines whether research is carried out in an ethical manner or not. Before a scientist can conduct an investigation with test subjects, he or she must submit an application to the board. In this he / she must describe in detail what the research will look like and what the risks and benefits of the research are. The members of the IRB must decide whether an investigation may be conducted or not. Standard 8.02 states that many investigations require informed consent have to ask. This is a leaflet / website that describes what the research will be about, what the risks and benefits are, whether the data will be treated anonymously or not, and finally whether the subject agrees to participate in the study. For natural observation studies in low-risk settings no informed consent is required. The IRB decides whether informed consent is needed for these investigations.
Standard 8.07 is about deception . Sometimes researchers keep a part behind and sometimes they lie to test subjects. According to some researchers, it is sometimes necessary to lie or to withhold things from the subjects. Yet other researchers find that there should never be any lying against the test subjects. When researchers do decide to withhold things from the subjects, they must inform the test subjects about the deception and the actual goal of the research after the end of the research. This is called debriefing and it belongs to Standard 8.08. Often there is also a debriefing done in research that does not use deception.
What do the standards say about misbehavior with regard to publications?
Most guidelines deal with the proper treatment of subjects. However, there are also guidelines that deal with the publication process. It is considered ethically to publish the results, otherwise a test subject has given his time to a researcher for nothing. Two misbehavior with regard to publishing is data fabrication and data falsification (Standard 8.10). Data fabrication means that a researcher does not enter what has actually been said, but invents things to support his hypotheses. Data falsification means that researchers influence the results by, for example, omitting some observations or by influencing test subjects. The fabrication or falsification of data can ensure that theories that are actually not accurate are seen as accurate. In addition, it can also ensure that researchers spend a lot of time in futile research.
Another form of misconduct is plagiarism (Standard 8.11). This means that you represent the ideas and words of others as your own ideas or words, so without referring to the original author. It is seen as a form of stealing. To prevent plagiarism, a writer must refer to the original author when he / she uses the person's ideas. This is done according to the APA standards described briefly in chapter 1 (ie name surname and year of publication). Students must keep to the rules when they write a propaedeutic thesis or thesis and do not commit plagiarism, otherwise they can be punished by being sent away from the program.
What do the standards on animal research say?
Psychologists do not only conduct research with people, but sometimes also with animals. According to Standard 8.09, psychologists who use animals must take care of these animals, treat them humane, use as few animals as possible, and ensure that their research is important enough to justify the use of animals. It is often the case that every country also has other institutions that monitor the use of animals. A group is often set up that supervises the care of animals in research. In many countries, the three Rs are used: replacement, refinement and reduction. Replacement means that researchers find replacement for animals where possible. Refinement means that researchers have to carry out their experimental actions in such a way that the animal receives as little stress as possible. Reduction means that one has to carry out research with as small a number of animals as possible.
Most psychologists and psychology students agree with the use of animals for research. However, they are good for treating animals and it is important that researchers take into account the pain that animals may incur during the study. Activists of animal rights believe that test animals also have rights and by subjecting them to research their rights are violated. Other activists believe that humans are no more important than other animals and that animal research can only be done if the same research can also be done with human subjects. Researchers must balance the use of animals and the treatment of animals.Many psychologists treat the animals really well and through animal research they have discovered many things that contribute to our applied and basic knowledge. In addition, psychologists try to use as few animals as possible and where they can come up with different procedures (such as computer models).
When psychologists decide how to operationalize a variable, they have to choose between three different measurements: observational measurements, self-reports and physiological measurements. They also have to decide which scale they will use. As mentioned in chapter 3, a conceptual variable is the definition of a variable at theoretical level according to the researcher. The operational variable is the decision about how that variable should be measured or manipulated. Each conceptual variable can be operationalized in several ways. For example, the concept of wealth can be operationalized by looking at the annual income or coding the age of someone's car.
What are the three types of measurements?
The types of measurements that psychologists use to operationalize concepts often fall into the categories self-reporting, observations and physiological measurements. Self-reports look at the answers that people give themselves on a questionnaire or during an interview. In children it is often the case that self-reports are replaced by reports from parents and / or teachers. Observational measurements are also called behavioral measurements and they operationalize a variable by observing observable behaviors. Encrypting how expensive a car costs is an observable measure of wealth. Counting how many tooth prints there are in a pencil is an observable measurement for stress. Physiological measurements operationalize a variable by looking at biological data, such as brain activity and heart rate. Often instruments are used for this purpose, such as EEGs and fMRIs. It is best to use all three reporting techniques to see if the results match.
Which scales are there?
All variables must have at least two levels. The levels of operational variables can be coded using different scales. Operational variables are mainly classified as categorical or quantitative. The levels of categorical variables are categories. These variables are also called nominal variables. An example of this is gender, which has the levels man and woman. A man can be coded as '1' and a woman for example as '2'. These numbers do not say anything else and other numbers may as well be used. These numbers do not have a numerical value and it is therefore not the case that being a woman is 'higher' than being a man. Quantitative variables do have values with a meaning.
Quantitative variables can be further classified on an ordinal, interval and ratio scale. An ordinal scale looks at a ranking. A teacher can return tests in order from the highest to the lowest number. The first student scored higher than the last student who receives the test. However, it is not known how much higher the first student has scored. An ordinal scale does not say anything about the distance between the different keys. An interval scale does work with equal intervals (distances) between levels and there is also a real zero point, but that does not really mean that someone has 'nothing'. An IQ test is an example of an interval scale. The difference between 95 and 100 is the same as the difference between 105 and 110. Scoring a 0 on an IQ test does not mean that you do not have an IQ. A ratio scale also has equal intervals and really a zero point that means 'nothing'. People who do not respond well to a test score a 0 and this 0 means that they really did not score anything correctly. Due to a meaningful zero point, something more can be said about the levels. For example, it can be said that someone who earns 4000 euros per month earns twice as much per month than someone who earns 2000 euros per month.
What is reliability in tests and how can it be measured?
How do you know if you have properly operationalized a variable? How do you know if the measurements of a study construct have validity? Construct validity has two aspects: reliability refers to how consistent the results of a measurement are and how validity looks or a variable measures what it is supposed to measure.
What is reliability in tests?
Researchers collect data to ensure that measurements are reliable. Establishing reliability is an empirical question. Reliability can be tested in three ways and all three ways are about consistency in measurements. Test-retest reliability means that the researcher finds the same scores every time he / she measures something. People who score the highest on an IQ test should also score the highest on the IQ test one month later when the same group of people is examined. Inter-assessor reliability means that the same scores are obtained from different assessors. This form of reliability is most important in observational measurements. Internal reliability means that a subject gives a consistent pattern of answers.
What can be used to evaluate reliability?
Two statistical tools can be used to analyze reliability: scatter diagrams and the correlation. Reliability can be seen as an association claim. Test-retest reliability can be displayed in a scatter plot. On the x-axis you put the first measurements of all persons and on the y-axis the second measurements of all persons. When the points are the same or almost the same, you can speak of test-retest reliability. Inter-assessor reliability can also be analyzed with a scatter diagram. The values that an assessor has given to the test subjects are on the x-axis and the values that the other assessor has given are on the y-axis. If the points are around a straight line, then there is inter-assessor reliability.
The reliability of relationships between variables is more often measured with a correlation coefficient, r . An r indicates the direction and strength of a relationship. When the slope goes down in a scatter plot, r is negative. When the slope rises in a scatter plot, r is positive. The value of r is between -1.0 and 1.0. When the value is close to -1 or 1, the relationship is strong, when the value is close to 0 the relationship is weak. For test-retest reliability, two time measurements are taken into account. When the r between these two measurements is positive and strong (higher than .50) then there is good test-retest reliability. When the scores of two evaluators are examined and the r appears to be positive and strong (.70 or higher), then there is good inter-assessor reliability. To measure the internal reliability of a scale, researchers look at Cronbach's alpha. This is calculated with SPSS and it compares all items of a scale with each other. A number comes out and the closer the number to the 1, the more reliable the scale. When internal reliability proves to be high, all items can be included in a scale, if it does not turn out to be the case, then researchers have to adjust their scale items.
What types of validity are there in measurements?
In addition to reliability, it must also be checked whether the tests actually measure what they need to measure. Does your religion scale really measure how religious someone is? Psychologists often want to measure abstract constructs, for which no comparison standard exists. Construct validity is therefore important in psychological research. We can not measure pleasure directly. We can estimate joy by looking at different things. So we can look at someone's well-being, how often someone laughs, to stress levels of hormones and to blood pressure. All these measurements are indirect. For some abstract constructs, there is simply no direct measurement.How can you know whether an indirect operational measurement of a construct actually measures what it is supposed to measure? You can know this by collecting data and evaluating validity on the basis of these data. There are different types of validity.
Face validity means that a variable seems plausible. It is rather subjective: if it seems to be a good measure, then it has face validity. Component validity checks whether a measurement contains all parts of a construct. If intelligence as the ability to plan, solve problems, reasoning, understanding complex ideas, abstract thinking and fast learning is seen, then an operational scale must have questions about each of these parts.
Most psychologists do not want to rely solely on subjective forms of validity. Therefore they also look to see if the measurement is associated with something that it should be associated with. Criterion validity checks whether the measurement is related to a concrete outcome, such as a behavior, with which it should be associated according to the theory. When an IQ test criterion has validity, it should correlate to behaviors that correspond to the construct of intelligence (such as the things mentioned above). Criterion validity can thus be viewed on the basis of scatter diagrams and correlation coefficients. Another way to get information about criterion validity is through so - called known-group paradigms to use. Researchers then look at whether the scores of a measurement can discriminate between a set of groups whose behavior is already well understood.
Another form of validity looks for meaningful patterns of similarities and differences. If there is validity, then the measurement should correlate strongly with other measurements of the same construct (called convergent validity ) and it should be less strongly correlated with measurements of different constructs ( discriminant validity ). When you come up with a new scale for measuring depression, you can check whether your scale corresponds to an existing scale for depression. When the correlation between these two is high, you can say that your scale meets convergent validity. In addition, your scale should not correlate strongly with measurements from other constructs (discriminant validity). For example, your scale does not have to correlate strongly with perceived physical health. Convergent and discriminant validity are often determined together. There are no rules about how high or low the correlation should be. The only rule is that the correlation between related constructs must be higher than the correlation between unrelated constructs.
How can you improve the construct validity of a survey?
In this chapter, the word survey used to refer to questions asked to people by telephone, during interviews, on paper, via e-mail or on the internet. Psychologists who develop their questions well can support frequency claims that have good construct validity. Survey questions may look different. There are open questions that allow a subject to answer how they want. The answers are often rich in knowledge, but a disadvantage is that the answers must be coded and categorized. This takes a lot of time and it is often difficult to do. That is why many psychologists decide to use different kinds of questions. Often, forced-choice questions are asked. Test subjects can choose the best option from multiple options. A Likert scale is often used in psychological research.Test subjects are asked to what extent they agree with a particular statement. The options range from strongly disagreeing to strongly agree. If it was not checked to what extent someone agrees with a statement, but to a different numerical value, then that becomes a called semantic differentiation format . Here, for example, the 1 can easily imagine and a difficult one. A more familiar example for the larger public is the assessment of products on the internet by means of five stars. Researchers can combine the different types of questions in a questionnaire. It is important to know that the question types do not break the construct validity.
How can you best ask questions?
The way in which questions are formulated and asked can have an influence on construct validity. Every question must be clear and must be answered immediately. Makers of questionnaires must ensure that the wording and order of questions do not influence the responses of test persons. See the following difference from a study done on racial relations in America:
Do you think the relationship between black and white Americans
- always be problematic?
-or that eventually a solution will be found?
Do you think the relationship between black and white Americans
- is it as good as it will ever be?
-or that it will be better eventually?
Only 45% of the people who received the first question were optimistic about the varieties. 73% of the people who received the second question were optimistic about racial relations. This is because these questions are leading. The questions are different: the first question is negative, with the words 'problematic' and 'finding solutions' and the second question is positive, with the words 'good' and 'better'. Writers of questionnaires must therefore make the questions as neutral as possible, otherwise they will not know the real thoughts and opinions of respondents.
Sometimes a question can be so difficult to say that a respondent will have difficulty answering that reflects his / her opinion accurately. It is best to ask as simple a question as possible. When people understand a question, they can give a clear and direct answer. However, sometimes people who write questions forget this rule and they can accidentally put two questions into one. These are called double-barreled questions . These questions have poor construct validity because people answer the first question, the second or both questions. Your item can therefore measure the first construct, the second or both constructs. The questions must be asked separately.
Sometimes the negative wording of a question can make the question unnecessarily difficult. With negative this time is not meant to include negative words such as 'bad' and 'problematic', but negative words. A survey showed that 20% of Americans denied the Holocaust. This caused quite a stir and researchers therefore decided to see if this study had been carried out properly. They found that the question was difficult to put into words: "Does it seem likely or unlikely for you that the Nazi extermination of Jews had never taken place?" Most people have difficulty with the double denial of 'impossible' and 'never'. This question did not, therefore, measure the beliefs of people, but to what extent they had used their working memory and motivation to answer the question.This question therefore had poor construct validity and thus did not measure the true beliefs of people. Sometimes one negative word can make a question difficult. Often researchers also ask that question in a positive way and then the internal consistency of those two items is looked at to see if a person gives the same answer (if you disagree with a question, then you have to deal with the other question). agree). There must be a good look with negative questions, because it can reduce the construct validity. Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.Sometimes one negative word can make a question difficult. Often researchers also ask that question in a positive way and then the internal consistency of those two items is looked at to see if a person gives the same answer (if you disagree with a question, then you have to deal with the other question). agree). There must be a good look with negative questions, because it can reduce the construct validity. Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.Sometimes one negative word can make a question difficult. Often researchers also ask that question in a positive way and then the internal consistency of those two items is looked at to see if a person gives the same answer (if you disagree with a question, then you have to deal with the other question). agree). There must be a good look with negative questions, because it can reduce the construct validity. Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.Often researchers also ask that question in a positive way and then the internal consistency of those two items is looked at to see if a person gives the same answer (if you disagree with a question, then you have to deal with the other question). agree). There must be a good look with negative questions, because it can reduce the construct validity. Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.Often researchers also ask that question in a positive way and then the internal consistency of those two items is looked at to see if a person gives the same answer (if you disagree with a question, then you have to deal with the other question). agree). There must be a good look with negative questions, because it can reduce the construct validity. Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.Sometimes the answers to these questions say something more about the motivation and skill to do cognitive work than about the actual opinions of people.
The order of the questions can also influence the answers that people give. Suppose some people support a certain action (such as better conditions for women), but not so much support an action for better conditions for ethnic minorities. When asked first to indicate whether they are for or against actions to improve the circumstances of women and then asked whether they are for or against actions to improve the circumstances of minorities, then another answer may emerge then when the questions were asked the other way around. People often want to be consistent and when they are first asked if they support actions for women and they reply that they agree with it, they will soon be inclined to reply that they support actions for ethnic minorities.The best way to check whether the order of questions influences is to create different versions of the questionnaire and to change the order of questions in each version. If the results of the first order are different from the results of the second order, then it can be said that there is a sequential effect.
How can test subjects be encouraged to respond accurately?
Test subjects can sometimes give less accurate answers. They do not always do this intentionally. Sometimes they do not do their best to respond accurately, sometimes they want to come across well and sometimes they are unable to give accurate answers to questions about their thoughts and motivation. However, self-reports are often ideal. Most people can answer questions about their demographics and perspectives. Sometimes it is even true that self-reports are the only option. For example, if you want to know what someone is dreaming about, then you have to ask this person because we do not have an instrument that can see your dreams. Some things you can not easily observe, such as a person's fear, and you will therefore have to ask this person yourself.
So-called response sets are rapid responses that a subject can give when answering a questionnaire. Sometimes people do not think about certain questions and they can answer all those questions negatively, positively or neutrally. Response sets can make the construct validity weaker, because people do not say what they actually think. A form of a response set is acquiescence or consent. This means that someone always answers 'yes' or 'strongly agree' to all questions. This is bad for the construct validity. A way of checking whether someone says 'yes' every time, without considering whether it really agrees with a statement is to put the questions the other way around. A question such as 'I love candy' should then be worded as 'I do not like sweets'. Someone who really likes candy will agree with the first statement, but disagree with the second statement. Another response set is fence sitting . This means that people always choose the middle of a scale. This is mainly done when the question is controversial or difficult. One way to counter this is to remove the middle. So instead of five answer options, you can choose from four answer options. A disadvantage, however, is that people who are really neutral or have no opinion, if not express their true thoughts.
Most people want to be seen as good by others and sometimes answers are given so that a person can look better than he / she is. These questions have a low construct validity, because people are more inclined to choose the answers that make them look better. Sometimes subjects are shy or worried to give their unpopular opinion on a question. One way to counter this is to guarantee the anonymity of the subjects. However, this does not always help. Another way is to ask the questions to friends and family members. After all, these people know you well. In addition, computer measurements can be made to measure the implicit opinions of people.Test subjects usually do not know what the real purpose of the research is and they will not try to influence their answers.
Sometimes self-reports can be inaccurate, because people do not know why they think so or act in a certain way. In fact, their memories of certain events can also be inaccurate. Asking people what happened is probably not the best way to find out what really happened. Self-reports are therefore not suitable for all types of questions. A survey is suitable for asking questions that are subjective in nature: what a person thinks he / she is doing and what he / she thinks his / her behavior influences. But if you want to know what people really do and what their behavior really influences, then you will have to observe these people.
How is the construct validity of behavioral observations?
When a researcher observes the behavior of animals or humans and keeps it systematically, we speak of observational research . Some researchers think that observations are better than self-reports, because some people can not accurately answer questions about their behavior and past events. Observations can form a basis for frequency claims. For example, it is possible to look at how often people eat at a snack bar every week, how often parents scream during a football game of their child and how often cars stop at a pedestrian crossing. An example of an observational study is Mehl's research that looked at how many words people say every day. Each test person was wearing an electronic instrument and researchers have coded how many words men and women said each day. On average, women said more words per day than men, but this difference was not statistically significant.This means that women do not speak more than men (even though we often think so).
Why are observations sometimes better than self-reports?
If the subjects in the previous example were asked to keep track of the number of words they said per day, it would not have been successful or not accurate. In observations, researchers work very carefully to ensure that their observations are accurate and valid. Observations have good construct validity when they can avoid the following three problems: observer bias, observer effects and reactivity.
Observator bias occurs when the expectations of an observer influence their interpretations of the behavior of test subjects. They do not judge the observations objectively, but according to their own expectations. Observator effects take place when an observer changes the behavior of the person or animal that is observing. The behavior changes and corresponds to the expectations of the observer. In one study, students all got a rat and they had to keep track of how long it took before the rat learned to walk through a maze. The rats were genetically the same, but some students were told that their rat was a smart maze runner and other students were told that their rat was a lazy maze runner. It turned out that the smart rats got faster by the day and that the lazy rats did not get any faster. Observers not only saw what they wanted to see, but they also caused the behavior of the person observed to match their expectations. One way to prevent observer bias and observer effects is through make codebooks . These stencils state how each behavior can be coded. Another way is to use a blind design . Observers do not know in which condition a test subject is located and he / she can not influence a test subject.
Sometimes the presence of a person can make someone behave differently than he / she normally does. Reactivity is that people change their behavior in one way or another when another person looks. Sometimes they exhibit good behavior and sometimes bad behavior. Reactivity occurs not only with human subjects, but also with animals. One way to counter this is to stand out as little as possible as an observer. Sometimes a one-way mirror can be used to observe test subjects. Another way is to get the test subjects to get used to you. An observer who wants to observe children can first spend a few days at school so that the children get used to him / her and forget that they will be observed. This can of course also be done with animals. A third way is to look at the traceable data that a behavior leaves behind, instead of the behavior itself.Someone can say that he or she is a cautious driver, but his / her fines draw a different picture.
Most psychologists find it ethical that behavior in public settings are observed. When secret recordings are made, then a researcher must have a good reason for it and tell the test subject after the investigation. If the subject does not agree that recordings have been made, the researcher must remove the file without having seen it.
When you test external validity, you wonder whether the results of a particular study can be generalized to a larger population. The external validity is very important for frequency claims. You wonder whether the values found for the people in your sample could be found in the entire population. Does your sample represent the entire population? External validity does not only look at a sample group, but also at settings. A researcher may not want to know if the results of a research can be generalized to other members of a certain population, but he wants to know if the results can be generalized to other settings, such as other products from the same factory or other courses given by the same lecturer. .This chapter will mainly deal with the external validity of a sample group.
What are samples?
A population can be seen as a whole set of people or products that a researcher is interested in. A sample is a smaller set from that population. If you want to know how the new taste of the Lays chips is, you only have to taste one chip to know how it tastes. All other chips from that bag taste the same and you do not have to eat all the chips from that bag to find out how it tastes. If you were to taste all the chips from that bag, you would have a so-called census to carry out. Researchers also do not have to examine all members of a population. They believe that a sample says something about the entire population. The external validity of a study is about the adequacy of the sample to represent the non-studied population.
There are many populations that scientists can study. Before scientists can determine whether a sample is biased or not, they must specify a population. This becomes the population of interest called. Scientists can have a broad interest (such as the entire population of the Netherlands) or have a specific interest (all women who have studied psychology in Groningen). Only when you have a population in mind can you talk about the generalizability of a sample. A sample can only represent a population if the sample comes from the population. This does not mean, however, that a sample from the population represents the entire population. If a sample consists of Dutch people, it does not automatically mean that it represents the entire Dutch population. Perhaps a researcher has only studied rich Dutch people and that does not of course represent the entire Dutch population. A sample can be either representative or biased. In a biased sample , some members of the population of interest have a higher probability of being included in the sample than other members of the population. In a representative sample all members of the population have an equal chance to be put in the sample. Only representative samples can ensure that we can draw conclusions about a population.
When is a sample biased?
A sample can sometimes contain too many of the unusual members. A sample can be biased in at least two ways. Scientists sometimes only investigate people that they can easily get in touch with, or only those who are eager to show up. This may cause the external validity of a study to decrease, because people who are easy may have different opinions than people who are less easy. Many studies use so-called convenience sampling . That is a sample of people who are available to use. These are often psychology students. Researchers can also use a convenience sample if they can not come into contact with a certain subgroup. Sometimes researchers simply can not investigate people who live too far away, who do not show up or who do not answer their phone. This can, of course, lead to a biased sample, because the people they can come into contact with can be different than the population they want to generalize to. A sample can also be biased by self-selection . This means that a sample contains people who want to participate in the study themselves. Self-selection is common in online research and it occurs in almost all internet polls. Internet users evaluate products that they have used and it is often the case that the people who do such an assessment are not representative of the entire population of people who bought the product.
What are sample techniques?
When researchers really want to have a representative sample, they can best apply probability sampling . Probability sampling is better known as random sampling . This means that every member of the population that is interested in has an equal chance of being chosen to be included in the sample. Because all members of the population have an equal chance of being represented, the results of these samples can be generalized to the entire population. Random sampling is good for external validity. Nonprobability sampling is the opposite, people are not chosen randomly, which ensures a biased sample.
The basic form of random sampling is simple random sampling . You can imagine this form of sampling as follows: every name of all members of a population you are interested in write on a note and you do this in a hat. Then you get an x number of notes from the hat. Another way is to assign each person a number and use a table with random numbers to select the numbers. However, simple random sampling can sometimes take a lot of time or can not be done, because it is difficult to assign a number to each member of the population. In a cluster sample clusters of subjects from a population random are selected and all individuals in all selected clusters are then used. A multistage sampling is similar to this, but two random samples are performed: first a random sample of clusters is done and then a random sample of people within these clusters is done.
Yet another technique is stratified random sampling . Here, a researcher selects certain demographic groups and then performs a random selection of individuals within each of these groups. For example, researchers want their sample of 2000 Canadians to contain South Asians in the same proportion as in the entire Canadian population. 4% of the Canadian population is South Asian and at least 80 South Asians from Canada must be included in the sample. So there are two strata in this study: the South Asians and the other Canadians. However, all members are randomly chosen. Another variation of stratified random sampling is oversampling . This means that the researcher deliberately over-represents one or more groups. A researcher may decide to do so if the subgroup is only a small percentage of the entire group (such as those 4% South Asians in Canada). The researcher could then decide to include 200 South Asians in the sample instead of 80. The South Asian group would then be 10% of the sample, while in the real population it would be 4%. With an oversampling, however, the results are adjusted again and the oversampled group is proportionally weighted in the population. Oversampling is done in a random way.
In systematic sampling , a computer or random table is used and the researcher selects two random numbers, for example 3 and 6. If the sample consists of a fitness room full of athletes, the researcher will start with the third person and each time take every sixth person in the sample until the sample is large enough. Researchers often also use multiple sampling techniques in a study. As long as it is done in a random way, the sample will represent the population. Remember that random sampling is not the same as random attribution . Random attribution is done in experimental designs. Researchers want to put test subjects in different groups (conditions) and they will do so in a random way. Random attribution causes the internal validity to increase, by ensuring that the treatment group and the comparison group have the same type of people (and that there is no alternative explanation for the results found).
Can researchers also opt for biased sampling techniques?
When external validity is not important for a researcher, he can choose to use a biased sample. An example of this is convenience sampling (which has already been discussed). This means that a researcher uses people who are easily accessible. When researchers only want to ask people from certain subgroups and do not choose these people in a random way, this is called purposive sampling . Another form of purposive sampling is snowball sampling . The participants are asked to contribute a few acquaintances to participate in the research. This is, of course, an unrepresentative way to sample, because people bring people through social networks and this is not random. In quota sampling the researcher identifies the subpopulations and chooses how large each subpopulation will be in the sample. Then he selects the people from this population in a non-random way (for example by convenience sampling).
What is the most important in external validity?
Frequency claims are claims about how often something happens in a population. Often these are expressed in percentages. External validity is very important for frequency claims and therefore the sampling techniques will also have to be carefully looked into. Sometimes the external validity of samples based on random samples can be confirmed. Sometimes the polls for the elections correspond to the results of the elections. However, it is often difficult to check the accuracy of a sample because researchers can not investigate a whole population to examine the true percentage. The only thing you can do is to see if your sampling technique is good. As long as a random sample is used, you can have more confidence in the external validity of your results.
What if a representative sample is not very important?
External validity is often very important for frequency claims, but external validity is not always top priority for researchers. This is the case, for example, when they examine association and causal claims. Many association or causal claims can be accurately detected with a convenience sample. With frequency claims, you have to ask yourself whether it is important whether a random sample has been performed. Is the reason that a sample is biased relevant to your claim or not? Are the characteristics of a population bombarding a sample relevant to what you are measuring? If they are not important, you can sometimes trust unrepresentative samples.
Are larger samples better?
One of the biggest myths in research is that larger samples are better. When a phenomenon is rare, you do not have a large sample for the analysis. It is often the case that researchers have enough to 1000 people when they want to investigate a population of a country as large as the United States. The larger the sample, the smaller the margin of error (discussed in an earlier chapter). However, after a sample size of 1000 people you need a lot more people to make the margin of error a little bit more accurate (in 1500 people the margin of error is also 3% and in 2000 people it is 2%). 1000 is therefore seen as an optimal balance between effort and accuracy. A sample of 1000 people ensures that the results can be generalized to the population,as long as the random sample has been taken. The sample size is also not an issue of external validity, but of statistical validity.
Association claims are claims that describe the relationship between two measured variables. A bivariate correlation is also called a bivariate association and it is an association that concerns two variables. To investigate associations, one must examine the first variable and then the second variable and this must be done in the same group of people. Statistical methods and graphs are then used to show the type of relationship between the variables. A relatively large number of studies are correlational. An example of correlational research is John Cacioppo's research into internet love and satisfaction in your marriage. Cacioppo and his colleagues were interested in the relationship between meeting your husband online and marriage satisfaction. They sent a questionnaire via email to thousands of people who use uSamp (an online research center).The subjects answered questions about where they met their spouse (online or not online). Their marriage satisfaction was also measured by the Couple Satisfaction Index (CSI). Among other things, this contains the question 'Indicate the degree of happiness in your marriage' and subjects could answer a Likert scale with seven answer options (from very unhappy to perfect). The survey showed that people who had met each other online scored higher on the CSI. Of course, a correlational relationship does not show a causal relationship and one must therefore be cautious about drawing conclusions about this research.and test subjects could answer a Likert scale with seven answer options (from very unhappy to perfect). The survey showed that people who had met each other online scored higher on the CSI. Of course, a correlational relationship does not show a causal relationship and one must therefore be cautious about drawing conclusions about this research.and test subjects could answer a Likert scale with seven answer options (from very unhappy to perfect). The survey showed that people who had met each other online scored higher on the CSI. Of course, a correlational relationship does not show a causal relationship and one must therefore be cautious about drawing conclusions about this research.
How do you describe associations between two variables?
After you have collected all the data, you must describe the relationship between the two measured variables on the basis of a scatter diagram and the correlation coefficient r . When you put the two variables against each other in a scatter diagram and for each person the values are listed as dots, you can draw a line through your point cloud. If your line runs from the lower left to the upper right, we speak of a positive relationship. A positive relationship means that high scores on one variable go together with high scores on the other variable. When the line runs from top left to bottom right, there is a negative relationship. High scores on one variable then go along with low scores on the other variable. The strength of the correlation can be indicated with the correlation coefficient r . This ranges from -1 to 1. A correlation of .10 or -.10 has a weak effect size. A r of .30 or -30 has a moderate effect size. A correlation of .50 or -.50 and larger has a large effect size. R thus shows the direction (positive or negative) and strength of the relationship.
How do you describe associations with categorical data?
The description above describes how the association between two variables can be described. However, you have to think that some variables are categorical. In the example given about the Cacioppo study, one of the variables is a categorical variable. That is the variable that was about meeting your spouse via the internet. People can of course only respond to it 'online' or 'offline'. The values of a categorical variable can only fall into a category. The other variable (marriage satisfaction) was quantitative. After all, one could choose from seven answer options.
When both variables of an association are measured with quantitative scales, it is customary to make scatter plots. The data can best be represented in this way. A scatter chart is not useful if one of the variables is categorical. The points that individuals suggest come to stand together (vertically) for meeting a spouse online and also vertically for meeting a spouse offline. It is very difficult to see in a categorical variable on the scatter diagram whether the relationship is positive or negative. It is perhaps possible to create a scatter diagram of a categorical variable, but it is not usually. It is more convenient to make a bar chart. In a bar chart, each person is not represented as a point,but the averages for each category are displayed. With a bar chart you can investigate the difference between the group means.
When at least one of the variables in an association claim is categorical, different statistical methods can be used to analyze the data. Sometimes r can be used, but it is more common to test whether the differences between the averages are statistically significant. This is often done with the t-test . It may seem crazy that association claims can be represented with both scatter charts and bar charts or that they can be described by different statistical methods. It does not matter which kind of chart or statistical measure you use; if both variables are measured, then a study is correlational. As discussed earlier (chapter 3), we speak of an experiment when one of the variables has been manipulated. Experiments are better for causal claims. An association claim is not supported by a particular graph or a certain statistical measure; it is supported by a design of a study, where both variables are measured.
How do you investigate association claims?
The most important validities to be investigated in association claims are construct validity and statistical validity. Sometimes external validity can also be investigated. Internal validity is not important in association claims.
What does construct validity mean for association claims?
Because an association claim describes the relationship between two measured variables, it is important to look at the construct validity of both variables. One must therefore look at how well each of the two variables was measured. You can ask yourself whether the size is reliable and whether it measures what it is supposed to measure. You may also wonder what the evidence for face validity, discriminant, convergent and competitor validity of the variable is.
What does statistical validity mean in association claims?
When examining statistical validity of an association claim, you actually want to know if and which factors influenced the scattergram, correlation coefficient r , bar graphs or differences of averages that led to your association claim. It is necessary to look at effect size, outliers in the data, restrictions and statistical significance of the relationship.
What does effect size mean for association claims?
The effect size looks at the strength of a relationship. After all, some associations are stronger than other associations. When there are two associations, then the association with the r the closer to the 1 is stronger. Strong effect sizes are accompanied by more accurate predictions than weak effect sizes. Your prediction terror also decreases when the strength of effect sizes increases. Stronger effect sizes are generally also more important than smaller effect sizes. Yet there are exceptions to this rule. These exceptions are all dependent on the context. Sometimes a small effect size can be important. When it comes to life and death, a small effect size can be important. In one study of heart attacks, half of the subjects received one aspirin per day and the other half of the group received a placebo. It turned out that an aspirin per day was associated with fewer heart attacks, but the effect size was r = .03. In that group of people there were 85 fewer people who had a heart attack than in the placebo group. This was seen as an important result. If the outcome is not of vital importance (ie whether you have met your spouse online or offline), a small effect size is probably not important.
What does statistical significance mean for association claims?
Of course, researchers can not investigate all individuals from a population and they should therefore use random samples. Based on these samples, conclusions are drawn about the population. It is often the case that the results of samples and population mirror each other, but this is not always the case. Sometimes there is no association between two variables of a population, but a study can by chance find an association in a sample. The correlation of that sample was caused by chance. This sometimes happens and we always have to ask ourselves if there really is an association in the population or if an association was found in the random sample.
Statistical significance calculations represent a probabilistic estimate, p . The p says something about the probability that the association came from a population in which the association is nil. If the probability is less than 5%, then we can assume that it is very unlikely that the results came from a nihil association. The correlation is then seen as statistically significant. When the results a high p yield (.05 or higher), the results are not statistically significant. Then a researcher can not exclude that the results come from a population where the association between variables is nil. Significance is also related to effect size: the stronger the correlation (large effect size), the greater the chance that the correlation will be statistically significant. Statistical significance calculations depend not only on effect size, but also on sample size. A small effect size will be statistically significant if it comes from a very large sample (from 1000 test subjects). A small sample is more easily affected by chance than large samples.Weak correlations based on small samples will be the result of chance rather than being labeled as significant. In scientific articles you can read about the significance of an investigation. You can see a significance to the p , but sometimes a statistically significant result is also displayed with an asterix (that is a *).
Do outliers have an influence on an association?
Outliers are extreme scores. They are scores that differ completely from the other scores. Outliers can sometimes have a large effect on the correlation coefficient r . The presence of an outlier can cause the correlation to shift (for example from r = .26 to r = .37). Outliers may cause problems for association claims. In bivariate correlations, outliers are especially problematic when they have extreme scores on both variables. When examining an association claim, you first have to ask yourself if there are outliers in a sample. You could find these outliers by looking at scatter charts. Outliers are especially important to look at if you have a small sample. When a sample consists of 600 subjects who score almost all in the middle, one outlier who scores extremely (or far left or far right) will not have much influence. However, if you have a sample of 16 people who score in the middle, an outlier who scores extremely can have a big influence.
Are there any range restrictions?
When in correlational research there is not a whole range of scores for one of the variables in the association, the correlation may appear smaller than it actually is. This is called a range restriction . This means that you do not show all the values that are available. If researchers suspect that there is a range restriction, they may decide to apply a statistical technique, correction for range restriction. called. The formula is not given in this book, because the writers think it is too difficult and students do not have to know. Range restriction may be present when, for whatever reason, there is little variance in any of the variables. When one looks at the income of parents and the school achievements of a child, one has to look at all incomes. It is not intended that you only include parents in the sample who have a means income. Parents with a low and high income should also be included.
Is the relationship curvilinear?
If a study states that there is no relationship between variables, then the relationship may actually be nil. However, in some cases it may also be that the relationship is curvilinear. This means that the relationship between the two variables can not be represented as a straight line. It may be that the relationship is positive at first (high x-variable is associated with a high y-variable), but at a given moment becomes negative (high x-variable goes along with a low y-variable). An example of this is healthcare. As someone becomes parents, he / she needs less health care (eg doctor visits) to a certain point. However, from a certain age (about 60 years) the need for health care increases again.So there is a curvilinear relationship between age and health care.
Can a causal inference be made about an association?
It is important to consider causality. Many laymen associate correlations with causality. People who have no knowledge of psychology and have read about the research of internet dating and marriage satisfaction will unfairly tell their single friends to register at a dating site. As if that could ensure that you will get a happy marriage. They thus incorrectly attributed causality to a correlation. You must therefore always realize that correlation is not a cause! A normal association can not cause a cause. For causality, temporal precedence, internal validity and covariance of cause and effect are necessary. With a correlation between two variables, you do not always know which variable came first and whether one variable caused the other.Moreover, you also do not know whether there was a third variable that has influenced one or both variables. Only when all three conditions of causality are met can one speak of a causal connection. In an association claim, it will never be possible to meet all three conditions. Causality can only be investigated through experiments. When a third variable creates a correlation between two variables, one speaks of a spurious association.Causality can only be investigated through experiments. When a third variable creates a correlation between two variables, one speaks of a spurious association.Causality can only be investigated through experiments. When a third variable creates a correlation between two variables, one speaks of a spurious association.
Where can the association be generalized?
With external validity you wonder whether an association claim can be generalized to other people, times and places. It may be that a bivariate correlation study did not use a random sample, but that does not mean that you have to write the association away. You can simply accept the results of an investigation and leave the part of the generalization to follow-up research. However, many associations generalize to the population.
When there is an association research and the relationship between the two variables changes because another variable exerts an influence, we speak of a moderator . Moderators can give us information about external validity. When an association is moderated by a third variable, some results may not be generalized to other settings or groups of people.
Association claims can give a lot of information. A well-known example of an association is that children who see a lot of violence on TV also act aggressively. Yet that says nothing about causality. We are often not only interested in correlation, we want to know what caused the consequence. You really want to know if children become aggressive by watching violent TV programs. The reason we want to know such things is of course because we want to come up with an intervention. If children become really violent because of violent programs, then parents should make sure that they do not watch these programs anymore. The best way to test causality is to use an experiment. However, sometimes you come a long way by using other techniques.This chapter discusses techniques that go beyond correlations and approach causality.
In the previous chapter, bivariate correlational research was discussed. That research always looked only at two measured variables. Longitudinal research and multiple-regression designs are about more than two measured variables and therefore they are also called multivariate designs. called. These designs are not the solution to the causality criterion, but they are very useful, are often used and are a solution when no experiments can be used. The example of violent programs watching and aggressive behavior is an example of bivariate correlational research. This does not meet the three criteria for causality. In that research it can be established that there is covariance, because research has shown that the correlation between watching violent programs and aggressive behavior is .35. However, it is not possible to determine with this design what came first: watch the violent programs and then become aggressive or aggressive and watch violent programs? There is also no good internal validity,because the relationship between violent programs and aggressive behavior could be explained by a third variable. So with bivariate designs you can not state well what came first and whether other variables have influenced the relationship.
How can you determine temporal precedence with longitudinal designs?
Longitudinal designs can determine temporal precedence by measuring the same variables for the same person at different time points. Longitudinal designs are often used in developmental psychology to study the changes in certain characteristics of people. In the 1960s and 1970s, Eron conducted research into violent program watching and aggressiveness. He asked children at a bass school what their four favorite TV programs were, and he also asked every child in the class which kids from the class were most fighting, banging, common and pushing. Ten years later he again asked the same questions to the same children (who were now teenagers). This research is longitudinal, because Eron measured the same variables in the same group of people 10 years later.It is also an example of a multivariate correlational research, because it measured four variables: aggressive programs look at time 1, aggressive programs look at time 2, aggression at time 1 and aggressive at time 2.
How should you interpret the results of longitudinal design?
There are more than two variables involved in a multivariate correlational design and your design will therefore give multiple correlations. This can be cross-sectional correlations , autocorrelations and cross-lag correlations to be. The first two correlations are cross-sectional correlations and they test whether two variables measured at the same time point correlate. For example, Eron's research showed that watching violent programs on TV at a young age and aggression at a young age correlate. Then it is checked whether the same variables correlate with each other at different time points. This is what we call autocorrelations. It was therefore examined whether preference for violent programs in the early years correlates with preference for violent programs in the teenage years and whether aggressive behavior in the early years correlates with aggressive behavior in the teenage years. Eron's research showed that violent programs were not stable over a long period of time,but that aggression at a young age correlated with aggression in the teenage years. Researchers are most interested in cross-lag correlations and these are correlations that determine whether the previous measurement of a variable is associated with a later measurement of another variable. In the research from the example one wanted to know whether watching violent programs at a young age was associated with aggression in the teen years and also whether aggression at a young age was associated with watching violent programs in the teen years. This cross-lag correlation shows how people change with time and it shows temporal precedence. From Eron's research, only a cross-lag correlation was significant and that was that children who preferred violent programs's were also more aggressive in the teen years at a young age. Children who were aggressive at a younger age had no preference for violent programs in the teen years. These results assume that the preference for violent programs (and therefore not aggression) was first.
What about the three criteria for causality in longitudinal studies?
Longitudinal research can provide evidence of the causality of a relationship. For example, correlations in the study show that there is covariance. Longitudinal studies can also help with temporal precedence, because each variable is measured at least at two time points. Researchers can thus see what the different patterns are and on the basis of this they can determine whether variable x or variable y came first. However, longitudinal studies can not exclude a third variable. If only 2 variables are looked at over 2 time points, you can not exclude that a third variable could have affected the relationship. Nevertheless, longitudinal studies can set up a study in such a way that they can exclude some third variables.In the Eron study, boys and girls were examined separately. He tried to ensure that the possible third variables - gender - would not affect the results.
Some people will wonder why researchers from longitudinal studies put so much effort into locating the same subjects ten years later and not simply choosing an experiment. The reason is that people often can not be attributed to conditions. You can not tell a person what his favorite program on television should be. It is sometimes difficult to manipulate variables. In addition, in some cases it may be unethical to assign people to a particular group. It would have been unethical if Eron assigned children to the group that had to watch violent programs on TV.
How can multiple-regression designs exclude third variables?
In a study it emerged that having long breaks (more than 15 minutes) was associated with better behavior in children. But what is the causal link? What came first? Are children better because they have had a longer break or are good children rewarded with a longer break? In addition, you also have to ask yourself whether there are third variables. It could just be that there are one or more variables that have influenced the relationship between the duration of a break and good behavior. With multiple regression analyzes you can exclude some third variables. Barros and her colleagues asked teachers from different schools to indicate how long their pupils had a break, and they also asked the teachers to fill in a questionnaire about the (problematic) behavior of pupils.In addition, the researchers also looked at how many children were in the class, the income of the parents and whether the school was a regular or private school. This made the research a multivariate correlational study.
With multivariate designs, researchers can see whether a relationship between two variables persists when a third variable is kept constant. You could divide such a third variable into different subgroups. Imagine you take income from parents as a third variable. You could divide this into low income, middle income and high income. You can then check whether the relationship between behavioral problems and duration of breaks continues in each of these subgroups.
Which statistical measures are used in multiple-regression designs?
In these designs we look at three or more variables. First one has to decide which variable is the most interesting. This is called the dependent variable or criterion variable . In the research on pauses and problem behavior, researchers were most interested in problem behavior. The other variables are called independent variables or predictor variables . When you have a regression carried out in SPSS, you also get a regression table. In your regression table you have to look at the beta values. Beta shows the direction and strength of the relationship between predictor and criterion variable, while keeping the other predictor variables constant (very important). It looks like an r, but it adds an extra dimension. A negative beta indicates a negative relationship and a positive beta indicates a positive relationship. A high value means that the relationship is also stronger than a low value. A beta is standardized and the units of the different predictor variable (eg euro, minutes and centimeters) are all standardized to a measure. A beta value can change when other predictor variables are added. In addition, there is often a column next to the beta values that also contains the significance and p- value of the beta. When the p equal or higher than .05, then beta is not significant. This means that the association between a predictor variable and criterion variable is found randomly in the study and probably does not exist in the population.
What if you look at multiple variables that can influence the relationship between a criterion and predictor variable? Then the same rules apply again for beta. The beta value of a variable already says something about the relationship between that predictor variable and criterion variable, checked for the other predictor variables that are included in the model. It is useful to add multiple predictor variables in a study, so that you can state with greater certainty (or not) that a relationship is not influenced by a third variable. What is also useful in adding multiple predictor variables is that you can see the factors that have a stronger influence on the dependent variables by the size of the beta values. So look closely at the beta and do not mix the beta with the unstandard b . This is a value that is often displayed in a regression table, but which therefore looks at values that are not standardized. So you could not compare each variable with b , because you can not compare euros by centimeter or minutes.
In popular magazines or newspapers, results of an investigation are often also highlighted. Often it is true that terms such as "beta," " p" and "significance" are not mentioned. Yet you can see from a number of terms that it is a multiple regression. Terms such as 'checking for other variables,' 'taking into account other variables,' and 'correction for other variables' show that multiple regression was used in the study.
Can regression determine causality?
Even if you add 20 variables that can be seen as potential third variables, it does not mean that you have met all conditions for causality. Multiple regression designs can exclude certain third variables, but they can not determine temporal precedence. In addition, they can not check for third variables that are not included in the study. It can happen that researchers are not aware that there is a certain variable that could influence the relationship between the criterion and predictor variable. This variable will then not be included in the study and the conclusion drawn on the basis of the results of the research will be distorted.The problem with potential third variables can actually only be solved by carrying out experiments. By randomly assigning test subjects to certain conditions, you take away third variables. Only experiments can determine causality.
What do pattern and parsimony mean for causality?
Longitudinal studies establish temporal precedence. Multiple regression analyzes establish temporal precedence. In correlational research something can still be used with which you can approach the causality of a relationship. Those are the pattern and parsimony. Parsimony is the extent to which a good scientific theory can offer the simplest explanation for a phenomenon. In causal claims, parsimony refers to the simplest explanation for a data pattern. That is the best explanation where you make as few exceptions or qualifications as possible.
A well-known phenomenon that is often investigated will be used as an example. Decades ago, people noticed that there were more smokers who had lung cancer than non-smokers. Manufacturers of cigarettes obviously did not want their sales to fall and they claimed that there were other factors that affected the correlation between smoking and lung cancer. Multiple regression analyzes might exclude certain third variables, but you can not exclude variables that you do not include in your research. Nor can you perform experiments because it is not ethical to assign certain subjects to a smoker condition. The only data that researchers had was data from correlational research. With the correlational data one had to come up with a simple mechanism.The most logical thing was to say that there are chemicals in the smoke of cigarettes that are poisonous when they come into contact with human tissue. The more contact a person has with these chemicals, the more he / she is exposed to the toxins. On the basis of this, predictions could be made (such as smoking cessation reducing the risk of cancer) and tested. The strength of the pattern must also be tested. Scientists often combine methods and results to develop and test causal theories. Journalists should not only bring forward a part of the research, but also state what previous research has found and they must also describe the context well.
What about mediation in multivariate regression?
When a relationship is established, scientists want to go further. They wonder why something happens. These explanations for causal relationships often include a mediator . If variable x has an immediate effect on variable y, but can also go via variable z and thus indirectly influence variable y, we call variable z a mediator. An investigation does not have to be correlational in nature to contain a mediator. Experts may also be involved in experimental research. Often it is true that a mediator can be well analyzed by multivariate methods. Mediators are similar to the third variable. Thus both can be examined by multiple regression. Yet these two things differ from each other. A third variable is external to the two variables in the original bivariate relationship and is often seen as a disturbing variable. A mediator is internal to the causal relationship and it is seen by research as an interesting variable.Do not mix mediator and moderator (discussed in Chapter 8).
What about the four validities in multivariate designs?
Multiple-regression analyzes help with the third variable problem, longitudinal research establishes temporal precedence and multivariate designs therefore have some evidence for internal validity. For multivariate designs, it is also important to examine the construct validity by looking at how well each variable was measured. To examine external validity, one can look at the subjects. Have these been chosen randomly? Have people from different layers of the population been used or have researchers, for example, only studied people with a low income? In addition, statistical validity can be investigated by looking at the statistical data that the researchers have presented.What about the effect size and significance? Also must be looked at outliers and curvilinear relations.
What are the variables in an experiment?
In psychology, experiment means that a researcher manipulates at least one variable and measures another variable. Experiments can take place in a laboratory or elsewhere. A manipulated variable is a variable that is checked. For example, a researcher can divide someone into a certain condition of a variable. Measured variables are noted measurements of behavior or ideas, such as self-reports, behavioral observations or physiological measurements. During the experiment, the researchers note what happens. In an experiment, the manipulated variable is an independent variable. The measured variable is the dependent variable. How a subject behaves on the measured variable depends on the level of the independent variable. Researchers have less control over the dependent variable than the independent variable. They manipulate the independent variable and see what happens to the dependent variable. When the values are expressed in a graph, the independent variable almost always comes on the x-axis and the dependent variable on the y-axis. When researchers manipulate an independent variable,then they have to make sure that only one thing varies at a time. In addition to the independent variable, researchers must also check for potential third variables by keeping the factors between the levels of the independent variable constant. Each variable that a researcher intentionally keeps constant becomes one control variable . Actually, control variables are not variables, because they do not vary, the levels are kept constant. However, these control variables are essential in experiments. They ensure that researchers can distinguish a cause from a potential other cause and thus eliminate alternative explanations of the results. Control variables are important for internal validity.
Why do experiments support causal claims?
Researchers can support causal claims by means of experiments. The three rules for causality were discussed earlier. Experiments meet the three rules for causality.
What about covariance and temporal precedence in experiments?
Comparison groups occur in experiments. Experiments are therefore better sources of information than your own experience, because you can not really compare your own experience with another group. Experiments manipulate an independent variable and each independent variable has two levels, so true experiments always try to look at covariance. An independent variable can show covariances in certain ways. A control group is a level of the independent variable that represents 'no treatment' or a neutral condition. When a study has a control group, the other level / levels are called the treatment group (s). In experiments, temporal precedence can also be controlled. after all,researchers first manipulate an independent variable and then look at the dependent variable. An experiment therefore ensures that the cause for the effect comes. This also makes experiments superior to correlational designs.
What about the internal validity of experiments?
Internal validity is important for causal claims. A research has good internal validity if it ensures that the causal variable and not other factors are responsible for the change in the outcome variable. Alternative explanations are called confounds and they pose a threat to internal validity. There are different confounds for internal validity.
A design confound is a researchers mistake in designing the independent variable. It is a second variable that varies at the same time as the independent variable that a researcher is interested in. It can therefore be seen as an alternative explanation for the results and that is not good. With a design confound, an experiment has poor internal validity and therefore can not support causal claims. However, you have to be careful with saying that an investigation has a design confound. Not all potentially problematic variables are confounds. When systematic variability is shown in the independent variable, a design confound can be problematic.Suppose there are two conditions (one with a red room and the other with a green room) and that in both conditions subjects have to solve anagrams. If people in one group were very good at solving anagrams and were in the other group of people who were very bad at solving anagrams, then that is a confound. For example, it can not be said that people are better at solving anagrams when they are in a green environment than when they are surrounded by red things.For example, it can not be said that people are better at solving anagrams when they are in a green environment than when they are surrounded by red things.For example, it can not be said that people are better at solving anagrams when they are in a green environment than when they are surrounded by red things.
A selection effect occurs in an experiment when the type of subjects in a level of the independent variables are systematically different from the subjects in another level of the independent variables. Selection effects can occur when researchers let the subjects choose which group they want to sit in. Selection effects can also result in the researcher attributing a certain type of person (for example, women) in a condition and another type of person (men) in a different condition.
Good experiments often use random attribution to avoid selection effects. In some researchers, a scientist can throw a dice to determine in which condition each subject will end up. So everyone has an equal chance to get into a certain condition. Test subjects differ in their motivation, intelligence and other things and random attribution can ensure that these differences are more evenly distributed. The experimental groups will become almost the same.
In practice, random attribution does not always work perfectly. This is usually the case with small groups. Researchers can therefore decide to matched groups to use. To make matched groups, the scientists must measure a particular variable that could be important for the dependent variable. This can be, for example, IQ. If you have four groups, you look at the four persons with the highest IQ. From that matched group, each subject is randomly attributed to one of the four groups. Then you look at the next four subjects with a high IQ and so you go on. Matching can ensure that everyone is randomly assigned and that the groups are equal in a certain variable. The disadvantage of a matched group is that an extra step has to be done, in this case an IQ test.
What are independent-groups designs?
Experiments can take many forms. In an independent-group design , different groups of subjects are placed in different levels of the independent variable. This is also called a between-group design . In a within-groups design (called within-subjects design ) there is only one group of subjects and each person is exposed to each level of the independent variable. Two forms of independent-group design are the post-test-only design and the pretest / post-test design. In the post-test-only design the subjects are randomly assigned to the groups of the independent variable and they are tested once on the dependent variable. The post-test-only design meets all three criteria of causality. In a pretest / post-test design test subjects are randomly divided into two groups and tested twice on the dependent variable: once for exposure to the independent variable and once after exposure with the independent variable. Researchers can use a pretest / post-test design if they want to evaluate whether the random attribution has made the groups equal. This is mainly done when groups are small and researchers can be sure that there is no selection effect. Such a pretest post-test design can also show how test subjects in the experimental condition have changed with time. Such a pre-test / post-test design is useful, but it can not always be done. However, the post-test-only design is already a good way to do research.
What are within-groups designs?
There are two types of within-groups designs. In competitor-measures design , subjects are exposed to all levels of an independent variable at about the same time and one preference of a behavior or idea is the dependent variable. In a study, scientists looked at whether babies had a preference for male faces or female faces. They let babies watch pictures of men's and women's faces at the same time. A researcher measured which face they looked the longest. The independent variable was the gender of the face and the babies were exposed to both levels of the independent variable at the same time. The preference of the babies was the dependent variable. In a repeated-measures design test subjects are measured more than once on the dependent variable - ie after exposure to each level of the independent variable.
The advantage of a within-group design is that it ensures that the subjects in the two groups will be the same, because they are the same subjects. Each subject can be compared with himself. A person is his / her own control person. With such a design, researchers can also say with more power that there is an effect between the conditions. Because all other differences (including sex, living conditions, personality and motivation) are kept the same, it is more likely that researchers will find an effect of the manipulation of the independent variable, if there is one. Power refers to the possibility of a study to show a statistically significant result when an independent variable has a real effect in the population. A within-group design can also be seen as a fine way of research because fewer subjects are needed.
What about the three criteria of causality in within-groups designs?
Within-group design sometimes the internal validity can be bad. Being exposed to a condition can change how test subjects respond to the other conditions. These responses are called order effects . Order effects occur when the exposure to a level of the independent variable affects the responses of the next level of the independent variable. These order effects are confounds. Order effects can consist of practice (exercise) effects . These effects are also called fatigue (fatigue) effects. A long sequence can lead to someone getting better in a task or finding it boring at the end of the task. Order effects can also carryover effects contain. A form of contamination can go from one condition to another condition. After brushing your teeth, things you drink will taste different than you are used to.
To prevent order effects, researchers can apply counterbalancing . This means that researchers present the levels of the independent variables in different sequences to subjects. When researchers want to use counterbalancing, they have to divide the test subjects into groups. Each group gets one of the orders. Random allocation determines which group gets the one order and which group the other. An experiment can be counterbalanced full or partially. When a within-group experiment has only two or three levels of an independent variable, researchers can have a full counterbalance to apply. In this all possible sequences are done. When there are three conditions - 1, 2 and 3 then each group of subjects is randomly assigned to one of the following six conditions:
1-2-31-3-22-1-32-3-13-1-23-2-1
As the number of conditions increases, the number of possible orders also increases drastically. When researchers want several people in an order, they need a lot of test subjects. It is therefore not always practical to do a full counterbalance. In a partial counterbalance , only a few of the possible fitness sequences are represented. The conditions can then be presented randomly to subjects (using a computer).
Within-groups designs can determine covariance, they can provide temporal precedence and if order effects are checked, then it is also good with the internal validity of these designs. Sometimes researchers do not opt for within-groups designs. One of the reasons is because of the order effects. Another disadvantage of such designs is that it is not always practical. A third problem occurs when people see all levels of an independent variable and therefore change their behavior (because they have or think through what the research is about).
What do the four validities say about causal claims?
In an experiment two constructs were operationalized: the independent variable and the dependent variable. Construct validity says something about how well the variables are measured and manipulated. When you look at the construct validity of an experiment, you have to look at both the dependent and the independent variable. Chapters 5 and 6 discuss how this can be done. Sometimes researchers also use a manipulation check to see if the construct validity of their independent variable is good. Pilot studies can also be used to see if the manipulations are effective. Pilot studies are studies that are done with a few test subjects and that are performed for the real research.Researchers can show that the results support their theory by collecting more data.
If you want to investigate the external validity of causal claims, you should look at how the subjects were included in the sample. If random sampling is done, then it is good with external validity. Often it is true that external validity is not a top priority for researchers who carry out experiments. Internal validity is more important and if both types of validity can not be guaranteed, then external validity researchers usually fall for internal validity.
In statistical validity of experiments, one must look at effect size, d . This number shows how many two groups differ from each other as far as the dependent variable is concerned. It shows the distance between the averages of a group and it shows how the scores overlap. It therefore looks at the difference between scores and spread within a group of scores. A larger d is associated with a larger r, which is good.
Internal validity is most important for causal claims. If the internal validity of an experiment is good, you can be pretty sure that your causal claim is accurate.
When examining an experiment, internal validity is most important. The biggest threats to internal validity are design confounds, selection of securities and order effects. These have already been discussed in chapter 10. These are unfortunately not the only things that could threaten an experiment.
What are threats to internal validity?
There are multiple threats to internal validity. In this chapter a total of 12 will be discussed. Now follow the first six: maturation, history, regression, attrition, test and instrumental threats.
Maturation (maturation) threat is a change in behavior that has come about spontaneously with time. People adapt to their environment, people become better at performing certain actions and children learn to speak better. This happens 'just' and not because of some intervention. To eliminate the threat of maturation, a control group must be used in the study. Sometimes changes occur because something specific has happened between the pretest and the post test. This is becoming a history threat called. A historical event does not have to be something big, like a war. It can also be something small, such as changing seasons. In order to be seen as a historical threat, the variable must affect everyone or almost everyone in a group. History threats can be prevented through the use of control groups. Regression threats refer to regression to the average. When a behavior is extreme at time point 1, it will probably be less extreme at time point 2. Extremity is usually explained by favorable or, conversely, unfavorable random events. Regression threats only take place in a pretest / post-test design and only when a group scores extremely well on the pretest. These threats can be prevented through the use of control groups.
Attrition is a reduction of the test subjects that takes place before the research has finished. This can, for example, take place between the pre-test and post-test. Attrition is a problem when it is systematic. This means that it becomes a problem when a certain type of test subjects no longer participate in the study. If these test persons are not present, the results may be distorted. Researchers can extract the data from subjects who stop taking it from the research. A test threat refers to the change in a subject as a result of taking a test more than once. People can get better at making the test or get bored. Researchers can prevent this by using alternative forms of the two measurements. An instrumental threat occurs when a measurement instrument changes over time. Observers should not suddenly change their standards between two measurements.
What are the three major threats to internal validity in experiments?
Even if you add control groups, there may still be threats to the internal validity of your experiment. Three of these threats are observer, placebo and demand characteristics. An observer bias takes place when the expectations of the researcher influence his interpretation of the results. In addition to a threat to internal validity, an observer bias can also pose a threat to construct validity. Demand characteristics are a problem when the subjects think they know what a study is about and therefore change their behavior. In order to prevent observer bias and demand characteristics, it is useful to conduct studies that are double-blind to be. This means that both the subjects and the researchers who evaluate them do not know in which group the subjects are. When a double-blind study is not possible, researchers can also perform a masked design (chapter 6). This means that the subjects do know in which group they are, but the researchers do not know in which group a test subject is. A third threat is the placebo effect . This effect occurs when subjects receive treatment and become better, because they think they have received a real treatment (for example, a real medicine instead of a sugar pill). Placebo effects are not imagined and research has shown that placebo effects can be both psychological and physical. It is also not the case that placebo effects are only positive. You often hear that people are less depressed because they think they have received a pill that has made them better. Nevertheless, placebo effects can also cause unpleasant side effects, such as skin rashes and headaches. To prevent placebo effects, it is useful to perform double-blind placebo control studies. Both the test subjects and the researchers who give the pill do not know in which group a test subject is.
What happens when a researcher finds a zero effect? This means that the independent variable has not exercised any influence on the dependent variable. There seems to be no significant covariance between the two. Most people will not often read about zero effects. It is of course more interesting to present results in which the independent variable has had an influence on the dependent variable. Nevertheless, zero effects are fairly common. Especially if you start doing research yourself as students, you will probably have to deal with zero effects. Zero effects can occur when the independent variable has not actually affected the dependent variable. However, zero effects can also occur because the research was not accurately set up or performed.The independent variable may influence a dependent variable, but due to some obscure factor, the researchers could not find the true difference. The obscure factors can take two forms: there was not enough difference between groups (between-group) or there was too much variability in groups (within group).
What happens if there is not enough difference between groups?
Bad manipulations, insensitive measurements and reversion design counfounds can ensure that there is not enough difference between groups. When a zero effect emerges, a researcher must carefully examine how he has operationalized his independent variable. The construct validity must be examined to test weak manipulations. Maybe other manipulation groups should have been created. Sometimes zero effects are found because the researchers have not rationalized a dependent variable with enough sensitivity. When different groups all score very high on the dependent variable, we call this ceiling effects . When all groups score very low on the dependent variable, we call this floor effects . Suppose you have three different groups of test subjects who all receive the same test. Suppose that the test is so difficult that almost no one can get a good score. All subjects would have a very low score. This is because the test was far too difficult and it can not be said that the different conditions of the independent variable have influenced this. So a floor effect has occurred. Manipulation checks can help to detect weak manipulations (and therefore also ceiling and floor effects).
What happens if there is a lot of variability in groups?
Zero effects can also be found if there is too much variability in a group. This is called noise or error variance. Due to the large variability in a group, a real difference between groups may not be detected. Because there is so much variability in group A, subjects from group A can match test subjects from group B. This creates a statistical validity problem: the more the groups overlap, the smaller the effect size and the fewer the averages of the groups. will be statistically significant.
One reason for the large variability in a group can be the measurement error. A measurement error is any factor that can increase or decrease the true score of a person on the dependent variable. A man who is 1.80 meters, can be measured as 1.79 meters because he was not well upright. All measurements of dependent variables do have a measurement error, but researchers try to keep this as low as possible. The more sources of random error there are in the measurement of a dependent variable, the more variability there will be within a group of subjects. A measurement error can be reduced by using reliable and accurate measurements and techniques. When it is difficult to find a good instrument, it is useful to do more measurements. More subjects must be included in the study. The more subjects there are, the greater the chance that the random errors will cancel each other out.
Individual differences can also create variability within groups. One way to take these differences into account is to use within-group designs. Each subject participates in both conditions of the independent variable. Such a structure ensures that each person can be compared with himself / herself and that individual differences can thus be written down. In addition fewer subjects are needed for within-group designs (chapter 10). The same result can also be obtained with matched groups. If there are two conditions, the researcher will have to match people from different conditions that are similar and compare the scores on the dependent variable of these people. If within-group designs or matched groups can not be executed,researchers must then search for more subjects.
A third factor that can cause group variability is situation noise . These are all kinds of different external derivations. This can be any factor that can cause variability in a group and hide real differences. Researchers try to reduce the noise situation by doing the experiments in a quiet setting. That is, somewhere where not many cars drive by, are not unpleasant or pleasant smells or where no noisy people walk by. Researchers often try to do much to reduce certain distractions that could affect the dependent variable. When researchers use a within-group design and strong manipulations and perform good checks of the experimental situations, they increase the power (Chapter 10). With a research that has a lot of power, true patterns can be found better.
If you find a zero effect, then you have to check whether the manipulations were good, whether the variables were well operationalized, if there were any measurement errors, if there were enough subjects and if you had good control over the situation. If all these things and power are good and you still find a zero effect, then you can say that the independent variable really does not have an effect on the dependent variable.
What are interactions in research?
Researchers can be interested in more than one independent variable from the outset, or they can suddenly come up with a follow-up study in which an extra independent variable is looked at. When researchers ask about the effect of an extra independent variable, they are usually interested in an interaction effect . This interaction effect looks at whether the effect of the original independent variable depends on the level of the other independent variable. An example of this is hands-free calling and response time while driving. Researchers wanted to know if younger people have a worse response time while driving when they are calling hands-free than older people. Research had already shown that calling while driving makes sure that people react less well to 'obstacles' on the road. In that study there is only one independent variable (the use of the telephone). Then they wanted to know if the effect depended on age. That therefore became the second independent variable. An interaction effect can be mathematically explained as a difference of the difference.
People's thoughts, behavior, emotions and motivation are very complicated. It is therefore not surprising that they are often involved in research in interactions. There are different types of interactions. Suppose you are asked if you like hot or cold food more. You will probably answer that depends on the food itself. You want your soup to be warm and your ice cream cold. The food you have to judge is an independent variable and the temperature of that food is another independent variable. If you put this in a figure, you would see an interaction effect. The two lines of the independent variable will intersect. This interaction becomes a crossover interaction called. If the lines of two independent variables do not run parallel and do not cross each other, then we speak of a spreading (spread) interaction . When there is an interaction, you can describe it accurately in both directions. It does not matter which independent variable you put on the x-axis.
Which design can be used to test two variables?
Researchers use factorial designs to test interactions. A factorial design is a design with two or more independent variables (called factors). Usually the two independent variables are crossed. This means that researchers test every possible combination of the independent variables. In the example of mobile use, age and reaction speed while driving, there are two factors: age and telephone use. When the two independent variables are crossed, four conditions arise: old people who drive and make a phone call, old people who drive and do not make a phone call, young people who drive and have a telephone conversation and young people who drive and do not make a phone call. There are two independent variables and each variable has two levels (young vs. old and bubbles vs. do not call).This design is therefore also called a 2 x 2 design. Factorial designs can be used to test manipulated variables (using telephone or not) and participant variables (age).
Can factorial designs be used to test limits and theories?
Factorial designs are used to test whether an independent variable can influence different people or influence people in different situations in the same way. The research of telephone use, age and reaction speed was also carried out with a factorial design. No interaction between the independent variables was found. This means that there was no difference in response speed with or without telephone use between young and old drivers. Testing limits in a study does look like testing the external validity. When an independent variable is tested in more than one group, researchers actually test whether the effect generalizes. In the example about reaction speed and telephone use, both groups react the same.Generalize the effect to drivers of all ages. There are of course also studies where groups react differently to an independent variable. When factorial designs are used to test the limits, you also search for moderators, as it were. A moderator (chapter 8) is a variable that exerts an influence on the relationship between an independent variable and dependent variable. A moderator results in an interaction. The effect of an independent variable depends on the level of the other independent variable. Factorial designs are not only used to test the generalizability of a variable, but also to test theories.When factorial designs are used to test the limits, you also search for moderators, as it were. A moderator (chapter 8) is a variable that exerts an influence on the relationship between an independent variable and dependent variable. A moderator results in an interaction. The effect of an independent variable depends on the level of the other independent variable. Factorial designs are not only used to test the generalizability of a variable, but also to test theories.When factorial designs are used to test the limits, you also search for moderators, as it were. A moderator (chapter 8) is a variable that exerts an influence on the relationship between an independent variable and dependent variable. A moderator results in an interaction. The effect of an independent variable depends on the level of the other independent variable. Factorial designs are not only used to test the generalizability of a variable, but also to test theories.The effect of an independent variable depends on the level of the other independent variable. Factorial designs are not only used to test the generalizability of a variable, but also to test theories.The effect of an independent variable depends on the level of the other independent variable. Factorial designs are not only used to test the generalizability of a variable, but also to test theories.
How do you interpret factorial results?
In an analysis with two independent variables, you can inspect three things: two main effects and an interaction effect. The effect of each independent variable must be examined. These are therefore main effects. The marginal average is the average for a factor, averaged over the levels of the other independent variable. Researchers look at marginal averages to examine the main effects and they use statistics to investigate whether the difference in marginal averages is statistically significant. You should not be mistaken with the word "main effect." A main effect does not mean that it is the most important effect (if there is an interaction, then that is the most important effect). Actually the name overall effect is better.Main effects are differences and an interaction effect is the difference of the differences. Interactions can often be seen if you have a figure (sometimes it is difficult to see), but you could also look at a table. If you look at the difference of the levels of each independent variable and see that these differences are different, then you know that there could be an interaction. Using statistics you could find out if this difference is significant. Interactions are easier to detect in figures. When the lines in a graph run parallel, there is probably no interaction and if they do not run parallel, there is an interaction. Of course you have to confirm it with statistics. You could also detect interactions in a bar chart.When you draw lines from the same levels (for example from orange to orange and pink to pink) and these lines are not parallel, you could assume that there is an interaction. When both main effects and an interaction effect are found, the interaction effect is more important.
Which factorial variations are there?
In the previous piece a 2 x 2 design was discussed. Of course, researchers can also choose an independent variable that has more than two levels or they can use three or more independent variables. In independent groups of factorial design (between-subjects factorial) both independent variables are studied as independent groups. If it concerns a 2 x 2 factorial design, then there are four different groups of subjects in the experiment. In a within-groups factorial design (repeated-measures factorial) both independent variables are manipulated within groups. When there is a 2 x 2 factorial, there is a group of test subjects and all these test subjects participate in all four cells of the design. In a mixed factorial design an independent variable is manipulated as an independent group and the other independent variable is manipulated as a within group.
What happens if the number of levels or the number of independent variables increases?
When one of the independent variables has three levels and the other independent variable two, then we speak of a 2 x 3 design. There will then be 2 x 3 = 6 cells. Of course there are several combinations for designs. When independent variables have more than two levels, researchers can also simply look at the main effects and interaction effects by calculating the marginal averages and then see if they differ. The easiest way is still to create a line chart in SPSS and see if the lines are parallel. Of course, it must also be checked whether the effects are significant. When researchers add a third independent variable and all independent variables have two levels, then we speak of a 2 x 2 x 2 factorial design, or a three-way design . In this design there are 2 x 2 x 2 = 8 cells or conditions. The best way to display such a design is to run a table twice from your original 2 x 2 study. Once for each level of the third independent variable. If you want to display it in a graph, you have to make two line diagrams that are next to each other. In a three-way design there can also be three main effects and two interaction effects or a large three-way interaction. A three-way interaction means that the two-way interaction between two of the independent variables depends on the level of a third independent variable.
How can you discover in articles that it was a factorial design?
In empirical articles it is almost always described which design has been used. They often use the terms 2 x 2 or 2 x 3. These numbers show nicely how many independent variables there are and how many levels each variable has. In addition, empirical articles often use the terms 'main effect' and 'interaction'. Popular articles in magazines or newspapers often do not mention which design has been used. Yet there are certain indications that you can see whether it was a factorial design or not. So you can look at the word 'it depends on ...'. This shows that a certain effect depends on the level of another variable. You can also recognize factorial designs because participant variables are used.
A quasi-experiment differs from a true experiment in the field of control. In a quasi-experiment, researchers do not have full control over the conditions. Subjects are not randomly attributed to the conditions. Below is an example of a quasi-experiment.
Plastic surgery is performed virtually anywhere in the world. People who undergo such procedures say that their self-confidence and body image will improve after the procedures. But is that really true? One way to find out is by randomly attributing people to the plastic surgery condition and not to the others. This is of course not ethical, because you can not tell test subjects that they have to undergo plastic surgery for the examination. Nevertheless, researchers have found a way to test the effects of plastic surgery. Researchers have asked people who were already about to undergo plastic surgery for their research. These people were tested for their self-confidence for the study and 3, 6 and 12 months after the study.The comparison group was a group of people who were also registered at the same plastic surgery clinic, but who had not yet had any intervention. They also answered questions at the same time as the first group. This research resembled an experiment, but it was a quasi-experiment because subjects were not randomly assigned to a condition.
What about the internal validity of quasi-experiments?
The support that a quasi-experiment can offer to causal claims depends on the design and the results. There is a selection effect for internal validity if the groups of different levels of the independent variable contain different types of subjects. For example, you can not state with certainty that the independent variable has caused a change in the dependent variable. For example, the subjects who underwent plastic surgery were perhaps different from the subjects who had not undergone plastic surgery. Indeed, research showed that the subjects who underwent plastic surgery were richer than the subjects who had not yet undergone it. However, this study had a pretest post-test nature and this meant that the selection effects became nil.Matched groups can also be used to compare the two groups of subjects. Some researchers fit one wait-list design , in which all subjects undergo treatment, but at different times.
There are several problems that can occur in quasi-experiments. In quasi-experiments, problems can also occur with the design. A design confound occurs when a third variable systematically varies within a level of the independent variable that people are interested in. By collecting extra data you can ensure that no design confound occurs. A maturation threat can occur when subjects with pretest and posstest show an improvement, but it is not clear whether the change is caused by the treatment or if the group has been spontaneously improved. With a control group it is easier to say whether an improvement is caused by a treatment or has arisen spontaneously. A historical threat (historical threat) happens when a historical event occurs for all subjects in a study simultaneously with the treatment. It is then unclear to say whether a result is caused by the treatment or by an external event. With a comparison group, the effects of historical threats can usually be written off.
Regression to the average happens when an extreme result is caused by a combination of random factors that probably will not happen again in the same combination. Your extreme result will therefore become less extreme with time. Regression effects are only a threat to internal validity if a group is selected because of an extremely high or extremely low score. These scores could be extreme due to the combination of random factors that will no longer occur together. Attrition happens when people no longer want to participate in the research after a period of time. It is a threat to internal validity when people leave because of a systematic reason. It may be that the people who were least happy after their plastic surgery stopped the research. The result that plastic surgery improves the self-image is due to the fact that only satisfied subjects remained in the study. However, it is easy to check attrition. You only have to check whether the persons who have gone away from the survey systematically agree.
When test subjects are tested multiple times, researchers must look out for test effects. Repeated testing can ensure that people get better because the test is known to them or that they get worse because the test has become dull. Researchers therefore sometimes use different, equivalent tests. In doing so, they must take into account the difficulty of the tests. It is not intended that the tests differ in difficulty, otherwise you can not see whether the change was really caused by the treatment. Another threat to the internal validity of quasi-experiments is an observer bias. Sometimes the expectations of a researcher can influence his interpretation of the results.Test subjects may also think that they know what an investigation is about and adjust their behavior accordingly.
Why should quasi-experiments be used?
Quasi-experiments can be sensitive to the threat of internal validity. Why would researchers then use quasi-experiments? One of the reasons to use quasi-experiments is because quasi-experiments can use research in 'the real world'. There is no artificial setting, as in many real experiments (laboratory). These real settings can also ensure that external validity improves and that it can be stated with more certainty that the results can be generalized to the population. In addition, quasi-experiments can also be used if one deals with the ethical issues of real experiments. Some things can only be investigated ethically with quasi-experiments (such as plastic surgery research).Quasi experiments also show good construct validity of the independent variables.
How can you conduct research with few subjects?
Sometimes scientists conduct research with few subjects. As mentioned earlier, it is not always necessary to have a very large sample. N is the number of subjects in a sample. It is more important for external validity to select a sample well than to include many subjects in the sample. When researchers use a small N-design , instead of getting little information from a large sample, they get a lot of information from a small sample. They can even look at one animal or one person in a single N-design . There are differences between large N-designs and small N-designs. In large N-designs test subjects are put in groups and the data of an individual is not interesting. We look at the combined data of all persons. Data is also presented as group averages. In small N-designs, each individual is treated as a separate experiment. Often these designs are repeated measurements, in which researchers observe how an animal or human reacts in different conditions. The data of individuals is presented in small N-designs.
What are three different small N-designs?
Well thought-out and executed small N-design studies can help scientists to find out whether changes have come about through interventions or the influence of another variable. There are different designs that can be used.
In a stable baseline (stable baseline) design, observers observe behavior from a long baseline period before starting treatment or intervention. If the behavior during the baseline is stable, researchers can say with more certainty that a treatment is effective. A stable baseline has resulted in internal validity. In a multiple-baseline design spread researchers their introduction of interventions on different contexts, moments or situations. By looking at multiple baselines and behaviors, researchers can increase internal validity and thereby support the causal conclusions. Different baselines can be different behaviors within a person or different situations for a person. The baseline conditions can also be different people. Whatever a multiple-baseline design looks like, it offers a comparison group or comparison condition with which a treatment can be compared.
In a reversal design a researcher observes problem behavior with and without treatment, but then he takes the treatment back (reversal period) to see if the problem behavior comes back. If the treatment really works, then the behavior should decline again when the treatment is removed. In this way, internal validity can be tested and causal assertions can be made. Reversal designs are only suitable for situations where the treatment would not cause permanent changes. You can not run a reversal design to investigate an educational intervention. If a student has mastered a certain skill, then this skill will not suddenly be lost. It can also sometimes be unethical to perform a reversal design.It is not always ethically justified to take a treatment away from someone (for example from depressed people). There have been several great psychologists who have few subjects used to develop theories. For example, Piaget looked at his three children to develop his theory on the cognitive development of children.
How do small N-designs meet the four validities?
The previous pieces of text have dealt with how small N-designs can ensure that internal validity is increased. But what about the other validities? Can an animal or a person represent an entire population (external validity)? Researchers can take extra steps to increase external validity. This allows researchers to triangulate by combining the results of small or single N-studies with other studies that had more test subjects. Sometimes researchers are not at all interested in generalization to a whole population. Research can sometimes really be intended for a small subgroup of the population. For construct validity in small N-designs it is important that there are multiple observers and that it is checked for inter-assessor reliability. In small N-designs, no traditional statistical methods are often used. However, conclusions must be drawn from the data and the data must be treated in a good way.
Scientists should always ask themselves whether the results of their research are replicable. This means that the findings show the same results when the result is carried out again. Replicability gives a research credibility. It is often the case that researchers replicate their results before their findings are published. There are several replication studies.
In direct replications , researchers repeat the original research as accurately as possible. They try to find out whether the original effect can also be found with new data.
In a conceptual replication , scientists examine the same question but they use different procedures. The variables are operationalized in a different way. For example, research on the size of portions in the first study can use pasta and use French fries in the replication study.
In a replication plus extension survey the researchers replicate the original research, but they also add variables to test more questions. An example of this was the research into reaction speed and calling while driving. Initially, researchers only looked at whether and how the reaction speed changed during the call, after which they decided to also see if there was a difference between young and old drivers. The introduction of a participant variable can be used to carry out a replication-plus-extension study. Another way to conduct such a study is by introducing a new situational variable. This variable allows you, for example, to compare the data of a moment of time with the data of another time moment.For example, one can test drivers who have not had training with a driving simulator and test the same people again four days after they have practiced with such a driving simulator. You can think of many different situational variables to add to an investigation.
Much value is attached to replication of research by other researchers. If it is not possible to replicate a study, it could mean that the original effect can only be found in very special conditions and circumstances. One must then carefully deal with the importance of the effect.
What does the literature say about meta-analyzes?
Scientific literature consists of a series of related studies that have been carried out by various researchers and have tested similar variables. Sometimes researchers themselves collect all studies on a specific topic and generate it into a review article. One way to write such a review article is by listing all the findings. Another way is to make a mathematical summary of the scientific literature. This is a meta-analysis . Meta-analysis often includes studies that have different sample sizes. It is often the case that the studies with a higher sample size also count heavier in the analysis. In meta-analyzes, the effect sizes are averaged to find an overall effect size. Researchers can also sort a group of studies into categories and calculate the effect sizes for all categories. Because meta-analyzes often contain studies published in empirical journals, you can be pretty sure that the quality of the data is good. Nevertheless, you should consider the publication bias in psychology. This means that significant relationships are published more often than zero-effect relationships. This can lead to the desk drawer effect (file drawer problem). This means that a meta-analysis can overestimate the true size of an effect because zero effects are not included in the analysis. Actually, researchers who want to carry out a meta-analysis should contact their colleagues and ask for published and unpublished data from their projects. Meta-analyzes are strong because they combine the findings of different studies, but a meta-analysis is only as strong as the data that goes into it. Account must be taken of unpublished studies and that distorted conclusions can be reached by not adding studies with zero effects.
Should an important study have external validity?
Replications can also help with examining certain validities. External validity is about the extent to which the results of a research are generalizable to other persons and settings. Direct replication studies do not support external validity, but conceptual replication and replication plus extension studies can support external validity. When different methods are used to test the same, researchers can decide to include other test subjects and other settings in the study. In addition, it is important to know that for generalizability it is more important to look at how subjects are recruited than how many subjects are recruited. The agreement between the context of an investigation and the 'real world' is sometimes called the called ecological validity . Ecological validity is an aspect of external validity. It depends on the goal of the researchers how important the ecological validity is. If the researchers only want to apply their theory to men, then the results need not be generalizable on women. The same also applies to causal claims. In the theory test mode, researchers only want to test an association that can contribute to the support for a theory. In that case, it is more important to test internal validity than external validity. The example with the monkeys and the comfort contact theory (chapter 1) is an example of such a theory test mode. It was more important for the researchers to test the internal validity than the external ones.
Yet psychologists are also interested in working in a generalizable mode. These psychologists want to generalize the findings of their sample to a larger population. Applied research is often done in the generalizable mode. Frequency claims must always be tested in the generalizable mode. Of course you want to make a statement about a large group of people. Association and causal claims are often done in the theory test mode, but it can sometimes also be tested in the generalizable mode. Culture psychologists are interested in how a culture determines the way of thinking, behavior and feeling of individuals. Culture psychologists mainly use the generalizable mode. They have shown that many theories that are supported in a cultural context,not be supported in another cultural context. This is also the Müller-Lyer illusion (two types of lines that do not look as long, but whose length is the same). Apparently, falling for visual illusions depends on the culture you grew up in. People who have grown up in a developed country have more experience with right angles and therefore have a different depth of perception than people from villages in Africa. The Dutch will otherwise see illusion against the Müller-Lyer than African villagers. Psychologists always have to take into account that processes, even basic processes, can be influenced by culture. Most studies were done with subjects from the United States, Australia and Europe. These subjects are also called the WEIRD population: western, educated, industrialized,rich and democratic. These WEIRD people do not represent the whole world. It is important to realize that you can not assume that your results can be generalized to the whole world if they are generalizable to the Dutch population.
Should research only be carried out in real settings?
Many people mistakenly think that studies done in the field (daily life) are more important than studies carried out in an artificial laboratory. Researches that are done in the field (field-setting) almost certainly have good external validity. However, the ecological validity of a setting is only one aspect of the generalizability of the setting. A setting may be realistic, but it does not represent all settings that a person can encounter. Often it is also the case that researchers copy the settings in a laboratory as accurately as possible. Emotions and behaviors that are shown during the laboratory research can be just as real as in the real world. Many laboratory experiments are high in experimental realism . This means that they create settings in which people exhibit genuine emotions, motivations and behaviors. By increasing the ecological validity of a research, scientists can ensure that their findings are generalizable to non-laboratory settings. Studies conducted from the theory test mode find it important to have good internal validity, even if it is at the expense of external validity. That does not mean, however, that these studies are not important. Many such studies have contributed to our knowledge about psychology. External validity is not everything.
Research Methods in Psychology - Morling - BulletPoints (NL)
What is the psychological way of thinking? Ch.1
Psychologists are scientists and psychology is based on doing research.
Psychologists can be producers or consumers of research. Producers of research questionnaires take off and conduct research, often at universities, c onsumers research reading scientific journals and apply the theories in their work as a therapist, counselor or teacher.
Some psychologists are both producer and consumer.
It is important for therapists to follow evidence-based treatments . These are therapies that are supported by research.
The theory-data circle means that scientists collect data to test, change or update their theories.
A theory contains assertions about the relationship between variables. Theories lead to specific hypotheses .
A hypothesis can be seen as a prediction. It says something about what the scientists expect to observe, if their theory is correct.
Data can be seen as a set of observations. Data can support or contradict a theory.
Good theories are supported by data, are falsifiable (data can contradict a theory) and parsimonious (if two theories explain the data equally well, but one is simpler than the other, then the simple theory must be chosen).
Scientists are empiricists and they observe the world systematically. They test their theories with studies and they adapt their theories to the data found.
Scientists approach empirical research (problems from daily life) and basic research (intended to contribute to general knowledge).
Scientists also continue to investigate: once a scientist has found an effect, he / she wants to do follow-up research to find out why, when and for whom the effect works.
In addition, scientists make their findings known in the scientific world and the media.
What are sources of information in psychological research? Ch.2
Conclusions based on experiences or intuition are usually not reliable. One of the reasons is that experiences do not have a comparison group.
In order to be able to draw conclusions about a particular treatment or effect, groups must be compared with each other. The treated / recovered group, the treated / non-recovered group, the untreated / recovered group and the untreated / non-recovered group must be looked at.
In daily life there are several explanations for a solution. In the research these alternative explanations are called confounds .
A confound takes place when you think that a thing has caused a result, but other things have changed and it is not certain what the cause was.
In everyday life it is difficult to isolate variables. In research it is possible to check variables and to change one variable at a time.
Conclusions based on intuitions are often not reliable. That is because most people are not scientific thinkers and can therefore have a distorted view of reality.
The availability heuristic (availability heuristic) is a cognitive bias and means that things that we can quickly recall, send our thoughts. These are often events that are lively or have recently taken place.
If you only look at the things that are present and not at the things that are absent, then you commit a present / present bias .
C onfirmatory hypothesis testing is not a scientific way of doing research. Questions are asked that confirm a hypothesis, but no questions are asked that could contradict the hypothesis.
The bias blind spot is the conviction that we will not fall prey to a bias.
Most psychologists publish their work in three different sources: scientific journals, book chapters and whole books.
Empirical articles report the results of an investigation for the first time. These articles tell us something about the method used, the statistical tests that were used and the results of an investigation.
A review article provides a summary of many / all published studies that have been done on a topic. A so-called edited book consists of a number of chapters dealing with the same subject, but written by different scientists.
Psychologists can also describe their research in a complete book. However, that does not happen often.
What are the research tools for consumers? Ch.3
A variable varies, so it must have at least two levels.
A measured variable is a variable where the values are observed and noted (for abstract variables questionnaires are used). A manipulated variable is a variable that a researcher exerts influence on.
A claim is an argument that someone makes. Claims should be based on research.
There are three different types of claims: frequency claims, association claims and causal claims.
Frequency claims describe a certain amount of a variable that is expressed in a numerical value. These claims claim how often something occurs and are always about one measured variable.
Association claims claim that a certain level of a variable is associated with a certain level of another variable. Association claims contain at least two variables and the variables are measured, not manipulated.
Causal claims claim that a variable is responsible for the other variable. Causal claims always start with an association, but they continue even further.
To go from association to causality, a study must fulfill three criteria: covariance (correlate two variables), temporal precedence (causal variable comes before the outcome variable) and internal validity (there is no third variable that influences the two variables) ).
To evaluate claims, it is necessary to look at construct validity, external validity and statistical validity.
Construct validity is about how well a study has measured or manipulated a variable and e xternal validity is about the generalizability (how well do the subjects represent the population?). S tatistic validity looks at the extent to which the statistical conclusions are accurate.
What are the ethical guidelines for psychological research? Ch.4
Many ethical systems state, among other things, that researchers must adhere to the principles of respect, humanity and justice.
The principle of respect means that test subjects must know what questions will be asked during the research, which risks and what benefits the research will have. On that basis, decide whether they want to participate in the research.
The principle of humanity means that researchers must check in advance whether subjects and populations are at risk or receive benefits from the research.
The principle of justice requires a balance between the people who participate in the research and the people who get benefits from the research.
The guidelines of the American Psychological Association (APA) can also be used. There are five general and ten specific guidelines.
The five general APA principles are: respect, humanity, justice, integrity and loyalty and responsibility (is seen together as one principle).
Integrity means that teachers need to learn accurate things from their students and that therapists must remain informed of empirical evidence about therapeutic techniques.
Loyalty and responsibility mean that psychologists can not engage in sexual relations with their pupils or clients and that teachers should not have one of their pupils as clients.
Ethical Standard 8 is most important for researchers. This standard states that there must be an institutional review council that determines whether research is ethically conducted or not.
It also states that subjects in whom deception is used by the researcher must be debriefed.
Nor can data be fabricated (fabricated values) or falsified (affecting results by omitting things or by influencing test subjects). In addition, researchers must refer to the original author when they use those ideas, otherwise there is plagiarism .
As far as research on laboratory animals is concerned, one must adhere to the three R's: replacement (replacement of animals), refinement (refinement, so that the animal receives as little stress as possible) and reduction (reduction of the number of animals used).
What are good measurements in psychology? Ch.5
Self-reports look at the answers that people give themselves on a questionnaire or during an interview.
Observational measurements (called behavioral measurements ) make a variable operational by observing observable behaviors. Physiological measurements operationalize a variable by looking at biological data, such as brain activity and heart rate.
Operational variables (called nominal variables) are mainly classified as categorical or quantitative, with the levels of categorical variables being categories. The numbers of a nominal variable have no numerical value.
Quantitative variables can be further classified on an ordinal, interval and ratio scale. An ordinal scale looks at a hierarchy and says nothing about the distance between the different values.
An interval scale does work with equal intervals (distances) between levels and there is also a real zero point, but that does not really mean that someone has 'nothing'.
A ratio scale also has equal intervals and really a zero point that means 'nothing'. For example, someone can really have 0 things right on a test.
If variables are well operationalized, then there is good construct validity. Construct validity has two aspects: reliability refers to how consistent the results of a measurement are and how validity looks or a variable measures what it is supposed to measure.
Reliability can be tested in three ways: test-retest reliability, inter-assessor reliability and internal reliability. Test-retest reliability means that the researcher finds the same scores every time he / she measures something.
Inter-assessor reliability means that the same scores are obtained from different assessors. Internal reliability means that a subject gives a consistent pattern of answers.
Reliability can be analyzed using scatterplans and correlation coefficients. For good reliability, the points should be well around the straight line in a scattergram and the correlation should be positive and strong (close to value -1 or 1).
To measure the internal reliability of a scale, researchers look at Cronbach's alpha. This number can be calculated with SPSS and the closer the number to the 1, the more reliable the scale.
Face validity means that a variable seems plausible: if it seems to be a good measure, then it has face validity. Component validity checks whether a measurement contains all parts of a construct.
Criterion validity checks whether the measurement is related to a concrete outcome, such as a behavior, with which it should be associated according to the theory. This can also be analyzed using scatter diagrams and correlations.
If there is validity, then the measurement should correlate strongly with other measurements of the same construct (called convergent validity ) and it should correlate less strongly with measurements of different constructs ( discriminant validity ).
How do we use a survey and observations? Ch.6
A survey refers to questions asked to people by phone, during interviews, on paper, via e-mail or on the internet. Survey questions can be open or closed.
Sometimes a question can be so difficult to say that a respondent will have difficulty answering that reflects his / her opinion accurately. It is best to ask as simple a question as possible.
Researchers must look out for guiding questions: some positive or negative words can influence respondents' answers.
Double-barreled questions are two questions in one. These have poor construct validity, because you do not know whether the respondent has answered or the first question, the second or both questions.
The order of the questions can also influence the answers that people give. The best way to check whether the order of questions influences is to create different versions of the questionnaire and to change the order of questions in each version.
Response sets are fast responses that a subject can give when answering a questionnaire. Sometimes people do not think about certain questions and they can answer all those questions negatively, positively or neutrally.
A form of a response set is acquiescence or consent. This means that someone always answers 'yes' or 'strongly agree' to all questions.
One way to see if someone who says yes every time really agrees with the propositions is to put the questions the other way round.
Another response set is fence sitting : people always choose the middle of a scale. One way to counter this is to remove the middle.
A survey is suitable for asking questions that are subjective in nature: what a person thinks he / she is doing and what he / she thinks his / her behavior influences. But if you want to know what people really do and what their behavior really influences, then you will have to observe these people.
Observator bias occurs when the expectations of an observer influence their interpretations of the behavior of test subjects. Observator effects occur when an observer changes the behavior of the person or animal that he / she is observing.
Reactivity is that people change their behavior in one way or another when another person looks. This can be countered by striking as little as possible as an observer or by letting the test subjects get used to you, so that they forget that they are being observed.
How do you rate the frequencies of behavior and beliefs? Ch.7
The external validity looks at whether the results of a particular study can be generalized to a larger population. External validity is important for frequency claims.
A population can be seen as a whole set of people or products that a researcher is interested in. A sample is a smaller set from that population.
In a biased sample , some members of the population of interest have a higher probability of being included in the sample than other members of the population.
Convenience sampling and self-selection can provide a biased sample. Convenience sampling is a sample of people available to use and self-selection means that a sample contains people who want to participate in the study themselves.
To obtain a representative sample, researchers can use random ( probability) sampling . This means that every member of the population that is interested in has an equal chance of being chosen to be included in the sample.
In a cluster sample clusters of subjects from a population random are selected and all individuals in all selected clusters are then used. A multi-stage sampling is similar to this, but two random samples are performed: first a random sample of clusters is done and then a random sample of people is made within these clusters.
In stratified random sampling , a researcher selects certain demographic groups and then performs a random selection of individuals within each of these groups.
Oversampling means that the researcher deliberately over-represents one or more groups.
If external validity is not important for a researcher, he can choose to use a biased sample. He can then choose people in a non-random way ( purposive sampling ) and / or ask people if they can let two acquaintances participate in the research ( snowball sampling ).
Larger samples are not always better. If one wants to generalize results to a large population (such as the US), then a sample of 1000 subjects will suffice.
What is bivariate correlational research? Ch.8
Association claims are claims that describe the relationship between two measured variables. A bivariate correlation is an association that concerns two variables.
The data of a correlational study can be described on the basis of a scattergram and the correlation coefficient r . When all the test subjects are represented as dots and a line is drawn through the dots, one can see whether the relation is positive (x high, y high) or negative (x high, y low).
The strength of the correlation can be indicated by the correlation coefficient r and it ranges from -1 to 1. A correlation of .10 or -.10 has a weak effect size, an r of .30 or -30 has a moderate effect size and a correlation of .50 or -.50 and larger has a large effect size.
It is more convenient to display association claims with a categorical variable in a bar chart than in a scatter chart. In addition, better use can be made of the t-test instead of the r.
The effect size is part of the statistical validity and association claims can be tested on this basis. The effect size looks at the strength of a relationship and the closer the r is to 1, the stronger the relationship (generally).
Strong effect sizes are more accurate and generally also more important than weak effect sizes. Weak effect sizes are only important when it comes to life and death.
Statistical significance calculations represent a probabilistic estimate, p . The p says something about the probability that the association came from a population in which the association is nil ( p smaller than .05 is significant, p equal or larger .05 is not significant).
Outliers often only affect a small sample.
If the entire range of a variable is not represented (ie only middle-income and no low or high income), then we speak of a range restriction .
If there is a relationship between two variables, but it can not be presented as a straight line, then it could be a curvilinear relationship . It may be that the relationship is positive at first, but at a given moment becomes negative (or the other way around).
No causal inferences can be made with association claims. Association claims can not satisfy all three conditions of causality, because association claims are not done with experiments.
When there is an association research and the relationship between the two variables changes because another variable exerts an influence, we speak of a moderator .
What is multivariate correlational research? Ch.9
Research designs with more than two measured variables are called multivariate correlational designs . Examples include longitudinal research and multiple-regression designs.
Longitudinal designs can determine temporal precedence by measuring the same variables with the same person at different time points.
There are more than two variables involved in a multivariate correlational design and your design will therefore give multiple correlations. These can be cross-sectional correlations , autocorrelations and cross-lag correlations .
Cross-sectional correlations test whether two variables measured at the same time point correlate. When we look at whether the same variables correlate with each other at different time points, we speak of autocorrelations.
Researchers are most interested in cross-lag correlations and these are correlations that determine whether the previous measurement of a variable is associated with a later measurement of another variable.
Longitudinal research can show that there is covariance and it can also help with temporal precedence. However, these studies can not exclude a third variable (internal validity) and therefore the criteria of causality can not be met.
With multivariate designs, researchers can see whether a relationship between two variables persists when a third variable is kept constant. You could divide such a third variable into different subgroups.
With multivariate regression designs you can exclude a third variable, provided that this variable is included in your model.
As a statistical measure in multiple regression you have to look at the beta. Beta shows the direction and strength of the relationship between predictor and criterion variable, while keeping the other predictor variables constant.
Parsimony is the extent to which a good scientific theory can offer the simplest explanation for a phenomenon.
If variable x has an immediate effect on variable y, but can also go via variable z and thus indirectly influence variable y, we call variable z a mediator .
How can causal claims be evaluated with the help of experiments? Ch.10
In an experiment, a researcher manipulates at least one variable (independent variable) and measures another variable (dependent variable).
Every variable that a researcher intentionally keeps constant is called a control variable . This is done to ensure that there are no alternative explanations (confounds) for the results found.
Experiments fulfill the three claims of causality (covariance, temporal precedence and internal validity).
A design confound is a researchers mistake in designing the independent variable. It is a second variable that varies at the same time as the independent variable that a researcher is interested in.
A selection effect occurs in an experiment when the type of subjects in a level of the independent variables are systematically different from the subjects in another level of the independent variables.
Design confounds and selection effects are not good for internal validity. Good experiments often use random attribution to avoid selection effects.
Sometimes random attribution does not work and you can use matched groups. These are groups of test subjects that correspond to a certain degree of a variable that can influence the dependent variable and of which the subjects are then randomly attributed to the different conditions of the independent variable that one is interested in.
In an independent- group design ( between-group design ) different groups of subjects are placed in different levels of the independent variable. In a within-groups design ( within-subjects design ) there is only one group of subjects and each person is exposed to each level of the independent variable.
Two forms of independent-group design are the post-test-only design and the pretest / post-test design. In the post-test-only design , subjects are randomly assigned to the groups of the independent variable and they are tested once on the dependent variable.
In a pre-test / post-test design , subjects are divided into two groups in a random manner and they are tested twice for the dependent variable: once for exposure to the independent variable and once after exposure with the independent variable.
There are two types of within-groups designs. In competitor-measures design , subjects are exposed to all levels of an independent variable at about the same time and one preference of a behavior or idea is the dependent variable.
In a repeated-measures design , subjects are measured more than once on the dependent variable - ie after exposure to each level of the independent variable.
To prevent order (order) effects, counterbalancing can be used. This means that researchers present the levels of the independent variables in different sequences to subjects.
What is the influence of confounding and obscure factors? Ch.11
Some threats to internal validity are maturation, history, regression, attrition, test and instrumental threats. Maturation (maturation) threat is a change in behavior that spontaneously occurred with time, no intervention was involved.
Sometimes changes occur because something specific has happened between the pretest and the post test. This is called a history threat and it does not have to be a big event, but one that has an influence on all or almost all members of the population.
Regression threats refer to regression to the mean. When a behavior is extreme at time point 1, it will probably be less extreme at time point 2.
Attrition is a reduction of the test subjects that takes place before the research has finished. Attrition becomes a problem when a certain type of test subjects no longer participate in the study (when it is systematic).
A test threat refers to the change in a subject as a result of taking a test more than once. An instrumental threat occurs when a measurement instrument changes over time.
Many of the above threats can be reduced by adding control groups. Nevertheless, three threats can persist: observer, placebo and demand characteristics.
An observer bias takes place when the expectations of the researcher influence his interpretation of the results. Demand characteristics are a problem when the subjects think they know what a study is about and therefore change their behavior.
A placebo effect occurs when subjects receive treatment and improve because they think they have received a real treatment (for example, a real medicine instead of a sugar pill).
The three threats can be reduced by conducting double-blind research.
A zero effect means that the independent variable has no influence on the dependent variable. A zero effect can be found because the investigation was not performed accurately or because there was no actual influence of variable x on y.
An obscure factor can ensure that researchers do not see an influence of the independent variable on the dependent variable. The obscure factors can take two forms: there was not enough difference between groups (between-group) or there was too much variability in groups (within group).
If too little difference is found between groups, the variables may have to be operationalized differently. If too much variability within groups, within-group designs can be used.
How do you deal with experiments that contain more than one independent variable? Ch.12
An interaction effect looks at whether the effect of the original independent variable depends on the level of the other independent variable. There are different types of interactions and you can extract all those interactions from a figure.
When the lines of two independent variables intersect, we speak of a crossover interaction . If the lines of two independent variables do not run parallel and do not cross each other, then we speak of a spreading (spread) interaction .
Researchers use factorial designs to test interactions. A factorial design is a design with two or more independent variables (called factors).
In a factorial design, it is necessary to look at the main effects and interaction effects. Effects of each independent variable, called main effects , look at differences and interaction effects are differences of differences.
In independent groups of factorial design (between-subjects factorial) both independent variables are studied as independent groups. In a within-groups factorial design (repeated-measures factorial) both independent variables are manipulated within groups.
When researchers add a third independent variable and all independent variables have two levels, then we speak of a 2 x 2 x 2 factorial design, or a three-way design . In this design there are 2 x 2 x 2 = 8 cells or conditions.
What are quasi-experiments? Ch.13
In a quasi-experiment, researchers do not have full control over the conditions. Subjects are not randomly attributed to the conditions.
Matched groups can be used to take into account the differences of people between the groups. Some researchers also apply a wait-list design , in which all subjects undergo treatment, but at different times.
A maturation threat can occur when subjects with pretest and posstest show an improvement, but it is not clear whether the change is caused by the treatment or if the group has been spontaneously improved.
A historical threat happens when a historical event takes place simultaneously with the treatment for all subjects in a study. It is then unclear to say whether a result is caused by the treatment or by an external event.
Comparison groups can remove these threats.
Regression to the average happens when an extreme result is caused by a combination of random factors that probably will not happen again in the same combination. Your extreme result will therefore become less extreme with time.
Attrition happens when people no longer want to participate in the research after a period of time. It is a threat to internal validity when people who leave are systematically different from those who stay.
An advantage of quasi-experiments is that it uses research in the real world. There is no artificial setting and so the external validity is pretty good.
Another advantage is that quasi-experiments can ensure that things that are not regarded as ethical in a true experiment can still be investigated.
When researchers use a small N-design , instead of getting little information from a large sample, they get a lot of information from a small sample. They can even look at one animal or one person in a single N-design .
In a stable baseline (stable baseline) design, observers observe behavior from a long baseline period before starting treatment or intervention. If the behavior during the baseline is stable, researchers can say with more certainty that a treatment is effective.
In a multiple-baseline design, researchers spread their introduction of interventions on different contexts, moments or situations.
In a reversal design , a researcher observes problem behavior with and without treatment, but then takes the treatment back (reversal period) to see if the problem behavior recurs. If the treatment really works, then the behavior should decline again when the treatment is removed.
Can the results of a research be applied in daily life? Ch.14
Replicability means that the findings show the same results when the result is performed again. Replicability gives a research credibility.
In direct replications , researchers repeat the original research as accurately as possible to try to find out whether the original effect can also be found with new data.
In a conceptual replication , scientists examine the same question but they use different procedures. In a replication-plus-extension study, the researchers replicate the original study, but they also add variables to test more questions.
A meta-analysis is a mathematical summary of the scientific literature on a subject. In a meta-analysis, often studies are included that have different sample sizes and often it is true that the studies with a higher sample size also count heavier in the analysis.
Meta-analyzes can give a distorted picture if studies with a zero effect are not included.
The similarity between the context of a research and the 'real world' is sometimes also called ecological validity . Ecological validity is an aspect of external validity.
It depends on the purpose of your research (and for whom you want to assert your theory) how much value you attach to external validity. Some studies attach almost no value to external validity and focus more on internal validity.
Culture psychologists have shown that many theories that are supported in a cultural context are not supported in a different cultural context.
Most studies have been conducted with subjects from the United States, Australia and Europe and these subjects are also called the WEIRD population: western, educated, industrialized, rich and democratic. These WEIRD people do not represent the whole world and that your results can not simply be generalized to the entire world population.
Many laboratory experiments are high in experimental realism . This means that they create settings in which people exhibit genuine emotions, motivations and behaviors.
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>
Here you can find some summaries and study tips that I have found around Research Methods and Statistics at the University of Amsterdam. Most of it is free.
In this bundle summaries, exam questions and lecture notes will be shared for the course Research Methods and Statistics, Bachelor 1, University of Amsterdam.
For a complete
This bundle contains Summaries, Lecture Notes and Practice Exams for the course Research Methods 1 of the Psychology, year 1 program at the VU
There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
Main summaries home pages:
Main study fields:
Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports
Main study fields NL:
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
4247 | 3 |
Add new contribution