WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva
- 1779 reads
Critical thinking
Article: Simmons, Nelson, & Simonsohn (2011)
False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant
This article is about two things:
One of the most costly errors is a false positive.
They inspire investment in fruitless research programs and can lead to ineffective policy changes.
Ambiguity is rampant in empirical research.
As a solution to the flexibility-ambiguity problem, there are offered six requirements for authors and four guidelines for reviewers.
This solution substantially mitigates the problem but imposes only a minimal burden on authors, reviewers, and readers.
Leaves the right and responsibility of identifying the most appropriate way to conduct research in the hands of researchers, requiring only that authors provide appropriately transparent descriptions of their methods so that reviewers and readers can make informed decisions regarding the credibility of their findings.
Requirements for authors
1. Authors must decide the rule for terminating data collection before data collection begins and report this rule in the article.
2. Authors must collect at least 20 observations per cell or else provide a compelling cost-of-data collection justification.
Samples smaller than 20 per cell are not powerful enough to detect most effects.
3. Authors must list all variables collected in a study
Prevents researchers from reporting only a convenient subset of the many measures that were collected, allowing readers and reviewers to easily identify possible researcher degrees of freedom.
4. Authors must report all experimental conditions, including failed manipulations
Prevents authors from selectively choosing only to report the condition comparisons that yield results that are consistent with their hypothesis.
5. If observations are eliminated, authors must also report what the statistical results are if those observations are included.
Makes transparent the extent to which a finding is reliant on the exclusion of observations, puts appropriate pressure on authors to justify the elimination of data, and encourages reviewers to explicitly consider whether such exclusions are warranted.
6. If an analysis includes covariate, authors must report the statistical results of the analysis without the covariate
This makes transparent the extent to which a finding is reliant of the presence of a covariate, puts appropriate pressure on authors to justify the covariate and encourages reviewers to consider whether including is warranted.
Guidelines for reviewers
1. Reviewers should ensure that authors follow the requirements
2. Reviewers should be more tolerant of imperfections in results
3. Reviewers should require authors to demonstrate that their results do not hinge on arbitrary analytic decisions.
4. If justification of data collection or analysis are not compelling, reviewers should require the authors to conduct an exact replication.
Criticisms
Criticism of the solution comes in two varieties:
Not far enough
The solution does not lead tot the disclosure of all degrees of freedom.
Authors have tremendous disincentives to disclose exploited researcher degrees of freedom.
To far
the guidelines prevent researchers from conducting exploratory research.
Nonsolutions
Solutions rejected by the authors for they are less practical, less effective or both.
Critical thinking
Article: Simmons, Nelson, & Simonsohn (2011)
False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant
This article is about two things:
One of the most costly errors is a false positive.
They inspire investment in fruitless research programs and can lead to ineffective policy changes.
Ambiguity is rampant in empirical research.
As a solution to the flexibility-ambiguity problem, there are offered six requirements for authors and four guidelines for reviewers.
This solution substantially mitigates the problem but imposes only a minimal burden on authors, reviewers, and readers.
Leaves the right and responsibility of identifying the most appropriate way to conduct research in the hands of researchers, requiring only that authors provide appropriately transparent descriptions of their methods so that reviewers and readers can make informed decisions regarding the credibility of their findings.
Requirements for authors
1. Authors must decide the rule for terminating data collection before data collection begins and report this rule in the article.
2. Authors must collect at least 20 observations per cell or else provide a compelling cost-of-data collection justification.
Samples smaller than 20 per cell are not powerful enough to detect most effects.
3. Authors must list all variables collected in a study
Prevents researchers from reporting only a convenient subset of the many measures that were collected, allowing readers and reviewers to easily identify possible researcher degrees of freedom.
4. Authors must report all experimental conditions, including failed manipulations
Prevents authors from selectively choosing only to report the condition comparisons that yield results that are consistent with their hypothesis.
5. If observations are eliminated, authors must also report what the statistical results are if those observations are included.
Makes transparent the extent to which a finding is reliant on the exclusion of observations, puts appropriate pressure on authors to justify the elimination of data, and encourages reviewers to explicitly consider whether such
Critical thinking
Article: Nosek, Spies, & Motyl, (2012)
Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability
An academic scientist’s professional success depends on publishing.
This article develops strategies for improving scientific practices and knowledge accumulation that account for ordinary human motivations and biases.
Incentives for surprising, innovative results are strong in science.
Problem: the incentives for publishable results can be at odds with the incentives for accurate results. This produces a conflict of interest.
The solution requires making incentives for getting it right competitive with the incentives for getting it published.
Publishing is the ‘very heart of modern academic science, at levels ranging from the epistemic certification of scientific thought to the more personal labyrinths of job security, quality of life and self esteem’.
With an intensely competitive job marked, the demands on publication might seem to suggest a specific objective for the early-career scientists: publish as many articles as possible in the most prestigious journals that will accept them.
Even if a researcher conducts studies competently, analyses the data effectively, and writes the results beautifully, there is not guarantee that the report will be published.
Part of
Critical thinking
Article: Dienes (2003)
Neyman, Pearson and hypothesis testing
In this article, we will consider the standard logic of statistical inference.
Statistical inference: the logic underlying all the statistics you see in the professional journals of psychology and most other disciplines that regularly use statistics.
The underlying logic of statistic (Neyman-Pearson) is both highly controversial, frequently attacked (and defended) by statisticians and philosophers, and more frequently misunderstood.
The meaning of probability we choose determines what we can do with statistics.
The proper way of interpreting probability remains controversial, so there is still debate over what can be achieved with statistics.
The Neyman-Pearson approach follows form one particular interpretation of probability. The Bayesian approach considered follows form another.
Interpretations often start with a set of axioms that probabilities must follow.
Two interpretations of probability:
The most influential objective interpretation of probability is the long-run relative frequency interpretation. Here, probability is a relative frequency.
Because the long-run relative frequency is a property of all the events in the collective, it follows that a probability applies to a collective, not to any single event.
A single event could be a member of different collectives. So a singular event does not have a probability, only collectives do.
Objective probabilities do not apply to single cases. They also do not apply to the truth of hypotheses.
A hypothesis is simply true or false, just as a single event either occurs or does not.
A hypothesis is not a collective, it therefore does not have an objective probability.
Data = D
Hypothesis = H
P(H|D) is the inverse of the conditional probability p(D|H). Inverting conditional probabilities makes a big difference.
P(A|B) can have a very different value from p((B|A).
If you know P(D|H) does not mean you know what p(H|D) is.
There are two reasons for this:
Statistics cannot tell us
.....read moreCritical thinking
Article: Dennis & Kintsch
Evaluating Theories
A theory is a concise statement about how we believe the world to be.
Theories organize observations of the world and allow researchers to make predictions about what will happen in the future under certain conditions.
Science is about the testing of theories, and the data we collect as scientists should either implicitly or explicitly bear on theory.
The characteristics that lead a theory to be successful from those that make it truly useful:
Descriptive adequacy
The extent to which it accords with data.
In psychology, the most popular way of comparing a theory against data is null hypothesis significance testing.
Determining whether a theory is consistent with data is not always as straightforward as it may at first appear.
Some of the the subtleties involved in determining the extent to which a theory accords with data
Critical thinking
Article: Dienes (2008)
Degrees of falsifiability
A potential falsifier of a theory: any potential observation that would contradict the theory.
One theory is more falsifiable than another if the class of potential falsifiers is larger.
Scientists prefer simple theories.
Simple theories are better testable.
A theory can gain in falsifiability not only by being precise, but also be being broad in range of situations to which the theory applies.
The greater the universality of a theory, the more falsifiable it is. Even if the predictions are not very precise.
Revisions to a theory may make it more falsifiable by specifying fine-grained causal mechanisms.
As long as the steps in a proposed causal pathway are testable, specifying the pathway gives you more falsifiers.
Psychologists sometimes theorize and make predictions by constructing computational models.
A computational model is a computer simulation of a subject, where the model is exposed to the same stimuli subjects receive and gives actual trial-by-trial responses.
A theory that allows everything explains nothing.
The more a theory forbids, the more it says about the world. The empirical content of a theory increases with its degree of falsifiability.
The more falsifiable a theory is, the more open it is to criticism.
So the more falsifiable our theories are, the faster we can make progress, given progress comes from criticism.
Science aims at the maximum falsifiability it can achieve: successive theories should be successively more falsifiable. Either in terms of universality or precision.
Make sure that any revision or amendment to theory can be falsified. That way theory development is guaranteed to keep its empirical character.
Observations are always ‘theory impregnated’.
Falsification is not so simple as pitting theory against observation.
Theories determine what an observation is.
Critical thinking
Article: Foster (2010)
Causal Inference and Developmental Psychology
(the part needed for psychology at the UvA)
Four premises
Causal thinking and causal inference are unavoidable.
Causal inference as the goal of developmental psychology
the lesson is not that causal relationships can never be established outside of random assignment, but that they cannot be inferred from associations alone. Some additional assumptions are required.
The goal of this research should be to make causal inference as plausible as possible.
Doing so involves applying the best methods available among a growing set of tools.
As part of the proper use of those tools, the researcher should identify the key assumptions on which they rest and their plausibility in any particular application.
The researcher should check the consistency of those assumptions as much as possible using the available data. In many instances key assumptions will remain untestable.
The plausibility of those assumptions need to be assessed in the light of substantive knowledge.
What constitutes credible or plausible is not without debate.
At this point, much of developmental psychology involves implausible causal inference.
Two conceptual tools are especially helpful in moving from associations to causal relationships.
This tool assists researchers in identifying the implications of a set of associations for understanding causality and the set of assumptions under which those associations imply causality
Moving from association to causality requires ruling out potential confounders: variables associated with both treatment and outcome.
The DAG is particularly useful for helping the research to identify covariates and for perhaps understanding unanticipated consequences of incorporating these variables.
Critical thinking
Article: Pearl (2018)
Confounding and deconfounding: or, slaying the lurking variable
Confounding bias occurs when a variable influences both who is selected for the treatment and he outcome of the experiment.
Sometimes the confounders are known. Other times they are merely suspected and act as a ‘lurking third variable’.
If we have measurements of the third variable, then it is very easy to deconfound the true and spurious effects.
Statisticians both over- and underrate the importance of adjusting for possible confounders
Knowing the set of assumptions that stand behind a given conclusion is not less valuable than attempting to circumvent those assumptions with and RCT, which has complications on its own.
The one circumstance under which scientists will abandon some of their reticence to talk about causality is when they have conducted a randomized controlled trial (RCT).
Randomization brings two benefits:
Another ways is, if you know what all the possible counfounders are, to measure and adjust for them.
But, randomization had one great advantage: it servers every incoming link to the randomized variable, including the ones we don’t know about or cannot measure.
RCTs are preferred to observational studies.
But, in some cases, intervention may be physically impossible or unethical.
Provisional causality: causality contingent upon the set of assumptions that our causal diagram advertises.
The principal objective of an RCT is to eliminate confounding.
Confounding is not a statistical notion. It stands for the discrepancy between what we want to assess (the causal effect) and what we actually do assess using statistical methods.
If you can’t articulate mathematically what you want to assess, you can’t expect to define what constitutes a discrepancy.
Historically, the concept of ‘confounding’ has evolved around two related conceptions:
Both these concepts have resisted formalization.
Critical thinking
Article: Shadish (2008)
Critical thinking in Quasi-Experimentation
All experiments are about discovering the effects of causes.
All experiments have in common the deliberate manipulation of an assumed cause, followed by observation of the effects that follow.
A Quasi-experiment: an experiment that does not use random assignment conditions.
What is a cause?
An inus condition: an insufficient cause by itself. It effectiveness required it to be embedded in a larger set of conditions.
Most causal relationships are not deterministic, but only increase the probability that an effect will occur.
This is the reason why a given causal relationship will only occur under some conditions but not universally.
To different degrees, all causal relationships are contextually dependent, so the generalization of experimental effects is always at issue.
Experimental causes are manipulable.
Experiments explore the effects of things that can be manipulated.
Experimental causes must be manipulable.
In quasi-experiments, the cause is whatever was manipulated, which may include many more things than the researcher realizes were manipulated.
In quasi-experiments, especially if the researcher is not the person manipulating the treatment, it is easy to make mistaken claims about what was manipulated, and the context in which it occurred.
What is an effect?
In an experiment, we observe what did happen when people receive a treatment.
The counterfactual is knowledge of what would have happened to those same people if they simultaneously had not received treatment.
An effect is the difference between what did happen and what would have happened.
We can never observe the counterfactual.
Experiments try to create reasonable approximations to this physically impossible counterfactual.
Two central tasks in experimental design are:
Random assignment forms a control group that is often the best approximation to this counterfactual that we can usually obtain, though even that control group is imperfect because the person in the control group are not identical to those in the treatment group.
However, we do know that participants in the treatment and control group differ form each other only randomly.
The problem in quasi-experiments is that differences between treatment and control are usually systematic, not random, so nonrandom controls may not tell us much about what would have happened to the treatment group if they had not received treatment.
Much of quasi-experimentation is concerned with creating good sources of counterfactual inference. In general, quasi-experiments use two different tools to do so
Even then, the effects of quasi-experiments are rarely as trustworthy
.....read moreCritical thinking
Article: Marewski, & Olsson, (2009)
Beyond the null ritual, formal modeling of psychological processes
Rituals can be characterized by a range of attributes including:
Each of these characteristics is reflected in null hypothesis significance testing.
One good way to make theories more precise is to cast them as formal models.
In doing so, researchers can move beyond the problems of null hypothesis significance testing, and simple difference searching.
In the broadest sense, a model is a simplified representation of the world that aims to explain observed data.
A model is a formal instantiation of a theory that specifies the theory’s predictions. This category also includes statistical tools, such as structural equation or regression models.
Statistical tools are not typically meant to mirror the workings of psychological mechanisms.
What is the scope of Modeling?
Modeling is not meant to be applied equally to all research questions. Each method has its specific advantages and disadvantages.
Modeling helps researchers answer involved questions and understand complex phenomena.
In psychology, modeling is especially suited for basic and applied research about the cognitive system.
Four closely interrelated benefits of increasing the precision of theories by casting them as models:
Designing strong tests of theories
Models provide the bridge between theories and empirical evidence.
They enable researchers to make competing quantitative predictions, which in turn lead to strong comparative tests of theories.
Any quantitative prediction can be systematically better or worse than any other.
But, as soon as one starts to compare quantitative predictions from different models, the use of null hypothesis testing can become inappropriate or meaningless.
Sharpening research questions
Null hypothesis tests are often used to evaluate verbal, informal theories.
But, in such theories are underspecified, then they can be used post hoc, to ‘explain’ almost any observed empirical pattern.
Critical thinking
Article: Cronbach (1957)
The two disciplines of scientific psychology
The experimental method, where the scientists changes conditions in order to observe their consequences, is much the more coherent of our two disciplines.
Correlational psychology was slower to mature.
It qualifies equally as a discipline, because it asks a distinctive type of question and has technical methods of examining whether the question has been properly put and the data properly interpreted.
The well-known virtue of the experimental method is that it brings situational variables under tight control. It thus permits rigorous tests of hypotheses and confident statements about causation.
The correlational method can study what man has not learned to control or can never hope to control.
In the beginning, experimental psychology was a substitute for purely naturalistic observation of man-in-habitat.
The experiment came to be concerned with between-treatment variance.
And, today the majority of experimenters derive their hypotheses explicitly from theoretical premises and try to nail their results into a theoretical structure.
The goal in the experimental tradition is to get differential variables out of sight.
The correlational psychologists loves those variables the experimenter left home to forget.
Factor analysis is rapidly being perfected into a rigorous method of clarifying multivariate relationships.
The correlational psychologists is a mere observer of a play where Nature pulls a thousand strings: but his multivariate methods make him equally and expert, an expert in figuring out where to look for the hidden strings.
It is not enough for each discipline to borrow from the other.
Correlational psychologists studies only variance among organisms; experimental psychology studies only variance among treatments.
A united discipline will study both of these, but it will also be concerned with the otherwise neglected interactions between organismic and treatment variables.
Our job is to invent constructions and to from a network of laws which permits prediction.
From observations we must infer a psychological description of the situation and of the present state of the organism.
Our laws should permit us to predict, from this description, the behaviour of organism-in-situation.
Methodologies for a joint discipline have already been proposed.
Critical thinking
Article: Kievit, Frankenhuis, Waldorp, & Borsboom (2013)
Simpson's paradox in psychological science: a practical guide
Introduction
Simpson’s paradox: the direction of an association at the population-level may be reversed within the subgroups comprising that population.
Simpson showed that a statistical relation observed in a population could be reversed within all of the subgroups that make up that population.
Simpson’s paradox is a counter-intuitive feature of aggregated data, which may arise when (causal) inferences are drawn across different explanatory levels. (like population to subgroup or subgroup to individual).
Simpson’s paradox is conceptually and analytically related to many statistical challenges and techniques.
The underlying shared theme of these techniques is that they are concerned with the nature of (causal) inference. The challenge is what inferences are warranted based on the data we observe.
One can only be sure that a group-level finding generalizes to individuals when the data are ergodic, which is a very strict requirement.
Since this requirement is unlikely to hold in many data sets, extreme caution is warranted in generalizing across levels.
The dimensions that appear in a covariance structure analysis describe patterns of variation between people, not variation within individuals over time.
A person X may have a position on five dimensions compared to other people in a given population, but this does not imply that person varies along this number of dimensions over time.
Two variables may correlate positively across a population of individuals, but negatively within each individual over time.
Simpson’s paradox may occur in a wide variety of research designs, methods, and questions.
There is no single mathematical property that all instances of SP have in common. Therefore, there will not be a single, correct rule for analysing data so as to prevent cases of SP.
What we can do is consider the instances of SP we are most likely to encounter, and investigate them for characteristic warning signals.
The most general danger of psychology is that we might incorrectly infer that a finding at the level of the group generalizes to subgroups, or to individuals over time.
Preventing Simpson’s paradox
Develop and test mechanistic explanations
The first step in addressing SP is to carefully consider when it may arise.
The mechanistic inference we propose to explain the data may be incorrect.
This danger arises when we use data at one explanatory level to infer a cause at a different explanatory
Critical thinking
Article: LeBel & Peters (2011)
Fearing the future of empirical psychology
Because empirical data undermine theory choice, alternative explanations of data are always possible, both when the data statistically support the researcher’s hypothesis and when they fail to do so.
The interpretation bias: a bias toward interpretations of data that favour a researcher’s theory, both when the null hypothesis is statistically rejected and when not.
This bias entails that, regardless of how data turn out, the theory whose predictions are being tested is artificially buffered from falsification.
The ultimate consequence is an increased risk of reporting false positives and disregarding true negatives, and so drawing incorrect conclusions about human psychology.
The research bias underlying the file-drawer problem in no way depend on unscrupulous motives.
The knowledge system that constitutes a science such as psychology can be roughly divided into two types of belief:
In any empirical test of a hypothesis, interpretation of the resulting data depends on both theory-relevant and method-relevant beliefs, as both types of belief are required to bring the hypothesis to empirical test.
Consequently, the resulting data can always be interpreted as theory relevant or as method relevant.
Weaknesses in the current knowledge system of empirical psychology bias the resulting choice of interpretation in favour of the researcher’s theory.
Deficiencies in methodological research practice systematically bias
This has the result that the researcher’s hypothesis is artificially buffered from falsification.
The interpretation of data should hinge not on what the pertinent beliefs are about, but rather on the centrality of those beliefs.
The centrality of belief reflects its position within the knowledge system: central beliefs are those on which many other beliefs depend. Peripheral beliefs are those with few dependent beliefs.
The rejection of central beliefs to account for observed data entails a major restructuring of the overall knowledge system.
Conservatism: choosing the theoretical explanation consistent with the data that requires the least amount of restructuring of the existing knowledge system.
Generally, the conservatism in theory choice is a virtue, as it reduces ambiguity in the interpretation of data.
The value of methodological rigour is precisely that, by leveraging conservatism, it becomes more difficult to blame negative results on flawed methodology.
When method-relevant
Critical thinking
Article: Scott O. Lilienfeld (2005)
The 10 commandments of helping students distinguish science from pseudoscience in psychology
The first commandment
It is important to communicate to students that the differences between between science and pseudoscience, although not absolute or clear-cut, are neither arbitrary or subjective.
Warning signs that characterize most pseudoscientific disciplines:
Non of these warnings signs is by itself sufficient to indicate that a discipline is pseudoscientific.
But, the more of these warning signs a discipline exhibits, the more suspect it should become.
The second commandment
Learning to distinguish scepticism from cynicism.
One danger of teaching students to distinguish science from pseudoscience is that we can inadvertently produce students who are reflexively dismissive of any claim that appears implausible.
Scepticism, which is the proper mental set of the scientist, implies two seemingly contradictory attitudes:
Cynicism implies close-mindedness.
The third commandment
Distinguish methodological scepticism from philosophical scepticism.
There is a continuum of confidence in scientific claims.
The fourth commandment
Distinguish pseudoscientific claims from claims that are merely false.
The key difference between science and pseudoscience lies not in their content but in their approach to evidence.
The fifth commandment
Distinguish science from scientists.
The scientific method is a toolbox of skills that scientists have developed to prevent themselves from confirming their own biases.
The sixth commandment
Explain the cognitive underpinnings of pseudoscientific beliefs.
We are all prone
.....read moreThis is a list of the important terms used in the articles of block 2 of WSRt at the uva.
Accuracy motives: to learn and publish true things about human nature
Professional motives: to succeed and thrive professionally.
Statistical inference: the logic underlying all the statistics you see in the professional journals of psychology and most other disciplines that regularly use statistics.
The subjective interpretation of probability: a probability is a degree of conviction of a belief
The objective interpretation of probability: locate probability in the world.
Alpha: the long-term error rate for one type of error: saying the null is false when it is true.
Type I error: when the null is true and we reject it.
Type II error: accepting the null when it is false.
Meta-analysis: the process of combining groups of studies together to obtain overall tests of significance.
Descriptive adequacy: does the theory accord with the available data?
Precision and interpretability: Is the theory described in a sufficiently precise fashion that other theorists can interpret it easily and unambiguously?
Coherence and consistency: Are there logical flaws in the theory? Does each component of the theory seem to fit with the others in to a coherent whole? Is it consistent with theory in other domains?
Prediction and falsifiability: Is the theory formulated in such a way that critical tests can be conducted that could reasonably lead to the rejection of the theory?
Postdiction and explanation: Does the theory provide a genuine explanation of existing results?
Parsimony: Is the theory as simple as possible?
Originality: Is the theory new or is it essentially a restatement of an existing theory?
Breadth: does the theory apply to a broad range of phenomena or is it restricted to a limited domain?
Usability: does the theory have applied implications?
Rationality: does the theory make claims about the architecture of mind that seem reasonable in the light of the environmental contingencies that have shaped or evolutionary theory?
This magazine contains all the summaries you need for the course WSRt at the second year of psychology at the Uva.
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
JoHo kan jouw hulp goed gebruiken! Check hier de diverse studentenbanen die aansluiten bij je studie, je competenties verbeteren, je cv versterken en een bijdrage leveren aan een tolerantere wereld
There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
Field of study
Add new contribution