Making replication mainstream – Zwaan et al. - 2018 - Article

Why are replications important?
What is this review’s purpose?
What is some background information on this topic?
What are issues concerning replicability?
Which types of replication studies exist?
What are some concerns about replicability?
Summary and conclusions

Why are replications important?

Being able to effectively replicate research findings is important for scientific progress. What defines science is that researchers don’t accept claims without critical evaluation of the evidence. Part of this evaluation process is the independent replication of findings. But the value of replication as a normal feature of psychology is a recently controversial topic.

Replications are important to falsify hypotheses. Lakatos’ notion of sophisticated falsificationism – an auxiliary hypothesis can be formed, allowing the expanded theory to accommodate the troublesome result. If more falsifications arise, more auxiliary hypotheses must be formed to account for unsupported predictions, problems begin to pile for a theory – this is degenerative. If auxiliary hypotheses are empirically successful, the program has better explanatory power – this is progressive. Replications are a tool to distinguish between progressive and degenerative research,

What is this review’s purpose?

This paper aims to educate readers on the value of replications and integrate recent discussions about them to give a foundation for future replication efforts. The authors hope that this will make replication studies more regular in research that should increase the authenticity of findings.

What is some background information on this topic?

Popper said that an ‘effect’ that’s found once but can’t be reproduced does not qualify as a scientific discovery, it’s chimeric (hoped for but impossible to achieve). There are two important insights that inform scientific thinking:

A finding needs to be repeatable to count as a scientific discovery.
Research needs to be reported in a way that others can reproduce the procedures.

Therefore, scientific discovery needs a consistent effect and a comprehensive description of the method used to produce the result in the first place. However, this does not imply that all replications are expected to be successful or that no expertise is required to conduct them.

What are issues concerning replicability?

Concerns about the replicability of findings exist in various disciplines. Problems with replicability and false positives can happen for many reasons:

Publication bias – process in which research findings are chosen based on how much support they give for a hypothesis.
Growing body of meta-scientific research showing the effects of researcher degrees of freedom (latitude in how research is conducted, analyzed, and reported). Existence of researcher degrees of freedom allow investigators to try different analytic options until they find a combination that gives a significant result, especially when there is pressure to publish significant findings.
- Confirmation bias: can convince investigators that the procedures that led to the significant result were the ‘best’ approach in the first place.
HARKing – hypothesizing after the results are known.

Researcher degrees of freedom and publication bias favouring statistically significant results have produced overestimations of effect sizes in literature. Replication is important to provide more accurate estimates of effect size.

Which types of replication studies exist?

Replication studies serve many purposes, and those objectives determine how a study is designed and interpreted. Schmidt (2009) identified five functions:

Address sampling error (i.e. false-positive detection)
Control for artifacts
Address researcher fraud
Test generalizations to different populations
Test the same hypothesis of a previous study using a different procedure

Multiple definitions have been proposed for direct and conceptual replications.

Direct replication: study that tries to recreate the critical elements (samples, procedures, measures) of an original study. It does not have to duplicate all aspects but only elements believed to be necessary for producing the original effect. Useful for reducing false positives.
- Theoretical commitment: researchers should agree on what those critical elements are.
Conceptual replication: study where there are changes to the original procedures that make a difference regarding the observed effect. Its designed to test if an effect extends to different populations given theoretical reasons to assume it will be weaker or stronger for different groups.

What are some concerns about replicability?

The interpretation of replications has produced disagreement and controversy. Here we consider the most frequent concerns.

Concern 1: Context is too variable

Perhaps the most voiced concern about direct replications is that the conditions under which an effect is initially observed may not hold when a replication attempt is performed – change in context.

Proponents suggest it is too hard to specify all the contextual factors and its extremely difficult for independent researchers to recreate these conditions with precision. Consequently, it is never possible to determine whether a ‘failed’ replication is due to the original demonstration being a false-positive or because whether context has changed so much to wipe out the effect.

Response C1

Context changes can and should be considered as a possible explanation for why a replication failed to obtain similar results to the original. Inn psychology it is rare that context does not matter or does not play a role in the outcome.

Nevertheless, the post hoc reliance on context sensitivity as an explanation for failed replication attempts is problematic for science. The fact that contextual factors vary between studies means that post hoc, context-based explanations are always possible to generate. Reliance on context sensitivity as an explanation, without committing to collecting new empirical evidence to test the idea, makes the original theory unfalsifiable – degenerative research.

An uncritical acceptance of these post hoc explanations ignores the possibility that false positives ever existed. The post hoc consideration of differences in features should lead to new testable hypotheses rather than dismissals of replication results.

Two strategies for solving the concerns outlined in this section are to (1) raise standards in reporting of experimental detail, so that the original papers include replication recipes, and (2) find ways to encourage original authors to identify potential boundaries/caveats in their research.

Concern 2: The theoretical value of direct replication is limited

Many arguments against replications agree on a general claim that direct replications aren’t necessary because they either have limited informational value or are misleading. The concern is that direct replications provide a false sense of certainty about the robustness of an underlying idea. Furthermore, replications can be unreliable just like original studies can, meaning that one can be sceptical about the value of any individual replication study.

Response C2

This concern implies that neither successful nor failed direct replications make novel contributions to theory. Some find it unworthy to do work that doesn’t advance a theory. But repeatedly showing that a theoretically predicted effect isn’t empirically supported adds knowledge to the field. Research that leads to identifying moderators and boundary conditions adds knowledge.

Direct replications are also necessary if researchers want to further explore a finding that emerged in exploratory research (e.g. a pilot study).

Procedures that can be used to accomplish some aims of direct replications includes preregistration. It can reduce/prevent researcher degrees of freedom, consequently reducing false positives. In preregistration, a researcher details a study design and analysis on a website before data is collected. Public preregistration can at least reduce publication bias.

Concern 3: Direct replications aren’t feasible in certain domains

It’s argued that replications may not be desirable or possible because of practical concerns. Certain studies may capitalize on extremely are events like a natural disaster or astronomical event, replications to test the effects of these events are impossible. So if being able to replicate a finding is what makes something ‘scientific’, then a lot of research would be excluded. Some topics are privileged as more scientific/rigorous than others (caste system). Replication studies are also more common in areas where studies are easier to conduct (e.g. universities) – bias.

Response C3

Just because replications aren’t always possible doesn’t undermine their value. Researchers working in areas where replication is difficult should be alert to such concerns and make efforts to avoid the resulting problems.

Costs of replication will be borne by researchers in that area. However, the goal of replication isn’t to provide the robustness of a certain field but instead to test the robustness of a certain effect.

Concern 4: Replication are a distraction

This concern holds the view that problems existing in the field may be so severe that attempts to replicate studies that currently exist will be a waste of time and could distract from bigger problems faced by psychology.

Related argument: the main problem in the accumulation of scientific knowledge is publication bias. Failed replications exist but aren’t being published. Once the omission of these studies is addressed, meta-analyses won’t be compromised and will provide an efficient means to identify the most reliable findings in the field. The idea is that even if replication studies tell us something useful, there are more efficient strategies to improve the field that have fewer negative consequences.

Response C4

Direct replications have unique benefits. It’s clear that failures to replicate past research findings have received most attention, but large-scale successful replications also have rhetorical power, showing that the field is capable of producing robust findings on which future work can build.

Replications also provide a simple metric by which we can evaluate the extent of the problem and the degree to which certain solutions work.

Concern 5: Replications affect reputations

Some debates about replication studies concern the reputation effects of them. Authors of failed replications could face questions of competency and feel victimized. Replications also create these concerns for the replicators who deserve credit for their effort in addressing the robustness of the original finding. Some argue that the replication crisis has created a career niche for bad experimenters’.

Another reputational concern comes from the fact that several of the most visible replication projects to date have involved large groups of researchers. How does one determine the contributions of and assign credits to authors of a multi-authored article?

Response C5

Replicators should go out of their way to carefully, objectively, and without exaggeration describe their results and implications of the original work. It can be useful for replicators and original authors to have contact, and in some cases collaboration. That is, a cooperative effort that is taken by two investigators who hold different views on an empirical question.

As more replications are conducted, the experience of having a study fail to replicate will become more normative, and hopefully less unpleasant.

Concern 6: There is no standard method to evaluate replication results

This is concerned with the interpretation of replication results. Two researchers can look at the same study and come to different conclusions regarding the original effects successful duplication. So what’s the point of running replication studies if the field can’t agree on which ones are successful?

Response C6

There’s growing consensus on which analyses are most likely to give reasonable answers to the question of whether a replication provides results consistent with those from an original study – frequentist estimation and Bayesian hypothesis testing. Investigators should consider multiple approaches and pre-registering analytic plans. Two approaches are especially promising:

Small telescopes approach: focus on interpreting confidence intervals from the replication. The idea is to consider what effect size the original study would have 33% power to detect and then use this value as a benchmark for replication. If the 90% CI from the replication excludes this value, then we say the original study couldn’t have meaningfully examined this effect. Focus on the design of the original study.
Replication Bayes factor approach: a number representing the amount by which new data (replication) shift the balance of evidence between two hypotheses. The extent of the shift depends on how accurately the competing hypotheses predict observed data. In a replication, the researcher compares statistical hypotheses that map to (1) a hypothetical optimistic theoretical proponent of the original effect (predicting replication effect size to be away from 0) and (2) a hypothetical sceptic who thinks the original effect doesn’t exist (predicts replication effect size to be close to 0). The replication Bayes factor can compare the accuracy of these predictions.

Summary and conclusions

Repeatability is essential to science. Arguably, a finding isn’t scientifically meaningful until it can be replicated with the same procedures that produced it initially. Direct replication is the mechanism by which replicability is assessed and a tool for distinguishing progressive from degenerative research.

A Nature survey reported that 52% of scientists believe their field as a significant replication crisis, 38% believe it’s a slight crisis. 70% of researchers have tried and failed to reproduce another scientist’s findings.

Many concerns have been raised:

When replications should be expected to fail.
What informational value they provide to a field that hopes to push a theory forward.
The fairness and reputational consequences of replications.
The difficulty in deciding when a replication has succeeded or failed.

Replication helps clarify which findings in the field we should be confident in as we move forward.

Access:

Public

Click & Go to more related summaries or chapters:

Summaries per article with Research Methods: theory and ethics at University of Groningen 20/21

False-positive psychology: Undiscovered flexibility in data collection and analysis allows presenting anything as significant - Simmons et al. - 2011 - Article

Why Summaries of Research on Psychological Theories Are Often Uninterpretable – Meehl - 1990 - Article

What has happened down here is the winds have changed – Gelman - 2016 - Article

Estimating the reproducibility of psychological science – Open Science Collaboration - 2015 - Article

Making replication mainstream – Zwaan et al. - 2018 - Article

The preregistration revolution – Nosek et al. - 2018 - Article

A Manifesto for Reproducible Science – Munafo et al. - 2017 - Article

When will ‘Open Science’ Become Simply ‘Science’? – Watson - 2015 - Article

Code of Ethics for Research in the Social and Behavioural Sciences - 2018 - Article

Science and Ethics in Conducting, Analyzing and Reporting Psychological Research – Rosenthal - 1994 - Article

On the Social Psychology of the Psychological Experiment: Demand Characteristics and their Implications - Orne - 2002 - Article

A Power Primer: Tutorials in Quantitative Methods for Psychology – Cohen - 1992 - Article

Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience – Button et al. - 2013 - Article

The Nature and History of Experimental Control - Boring - 1954 - Article

Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data - Rohrer - 2018 - Article

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Spotlight: topics

Check how to use summaries on WorldSupporter.org

Submenu: Summaries & Activities

Follow the author: Vintage Supporter

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

Search a summary, study help or student organization

Select any filter and click on Search to see results

Making replication mainstream – Zwaan et al. - 2018 - Article

Why are replications important?

What is this review’s purpose?

What is some background information on this topic?

What are issues concerning replicability?

Which types of replication studies exist?

What are some concerns about replicability?

Concern 1: Context is too variable

Response C1

Concern 2: The theoretical value of direct replication is limited

Response C2

Concern 3: Direct replications aren’t feasible in certain domains

Response C3

Concern 4: Replication are a distraction

Response C4

Concern 5: Replications affect reputations

Response C5

Concern 6: There is no standard method to evaluate replication results

Response C6

Summary and conclusions

Summaries per article with Research Methods: theory and ethics at University of Groningen 20/21

Contributions: posts

Add new contribution

Spotlight: topics

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

Quicklinks to fields of study for summaries and study assistance