Estimating the reproducibility of psychological science – Open Science Collaboration - 2015 - Article

Introduction
What is replication?
Method & Results
Does successful replication mean theoretical understanding is correct?
Does failure to replicate mean the original evidence was a false positive?
How is reproducibility influenced by publication bias?
Conclusion

Introduction

Reproducibility is a core principle of scientific progress. Scientific claims should get credibility by the replicability of their evidence instead of the status of their originator. Debates about transparency of methodology and claims are meaningless if the evidence being debated is not reproducible.

What is replication?

Direct replication tries to recreate conditions believed necessary to obtain a previously observed finding and is a way to establish reproducibility of a finding with new data. A direct replication may not achieve the original result for many reasons:

Known or unknown differences between the replication and original study could moderate the size of an observed effect.
Original result could have been a false positive.
The replication could produce a false negative.

False positives and false negatives provide misleading information about effects, and failure to identify the necessary conditions to reproduce a finding shows and incomplete theoretical understanding. Direct replication provides the chance to assess and improve reproducibility.

Method & Results

The authors came up with a protocol for selecting and conducting high-quality replications. Collaborators joined the project, selected study for replication from available studies in the sampling frame, and were guided through the replication process.

The Open Science Collaboration conducted 100 replications (270 contributing authors) of studies published in three psychological journals using high-powered designs and original materials when possible. Through consulting original authors, obtaining original materials, and internal review, replications kept high fidelity to the original designs.

Here, they chose to evaluate replication success using significance and P values, effect sizes, subjective assessments of replication teams, and meta-analysis of effect sizes. The mean effect size of the replication effects was half the size of the mean effect size of the original effects, substantial decrease.

97% of original studies had significant results (P
36% of replications had significant results.
47% of original effect sizes were in the 95% confidence interval of the replication effect size.
39% of effects were subjectively rated to have replicated the original result.
Assuming no bias in original results, combining original and replication results left 68% with statistical significant effects.

Tests suggest that replication success was better predicted by the strength of the original evidence than by characteristics of the original and replication teams.

Does successful replication mean theoretical understanding is correct?

It would be too easy to conclude that. Direct replication mainly provides evidence for the reliability of a result. Understanding is achieved through many, diverse investigations that give converging support for a theoretical interpretation and rules out alternative explanation.

Does failure to replicate mean the original evidence was a false positive?

It would also be too easy to conclude this. Replications can fail if the methodology differs from the original ways that interfere with observing the effect. The Open Science Collaboration conducted replications designed to minimize a priori reasons to expect different results by using original materials, contacting original authors for review of designs, and conducting internal reviews. However, unanticipated factors in the sample, setting, or procedure could have altered the observed effect sizes.

How is reproducibility influenced by publication bias?

There are indications of cultural practices in scientific communication that could be responsible for the observed results. The combination of low power research designs and publication bias produce a literature with upwardly biased effect sizes. This anticipates that replication effect sizes would regularly be smaller than the original studies. This isn’t because of differences in implementation but because the original effect sizes are influenced by publication/reporting bias and the replications aren’t.

Consistent with this expectation, most replication effects were smaller than the originals, and reproducibility success correlated with indicators of the strength of the initial evidence, like lower original P values and larger effect sizes. This suggests that publication, selection, and report biases are possible explanations for the difference in original and replication effects. The replication studies reduced these biases because replication preregistration and pre-analysis plans ensured confirmatory tests and reporting of all results.

Furthermore, repeated replication efforts that fail to identify conditions of the original finding can be observed reliably and could reduce confidence in the original finding.

Conclusion

The five indicators examined to describe replication success are not the only ways to evaluate reproducibility. But the results offer a clear, collective conclusion: a large portion of replications produced weaker evidence for original findings despite using materials provided by the original authors, internal reviews, and high statistical power to detect the original effect sizes.

Correlational evidence is consistent with the conclusion that variation in the strength of initial evidence was more predictive of replication success than variation in the characteristics of the teams conducting the research.

Reproducibility is not well understood because of the incentives for individual scientists to prioritize novelty over replication. Innovation is an engine of discovery and is vital for a productive, effective scientific interpose. But journal reviewers and editors might dismiss a new test of a published idea as unoriginal. Innovation shows that paths are possible, replication shows that paths are likely, progress relies on both. Scientific progress is a process of uncertainty reduction that can only succeed if science remains sceptical of its explanatory claims.

Access:

Public

Click & Go to more related summaries or chapters:

Summaries per article with Research Methods: theory and ethics at University of Groningen 20/21

False-positive psychology: Undiscovered flexibility in data collection and analysis allows presenting anything as significant - Simmons et al. - 2011 - Article

Why Summaries of Research on Psychological Theories Are Often Uninterpretable – Meehl - 1990 - Article

What has happened down here is the winds have changed – Gelman - 2016 - Article

Estimating the reproducibility of psychological science – Open Science Collaboration - 2015 - Article

Making replication mainstream – Zwaan et al. - 2018 - Article

The preregistration revolution – Nosek et al. - 2018 - Article

A Manifesto for Reproducible Science – Munafo et al. - 2017 - Article

When will ‘Open Science’ Become Simply ‘Science’? – Watson - 2015 - Article

Code of Ethics for Research in the Social and Behavioural Sciences - 2018 - Article

Science and Ethics in Conducting, Analyzing and Reporting Psychological Research – Rosenthal - 1994 - Article

On the Social Psychology of the Psychological Experiment: Demand Characteristics and their Implications - Orne - 2002 - Article

A Power Primer: Tutorials in Quantitative Methods for Psychology – Cohen - 1992 - Article

Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience – Button et al. - 2013 - Article

The Nature and History of Experimental Control - Boring - 1954 - Article

Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data - Rohrer - 2018 - Article

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Spotlight: topics

Check how to use summaries on WorldSupporter.org

Submenu: Summaries & Activities

Follow the author: Vintage Supporter

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

Search a summary, study help or student organization

Select any filter and click on Search to see results

Estimating the reproducibility of psychological science – Open Science Collaboration - 2015 - Article

Introduction

What is replication?

Method & Results

Does successful replication mean theoretical understanding is correct?

Does failure to replicate mean the original evidence was a false positive?

How is reproducibility influenced by publication bias?

Conclusion

Summaries per article with Research Methods: theory and ethics at University of Groningen 20/21

Contributions: posts

Add new contribution

Spotlight: topics

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

Quicklinks to fields of study for summaries and study assistance