Estimating the reproducibility of psychological science – Open Science Collaboration - 2015 - Article


Introduction

Reproducibility is a core principle of scientific progress. Scientific claims should get credibility by the replicability of their evidence instead of the status of their originator. Debates about transparency of methodology and claims are meaningless if the evidence being debated is not reproducible.

What is replication?

Direct replication tries to recreate conditions believed necessary to obtain a previously observed finding and is a way to establish reproducibility of a finding with new data. A direct replication may not achieve the original result for many reasons:

  • Known or unknown differences between the replication and original study could moderate the size of an observed effect.
  • Original result could have been a false positive.
  • The replication could produce a false negative.

False positives and false negatives provide misleading information about effects, and failure to identify the necessary conditions to reproduce a finding shows and incomplete theoretical understanding. Direct replication provides the chance to assess and improve reproducibility.  

Method & Results

The authors came up with a protocol for selecting and conducting high-quality replications. Collaborators joined the project, selected study for replication from available studies in the sampling frame, and were guided through the replication process.

The Open Science Collaboration conducted 100 replications (270 contributing authors) of studies published in three psychological journals using high-powered designs and original materials when possible. Through consulting original authors, obtaining original materials, and internal review, replications kept high fidelity to the original designs.

Here, they chose to evaluate replication success using significance and P values, effect sizes, subjective assessments of replication teams, and meta-analysis of effect sizes. The mean effect size of the replication effects was half the size of the mean effect size of the original effects, substantial decrease.

  • 97% of original studies had significant results (P
  • 36% of replications had significant results.
  • 47% of original effect sizes were in the 95% confidence interval of the replication effect size.
  • 39% of effects were subjectively rated to have replicated the original result.
  • Assuming no bias in original results, combining original and replication results left 68% with statistical significant effects.

Tests suggest that replication success was better predicted by the strength of the original evidence than by characteristics of the original and replication teams.

Does successful replication mean theoretical understanding is correct?

It would be too easy to conclude that. Direct replication mainly provides evidence for the reliability of a result. Understanding is achieved through many, diverse investigations that give converging support for a theoretical interpretation and rules out alternative explanation.

Does failure to replicate mean the original evidence was a false positive?

It would also be too easy to conclude this. Replications can fail if the methodology differs from the original ways that interfere with observing the effect. The Open Science Collaboration conducted replications designed to minimize a priori reasons to expect different results by using original materials, contacting original authors for review of designs, and conducting internal reviews. However, unanticipated factors in the sample, setting, or procedure could have altered the observed effect sizes.

How is reproducibility influenced by publication bias?

There are indications of cultural practices in scientific communication that could be responsible for the observed results. The combination of low power research designs and publication bias produce a literature with upwardly biased effect sizes. This anticipates that replication effect sizes would regularly be smaller than the original studies. This isn’t because of differences in implementation but because the original effect sizes are influenced by publication/reporting bias and the replications aren’t.

Consistent with this expectation, most replication effects were smaller than the originals, and reproducibility success correlated with indicators of the strength of the initial evidence, like lower original P values and larger effect sizes. This suggests that publication, selection, and report biases are possible explanations for the difference in original and replication effects. The replication studies reduced these biases because replication preregistration and pre-analysis plans ensured confirmatory tests and reporting of all results.

Furthermore, repeated replication efforts that fail to identify conditions of the original finding can be observed reliably and could reduce confidence in the original finding.

Conclusion

The five indicators examined to describe replication success are not the only ways to evaluate reproducibility. But the results offer a clear, collective conclusion: a large portion of replications produced weaker evidence for original findings despite using materials provided by the original authors, internal reviews, and high statistical power to detect the original effect sizes.

Correlational evidence is consistent with the conclusion that variation in the strength of initial evidence was more predictive of replication success than variation in the characteristics of the teams conducting the research.

Reproducibility is not well understood because of the incentives for individual scientists to prioritize novelty over replication. Innovation is an engine of discovery and is vital for a productive, effective scientific interpose. But journal reviewers and editors might dismiss a new test of a published idea as unoriginal. Innovation shows that paths are possible, replication shows that paths are likely, progress relies on both. Scientific progress is a process of uncertainty reduction that can only succeed if science remains sceptical of its explanatory claims.

Access: 
Public
Work for WorldSupporter

Image

JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Image

Click & Go to more related summaries or chapters:

Summaries per article with Research Methods: theory and ethics at University of Groningen 20/21

Summaries per article with Research Methods: theory and ethics at University of Groningen 20/21

Supporting content: 
Access: 
Public
Comments, Compliments & Kudos:

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.
Check how to use summaries on WorldSupporter.org


Online access to all summaries, study notes en practice exams

Using and finding summaries, study notes en practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

  1. Starting Pages: for some fields of study and some university curricula editors have created (start) magazines where customised selections of summaries are put together to smoothen navigation. When you have found a magazine of your likings, add that page to your favorites so you can easily go to that starting point directly from your profile during future visits. Below you will find some start magazines per field of study
  2. Use the menu above every page to go to one of the main starting pages
  3. Tags & Taxonomy: gives you insight in the amount of summaries that are tagged by authors on specific subjects. This type of navigation can help find summaries that you could have missed when just using the search tools. Tags are organised per field of study and per study institution. Note: not all content is tagged thoroughly, so when this approach doesn't give the results you were looking for, please check the search tool as back up
  4. Follow authors or (study) organizations: by following individual users, authors and your study organizations you are likely to discover more relevant study materials.
  5. Search tool : 'quick & dirty'- not very elegant but the fastest way to find a specific summary of a book or study assistance with a specific course or subject. The search tool is also available at the bottom of most pages

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Quicklinks to fields of study (main tags and taxonomy terms)

Field of study

Access level of this page
  • Public
  • WorldSupporters only
  • JoHo members
  • Private
Statistics
586