Fearing the future of empirical psychology - summary of an article by LeBel & Peters (2011)

Critical thinking
Article: LeBel & Peters (2011)
Fearing the future of empirical psychology

The interpretation bias
Conservatism in theory choice
Deficiencies in MRP
The logical strength of theory
Recommendations for strengthening method-relevant beliefs
Recommendations for weakening theory-relevant beliefs

The interpretation bias

Because empirical data undermine theory choice, alternative explanations of data are always possible, both when the data statistically support the researcher’s hypothesis and when they fail to do so.

The interpretation bias: a bias toward interpretations of data that favour a researcher’s theory, both when the null hypothesis is statistically rejected and when not.
This bias entails that, regardless of how data turn out, the theory whose predictions are being tested is artificially buffered from falsification.
The ultimate consequence is an increased risk of reporting false positives and disregarding true negatives, and so drawing incorrect conclusions about human psychology.

The research bias underlying the file-drawer problem in no way depend on unscrupulous motives.

Conservatism in theory choice

The knowledge system that constitutes a science such as psychology can be roughly divided into two types of belief:

Theory-relevant beliefs
Concern the theoretical mechanisms that produce behaviour
Method-relevant beliefs
Concern the procedures through which data are produced, measured and analysed

In any empirical test of a hypothesis, interpretation of the resulting data depends on both theory-relevant and method-relevant beliefs, as both types of belief are required to bring the hypothesis to empirical test.
Consequently, the resulting data can always be interpreted as theory relevant or as method relevant.

Weaknesses in the current knowledge system of empirical psychology bias the resulting choice of interpretation in favour of the researcher’s theory.
Deficiencies in methodological research practice systematically bias

The interpretation of confirmatory data as theory relevant
The interpretation of disconfirmatory data as method relevant

This has the result that the researcher’s hypothesis is artificially buffered from falsification.

The interpretation of data should hinge not on what the pertinent beliefs are about, but rather on the centrality of those beliefs.
The centrality of belief reflects its position within the knowledge system: central beliefs are those on which many other beliefs depend. Peripheral beliefs are those with few dependent beliefs.
The rejection of central beliefs to account for observed data entails a major restructuring of the overall knowledge system.

Conservatism: choosing the theoretical explanation consistent with the data that requires the least amount of restructuring of the existing knowledge system.
Generally, the conservatism in theory choice is a virtue, as it reduces ambiguity in the interpretation of data.
The value of methodological rigour is precisely that, by leveraging conservatism, it becomes more difficult to blame negative results on flawed methodology.
When method-relevant beliefs are peripheral and easily rejected, empirical tests become more ambiguous.

Theory-relevant beliefs should not be so central that they approach the status of logical necessity.
A theory’s strength should be measured by the extent to which it is falsifiable.
Theories that are too central risk becoming logical assumptions that are near impossible to dislodge with empirical tests.
It is critical that a hypothesis under test be described in a way that makes it empirically falsifiable and not logically necessary.

The knowledge system in empirical psychology is such that conservatism becomes a vice rather than a virtue in theory choice.

On the one hand, method-relevant beliefs are too peripheral, making them easy to reject
This increases the ambiguity of negative results, which contributes directly to the file drawer problem.
On the other hand, theory-relevant beliefs often appear too central, making them difficult to reject.
This leads to a process of confirmatory hypothesis testing, exacerbating the file drawer problem.

Deficiencies in MRP

Overemphasis on conceptual replication

The exclusive focus on conceptual replication is in keeping with the ethos of continuous theoretical advancement that is a hallmark of MPS.
An overemphasis on conceptual replication at the expense of close replication, however, weakens method-relevant beliefs in the knowledge system of empirical psychology, with the result that reports consisting entirely of conceptual replications may be less rigorous than those including a judicious number of close replications.

Typically in MRP, a statistical significant result is followed by a conceptual replication in the interest of extending the underlying theory.
The problem with this practice is that when the conceptual replication fails, it remains unclear whether the negative result was due to the falsity of the underlying theory or to methodological flaws introduced by changes in the conceptual replication.
Given the original statistical significant finding, the natural preference is to choose the latter interpretation and to proceed with another, slightly different, conceptual replication.

Danger arises because conceptual replication allows the researcher too much latitude in the interpretation of negative results.

In particular, the choice of which studies count as replications is made post hoc, and these choices are inevitably influenced by the interpretation bias: an extension that fails to reject the null hypothesis is not counted as a replication precisely because it did not replicate the original finding and therefore, the altered methodology must be to blame.
- The consequence is that a successful extension becomes a conceptual replication, whereas a failed extension becomes a methodological flawed pilot study, and it is tacitly understood that failed pilot studies belong in the file drawer.

Integrity of measurement instruments and experimental procedures

Failure to verify the integrity of measurement instruments and experimental procedures directly weakens method-relevant beliefs and thus increases ambiguity in the interpretation of negative (and even positive) results.

Little effort is put into independently validating and calibrating methodological procedures in MRP outside of main theory-testing experiments. Instead, experimenters are required to verify procedures and test psychological theories simultaneously. The result is that it becomes easy to attribute negative results to methodological flaws and hence relegate them to the file drawer.

Although pilot studies confirming the operation of construct manipulations are sometimes reported in multi-experiments articles, such verification studies are not consistently performed give that they are not required for publication.

The integrity of measurement procedures is also often difficult to substantiate. Because of the small cell sizes typically used in experimental designs, it is often impossible to determine accurate reliability estimates of test scores within experimental conditions.
Even when reliability can be accurately estimated, this methodological check is only the tip of the iceberg in determining whether observed scores primarily reflect the construct of interest rather than some other construct.

Taken together, the inconsistent, informal, and arduous nature of verifying the integrity of manipulation and measurement procedures leaves method-relevant beliefs much weaker than required for a rigorous empirical science.

Problems with null hypothesis significance testing

The exclusive reliance on the number .05 is problematic because:

The standard null hypothesis of no difference will almost always be false
It divorces theory choice form the context of the broader scientific knowledge system, encouraging myopic interpretations of data the can lead to bizarre conclusions about what has been empirically demonstrated.

Although it is well known that negative (null) results are ambiguous and difficult to interpret, exclusive reliance on NHST makes positive results equally ambiguous, because they can be explained by flaws in the way NHST is implemented rather than by a more theoretically interesting mechanism.
In this way, exclusive reliance on NHST increases the ambiguity of theory choice and undermines the rigour of empirical psychology.

The first problem:

In MRP, the null hypothesis is often formulated as a ‘nil hypothesis’, which claims that the means of different populations are identical.
This is a weak hypothesis because it is almost by definition false. Differences between different populations are inevitable, even if they only reflect ambient noise.
The statistical rejection of the nil hypothesis is therefore contingent only on a sample size sufficient to make the difference between means statistically significant.

The nil hypothesis is a straw man.

Because the nil hypothesis is not theory driven, it is hard to argue that its rejection implies anything whatsoever about the choice of alternative hypothesis.
The rejection of the nil is not equivalent to the rejection of a theoretically appropriate null hypothesis, and assuming that it is leads to the inflation of Type I error.

Second problem

Treating statistical significance as the sole criterion of theory choice when interpreting new data ignores all other evidence relevant to the interpretation of those data.
Empirical tests are not conducted in a theoretical vacuum, and existing evidence for or against a hypothesis should be factored into the interpretation of new data to supplement NHST.
NHST on its own does not not tell us what we want to know but something much less informative.
Basing theory choice on null hypothesis significance tests thus detaches theories from the broader system of empirical psychology.
Overreliance on NHST threatens the cumulation of evidence and the coherence of knowledge system in empirical psychology.

The logical strength of theory

Weak, peripheral method-relevant beliefs make it easy to discount negative results.
The more it appears that a theoretical explanation has to be the case, the more likely it is that disconfirming data will be attributed to methodological flaws.

Summary

The result of the combination of peripheral method-relevant beliefs and central theory-relevant beliefs is that conservatism in MRP becomes an unconditional bias toward interpretations of data that favour the researcher’s theory.
Conservatism should only bias theory choice toward interpretations of data that minimize revision of the knowledge system, regardless of whether a particular interpretation favours method-relevant or theory-relevant beliefs.

Strategies of improving MRP

The overarching recommendation is that methodology must be made more rigorous by strengthening method-relevant beliefs to constrain the filed of alternative explanations available for psychological finding.
This is true both when data statistically support a researcher’s theory and when they do not.
By making MRP more rigorous, the ambiguity of theory choice is reduced and empirical tests become more diagnostic.

A complementary recommendation is that the logical status of theory-relevant beliefs must be weakened.

Recommendations for strengthening method-relevant beliefs

Stronger emphasis on close replication

To determine whether an observed effect is real or due to sampling error.
Close replications are crucial because a failed close replication is the most diagnostic test of whether an observed effect is real, given that no differences between the original study and the replicating study were intentionally introduced.

In the case of a close replication, we cannot easily blame a negative result on methodological variation, because in a close replication methodological differences are not deliberately introduced into the replication.

Once successful close replications have been achieved in a new area of research, the value of further close replications diminishes and the value of conceptual replications increases dramatically.

Verify integrity of methodological procedures

To make method-relevant beliefs stronger and more difficult to reject, it is critical that verifying the integrity of empirical instruments and procedures becomes a routine component of psychological research.

Maintaining a clear distinction between pilot studies designed to verify the integrity of instruments and procedures and primary studies designed to test theories will do much to diminish the influence of the interpretation bias on the reporting of results.

It should also be standard procedure to routinely check the internal consistency of the scores of any measurement instruments used and to confirm measurement invariance of instruments across conditions.
It should also be standard practice to use objective markers of instruction comprehension and participant non-compliance.

Use stronger forms of NHST

Minimally, null hypothesis should not be formulated in terms of a nil hypothesis.

In the strong form, NHST requires that the null hypothesis be a theoretically derived point value of the focal variable, which the researcher them attempts of reject on observation of the data.

Significance tests should be treated as just one criterion informing theory choice, in addition to relevant background knowledge and considerations of belief centrality.

Recommendations for weakening theory-relevant beliefs

Considered individually, not all psychological hypotheses appear logically necessary, but insufficient attention has been paid to identify the criterion that distinguishes between falsifiable and non-falsifiable psychological hypotheses.

The important point is that making the dis-confirmation of a psychological hypothesis more plausible will reduce the bias toward methodological interpretations of negative results.
At minimum, care needs to be taken that hypotheses under test are stated such that their not being the case is possible, so that their truth is contingent rather than necessary.

When the researcher’s hypothesis is plausibly falsifiable and the null hypotheses is plausibly confirmable, statistical tests pitting these two hypotheses against each other will be much more informative for theory choice.

Access:

Public

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

This content is related to:

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

Check more of topic:

Samenvattingen voor psychologie en gedrag

Universiteit Amsterdam: UVA

This content is used in:

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Spotlight: topics

Check the related and most recent topics and summaries:

Activities abroad, study fields and working areas:

Samenvattingen voor psychologie en gedrag

Countries and regions:

The Netherlands

Institutions, jobs and organizations:

Universiteit Amsterdam: UVA

This content is also used in .....

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

This is a summary of the articles and reading materials that are needed for the second block in the course WSR-t. This course is given to second year psychology students at the Uva. This block is about analysing and evaluating psychological research. The order in which the

...

bundel bok 2 cd.jpg

False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant - summary of an article by Simmons, Nelson, & Simonsohn (2011)

Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability - summary of an article by Nosek, Spies, & Motyl, (2012)

Neyman, Pearson and hypothesis testing - summary of an article by Dienes (2003)

Evaluating Theories - summary of an article by Dennis & Kintsch

Degrees of falsifiability - summary of an article by Dienes (2008)

Causal Inference and Developmental Psychology - summary of an article by Foster (2010)

Confounding and deconfounding: or, slaying the lurking variable - summary of an article by Pearl (2018)

Critical thinking in Quasi-Experimentation - summary of an article by Shadish (2008)

Beyond the null ritual, formal modeling of psychological processes - summary of an article by Marewski, & Olsson, (2009)

The two disciplines of scientific psychology - summary of an article by Cronbach (1957)

Simpson's paradox in psychological science: a practical guide - summary of an article by Kievit, Frankenhuis, Waldorp, & Borsboom (2013)

Fearing the future of empirical psychology - summary of an article by LeBel & Peters (2011)

The 10 commandments of helping students distinguish science from pseudoscience in psychology - summary of an article by Scott O. Lilienfeld (2005)

WSRt, critical thinking, a list of terms used in the articles of block 2

Everything you need for the course WSRt of the second year of Psychology at the Uva

Check how to use summaries on WorldSupporter.org

Submenu: Summaries & Activities

Follow the author: SanneA

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

Search a summary, study help or student organization

Select any filter and click on Search to see results

Fearing the future of empirical psychology - summary of an article by LeBel & Peters (2011)

The interpretation bias

Conservatism in theory choice

Deficiencies in MRP

The logical strength of theory

Recommendations for strengthening method-relevant beliefs

Recommendations for weakening theory-relevant beliefs

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

Samenvattingen voor psychologie en gedrag

Universiteit Amsterdam: UVA

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

Contributions: posts

Add new contribution

Spotlight: topics

Samenvattingen voor psychologie en gedrag

The Netherlands

Universiteit Amsterdam: UVA

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

bundel bok 2 cd.jpg

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

Quicklinks to fields of study for summaries and study assistance