False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant - summary of an article by Simmons, Nelson, & Simonsohn (2011)

Join Log in Profile Search

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Critical thinking
Article: Simmons, Nelson, & Simonsohn (2011)
False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant

Abstract
Beginning
Solution
General discussion

Abstract

This article is about two things:

despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings, flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to false find evidence that an effect exists than to correctly find evidence that it does not.
a solution to that problem.

Beginning

One of the most costly errors is a false positive.

The incorrect rejection of the null hypothesis.
Once they appear in the literature, they are persistent.
- Because null results have many possible causes, failures to replicate previous findings are never conclusive.
- Because it is uncommon for prestigious journals to publish null findings or exact replication, researchers have little incentive to even attempt them.
False positives waste resources

They inspire investment in fruitless research programs and can lead to ineffective policy changes.

Ambiguity is rampant in empirical research.

Solution

As a solution to the flexibility-ambiguity problem, there are offered six requirements for authors and four guidelines for reviewers.

This solution substantially mitigates the problem but imposes only a minimal burden on authors, reviewers, and readers.
Leaves the right and responsibility of identifying the most appropriate way to conduct research in the hands of researchers, requiring only that authors provide appropriately transparent descriptions of their methods so that reviewers and readers can make informed decisions regarding the credibility of their findings.

Requirements for authors

1. Authors must decide the rule for terminating data collection before data collection begins and report this rule in the article.

2. Authors must collect at least 20 observations per cell or else provide a compelling cost-of-data collection justification.
Samples smaller than 20 per cell are not powerful enough to detect most effects.

3. Authors must list all variables collected in a study
Prevents researchers from reporting only a convenient subset of the many measures that were collected, allowing readers and reviewers to easily identify possible researcher degrees of freedom.

4. Authors must report all experimental conditions, including failed manipulations
Prevents authors from selectively choosing only to report the condition comparisons that yield results that are consistent with their hypothesis.

5. If observations are eliminated, authors must also report what the statistical results are if those observations are included.
Makes transparent the extent to which a finding is reliant on the exclusion of observations, puts appropriate pressure on authors to justify the elimination of data, and encourages reviewers to explicitly consider whether such exclusions are warranted.

6. If an analysis includes covariate, authors must report the statistical results of the analysis without the covariate
This makes transparent the extent to which a finding is reliant of the presence of a covariate, puts appropriate pressure on authors to justify the covariate and encourages reviewers to consider whether including is warranted.

Guidelines for reviewers

1. Reviewers should ensure that authors follow the requirements

2. Reviewers should be more tolerant of imperfections in results

3. Reviewers should require authors to demonstrate that their results do not hinge on arbitrary analytic decisions.

4. If justification of data collection or analysis are not compelling, reviewers should require the authors to conduct an exact replication.

General discussion

Criticisms

Criticism of the solution comes in two varieties:

it does not go far enough
it goes to far

Not far enough

The solution does not lead tot the disclosure of all degrees of freedom.

it cannot reveal those arising from reporting only experiments that ‘work’

Authors have tremendous disincentives to disclose exploited researcher degrees of freedom.

To far

the guidelines prevent researchers from conducting exploratory research.

Nonsolutions

Solutions rejected by the authors for they are less practical, less effective or both.

Correcting alpha levels
Using Bayesian statistics
Conceptual replications
Posting materials and data

Access:

Public

This content is related to:

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

This is a summary of the articles and reading materials that are needed for the second block in the course WSR-t. This course is given to second year psychology students at the Uva. This block is about analysing and evaluating psychological research. The order in which the...Read more

1779 reads

Check more of this topic?

Samenvattingen voor psychologie en gedrag

Search other summaries?

Associate with your Field of Study

Search Summaries or Notes

Start using Summaries

Add a Summary

This content is also used in .....

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant - summary of an article by Simmons, Nelson, & Simonsohn (2011)

Critical thinking
Article: Simmons, Nelson, & Simonsohn (2011)
False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant

Abstract
Beginning
Solution
General discussion

Abstract

This article is about two things:

despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings, flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to false find evidence that an effect exists than to correctly find evidence that it does not.
a solution to that problem.

Beginning

One of the most costly errors is a false positive.

The incorrect rejection of the null hypothesis.
Once they appear in the literature, they are persistent.
- Because null results have many possible causes, failures to replicate previous findings are never conclusive.
- Because it is uncommon for prestigious journals to publish null findings or exact replication, researchers have little incentive to even attempt them.
False positives waste resources

They inspire investment in fruitless research programs and can lead to ineffective policy changes.

Ambiguity is rampant in empirical research.

Solution

As a solution to the flexibility-ambiguity problem, there are offered six requirements for authors and four guidelines for reviewers.

Requirements for authors

1. Authors must decide the rule for terminating data collection before data collection begins and report this rule in the article.

Access:

Public

Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability - summary of an article by Nosek, Spies, & Motyl, (2012)

Critical thinking
Article: Nosek, Spies, & Motyl, (2012)
Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability

Abstract
A true story of what could have been
How evaluation criteria can increase the false result rate in published science
Some things are more publishable than others
A disconnect between what is good for scientists and what is god for science
Novelty and positive results are vital for publish-ability but nor for truth
Practices that can increase the proportion of false results in the published literature
Strategies that are not sufficient to stop the proliferation of false results
Strategies that will accelerate the accumulation of knowledge
The ultimate solution: opening data, materials, and workflow

Abstract

An academic scientist’s professional success depends on publishing.

Publishing norms emphasize novel, positive results.
Disciplinary incentives encourage design, analysis, and reporting decisions that elicit positive results and ignore negative results .
When incentives favor novelty over replication, false results persists in the literature unchallenged, reducing efficiency in knowledge accumulation.

This article develops strategies for improving scientific practices and knowledge accumulation that account for ordinary human motivations and biases.

A true story of what could have been

Incentives for surprising, innovative results are strong in science.

Science thrives by challenging prevailing assumptions and generating novel ideas and evidence that push the field in new directions.

Problem: the incentives for publishable results can be at odds with the incentives for accurate results. This produces a conflict of interest.

The conflict may increase the likelihood of design, analysis, and reporting decisions that inflate the proportion of false results in the published literature.

The solution requires making incentives for getting it right competitive with the incentives for getting it published.

How evaluation criteria can increase the false result rate in published science

Publishing is the ‘very heart of modern academic science, at levels ranging from the epistemic certification of scientific thought to the more personal labyrinths of job security, quality of life and self esteem’.

With an intensely competitive job marked, the demands on publication might seem to suggest a specific objective for the early-career scientists: publish as many articles as possible in the most prestigious journals that will accept them.

Some things are more publishable than others

Even if a researcher conducts studies competently, analyses the data effectively, and writes the results beautifully, there is not guarantee that the report will be published.
Part of

Access:

Public

Neyman, Pearson and hypothesis testing - summary of an article by Dienes (2003)

Critical thinking
Article: Dienes (2003)
Neyman, Pearson and hypothesis testing

Introduction
Probability
Data and hypotheses
Hypothesis testing: α
Hypothesis testing: α en ß
Power in practice
Sensitivity
Stopping rules
Multiple testing
Fisherian inference
Further points concerning significance tests that are often misunderstood
Confidence intervals
Criticism of the Neyman-Pearson approach
Using the Neyman-Pearson approach to critically evaluate a research article

Introduction

In this article, we will consider the standard logic of statistical inference.
Statistical inference: the logic underlying all the statistics you see in the professional journals of psychology and most other disciplines that regularly use statistics.

The underlying logic of statistic (Neyman-Pearson) is both highly controversial, frequently attacked (and defended) by statisticians and philosophers, and more frequently misunderstood.

Probability

The meaning of probability we choose determines what we can do with statistics.
The proper way of interpreting probability remains controversial, so there is still debate over what can be achieved with statistics.
The Neyman-Pearson approach follows form one particular interpretation of probability. The Bayesian approach considered follows form another.

Interpretations often start with a set of axioms that probabilities must follow.
Two interpretations of probability:

the subjective interpretation: a probability is a degree of conviction of a belief
the objective interpretation: locate probability in the world.

The most influential objective interpretation of probability is the long-run relative frequency interpretation. Here, probability is a relative frequency.
Because the long-run relative frequency is a property of all the events in the collective, it follows that a probability applies to a collective, not to any single event.
A single event could be a member of different collectives. So a singular event does not have a probability, only collectives do.

Objective probabilities do not apply to single cases. They also do not apply to the truth of hypotheses.
A hypothesis is simply true or false, just as a single event either occurs or does not.
A hypothesis is not a collective, it therefore does not have an objective probability.

Data and hypotheses

Data = D

Hypothesis = H

inverse conditional probabilities can have very different values
in any case, it is meaningless to assign an objective probability to a hypothesis.

Hypothesis testing: α

Statistics cannot tell us

Access:

Public

Evaluating Theories - summary of an article by Dennis & Kintsch

Critical thinking
Article: Dennis & Kintsch
Evaluating Theories

Introduction
Criteria on which to evaluate theories

Introduction

A theory is a concise statement about how we believe the world to be.
Theories organize observations of the world and allow researchers to make predictions about what will happen in the future under certain conditions.
Science is about the testing of theories, and the data we collect as scientists should either implicitly or explicitly bear on theory.

The characteristics that lead a theory to be successful from those that make it truly useful:

Descriptive adequacy:
Does the theory accord with the available data?
Precision and interpretability:
Is the theory described in a sufficiently precise fashion that other theorists can interpret it easily and unambiguously?
Coherence and consistency:
Are there logical flaws in the theory? Does each component of the theory seem to fit with the others in to a coherent whole? Is it consistent with theory in other domains?
Prediction and falsifiability:
Is the theory formulated in such a way that critical tests can be conducted that could reasonably lead to the rejection of the theory?
Postdiction and explanation:
Does the theory provide a genuine explanation of existing results?
Parsimony:
Is the theory as simple as possible?
Originality:
Is the theory new or is it essentially a restatement of an existing theory?
Breadth:
Does the theory apply to a broad range of phenomena or is it restricted to a limited domain?
Usability:
Does the theory have applied implications?
Rationality:
Does the theory make claims about the architecture of mind that seem reasonable in the light of the environmental contingencies that have shaped or evolutionary theory?

Criteria on which to evaluate theories

Descriptive adequacy

The extent to which it accords with data.
In psychology, the most popular way of comparing a theory against data is null hypothesis significance testing.
Determining whether a theory is consistent with data is not always as straightforward as it may at first appear.

Some of the the subtleties involved in determining the extent to which a theory accords with data

Using null hypothesis significance testing, it is not possible to conclude that there is no difference. A proponent of a theory that predicts a list-length effect can always propose that a failure to find the difference was a consequence of lack of power of the experimental design.
Null hypothesis significance testing encourages a game of 20 questions with nature. A

Access:

Public

Degrees of falsifiability - summary of an article by Dienes (2008)

Critical thinking
Article: Dienes (2008)
Degrees of falsifiability

Falsifiability
Observations

Falsifiability

A potential falsifier of a theory: any potential observation that would contradict the theory.
One theory is more falsifiable than another if the class of potential falsifiers is larger.

Scientists prefer simple theories.
Simple theories are better testable.

A theory can gain in falsifiability not only by being precise, but also be being broad in range of situations to which the theory applies.
The greater the universality of a theory, the more falsifiable it is. Even if the predictions are not very precise.

Revisions to a theory may make it more falsifiable by specifying fine-grained causal mechanisms.
As long as the steps in a proposed causal pathway are testable, specifying the pathway gives you more falsifiers.

Psychologists sometimes theorize and make predictions by constructing computational models.
A computational model is a computer simulation of a subject, where the model is exposed to the same stimuli subjects receive and gives actual trial-by-trial responses.

A theory that allows everything explains nothing.
The more a theory forbids, the more it says about the world. The empirical content of a theory increases with its degree of falsifiability.

The more falsifiable a theory is, the more open it is to criticism.
So the more falsifiable our theories are, the faster we can make progress, given progress comes from criticism.

Science aims at the maximum falsifiability it can achieve: successive theories should be successively more falsifiable. Either in terms of universality or precision.

Make sure that any revision or amendment to theory can be falsified. That way theory development is guaranteed to keep its empirical character.

Observations

Observations are always ‘theory impregnated’.
Falsification is not so simple as pitting theory against observation.
Theories determine what an observation is.

Access:

Public

Causal Inference and Developmental Psychology - summary of an article by Foster (2010)

Critical thinking
Article: Foster (2010)
Causal Inference and Developmental Psychology
(the part needed for psychology at the UvA)

Four premises

Causal inference is essential to accomplishing the goals of developmental psychologists
In many analyses, psychologists unfortunately are attempting causal inference but doing so badly, based on many implicit and, in some cases, implausible assumptions.
These assumptions should be identified explicitly and checked empirically and conceptually
Once introduced to the broader issues, developmental psychologists will recognize the central importance of causal inference and naturally embrace the methods available.

Why causal inference?
Two frameworks for causal inference

Why causal inference?

Causal thinking and causal inference are unavoidable.

Even if researchers can distinguish associations from causal relationships, lay readers, journalists, policymakers, and other researchers generally cannot.
If a researcher resist the urge to jump form association to causality, other researchers seem willing to do so on his or her behalf.

Causal inference as the goal of developmental psychology

the lesson is not that causal relationships can never be established outside of random assignment, but that they cannot be inferred from associations alone. Some additional assumptions are required.

The goal of this research should be to make causal inference as plausible as possible.
Doing so involves applying the best methods available among a growing set of tools.
As part of the proper use of those tools, the researcher should identify the key assumptions on which they rest and their plausibility in any particular application.
The researcher should check the consistency of those assumptions as much as possible using the available data. In many instances key assumptions will remain untestable.
The plausibility of those assumptions need to be assessed in the light of substantive knowledge.

What constitutes credible or plausible is not without debate.

At this point, much of developmental psychology involves implausible causal inference.

Such inference could be improved even without dramatically changing the complexity of the analysis.

Two frameworks for causal inference

Two conceptual tools are especially helpful in moving from associations to causal relationships.

The directed acyclic graph (DAG)

This tool assists researchers in identifying the implications of a set of associations for understanding causality and the set of assumptions under which those associations imply causality
Moving from association to causality requires ruling out potential confounders: variables associated with both treatment and outcome.
The DAG is particularly useful for helping the research to identify covariates and for perhaps understanding unanticipated consequences of incorporating these variables.

Access:

Public

Confounding and deconfounding: or, slaying the lurking variable - summary of an article by Pearl (2018)

Critical thinking
Article: Pearl (2018)
Confounding and deconfounding: or, slaying the lurking variable

Introduction
The chilling fear of confounding
The skillful interrogation of nature: why RCTs work
The new paradigm of confounding
The do-operator and the back-door criterion

Introduction

Confounding bias occurs when a variable influences both who is selected for the treatment and he outcome of the experiment.
Sometimes the confounders are known. Other times they are merely suspected and act as a ‘lurking third variable’.

If we have measurements of the third variable, then it is very easy to deconfound the true and spurious effects.

Statisticians both over- and underrate the importance of adjusting for possible confounders

Overrate in the sense that they often control for many more variables than they need to and even for variables that they should not control for
Underrate in the sense that they are loath to talk about causality at all, even if the controlling has been done correctly.

The chilling fear of confounding

Knowing the set of assumptions that stand behind a given conclusion is not less valuable than attempting to circumvent those assumptions with and RCT, which has complications on its own.

The skillful interrogation of nature: why RCTs work

The one circumstance under which scientists will abandon some of their reticence to talk about causality is when they have conducted a randomized controlled trial (RCT).

Randomization brings two benefits:

It eliminates confounder bias
It enables the researcher to quantify his uncertainty

Another ways is, if you know what all the possible counfounders are, to measure and adjust for them.
But, randomization had one great advantage: it servers every incoming link to the randomized variable, including the ones we don’t know about or cannot measure.

RCTs are preferred to observational studies.
But, in some cases, intervention may be physically impossible or unethical.

Provisional causality: causality contingent upon the set of assumptions that our causal diagram advertises.

The principal objective of an RCT is to eliminate confounding.

The new paradigm of confounding

Confounding is not a statistical notion. It stands for the discrepancy between what we want to assess (the causal effect) and what we actually do assess using statistical methods.
If you can’t articulate mathematically what you want to assess, you can’t expect to define what constitutes a discrepancy.

Historically, the concept of ‘confounding’ has evolved around two related conceptions:

Incomparability
A lurking third variable.

Both these concepts have resisted formalization.

Access:

Public

Critical thinking in Quasi-Experimentation - summary of an article by Shadish (2008)

Critical thinking
Article: Shadish (2008)
Critical thinking in Quasi-Experimentation

All experiments are about discovering the effects of causes.
All experiments have in common the deliberate manipulation of an assumed cause, followed by observation of the effects that follow.

A Quasi-experiment: an experiment that does not use random assignment conditions.

Causation
Critical thinking in quasi-experiments means showing alternative explanations are unlikely

Causation

What is a cause?

An inus condition: an insufficient cause by itself. It effectiveness required it to be embedded in a larger set of conditions.

Most causal relationships are not deterministic, but only increase the probability that an effect will occur.
This is the reason why a given causal relationship will only occur under some conditions but not universally.
To different degrees, all causal relationships are contextually dependent, so the generalization of experimental effects is always at issue.

Experimental causes are manipulable.
Experiments explore the effects of things that can be manipulated.
Experimental causes must be manipulable.

In quasi-experiments, the cause is whatever was manipulated, which may include many more things than the researcher realizes were manipulated.
In quasi-experiments, especially if the researcher is not the person manipulating the treatment, it is easy to make mistaken claims about what was manipulated, and the context in which it occurred.

What is an effect?

In an experiment, we observe what did happen when people receive a treatment.
The counterfactual is knowledge of what would have happened to those same people if they simultaneously had not received treatment.

An effect is the difference between what did happen and what would have happened.

We can never observe the counterfactual.
Experiments try to create reasonable approximations to this physically impossible counterfactual.

Two central tasks in experimental design are:

Creating a high-quality but necessarily imperfect source of counterfactual inference
Understanding how this source differs form the treatment condition.

Random assignment forms a control group that is often the best approximation to this counterfactual that we can usually obtain, though even that control group is imperfect because the person in the control group are not identical to those in the treatment group.
However, we do know that participants in the treatment and control group differ form each other only randomly.

The problem in quasi-experiments is that differences between treatment and control are usually systematic, not random, so nonrandom controls may not tell us much about what would have happened to the treatment group if they had not received treatment.
Much of quasi-experimentation is concerned with creating good sources of counterfactual inference. In general, quasi-experiments use two different tools to do so

Observing the same unit over time
To try to make nonrandom control groups as similar as possible to the participants in the treatment group.

Even then, the effects of quasi-experiments are rarely as trustworthy

Access:

Public

Beyond the null ritual, formal modeling of psychological processes - summary of an article by Marewski, & Olsson, (2009)

Critical thinking
Article: Marewski, & Olsson, (2009)
Beyond the null ritual, formal modeling of psychological processes

Beyond the null ritual
What is a model?
Advantages of formally specifying theories
The problem of overfitting
Other pitfalls of model selection

Beyond the null ritual

Rituals can be characterized by a range of attributes including:

Repetitions of the same action
Fixations on special features such as numbers
Anxieties about punishments for rule violations
Wishful thinking

Each of these characteristics is reflected in null hypothesis significance testing.

One good way to make theories more precise is to cast them as formal models.
In doing so, researchers can move beyond the problems of null hypothesis significance testing, and simple difference searching.

What is a model?

In the broadest sense, a model is a simplified representation of the world that aims to explain observed data.
A model is a formal instantiation of a theory that specifies the theory’s predictions. This category also includes statistical tools, such as structural equation or regression models.
Statistical tools are not typically meant to mirror the workings of psychological mechanisms.

What is the scope of Modeling?

Modeling is not meant to be applied equally to all research questions. Each method has its specific advantages and disadvantages.

Modeling helps researchers answer involved questions and understand complex phenomena.
In psychology, modeling is especially suited for basic and applied research about the cognitive system.

Advantages of formally specifying theories

Four closely interrelated benefits of increasing the precision of theories by casting them as models:

Models allow the design of strong tests of theories
They can also sharpen research questions
Models can lead beyond theories built on the general linear model
Modeling helps to address real-world problems

Designing strong tests of theories

Models provide the bridge between theories and empirical evidence.
They enable researchers to make competing quantitative predictions, which in turn lead to strong comparative tests of theories.
Any quantitative prediction can be systematically better or worse than any other.

But, as soon as one starts to compare quantitative predictions from different models, the use of null hypothesis testing can become inappropriate or meaningless.

Sharpening research questions

Null hypothesis tests are often used to evaluate verbal, informal theories.
But, in such theories are underspecified, then they can be used post hoc, to ‘explain’ almost any observed empirical pattern.

Access:

Public

The two disciplines of scientific psychology - summary of an article by Cronbach (1957)

Critical thinking
Article: Cronbach (1957)
The two disciplines of scientific psychology

The separation of the disciplines
Characterization of the disciplines
The shape of a united discipline

The separation of the disciplines

The experimental method, where the scientists changes conditions in order to observe their consequences, is much the more coherent of our two disciplines.

Correlational psychology was slower to mature.
It qualifies equally as a discipline, because it asks a distinctive type of question and has technical methods of examining whether the question has been properly put and the data properly interpreted.

The well-known virtue of the experimental method is that it brings situational variables under tight control. It thus permits rigorous tests of hypotheses and confident statements about causation.

The correlational method can study what man has not learned to control or can never hope to control.

Characterization of the disciplines

In the beginning, experimental psychology was a substitute for purely naturalistic observation of man-in-habitat.

The experiment came to be concerned with between-treatment variance.
And, today the majority of experimenters derive their hypotheses explicitly from theoretical premises and try to nail their results into a theoretical structure.
The goal in the experimental tradition is to get differential variables out of sight.

The correlational psychologists loves those variables the experimenter left home to forget.

Factor analysis is rapidly being perfected into a rigorous method of clarifying multivariate relationships.
The correlational psychologists is a mere observer of a play where Nature pulls a thousand strings: but his multivariate methods make him equally and expert, an expert in figuring out where to look for the hidden strings.

The shape of a united discipline

It is not enough for each discipline to borrow from the other.
Correlational psychologists studies only variance among organisms; experimental psychology studies only variance among treatments.
A united discipline will study both of these, but it will also be concerned with the otherwise neglected interactions between organismic and treatment variables.
Our job is to invent constructions and to from a network of laws which permits prediction.

From observations we must infer a psychological description of the situation and of the present state of the organism.
Our laws should permit us to predict, from this description, the behaviour of organism-in-situation.

Methodologies for a joint discipline have already been proposed.

Access:

Public

Simpson's paradox in psychological science: a practical guide - summary of an article by Kievit, Frankenhuis, Waldorp, & Borsboom (2013)

Critical thinking
Article: Kievit, Frankenhuis, Waldorp, & Borsboom (2013)
Simpson's paradox in psychological science: a practical guide

Introduction

Simpson’s paradox: the direction of an association at the population-level may be reversed within the subgroups comprising that population.

Simpson showed that a statistical relation observed in a population could be reversed within all of the subgroups that make up that population.

What is Simpson’s paradox?
Simpson’s paradox in individual differences
A survival guide to Simpson’s paradox

What is Simpson’s paradox?

Simpson’s paradox is a counter-intuitive feature of aggregated data, which may arise when (causal) inferences are drawn across different explanatory levels. (like population to subgroup or subgroup to individual).

Simpson’s paradox is conceptually and analytically related to many statistical challenges and techniques.
The underlying shared theme of these techniques is that they are concerned with the nature of (causal) inference. The challenge is what inferences are warranted based on the data we observe.

Simpson’s paradox in individual differences

One can only be sure that a group-level finding generalizes to individuals when the data are ergodic, which is a very strict requirement.
Since this requirement is unlikely to hold in many data sets, extreme caution is warranted in generalizing across levels.
The dimensions that appear in a covariance structure analysis describe patterns of variation between people, not variation within individuals over time.

A person X may have a position on five dimensions compared to other people in a given population, but this does not imply that person varies along this number of dimensions over time.

Two variables may correlate positively across a population of individuals, but negatively within each individual over time.

A survival guide to Simpson’s paradox

Simpson’s paradox may occur in a wide variety of research designs, methods, and questions.
There is no single mathematical property that all instances of SP have in common. Therefore, there will not be a single, correct rule for analysing data so as to prevent cases of SP.

What we can do is consider the instances of SP we are most likely to encounter, and investigate them for characteristic warning signals.

The most general danger of psychology is that we might incorrectly infer that a finding at the level of the group generalizes to subgroups, or to individuals over time.

Preventing Simpson’s paradox

Develop and test mechanistic explanations

The first step in addressing SP is to carefully consider when it may arise.

The mechanistic inference we propose to explain the data may be incorrect.
This danger arises when we use data at one explanatory level to infer a cause at a different explanatory

Access:

Public

Fearing the future of empirical psychology - summary of an article by LeBel & Peters (2011)

Critical thinking
Article: LeBel & Peters (2011)
Fearing the future of empirical psychology

The interpretation bias
Conservatism in theory choice
Deficiencies in MRP
The logical strength of theory
Recommendations for strengthening method-relevant beliefs
Recommendations for weakening theory-relevant beliefs

The interpretation bias

Because empirical data undermine theory choice, alternative explanations of data are always possible, both when the data statistically support the researcher’s hypothesis and when they fail to do so.

The interpretation bias: a bias toward interpretations of data that favour a researcher’s theory, both when the null hypothesis is statistically rejected and when not.
This bias entails that, regardless of how data turn out, the theory whose predictions are being tested is artificially buffered from falsification.
The ultimate consequence is an increased risk of reporting false positives and disregarding true negatives, and so drawing incorrect conclusions about human psychology.

The research bias underlying the file-drawer problem in no way depend on unscrupulous motives.

Conservatism in theory choice

The knowledge system that constitutes a science such as psychology can be roughly divided into two types of belief:

Theory-relevant beliefs
Concern the theoretical mechanisms that produce behaviour
Method-relevant beliefs
Concern the procedures through which data are produced, measured and analysed

In any empirical test of a hypothesis, interpretation of the resulting data depends on both theory-relevant and method-relevant beliefs, as both types of belief are required to bring the hypothesis to empirical test.
Consequently, the resulting data can always be interpreted as theory relevant or as method relevant.

Weaknesses in the current knowledge system of empirical psychology bias the resulting choice of interpretation in favour of the researcher’s theory.
Deficiencies in methodological research practice systematically bias

The interpretation of confirmatory data as theory relevant
The interpretation of disconfirmatory data as method relevant

This has the result that the researcher’s hypothesis is artificially buffered from falsification.

The interpretation of data should hinge not on what the pertinent beliefs are about, but rather on the centrality of those beliefs.
The centrality of belief reflects its position within the knowledge system: central beliefs are those on which many other beliefs depend. Peripheral beliefs are those with few dependent beliefs.
The rejection of central beliefs to account for observed data entails a major restructuring of the overall knowledge system.

Conservatism: choosing the theoretical explanation consistent with the data that requires the least amount of restructuring of the existing knowledge system.
Generally, the conservatism in theory choice is a virtue, as it reduces ambiguity in the interpretation of data.
The value of methodological rigour is precisely that, by leveraging conservatism, it becomes more difficult to blame negative results on flawed methodology.
When method-relevant

Access:

Public

The 10 commandments of helping students distinguish science from pseudoscience in psychology - summary of an article by Scott O. Lilienfeld (2005)

Critical thinking
Article: Scott O. Lilienfeld (2005)
The 10 commandments of helping students distinguish science from pseudoscience in psychology

The ten commandments of helping students distinguish science from pseudoscience in psychology

The ten commandments of helping students distinguish science from pseudoscience in psychology

The first commandment

It is important to communicate to students that the differences between between science and pseudoscience, although not absolute or clear-cut, are neither arbitrary or subjective.

Warning signs that characterize most pseudoscientific disciplines:

A tendency to invoke ad hoc hypotheses, which can be thought of as ‘escape hatches’ or loopholes, as a means of immunizing claims from falsification.
An absence of self-correction and an accompanying intellectual stagnation
An emphasis on confirmation rather than refutation
A tendency to place the burden of proof on sceptics, not proponents, of claims
Excessive reliance on anecdotal and testimonial evidence to substantiate claims
Evasion of the scrutiny afforded by peer review
Absence of ‘connectivity’, a failure to build on existing scientific knowledge
Use of impressive-sounding jargon whose primary purpose is to lend claims of facade of scientific respectability
An absence of boundary conditions. A failure to specify the settings under which claims do not hold.

Non of these warnings signs is by itself sufficient to indicate that a discipline is pseudoscientific.
But, the more of these warning signs a discipline exhibits, the more suspect it should become.

The second commandment

Learning to distinguish scepticism from cynicism.
One danger of teaching students to distinguish science from pseudoscience is that we can inadvertently produce students who are reflexively dismissive of any claim that appears implausible.

Scepticism, which is the proper mental set of the scientist, implies two seemingly contradictory attitudes:

An openness to claims
A willingness to subject these claims to incisive scrutiny.

Cynicism implies close-mindedness.

The third commandment

Distinguish methodological scepticism from philosophical scepticism.

Methodological (scientific) scepticism: an approach that subjects all knowledge claims to scrutiny with the goal of sorting out true from false claims
Philosophical scepticism: an approach that denies the possibility of knowledge.

There is a continuum of confidence in scientific claims.

The fourth commandment

Distinguish pseudoscientific claims from claims that are merely false.
The key difference between science and pseudoscience lies not in their content but in their approach to evidence.

Science seeks out contradictory information and eventually incorporates such information into its corpus of knowledge
Pseudoscience tends to avoid contradictory information and thereby fails to foster the self-correction that is essential to scientific progress.

The fifth commandment

Distinguish science from scientists.

The scientific method is a toolbox of skills that scientists have developed to prevent themselves from confirming their own biases.

The sixth commandment

Explain the cognitive underpinnings of pseudoscientific beliefs.

We are all prone

Access:

Public

WSRt, critical thinking, a list of terms used in the articles of block 2

This is a list of the important terms used in the articles of block 2 of WSRt at the uva.

Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability
Neyman, Pearson and hypothesis testing
Evaluating Theories
Degrees of falsifiability
Causal Inference and Developmental Psychology
Confounding and deconfounding: or, slaying the lurking variable
Critical thinking in Quasi-Experimentation
Beyond the null ritual, formal modeling of psychological processes
Simpson's paradox in psychological science: a practical guide
Fearing the future of empirical psychology
The 10 commandments of helping students distinguish science from pseudoscience in psychology

Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability

Accuracy motives: to learn and publish true things about human nature

Professional motives: to succeed and thrive professionally.

Neyman, Pearson and hypothesis testing

Statistical inference: the logic underlying all the statistics you see in the professional journals of psychology and most other disciplines that regularly use statistics.

The subjective interpretation of probability: a probability is a degree of conviction of a belief

The objective interpretation of probability: locate probability in the world.

Alpha: the long-term error rate for one type of error: saying the null is false when it is true.

Type I error: when the null is true and we reject it.

Type II error: accepting the null when it is false.

Meta-analysis: the process of combining groups of studies together to obtain overall tests of significance.

Evaluating Theories

Descriptive adequacy: does the theory accord with the available data?

Precision and interpretability: Is the theory described in a sufficiently precise fashion that other theorists can interpret it easily and unambiguously?

Coherence and consistency: Are there logical flaws in the theory? Does each component of the theory seem to fit with the others in to a coherent whole? Is it consistent with theory in other domains?

Prediction and falsifiability: Is the theory formulated in such a way that critical tests can be conducted that could reasonably lead to the rejection of the theory?

Postdiction and explanation: Does the theory provide a genuine explanation of existing results?

Parsimony: Is the theory as simple as possible?

Originality: Is the theory new or is it essentially a restatement of an existing theory?

Breadth: does the theory apply to a broad range of phenomena or is it restricted to a limited domain?

Usability: does the theory have applied implications?

Rationality: does the theory make claims about the architecture of mind that seem reasonable in the light of the environmental contingencies that have shaped or evolutionary theory?

Access:

Public

Everything you need for the course WSRt of the second year of Psychology at the Uva

This magazine contains all the summaries you need for the course WSRt at the second year of psychology at the Uva.

Summaries and supporting content:

WSRt, critical thinking, a list of terms used in the articles of block 2

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 3

WSRt using SPSS, manual for tests in the third block of the second year of psychology at the uva

WSRt, critical thinking - a summary of all articles needed in the third block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 4

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

Sharon Klinkenberg legt SPSS uit op YouTube

Summary of Discovering statistics using IBM SPSS statistics by Field - 5th edition

Critical thinking: A concise guide by Bowell & Kemp (4th edition) - a summary

What is a confidence interval in null hypothesis significance testing?

What is the difference between a p-value and Bayes likelihood?

What are important elements of Bayesian statistics?

What is the Bayes factor?

What are weaknesses of the Bayesian approach?

What is qualitative psychological research?

What criteria should be held by good qualitative research?

Year 2 of psychology at the uva

Access:

Public

Follow the author: SanneA

SanneA

More contributions of WorldSupporter author: SanneA:

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Comments, Compliments & Kudos:

Add new contribution

Promotions

JoHo kan jouw hulp goed gebruiken! Check hier de diverse studentenbanen die aansluiten bij je studie, je competenties verbeteren, je cv versterken en een bijdrage leveren aan een tolerantere wereld

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams
How and why would you use WorldSupporter.org for your summaries and study assistance?
Using and finding summaries, study notes en practice exams on JoHo WorldSupporter
Quicklinks to fields of study for summaries and study assistance

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

How and why would you use WorldSupporter.org for your summaries and study assistance?

For free use of many of the summaries and study aids provided or collected by your fellow students.
For free use of many of the lecture and study group notes, exam questions and practice questions.
For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
For compiling your own materials and contributions with relevant study help
For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, study notes en practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Use the menu above every page to go to one of the main starting pages
- Starting pages: for some fields of study and some university curricula editors have created (start) magazines where customised selections of summaries are put together to smoothen navigation. When you have found a magazine of your likings, add that page to your favorites so you can easily go to that starting point directly from your profile during future visits. Below you will find some start magazines per field of study
Use the topics and taxonomy terms
- The topics and taxonomy of the study and working fields gives you insight in the amount of summaries that are tagged by authors on specific subjects. This type of navigation can help find summaries that you could have missed when just using the search tools. Tags are organised per field of study and per study institution. Note: not all content is tagged thoroughly, so when this approach doesn't give the results you were looking for, please check the search tool as back up
Check or follow your (study) organizations:
- by checking or using your study organizations you are likely to discover all relevant study materials.
- this option is only available trough partner organizations
Check or follow authors or other WorldSupporters
- by following individual users, authors you are likely to discover more relevant study materials.
Use the Search tools
- 'Quick & Easy'- not very elegant but the fastest way to find a specific summary of a book or study assistance with a specific course or subject.
- The search tool is also available at the bottom of most pages

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study for summaries and study assistance

Field of study

Check the related and most recent topics and summaries:

Activity abroad, study field of working area:

Public
WorldSupporters only
JoHo members
Private

Statistics

2426