Bayesian Versus orthodox statistics: which side are you on? - summary of an article by Dienes, 2011

Critical thinking
Article: Dienes, Z, 2011
Bayesian Versus orthodox statistics: which side are you on?
doi: 10.1177/1745691611406920

The contrast: orthodox versus Bayesian statistics
The probabilities of data given theory and theory given data
Problems with the Neyman Pearson approach
The rationality of the Bayesian approach
Effect size
How to calculate a Bayes factor
Multiple testing and cheating
Weaknesses of the Bayesian approach

The contrast: orthodox versus Bayesian statistics

The orthodox logic of statistics, starts from the assumption that probabilities are long-run relative frequencies.
A long-run relative frequency requires an indefinitely large series of events that constitutes the collective probability of some property (q) occurring is then the proportion of events in the collective with property q.

The probability applies to the whole collective, not to any one person.
- One person may belong to two different collectives that have different probabilities
Long run relative frequencies do not apply to the truth of individual theories because theories are not collectives. They are just true or false.
- Thus, when using this approach to probability, the null hypothesis of no population difference between two particular conditions cannot be assigned a probability.
Given both a theory and a decision procedure, one can determine a long-run relative frequency with which certain data might be obtained. We can symbolize this as P(data| theory and decision procedure).

The logic of Neyman Pearson (orthodox) statistics is to adopt decision procedures with known long-term error rates and then control those errors at acceptable levels.

Alpha: the error rate for false positives, the significance level
Beta: the error rate for false negatives

Thus, setting significance and power controls long-run error rates.

An error rate can be calculated from the tail area of test statistics.
An error rate can be adjusted for factors that affect long-run error rates
These error rates apply to decision procedures, not to individual experiments.
- An individual experiment is a one-time event, so does not constitute a long-run set of events
- A decision procedure can in principle be considered to apply over a indefinite long-run number of experiments.

The probabilities of data given theory and theory given data

The probability of a theory being true given data can be symbolized as P(theory|data).
This is what orthodox statistics tell us.
One cannot infer one conditional probability just by knowing its inverse. (So P(data|theory) is unknown).

Bayesian statistics starts from the premise that we can assign degrees of plausibility to theories, and what we want our data to do is to tell us how to adjust these plausibilities.

When we start from this assumption, there is no longer a need for the notion of significance, p value, or power.
Instead, we simply determine the factor by which we should change the probability of different theories given the data.

The likelihood

In the Bayesian approach, applies to the truth of theories.
We can answer the questions about:

p(H), the probability of a hypothesis being true (our prior probability)
p(H|D), the probability of a hypothesis given the data (our posterior probability).

Neither of these can be do using the orthodox approach.
Likelihood: the probability of obtaining the exact data given the hypothesis.

Posterior is given by likelihood times prior.

The likelihood principle: all information relevant to inference contained in data is provided by the likelihood.
When we are determining how given data changes the relative probability of our different theories, it is only the likelihood that connects the prior to the posterior.

The likelihood is the probability of obtaining the exact data obtained given a hypothesis (P(D|H).
This is different from a p value, which is the probability of obtaining the same or more extreme data given both a hypothesis and a decision procedure.

A p-value for a t test is a tail area of the t distribution
The corresponding likelihood is the height of the distribution at the point representing the data

In orthodox statistics, p values are changed according to the decision procedure; under what conditions one would stop collecting data, whether or not the test is post hoc, how many other test one conducted.
None of these factor influence the likelihood.

The Bayes factor

The Bayes factor pits one theory against another.

Prior probabilities and prior odds can be entirely personal and subjective.
There is no reason why people should agree about these before data are collected if they are not part of the publically presented inferential procedure.
If the priors form part of the inferential procedure, they must be fairly produced and subjected to the tribunal of peer judgement.

One data are collected we can calculate the likelihood for each theory.
These likelihoods are things we want researchers to agree on. Any probabilities that contribute to them should be plausibly or simply determined by determined by the specification of the theories.
The Bayes factor (B): the ratio of likelihoods.

Posterior odds = B x prior odds.

If B is greater than 1, the data supported you experimental hypothesis over the null.
If B is less than 1, the data supported the null hypothesis over the experimental one.
If B is about 1, the experiment was not sensitive.

The evidence is continuous and there are not thresholds in Bayesian theory.
B automatically gives a notion of sensitivity, it directly distinguishes data supporting the null from data uninformative about whether the null or you theory was supported.

For both p values associated with a t test and for B, if the null is false, as a number of subjects increases, then test scores are driven in one direction.

p values are expected to become smaller
Both t and B values are expected to become larger

When the null hypothesis is true, p values are not driven in any direction, only B us. B is then driven to zero.

Problems with the Neyman Pearson approach

Stopping rule

In the Neyman Pearson approach, one must specify the stopping rule in advance.
Once those conditions are met, there is to be no more data collection.
Typically, this means one should use a power calculation to plan in advance how many subjects to run.

The Bayes factor behaves differently from p values as more data are run (regardless of stopping rule).

For a p value, if the null is true, any value in the interval 0 to 1 is equally likely no matter how much data you collect
- For this reason, sooner or later, you are guaranteed to get a significant result if you run subjects long enough and stop when you get the p value you want
- When the null is true, as the number of subjects increases, the p value is not driven to any particular value.
As the number of subjects increases and the null is true, the Bayes factor is driven toward zero.

Planned versus post hoc comparisons

When using Neyman Pearson, it matters whether you formulated your hypothesis before or after looking at the data (post hoc vs. planned comparisons).
Predictions made before rather than after looking at the data are treated differently.

Post hoc fitting can involve preference for one auxiliary over may others of at least equal plausibility.

In Bayesian inference, the evidence for a theory is just as strong regardless of its timing relative to the data.
This is because the likelihood is unaffected by the time the data were collected.
The likelihood principle follows from the axioms of probability.
It is not the ability to predict in advance per se that is important, that ability is just an (imperfect) indicator of the prior probability of relevant hypotheses.
When performing Bayesian inference, there is no need to adjust for the timing of predictions per se.

Multiple testing

When using Neyman Pearson, one must correct for how many tests are conducted in a family of tests.

When using Bayes, it does not matter how many other statistical hypotheses are investigated. All that matters is the data relevant to each hypothesis under investigation.
Once one takes into account the full context, the axioms of probability lead to sensible answers.

In the Bayes approach, rather than the Neyman Pearson approach, that is most likely to demand that researchers draw appropriate conclusions from a body of relevant data involving multiple testing.

The rationality of the Bayesian approach

If we want to determine by how much we should revise continuous degrees of belief, we need to make sure our system of inference obeys the axioms of probability.
If researchers want to think in terms of degree of support data provide for a hypothesis, they should make sure their inferences obey the axioms of probability.

One version of degrees of belief are subjective probabilities.
Subjective probabilities: personal convictions in an opinion.
When probabilities of different propositions form part of the inferential procedure we use in deriving conclusions from data, then we need to make sure that the procedure is fair.
Thus, there has been an attempt to specify objective probabilities that follow from the informational specification of a problem.
In this way, the probabilities become an objective part of the problem, with values that can be argued about, given the explicit assumptions, and that do not depend any further on personal idiosyncrasies.

One notion of rationality is having sufficient justification for one’s beliefs.
If one can assign numerical continuous degrees of justification to beliefs, then some simple minimal desiderata lead to the likelihood principle of inference.
Hypothesis testing violates the likelihood principle.

Effect size

Bayes factors demand consideration of relevant effect sizes.

Neyman developed two specific measures of sensitivity:

Power
Confidence intervals: the set of population values that the data are consistent with.

For any continuous measure based on a finite number of subjects, an interval cannot be an infinitesimally small point.
A null result is always consistent with population values other than zero.
That is why a non-significant result cannot on its own lead to the conclusion that the null hypothesis is true.

Theories and practical questions generally specify, even if vaguely, relevant effect sizes.
The research context, usually provides a range of effects that are too small to be relevant and a range of effects that are consistent with theory or practical use.

Researchers have relevant intuitions, and that is why it has made sense to them to assert null hypotheses.
Bayes makes them explicit.
If we want to use null results in any way to count against theories that predict an effect, we must consider the range of effect sizes consistent with the theory.

Effect size is very important in the Neyman Pearson approach.

One must specify the sort of effect one predicts in order to calculate power.

On the other hand, Fisherian significance testing leads people to ignore effect sizes.

People have followed Fisher’s method, while paying lip service to effect sizes, but not heeding Fisher’s advice that nothing follows from a null result.

One must specify what sort of effect sizes a theory predicts to calculate a Bayes factor.
Because it takes into account effect size, the Bayes factor distinguishes evidence that there is not relevant effect from no evidence of a relevant effect.
One can only confirm a null hypothesis when one has specified the effect size expected on the theory being tested.

In specifying theoretically expected effect sizes, we should ask ourselves “What size effect does the literature suggest is interesting for this particular domain?” Rather than following the common practice of plucking a standardized effect size of 0.5 out of thin air, researchers should get to know the data of the field.

Confidence intervals themselves have all the problems for Neyman Pearson inference in general (unlike credibility or likelihood intervals).
Because confidence intervals consists of all values non-significantly different from the sample mean, they inherit the arbitrariness of significance testing.

How to calculate a Bayes factor

To calculate a Bayes factor in support of a theory, one has to specify what the probability of different effect sizes are, given the theory.
Bayes gives us the apparatus to flexibly deal with different degrees of uncertainty regarding the predicted effect size.
Logically, one needs to know what a theory predicts in order to know how much it is supported by evidence.

Three distributions

In terms of predictions of the theory (or requirements of a practical effect), one has to decide what range of effects are relevant to the theory.
Three ranges:

An uniform distribution
All values between a lower bound and an upper bound.
All values are represented as possible and equally likely given the theory and all those outside are inconsistent with the theory
A normal distribution
One value is the most likely given the theory, and any values lower or higher are progressively less likely
Normal distribution centred on zero with only one tail
The theory predicts an effect in one direction, but smaller values are generally more likely than larger values

Different ways of using Bayes factors

Bayes factor is suggested to be used on any data where the null hypothesis is compared with a default theory.
Or when inference is based on the posterior and thus takes into account the priors of hypothesis.
Also for specific hypotheses that interest the researcher and allows priors to remain personal and not part of public inference.

By following Bayes rule, each of these approaches means rational answers are provided for the given assumptions, and researchers may choose each according to their goals and which assumptions seem relevant to them.

Bayes factors are just one form of Bayesian inference, namely a method for evaluating theories against another.

Multiple testing and cheating

With Bayes factors, one does not have to worry about corrections for multiple testing, stopping rules, or planned versus post hoc comparisons.
Bayes factor just tells you how much support given data provides for one theory over another.
There is no right Bayes factor.

Strictly, each Bayes factor is a completely accurate indication of the support for the data of one theory over another.
The theories are defined by the precise predictions they make.
The crucial question is which of these representations best matches the theory as the researcher has described it and related it to the existing literature.

One constraint on the researcher will be the demand for consistency: arguing for one application of a theory ties one’s hands when it comes to another application.
The solution is to use a default Bayes factor for all occasions, though this amounts to evaluating a default theory for all occasions, regardless of one’s actual theory.
A default Bayes factor will only test your theory if it happens to correspond to the default.
Another solution is to define the predictions according to simple procedures to ensure the theory proposed is tested according to fair criteria.

When using Bayes in multiple testing, one can use the fact that one is testing multiple hypotheses to inform the results if one believes that testing these multiple hypotheses is relevant to the probability of any of them being true.

Weaknesses of the Bayesian approach

Bayesian analysis force people to consider what a theory actually predicts, but specifying the predictions in detail may be contentious.
Bayesian analysis escape the paradoxes of violating the likelihood principle, but in doing so they no longer control for Type I and Type II errors.

Calculating a Bayes factor depends on answering the following question about which there may be disagreement: What way of assigning probability distributions of effect sizes as predicted by theories would be accepted by protagonists on all sides of a debate?

Ultimately, the issue is about what is more important to us: using a procedure with known long-term error rates or knowing the degree of support for our theory.

Check page access:

Public

Join WorldSupporter!

Join with a free account for more service, or become a member for full access and support of WordSupporter

This content is related to:

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

This is a summary of the articles and reading materials that are needed for the fourth block in the course WSR-t. This course is given to second year psychology students at the Uva. The course is about thinking critically about how scientific research is done and how this...Read more

2259 reads

Check more or recent content:

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs - summary of an article by Borsboom, Rhemtulla, Cramer, van der Maas, Scheffer and Dolan

Critical thinking
Article: Borsboom, Rhemtulla, Cramer, van der Maas, Scheffer and Dolan (2016)
Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs

The present paper reviews psychometric modelling approaches that can be used to investigate the question whether psychopathology constructs are discrete or continuous dimensions through application of statistical models.

Introduction
Measurement theoretical definitions of kinds and continua
Kinds and continua as psychometric entities
Alternative latent variable models

Introduction

The question of whether mental disorders should be thought of as discrete categories or as continua represents an important issue in clinical psychology and psychiatry.

The DSM-V typically adheres to a categorical model, in which discrete diagnoses are based on patterns of symptoms.

But, such categorizations often involve apparently arbitrary conventions.

Measurement theoretical definitions of kinds and continua

All measurement starts with categorization, the formation of equivalence classes.
Equivalence classes: sets of individuals who are exchangeable with respect to the attribute of interest.
We may not succeed in finding an observational procedure that in fact yields the desired equivalence classes.

We may find that individuals who have been assigned the same label are not indistinguishable with respect to the attribute of interest.
Because there are now three classes rather than two, next to the relation between individuals within cases (equivalence), we may also represent systematic relations between members of different cases.
One may do so by invoking the concept of order.
But, we may find that within these classes, there are non-trivial differences between individuals that we wish to represent.

If we break down the classes further, we may represent them with a scale that starts to approach continuity.

The continuity hypothesis formally implies that:

in between any two positions lies a third that can be empirically instantiated
there are no gaps in the continuum.

In psychological terms, categorical representations line up naturally with an interpretation of disorders as discrete disease entities, while continuum hypotheses are most naturally consistent with the idea that a construct varies continuously in a population.

in a continuous interpretation, the distinction between individuals depends on the imposition of a cut-off score that does not reflect a gap that is inherent in the attribute itself.

Kinds and continua as psychometric entities

In psychology, we have no way to decide conclusively whether two individuals are ‘equally depressed’.
This means we cannot form the equivalence classes necessary for measurement theory to operate.
The standard approach to dealing with this situation in psychology is

Access:

Public

Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology - summary of an article by Eaton, Krueger, Docherty, and Sponheim

Critical thinking
Article: Eaton, Krueger, Docherty, and Sponheim (2013)
Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology

This paper illustrates how new statistical methods can inform conceptualization of personality psychopathology and therefore its assessment.

The relationship between structure and assessment
Distributional assumptions of personality constructs
Model-based tests of distributional assumptions
Discussion

The relationship between structure and assessment

Structural assumptions about personality variables are inextricably linked to personality assessment.

reliable assessment of normal-range personality traits, and personality disorder categories, frequently takes different forms, given that the constructs of interest are presumed to have different structures.
when assessing personality traits, the assessor needs to measure the full range of the trait dimension to determine where an individual falls in it.
then assessing the presence or absence of a DSM-V personality disorder, the assessor needs to evaluate the presence of absence of the binary categorical diagnosis.
given the polythetic nature of criterion sets, the purpose of the assessment is to determine which criteria are present, calculate the number of present criteria, and note whether this sum meets or exceeds a diagnostic threshold.

The nature of the personality assessment instrument reflect assumptions about the distributional characteristics of the construct of interest.

items on DSM-oriented inventories are usually intended to gather converging pieces of information about each criterion to determine whether or not it is present.

Distributional assumptions of personality constructs

Historically, many assumptions about the distributions of data reflecting personality constructs resulted form expert opinion or theory.
Both ‘type’ theories and dimensional theories have been proposed.
Assessment instruments have reflected this bifurcation in conceptualization.

The resulting implications for assessment are far from trivial
The structure of a personality test designed to determine whether an individual is one or two personality types, needs only to assess the two characteristics, as opposed to assessing characteristics that are more indicative or mid-range.
- There is no mid-ground in type theory, so items covering middle-ground are not relevant.

Because the structure of personality assessment is reflective of the underlying distributional assumptions of the personality constructs of interest, reliance solely on expert opinion about these distributions is potentially problematic.

Model-based tests of distributional assumptions

It is critical for personality theory and assessment that underlying distributional assumptions of symptomatology be correct and justifiable.

different distributions impact the way clinical and research constructs are conceptualized, measured, and applied to individuals.
characterizing these latent constructs properly is a prerequisite for efforts to asses them.
- it is of limited

Access:

Public

Bayes and the probability of hypotheses - summary of Chapter 4 of Understanding Psychology as a science by Dienes

Critical thinking
Chapter 4 of Understanding Psychology as a science by Dienes
Bayes and the probability of hypotheses

Objective probability: a long-run relative frequency.
Classic (Neyman-Pearson) statistics can tell you the long-run relative frequency of different types of errors.

Classic statistics do not tell you the probability of any hypothesis being true.

An alternative approach to statistics is to start with what Bayesians say are people’s natural intuitions.
People want statistics to tell them the probability of their hypothesis being right.
Subjective probability: the subjective degree of conviction in a hypothesis.

Subjective probability
Bayes’ theorem
Bayesian analysis

Subjective probability

Subjective or personal probability: the degree of conviction we have in a hypothesis.
Probabilities are in the mind, not in the world.

The initial problem to address in making use of subjective probabilities is how to assign a precise number to how probable you think a proposition is.
The initial personal probability that you assign to any theory is up to you.
Sometimes it is useful to express your personal convictions in terms of odds rather than probabilities.

Odds(theory is true) = probability(theory is true)/probability(theory is false)
Probability = odds/(odds +1)

These numbers we get from deep inside us must obey the axioms of probability.
This is the stipulation that ensures the way we change our personal probability in a theory is coherent and rational.

People’s intuitions about how to change probabilities in the light of new information are notoriously bad.

This is where the statistician comes in and forces us to be disciplined.

There are only a few axioms, each more-or-less self-evidently reasonable.

Two aximons effectively set limits on what values probabilities can take.
All probabilities will lie between 0 and 1
P(A or B) = P(A) + P(B), if A and B are mutually exclusive.
P(A and B) = P(A) x P(B|A)
- P(B|A) is the probability of B given A.

Bayes’ theorem

H is the hypothesis
D is the data

P(H and D) = P(D) x P(H|D)
P(H and D) = P(H) x P(D|H)

P(D) x P(H|D) = P(H) x P(D|H)

Moving P(D) to the other side

P(H|D) = P(D|H) x P(H) / P(D)

This last one is Bayes theorem.
It tells you how to go from one conditional probability to its inverse.
We can simplify this equation if we are interested in comparing the probability of different hypotheses given the same data D.
Then P(D) is just a constant for all these comparisons.

P(H|D) is proportional to P(D|H) x

Access:

Public

Bayesian Versus orthodox statistics: which side are you on? - summary of an article by Dienes, 2011

Critical thinking
Article: Dienes, Z, 2011
Bayesian Versus orthodox statistics: which side are you on?
doi: 10.1177/1745691611406920

The contrast: orthodox versus Bayesian statistics
The probabilities of data given theory and theory given data
Problems with the Neyman Pearson approach
The rationality of the Bayesian approach
Effect size
How to calculate a Bayes factor
Multiple testing and cheating
Weaknesses of the Bayesian approach

The contrast: orthodox versus Bayesian statistics

The probability applies to the whole collective, not to any one person.
- One person may belong to two different collectives that have different probabilities
Long run relative frequencies do not apply to the truth of individual theories because theories are not collectives. They are just true or false.
- Thus, when using this approach to probability, the null hypothesis of no population difference between two particular conditions cannot be assigned a probability.
Given both a theory and a decision procedure, one can determine a long-run relative frequency with which certain data might be obtained. We can symbolize this as P(data| theory and decision procedure).

The logic of Neyman Pearson (orthodox) statistics is to adopt decision procedures with known long-term error rates and then control those errors at acceptable levels.

Alpha: the error rate for false positives, the significance level
Beta: the error rate for false negatives

Thus, setting significance and power controls long-run error rates.

An error rate can be calculated from the tail area of test statistics.
An error rate can be adjusted for factors that affect long-run error rates
These error rates apply to decision procedures, not to individual experiments.
- An individual experiment is a one-time event, so does not constitute a long-run set of events
- A decision procedure can in principle be considered to apply over a indefinite long-run number of experiments.

The probabilities of data given theory and theory given data

Bayesian statistics starts from the premise that we

Access:

Public

Network Analysis: An Integrative Approach to the Structure of Psychopathology - summary of an article by Borsboom and Cramer (2013)

Critical thinking
Article: Borsboom, D. and Cramer, A, O, J. (2013)
Network Analysis: An Integrative Approach to the Structure of Psychopathology
doi: 10.1146/annurev-clinpsy-050212-185608

Introduction
Symptoms and disorders in psychopathology
Complex psychopathology networks
Constructing and analysing psychopathology networks
The many roads to disorder: individual networks

Introduction

The current dominant paradigm of the disease model of psychopathology is problematic.
Current handling of psychopathology data is predicated on traditional psychometric approaches that are the technical mirror of of this paradigm.
In these approaches, observables (clinical symptoms) are explained by means of a small set of latent variables, just like symptoms are explained by disorders.

From this psychometric perspective, symptoms are regarded as measurements of a disorder, and in accordance, symptoms are aggregated in a total score that reflects a person’s stance on that latent variable.
The dominant paradigm is not merely a matter of theoretical choice, but also of methodological and pragmatic necessity.

In this review, we argue that complex network approaches, which are currently being developed at the crossroads of various scientific fields, have the potential to provide a way of thinking about disorders that does justice to their complex organisation.

In such approaches, disorders are conceptualized as systems of causally connected symptoms rather than as effects of a latent disorder.
Using network analysis techniques, such systems can be represented, analysed, and studied in their full complexity.
In addition, network modeling has the philosophical advantage of dropping the unrealistic idea that symptoms of a single disorder share a single causal background, while it simultaneously avoids the realistic consequence that disorders are merely labels for an arbitrary set of symptoms.
- It provides a middle ground in which disorders exists as systems, rather than as entities

Symptoms and disorders in psychopathology

We know for certain that people suffer from symptoms and that these symptoms cluster in a non-arbitrary way.
For most psychopathological conditions, the symptoms are only empirically identifiable causes of distress.

Mental disorders are themselves not empirically identifiable in that they cannot be diagnosed independently of their symptoms.
- It is impossible to identify any of the common mental disorders as conditions that exists independently of their symptoms.

In order for a disease model to hold, it should be possible to conceptually separate conditions from symptoms.

It must be possible (or at least imaginable) that a person should have a condition/disease without the associated symptoms.

This isn’t possible for mental disorders.
As an important corollary, this means that disorders cannot be causes of these

Access:

Public

Introduction to qualitative psychological research - an article by Coyle (2015)

Critical thinking
Article: Coyle, A (2015)
Introduction to qualitative psychological research

Introduction

This chapter examines the development of psychological interest in qualitative methods in historical context and point to the benefits that psychology gains from qualitative research.
It also looks at some important issues and developments in qualitative psychology.

Epistemology and the ‘scientific method’
Resistance to the ‘scientific method’: alternative epistemologies and research foci
Reflexivity in qualitative research
Evaluative criteria for qualitative research
Combining research methods and approaches

Epistemology and the ‘scientific method’

At its most basic, qualitative psychological research may be regarded as involving the collection and analysis of non-numerical data through a psychological lens in order to provide rich descriptions and possibly explanations of peoples meaning-making, how they make sense of the world and how they experience particular events.

Qualitative research is bound up with particular sets of assumptions about the bases or possibilities of knowledge.
Epistemology: particular sets of assumptions about the bases or possibilities of knowledge.
Epistemology refers to a branch of philosophy that is concerned with the theory of knowledge and that tries to answer questions about how we can know what we know.
Ontology: the assumptions we make about the nature of being, existence or reality.

Different research approaches and methods are associated with different epistemologies.
The term ‘qualitative research’ covers a variety of methods with a range of epistemologies, resulting in a domain that is characterized by difference and tension.

The epistemology adopted by a particular study can be determined by a number of factors.

A researcher may have a favoured epistemological outlook or position and may locate their research within this, choosing methods that accord to with that position.
Alternatively, the researcher may be keen to use a particular qualitative method in their research and so they frame their study according to the epistemology that is usually associated with that method.

Whatever epistemological position is adopted in a study, it is usually desirable to ensure that you maintain this position consistently throughout the wire-up to help produce a coherent research report.

Positivism: holds that the relationship between the world and our sense perception of the world is straightforward. There is a direct correspondence between things in the world and our perception of them provided that our perception is not skewed by factors that might damage that correspondence.
So, it is possible to obtain accurate knowledge of things in the world, provided we can adopt an impartial, unbiased, objective viewpoint.

Empiricism: holds that our knowledge of the world must arise from the collection and categorization of our sense perceptions/observations of the world.
This categorization allows us to develop more complex knowledge

Access:

Public

Surrogate Science: The Idol of a Universal Method for Scientific Inference - summary of an article by Gigerenzer & Marewski

Critical thinking
Article: Gigerenzer, G. & Marewski, J, N. (2015)
Surrogate Science: The Idol of a Universal Method for Scientific Inference
doi: 10.1177/0149206314547522

Introduction

Scientific inference should not be made mechanically.
Good science requires both statistical tools and informed judgment about what model to construct, what hypotheses to test, and what tools to use.

This article is about the idol of a universal method of statistical inference.

In this article, we make three points:

There is no universal method of scientific inference, but, rather a toolbox of useful statistical methods. In the absence of a universal method, its followers worship surrogate idols, such as significant p values.
The inevitable gap between the ideal and its surrogate is bridged with delusions.
These mistaken beliefs do much harm. Among others, by promoting irreproducible results.
If the proclaimed ‘Bayesian revolution’ were to take place, the danger is that the idol of a universal method might survive in a new guise, proclaiming that all uncertainty can be reduced to subjective probabilities.
Statistical methods are not simply applied to a discipline. They change the discipline itself, and vice versa.

Dreaming up a universal method of inference
Bayesianism and the new quest for an universal method
How statistics change research, surrogate science
Conclusion: Leibniz’s dram of Bayes’ nightmare?

Dreaming up a universal method of inference

The null ritual

The most prominent creation of a seemingly universal inference method is the null ritual:

Set up a null hypothesis of ‘no mean inference’ or ‘zero correlation’. Do not specify the predictions or your own research hypothesis.
Use 5% as a convention for rejecting the null. If significant, accept you research hypothesis. Report the result as p<.05, p<.01, p<.001, whichever comes next to the obtained p value.
Always perform this procedure.

Level of significance has three different meanings:

A mere convention
The alpha level
The exact level of significance

Three meanings of significance

The alpha level: the long-term relative frequency of mistakenly rejecting hypothesis H₀if it is true, also known as Type I error rate.
The beta level: the long-term frequency of mistakenly rejecting H₁ if it is true.

Two statistical hypothesis need to be specified in order to be able to determine both alpha and beta.
Neyman and Pearson rejected a mere convention in favour of an alpha level that required a rational scheme.

Set up two statistical hypotheses, H₁, H₂, and decide on alpha, beta and the sample size before the experiment, based on subjective cost-benefit considerations.
If the data fall

Access:

Public

WSRt, critical thinking, a list of terms used in the articles of block 4

This is a list of the important terms used in the articles of the fourth block of WSRt, with the subject alternative approaches to psychological research.

Article: Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs
Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology
Bayes and the probability of hypotheses
Bayesian Versus orthodox statistics: which side are you on?
Network Analysis: An Integrative Approach to the Structure of Psychopathology
Introduction to qualitative psychological research
Surrogate Science: The Idol of a Universal Method for Scientific Inference - summary of an article by Gigerenzer & Marewski

Article: Kinds versus continua: a review of psychometric approaches to uncover the structure of psychiatric constructs

Equivalence classes: sets of individuals who are exchangeable with respect to the attribute of interest.

Taxometrics: by inspecting particular consequences of the model for specific statistical properties of (subsets of) items, such as the patterns of bivariate correlations expected to hold in the data

Toward a Model-Based Approach to the Clinical Assessment of Personality Psychopathology

Latent trait models: posit the presence of one or more underlying continuous distributions.

Zones of rarity: locations along the dimension that are unoccupied by some individuals.

Discrimination: the measure of how strongly the item taps into the latent trait.

Quasi-continuous: the construct would be bounded at the low end by zero, a complete absence of the quality corresponding with the construct.

Latent class models: based on the supposition of a latent group (class) structure for a construct’s distribution.

Conditional independence: that inter-item correlations solely reflect class membership.

Hybrid models (of factor mixture models): combine the continuous aspects of latent trait models with the discrete aspects of latent class models.

EFMA: exploratory factor mixture analysis.

Bayes and the probability of hypotheses

Objective probability: a long-run relative frequency.

Subjective probability: the subjective degree of conviction in a hypothesis.

The likelihood principle: the notion that all the information relevant to inference contained in data is provided by the likelihood.

Probability density distribution: the distribution of if the dependent variable can be assumed to vary continuously

Credibility interval: the Bayesian equivalent of a confidence interval

The Bayes factor: the Bayesian equivalent of null hypothesis testing

Flat prior or uniform prior: you have no idea what the population value is likely to be

Bayesian

Access:

JoHo members

Everything you need for the course WSRt of the second year of Psychology at the Uva

This magazine contains all the summaries you need for the course WSRt at the second year of psychology at the Uva.

Supporting content:

WSRt, critical thinking, a list of terms used in the articles of block 2

WSRt, critical thinking - a summary of all articles needed in the second block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 3

WSRt using SPSS, manual for tests in the third block of the second year of psychology at the uva

WSRt, critical thinking - a summary of all articles needed in the third block of second year psychology at the uva

WSRt, critical thinking, a list of terms used in the articles of block 4

WSRt, critical thinking - a summary of all articles needed in the fourth block of second year psychology at the uva

Sharon Klinkenberg legt SPSS uit op YouTube

Discovering statistics using IBM SPSS statistics by A. Field (5th edition) a summary

Critical thinking: A concise guide by Bowell & Kemp (4th edition) - a summary

What is a confidence interval in null hypothesis significance testing?

What is the difference between a p-value and Bayes likelihood?

What are important elements of Bayesian statistics?

What is the Bayes factor?

What are weaknesses of the Bayesian approach?

What is qualitative psychological research?

What criteria should be held by good qualitative research?

Year 2 of psychology at the uva

Access:

Public

What is a confidence interval in null hypothesis significance testing?

An confidence interval is an interval brought out an algorithm that by repeated use gives an X% change to hold the true population value.

What are important elements of Bayesian statistics?

The three most important elements of Bayesian statistics are:

The Prior: the relative plausibility of hypothesis, before seeing the data
Likelihood: the predictive updating factor
The Posterior: the relative plausibility of hypothesis, after seeing the data

For more information about Bayesian statistics, check out my summary of the fourth block of WSRt

What is the Bayes factor?

The Bayes factor (B) compares the probability of an experimental theory to the probability of the null hypothesis.
It gives the means of adjusting your odds in a continuous way.

If B is greater than 1, your data support the experimental hypothesis over the null
If B is less than 1, your data support the null over the experimental hypothesis
If B is about 1, then your experiment was not sensitive

For more information, look at the (free) summary of 'Bayes and the probability of hypotheses' or 'Bayesian versus orthodox statistics: which side are you one?'

What are weaknesses of the Bayesian approach?

Weaknesses of the Bayesian approach are:

The prior is subjective
Bayesian analysis force people to consider what a theory actually predicts, but specifying the predictions in detail may by contentious
Bayesian analysis escape the paradoxes of violating the likelihood principle, but in doing so they no longer control for Type I and Type II errors

For more information, look at the (free) summary of 'Bayesian versus orthodox statistics: which side are you on?'

What is qualitative psychological research?

At its most basic, qualitative psychological research can be seen as involving the collection and analysis of non-numerical data through a psychological lens in order to provide rich descriptions and possibly explanations of peoples meaning-making, how they make sense of the world and how they experience particular events.

For more information, look at the (free) summary of 'Introduction to qualitative psychological research'

What criteria should be held by good qualitative research?

Criteria that should be held by good qualitative research are:

Sensitivity to context
Commitment
Rigour
Transparency
Coherence
Impact and importance

For more information about these criteria, look at my (free) summary of 'Introduction to qualitative psychological research, Coyle (2015)'.

How to use this summary?

Associate with your Field of Study

Search Summaries or Notes

Start using Summaries

Add a Summary

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Check more of this topic?

Psychologie en gedrag

Check all content related to:

Learn & Study

The Netherlands

Universiteit Amsterdam: UVA

Psychologie en gedrag

How to use more summaries?

Online access to all summaries, study notes en practice exams
Using and finding summaries, study notes en practice exams on JoHo WorldSupporter
Quicklinks to fields of study (main tags and taxonomy terms)

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

Using and finding summaries, study notes en practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Starting Pages: for some fields of study and some university curricula editors have created (start) magazines where customised selections of summaries are put together to smoothen navigation. When you have found a magazine of your likings, add that page to your favorites so you can easily go to that starting point directly from your profile during future visits. Below you will find some start magazines per field of study
Use the menu above every page to go to one of the main starting pages
Tags & Taxonomy: gives you insight in the amount of summaries that are tagged by authors on specific subjects. This type of navigation can help find summaries that you could have missed when just using the search tools. Tags are organised per field of study and per study institution. Note: not all content is tagged thoroughly, so when this approach doesn't give the results you were looking for, please check the search tool as back up
Follow authors or (study) organizations: by following individual users, authors and your study organizations you are likely to discover more relevant study materials.
Search tool : 'quick & dirty'- not very elegant but the fastest way to find a specific summary of a book or study assistance with a specific course or subject. The search tool is also available at the bottom of most pages

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study (main tags and taxonomy terms)

Field of study

Check other studie fields?

Main study and working fields

Access level of this page

Public
WorldSupporters only
JoHo members
Private

Statistics

2097

Comments, Compliments & Kudos:

mistake kuqelis contributed on 20-06-2021 14:58

I think the part you mentioned about data and theory might be incorrect check this out:

The probability of a theory being true given data can be symbolized as P(theory | data), and that is what many of us would like to know. This is the inverse of P(data | theory), which is what orthodox statistics tells us.

Add new contribution

Promotions

JoHo kan jouw hulp goed gebruiken! Check hier de diverse studentenbanen die aansluiten bij je studie, je competenties verbeteren, je cv versterken en een bijdrage leveren aan een tolerantere wereld