Beyond the Null Ritual: Formal Modeling of Psychological Processes - Marewski & Olsson - 2009 - Article

What lies beyond the zero ritual?
What is a model?
What is the scope of modeling?
What are the advantages of formally specifying theories?
What are more benefits of formal modeling: an example of a modeling framework?
How do you select between competing formal models?
What is the problem of overfitting?
How do you select between models?
How do you choose between model selection approaches?
What are other pitfalls of model selection?
Conclusion

One of the most used rituals in science is that of the null hypothesis: testing the hypothesis versus chance of chance. Although it is known to be problematic, it is often used in practice. One way to resist the temptation of using the null hypothesis is to make the theories more precise by transforming them into formal models. These can be tested against each other instead of against chance, which in turn enables the researcher to decide between competing theories based on quantitative measures.

The randomness of the .05 alpha level gives the writer flexibility in interpreting a p-value as an indication of proof against the null hypothesis. This article is about overcoming a ritual involved in testing hypotheses in psychology: the null ritual; or the null hypothesis significance testing.

This means that a non-specific hypothesis is tested against "chance", or it says that "there is no difference between two population means." 40 years ago, editors of major psychological journals required this ritual to be carried out in order for the paper to be published. Although methodological evidence nowadays contradicts this, the .05 alpha level is still used.

What lies beyond the zero ritual?

Rituals have a number of attributes that all apply to the null hypothesis: A repetition of the same action, a focus on the 5% (or 1%) level, fear of sanctions from journal editors and wishful thinking about the results. In its most extreme form, the zero ritual reads as follows:

Set up a statistical null hypothesis with "no mean difference" or "zero correlation". Do not specify the predictions of the research hypothesis or of alternative hypotheses.
Use 5% as a convention for the rejecting the null. If it is significant, accept the research hypothesis.
Always perform this procedure.

Since this ritual became institutionalized in psychology, several alternatives have been proposed to replace or supplement it. The most of these suggestions focus on the way the data is analyzed. Think about effect size measures, confidence intervals, meta-analysis an resample methods.

How is it possible that, despite attempts to introduce alternative ways, the null hypothesis is still being used, and the most? This may be due to the fact that most psychological theories are simply too weak to be able to do more than make predictions about directions of an effect. Therefore, we cannot offer an alternative form to the null hypothesis in this article, but a way to make theory more precise and to "make" it a formal model.

What is a model?

A model is a simplified representation of the world that is used to explain observed data: countless verbal and informal explanations of psychological phenomena. In a limited sense: a model is a formal instantiation of a theory that specifies the predictions.

What is the scope of modeling?

Modeling is not meant to always be applied in the same way. It must be seen as a tailor-made tool for specific problems. Modeling helps researchers to understand complex phenomena. Each method has its specific advantages and disadvantages of null hypothesis testing. Although it is also often used in other areas, it is often used in psychology for research on cognitive systems. Modeling is a complex undertaking that requires a lot of skills and knowledge.

What are the advantages of formally specifying theories?

There are four benefits of increasing precision of theories by casting them as models.

1. A model provides a design that has strong theory tests

Models provide the bridge between theories and empirical evidence. They enable scientists to make competitive quantitative predictions that lead to strong comparative testing of theories. Making comparative testing of theories more precise ultimately leads to better systematic quantitative predictions between the theories tested. By comparing quantitative predictions with different models, the use of the null hypothesis testing can become unnecessary and useless.

2. A model can sharpen a research question

Null hypothesis are often used to test verbal, informal theories. But if such theories are not specified, they can be used post-hoc to "explain" every possible observed empirical pattern. Formal quantitative predictions are not easy to understand due to intuitive reasoning. The predictions that a model makes can only be understood by performing computer simulations. In summary, often it is only through self-modeling that someone understands what a theory actually predicts and what it cannot justify. The goal of modeling is not only to find out which of competing explanations for data is preferred, but also to sharpen the questions to be asked.

3. A model can lead to the passing of theories that have arisen from the general linear model

Many null hypothesis significance tests only apply to simple hypothesis, for instance about linear addition effects. Scientists use available tools such as ANOVA and transform it into a psychological explanation for certain data. A prominent example is the attribution theory, which assumes that just as experimenters use ANOVA's to infer causal relations between two variables, outside the lab people infer causal relations by unconsciously doing the same calculation. But this might not be the best starting point for building a theory. Although general linear model (of which ANOVA is a way of) is a precise methodological tool, it is not always the best way to make statements or to base a theory on it.

4. A model helps to approach real-world problems

Just as the general linear model and a null hypothesis test are often inadequate when it comes to conceptualization and evaluation of a theory, factor designs can lead to testing theories under conditions that often have little to do with the real world, where the explanatory power of theories should be approved A lack of external validity can be one of the reasons why psychological outcomes make little contribution outside the lab in the real world: no person can randomly choose who they are in contact with, and no organism can "separate" correlations between life-sustaining information. But modeling, on the other hand, gives researchers the freedom to deal with natural confounders without destroying them: they can be built into the models. Modeling provides ways to increase the precision of theories. It helps researchers to quantify explanatory power. It ensures that they are not dependent on the null hypothesis. Formal statement can be linear and non-linear. By looking beyond factor designs, the possibility is created to approach real-world problems.

What are more benefits of formal modeling: an example of a modeling framework?

ACT-R is a wide, quantitative theory of human behavior that covers almost the entire human cognitive field.

Meta-analysis can be used to show that relying on significant tests slows the growth of cumulative knowledge. ACT-R, on the other hand, is a good example of how knowledge can systematically accumulate over time. ACT-R has its roots in old psychological theories, but in the end it knew its current form. This made it known that cognitive systems can give rise to adaptive processes by being transformed into static structures of the environment.

ACT-R models are specific enough for computer modeling for outcome and processes. For example, in a 2-alternative situation, reading this article or reading another article, an ACT-R model would predict which alternative would be chosen and what different reasons the model would consider before making this choice. Scientists can make the following predictions with ACT-R: (1) open behavior (2) temporal aspects of behavior (3) the associated patterns of an activity in the brain measured by fMRI scan.

In summary, modeling can promote the growth of cumulative knowledge, reveal how different behavioral activities are distributed and it can help to integrate psychological disciplines.

How do you select between competing formal models?

The comparison between alternative models is called model selection. There are a number of criteria for model selection: (1) psychological plausibility (2) falsifiability (3) number of assumptions that a model makes (4) whether a model is consistent with overarching theories (5) practical contribution. In practice, the criterion: descriptive adequacy is often used. This means that if 2 or more models are compared, the model that shows the least difference with existing data or the best fit is chosen.

A null hypothesis test is not a good way to choose between two models: if it has enough power, the test gives a significant result. But the biggest limitation of the model selection procedure based on significance or goodness-of-fit (R2) is that, on its own, these procedures do not approach the fundamental problem (choosing between two competing theories): overfitting.

What is the problem of overfitting?

To conclude that a model is better than the other using goodness-of-fit is reasonable if psychological measurements are noise-free. Unfortunately, noise-free data are practically impossible to obtain. As a result, there is an overfitting of the data: it not only captures the variance resulting from the cognitive process, but also that of a random error. Increased complexity causes a model to become overfit, thereby reducing generalizability. The generalizability is the degree to which the model is capable of predicting all potential samples generated by the same cognitive process, rather than to fit only a particular sample of existing data. The degree to which a model is susceptible to overfitting, is related to the model's complexity; the flexibility that enables it to fit certain patterns of data.

At the same time, the generalizability of a model can positively increase the complexity of a model - but only to the point where the model is complex enough to include the systematic variations of the data. If that point is exceeded, it reduces generalizability because the model randomly absorbs variations in the data. A good fit does not necessarily ensure good generalizability for new data.

How do you select between models?

Practical

This approach relies on the intuition that when comparing models, one should choose the model that can best predict unobservable data. This can be done by testing the validity of a test (often the cross-validity test is done). A limitation of this test form is that it is not consistent. Another way to deal with this problem is to dose as many free parameters as possible. This can be done by fixing it or by creating simple models with few or no free parameters.

Simulation

By simulating the predictions of a competing model, one can gain insight into a specific behavior of a model. The results can be used to design the task to maximize the discriminatory power between models.

Theoretical

In this approach, the goodness-of-fit measurement is combined with theoretical estimates of model complexity that result in an estimate of generalizability. Generalizability (= goodness or fit + complexity) is based on the maximum log likelihood as a goodness or fit index. The complexity measurement takes different forms for different generalization measurements. The most commonly used approaches are: AIC and BIC. These are only sensitive to a kind of complexity: the number of parameters.

How do you choose between model selection approaches?

We cannot answer the question which is better, but we recommend to note the results of as many selection criteria as possible and to discuss the suitability of each criterion.

What are other pitfalls of model selection?

There are also other complications that can arise when you design and test models. If specification is the greatest virtue of modeling, it can also be the greatest curse. One must choose how a bridge can be made between informal verbal descriptions and formal implementations. This can lead to unintended discrepancies between theories and various formal counterparts. This is known s the irrelevant specification problem.

A second problem that can arise in complex models is the Bononi paradox: when models become more complete and realistic they become less understandable and more opaque.

Third, there can also be an identification problem. That for any behavior that exists there is a universe of different models that are all capable of explaining and reproducing the behavior. There are also an infinite number of vague and informal theories going around for which nobody will ever be able to decide whether one is better than another.

Conclusion

Although modeling seems better from a scientific point of view, few use this approach to test. This is because it requires a lot of effort, time and knowledge. Accepting the null hypothesis test in laboratory settings leads to a reduction in the incentive for scientists to design models that explain issues in the "real" world. However, there is not often a better alternative to test and make models, because informal theories are not specific enough and they are tested against the chance of chance. With some knowledge and training, modeling can be performed with little effort.

Access:

Public

Check more: click and go to more related summaries or chapters

Summaries: the best scientific articles for research, science and statistics summarized

Critical Thinking in Quasi-Experimentation - Shadish - 2008 - Article

Causal inference and developmental psychology - Foster - 2010 - Article

Evaluating theories - Dennis & Kintsch - 2008 - Article

Karl Popper and Demarcation - Dienes - 2018 edition - Article

Surrogate Science: The Idol of a Universal Method for Scientific Inference - Gigerenzer - 2015 - Article

Beyond the Null Ritual: Formal Modeling of Psychological Processes - Marewski & Olsson - 2009 - Article

Simpson’s Paradox in Psychological Science: A Practical Guide - Kievit - 2013 - Article

Introduction to qualitative psychological research - an article by Coyle (2015)

False-positive psychology: Undiscovered flexibility in data collection and analysis allows presenting anything as significant - Simmons et al. - 2011 - Article

Check more: click and go to more related summaries or chapters

Article summaries of Scientific & Statistical Reasoning - UvA

Understanding Psychology as a Science - Dienes - 2008 - Article

False-positive psychology: Undiscovered flexibility in data collection and analysis allows presenting anything as significant - Simmons et al. - 2011 - Article

Causal inference and developmental psychology - Foster - 2010 - Article

Confounding and deconfounding: or, slaying the lurking variable - Pearl - 2018 - Article

Critical Thinking in Quasi-Experimentation - Shadish - 2008 - Article

The two disciplines of scientific psychology - Cronbach - 1957 - Article

Simpson’s Paradox in Psychological Science: A Practical Guide - Kievit - 2013 - Article

Beyond the Null Ritual: Formal Modeling of Psychological Processes - Marewski & Olsson - 2009 - Article

Evaluating theories - Dennis & Kintsch - 2008 - Article

Karl Popper and Demarcation - Dienes - 2018 edition - Article

Scaling - Furr & Bacharach - 2014 - Article

Statistical treatment of football numbers - Lord - 1935 - Article

Fearing the future of empirical psychology: Bem's (2011) evidence of psi as a case study of deficiencies in modal research practice - LeBel & Peters - 2011 - Article

Introduction to qualitative psychological research - Coyle - 2015 - Article

Surrogate Science: The Idol of a Universal Method for Scientific Inference - Gigerenzer - 2015 - Article

Summaries of articles with Scientific and Statisitical Reasoning at the University of Amsterdam 20/21

Understanding Psychology as a Science - Dienes - 2008 - Article

False-positive psychology: Undiscovered flexibility in data collection and analysis allows presenting anything as significant - Simmons et al. - 2011 - Article

Causal inference and developmental psychology - Foster - 2010 - Article

Confounding and deconfounding: or, slaying the lurking variable - Pearl - 2018 - Article

Critical Thinking in Quasi-Experimentation - Shadish - 2008 - Article

The two disciplines of scientific psychology - Cronbach - 1957 - Article

Simpson’s Paradox in Psychological Science: A Practical Guide - Kievit - 2013 - Article

Beyond the Null Ritual: Formal Modeling of Psychological Processes - Marewski & Olsson - 2009 - Article

Evaluating theories - Dennis & Kintsch - 2008 - Article

Karl Popper and Demarcation - Dienes - 2018 edition - Article

Scaling - Furr & Bacharach - 2014 - Article

Statistical treatment of football numbers - Lord - 1935 - Article

Fearing the future of empirical psychology: Bem's (2011) evidence of psi as a case study of deficiencies in modal research practice - LeBel & Peters - 2011 - Article

Introduction to qualitative psychological research - Coyle - 2015 - Article

Surrogate Science: The Idol of a Universal Method for Scientific Inference - Gigerenzer - 2015 - Article

Verzekeren bij een faire en solidaire zorgverzekeraar?

Join: WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Check: concept of JoHo WorldSupporter

Concept of JoHo WorldSupporter

JoHo WorldSupporter mission and vision:

JoHo wants to enable people and organizations to develop and work better together, and thereby contribute to a tolerant and sustainable world. Through physical and online platforms, it supports personal development and promote international cooperation is encouraged.

JoHo concept:

As a JoHo donor, member or insured, you provide support to the JoHo objectives. JoHo then supports you with tools, coaching and benefits in the areas of personal development and international activities.
JoHo's core services include: study support, competence development, coaching and insurance mediation when departure abroad.