Scientific & Statistical Reasoning – Article summary (UNIVERSITY OF AMSTERDAM)
- 2125 keer gelezen
Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>
A falsifier of a theory is any potential observation statement that would contradict the theory. There are different degrees of falsifiability, as some theories require fewer data points to be falsified than others. In other words, simple theories should be preferred as these theories require fewer data points to be falsified. The greater the universality a theory, the more falsifiable it is.
A computational model is a computer simulation of a subject. It has free parameters, numbers that have to be set (e.g. number of neurons used in a computational model of neurons). When using computational models, more than one model will be able to fit the actual data. However, the most falsifiable model that has not been falsified by the data (fits the data) should be used.
A theory should only be revised or changed to make it more falsifiable. Making it less falsifiable is ad hoc. Any revision or amendment to the theory should also be falsifiable. Falsifia
Standard statistics are useful in determining probabilities based on the objective probabilities, the long-run relative frequency. This does not, however, give the probability of a hypothesis being correct.
Subjective probability refers to the subjective degree of conviction in a hypothesis. The subjective probability is based on a person’s state of mind. Subjective probabilities need to follow the axioms of probability.
Bayes’ theorem is a method of getting from one conditional probability (e.g. P(A|B)) to the inverse. The subjective probability of a hypothesis is called the prior. The posterior is how probable the hypothesis is to you after data collection. The probability of obtaining the data given the hypothesis is called the likelihood (e.g. P(D|H). The posterior is proportional to the likelihood times the prior. Bayesian statistics is updating the personal conviction in light of new data.
The likelihood principle states that all the information relevant to inference contained in data is provided by the likelihood. A hypothesis having the highest likelihood does not mean that it has the highest probability. A hypothesis having the highest likelihood means that the data support the hypothesis the most. The posterior probability is not reliant on the likelihood.
The probability distribution of a continuous variable is called the probability density distribution. It has this name, as a continuous variable has infinite possibilities and probabilities in this distribution gives the probability of any interval.
A likelihood could be a probability or a probability density and it can also be proportional to a probability or a probability density. Likelihoods provide a continuous graded measure of support for different hypotheses.
In Bayesian statistics (likelihood analysis), the data is fixed but the hypothesis can vary. In significance testing, the hypothesis is fixed (null hypothesis) but the data can vary. The height of the curve of the distribution for each hypothesis is relevant in calculating the likelihood. In significance testing, the tail area of the distribution is relevant.
Classical statistics requires a stopping rule for data, differences in analysis methods when there are multiple tests (e.g. alpha needs to be adjusted for multiple t-tests) and the timing of the explanation of the data influences the conclusion (e.g. post-hoc tests or not). In Bayesian statistics, these things do not matter, as the hypothesis is adjusted according to the data. This leads to a stopping rule being obsolete, as well as adjusting the hypothesis or needing to explain the data before the data collection as the hypothesis changes in accordance with the data.
The stopping rule, differences in analysis methods when there are multiple tests and the timing of the data influencing the conclusion are violations of the likelihood principle.
The credibility interval is a confidence interval using Bayesian statistics. The Bayes factor refers to how to adjust our odds in favour of a theory we are testing over the null hypothesis in the light of our experimental data. A flat prior or a uniform prior is believing all population values are equally likely.
In choosing a prior, it is relevant to determine whether it can be approximated by a normal distribution (1) and what the mean and standard deviation of the prior is (2).
The mean for the likelihood function is the mean of the difference scores (e.g. condition 1 – condition 2). The standard deviation of the likelihood function is the standard error of these difference scores. If a t-test is used, the standard error can be calculated as following:
There are several formulas for the normal posterior in Bayesian statistics:
Mean of prior | |
Mean of sample | |
Standard deviation of prior | |
Standard error of sample | |
Precision of prior | |
Precision of sample | 2 |
Posterior mean M1 | |
Posterior standard deviation |
If the prior and the likelihood are normal, then the posterior is also normal. A Bayes factor of smaller than 1, the data support the null hypothesis over the experimental hypothesis. If the Bayes factor is greater than 1, the data support the experimental hypothesis over the null hypothesis. If the Bayes factor is approximately 1, then the experiment was not sensitive enough to distinguish between the two hypotheses.
There are three main things in Bayesian statistics
P(H|E) is the posterior. P(H) is the prior probability. P(E|H) is the likelihood. P(E) is the total probability of E.
There are several objections to Bayesian statistics:
Probabilities are the areas under a fixed distribution. P(data|distribution). Likelihoods are the y-axis values for the fixed data points with distributions that can be moved. L(distribution|data).
bility is not always easy, as some observations are influenced by the theory. A prediction, based on the theory, requires a definition of this observation, which is often also based on the theory. This makes it more difficult to falsify a theory.
The Duhem-Quine problem states that it is not possible to test a single hypothesis and thus falsify a single theory or hypothesis, as this hypothesis relies on several theories. An observation that falsifies the prediction does not necessarily falsify the main theory, as this relies on several other theories and it is unclear which theory is false exactly.
The standard logic of statistical inference is called the Neyman-Pearson approach. There are four axioms, rules, that probability must follow in order to be able to interpret probabilities. In the Neyman-Pearson approach, there are four axioms:
The subjective interpretation of probability states that probability is the certain conviction in a belief. The objective interpretation of probability states that probability is the long-term relative frequency of an event. Probabilities do not apply to singular events but only to the collective of events (all the hypothetical events) as reflected in the gambler’s fallacy.
A hypothesis can be seen as a single event and therefore does not have a probability. It is simply true or false. The p-value gives the probability of falsely rejecting or not rejecting the null hypothesis over the long-run. Hypothesis testing only gives the long-term error rate and not the probability of the null-hypothesis being true.
The probability of falsely accepting the null hypothesis, falsely rejecting the alternative hypothesis, in the long-run is β. The probability of β is 1 – power. Power is the probability of finding an effect in the population given that the effect actually exists. It is possible to control for β by doing the following:
Sensitivity can be determined by power (1), confidence intervals (2) and finding an effect that is significantly different from another reference one (3). The stopping rules refers to the conditions under which you will cease data collection. In the Neyman-Pearson approach to statistics, sequential stopping rules are used. When conducting multiple tests, a stricter significance level should be used in order to control for inflated type-I error rates.
A more significant result does not imply that the effect size is larger or that the result is more important. There are several points of criticism of the Neyman-Pearson approach:
This bundle contains everything you need to know for the fifth interim exam for the course "Scientific & Statistical Reasoning" given at the University of Amsterdam. It contains both articles, book chapters and lectures. It consists of the following materials:
...This bundle contains everything you need to know for the fifth interim exam for the course "Scientific & Statistical Reasoning" given at the University of Amsterdam. It contains both articles, book chapters and lectures. It consists of the following materials:
...This bundle contains everything you need to know for the fifth interim exam for the course "Scientific & Statistical Reasoning" given at the University of Amsterdam. It contains both articles, book chapters and lectures. It consists of the following materials:
...This bundle contains all the summaries for the course "Scientific & Statistical Reasoning" given at the University of Amsterdam. It contains the following articles:
There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.
Do you want to share your summaries with JoHo WorldSupporter and its visitors?
Field of study
JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world
2214 | 1 |
Add new contribution