Bayes and the probability of hypotheses - summary of Chapter 4 of Understanding Psychology as a science by Dienes

Critical thinking
Chapter 4 of Understanding Psychology as a science by Dienes
Bayes and the probability of hypotheses

Objective probability: a long-run relative frequency.
Classic (Neyman-Pearson) statistics can tell you the long-run relative frequency of different types of errors.

  • Classic statistics do not tell you the probability of any hypothesis being true.

An alternative approach to statistics is to start with what Bayesians say are people’s natural intuitions.
People want statistics to tell them the probability of their hypothesis being right.
Subjective probability: the subjective degree of conviction in a hypothesis.

Subjective probability

Subjective or personal probability: the degree of conviction we have in a hypothesis.
Probabilities are in the mind, not in the world.

The initial problem to address in making use of subjective probabilities is how to assign a precise number to how probable you think a proposition is.
The initial personal probability that you assign to any theory is up to you.
Sometimes it is useful to express your personal convictions in terms of odds rather than probabilities.

Odds(theory is true) = probability(theory is true)/probability(theory is false)
Probability = odds/(odds +1)

These numbers we get from deep inside us must obey the axioms of probability.
This is the stipulation that ensures the way we change our personal probability in a theory is coherent and rational.

  • People’s intuitions about how to change probabilities in the light of new information are notoriously bad.

This is where the statistician comes in and forces us to be disciplined.

There are only a few axioms, each more-or-less self-evidently reasonable.

  • Two aximons effectively set limits on what values probabilities can take.
    All probabilities will lie between 0 and 1
  • P(A or B) = P(A) + P(B), if A and B are mutually exclusive.
  • P(A and B) = P(A) x P(B|A)
    • P(B|A) is the probability of B given A.

Bayes’ theorem

H is the hypothesis
D is the data

P(H and D) = P(D) x P(H|D)
P(H and D) = P(H) x P(D|H)

so

P(D) x P(H|D) = P(H) x P(D|H)

Moving P(D) to the other side

P(H|D) = P(D|H) x P(H) / P(D)

This last one is Bayes theorem.
It tells you how to go from one conditional probability to its inverse.
We can simplify this equation if we are interested in comparing the probability of different hypotheses given the same data D.
Then P(D) is just a constant for all these comparisons.

P(H|D) is proportional to P(D|H) x P(H)

P(H) is called the prior.
It is how probable you thought the hypothesis was prior to collecting data.
It is your personal subjective probability and its value is completely up to you.

P(H|D) is called the posterior.
It is how probable your hypothesis is to you, after you have collected data.

P(D|H) is called the likelihood of the hypothesis
The probability of obtaining the data, given your hypothesis.

  • Your posterior is proportional to the likelihood times the prior.

This tells you how you can update our prior probability in a hypothesis given some data.
Your prior can be up to you, but having settled on it, the posterior is determined by the axioms of probability.
From the Bayesian perspective, scientific inference consists precisely in updating one’s personal conviction in a hypothesis in the light of data.

The likelihood

According to Bayes’ theorem, if you want to update your personal probability in a hypothesis, the likelihood tells you everything you need to know about the data.

  • Posterior is proportional to likelihood times prior

The likelihood principle: the notion that all the information relevant to inference contained in data is provided by the likelihood.

The data could be obtained given many different population proportions, but the data are more probable for some population proportions than others.

The highest likelihood is not the same as the highest probability.

  • The probability of the hypothesis in the light of the data is P(H|D), which is our posterior.
  • The likelihood of the hypothesis is the probability of the data given the hypothesis P(D|H)

We can use the likelihood to obtain our posterior, but they are not the same.
Just because a hypothesis has the highest likelihood, it does not mean you will assign the highest posterior probability.

  • The fact that a hypothesis has the highest likelihood means the data support that hypothesis most.
  • If the prior probabilities for each hypothesis were the same, then the hypothesis with the highest likelihood will have the highest posterior probability.
    • But the prior probabilities may mean that the hypothesis with the greatest support from the data, does not have the highest posterior probability.

Probability density distribution: the distribution of if the dependent variable can be assumed to vary continuously
A likelihood could be (or be proportional to) a probability density as well as a probability.

In significance testing, we calculate a form of P(D|H).
But, the P(D|H) used in significance testing is conceptually very different from the likelihood, the P(D|H) we are dealing with here.

  • The p-value in significance testing is the probability of rejecting the null, given the null is really true.
    • P(obtaining data as extreme or more extreme than D|H0)
    • In calculating a significance value, we hold fixed the hypothesis under consideration, H0, and we vary the data we might ave obtained.
  • The likelihood is P(obtaining exactly this D|H)
    • Here H is free to vary, but the D considered is always exactly the data obtained.
  • In calculating the likelihood, we are interested in the height of the curve for each hypothesis
    • It reflects just what the data were
  • In significance testing, we are interested in the ‘tail area’
    • This area is the probability of obtaining our data or data more extreme
  • In significance testing, we make a black and white decision
  • Likelihoods give a continuous graded measure of support for different hypotheses

In significance testing, tail areas are calculating in order to determine long-run error rates.
The aim of classic statistics is to come up with a procedure for making decisions that is reliable, which is to say that the procedures has known controlled long-run error rates.
To decide the long-run error rates, we need to define a collective.

Bayesian analysis

Bayes’ theorem says that posterior is proportional to likelihood times prior.
We can use this in two ways when dealing with real data

  • We can calculate a credibility interval
    • Credibility interval: the Bayesian equivalent of a confidence interval
  • We can calculate how to adjust our odds in favour of a theory we are testing over the null hypothesis in the light of our experimental data
    • The Bayes factor: the Bayesian equivalent of null hypothesis testing

Credibility intervals

Flat prior or uniform prior: you have no idea what the population value is likely to be

In choosing a prior decide:

  • Whether your prior can be approximated by a normal distribution
    • if so, what the mean of this distribution is
    • if so, what the standard deviation of this distribution is

Formulae for normal posterior:

  • Mean of prior = M0
  • Mean of sample = Md
  • Standard deviation of prior = S0
  • Standard error of sample = SE
  • Precision of prior = c0 = 1/S02
  • Precision of sample = cd = 1/SE2
  • Posterior precision = c1 = c0 + cd
  • Posterior mean M1 = (c0/c1)M0 + (cd/c1)Md
  • Posterior standard deviation, S1 = square root(1/c1)

For a reasonably diffuse prior (one presenting fairly vague prior options), the posterior is dominated by the likelihood.
If you started with a flat or uniform prior (you have no opinion concerning which values are most likely), the posterior would be identical to the likelihood.
Even if people started with very different priors, if you collect enough data, as long as the priors were smooth and allowed some non-negligble probability in the region of the true population value, the posteriors, being dominated by the likelihood, would come to be very similar.

If the prior and likelihood are normal, the posterior is also normal.
Having found the posterior distribution, you have really found out all you need to know.

The credibility interval is affected by any prior information you had.
But not with all the things that affect the confidence interval.

The Bayes factor

There is no such thing as significance testing in Bayesian statistics.
All one often has to do as a Bayesian statistician is determine posterior distributions.
With the Bayes factor you can compare the probability of an experimental theory to the probability of the null hypothesis.

H1 is your experimental hypothesis
H0 is the null hypothesis

P(H1|D) is proportional to P(D|H1) x P(H1)
P(H0|D) is proportional to P(D|H0) x P(H0)

P(H1|D)/ P(H0|D) = P(D|H1) /P(D|H0) x P(H1)/ P(H0)
Posterior odds = likelihood ratio x prior odds

The likelihood ratio is (in this case) called the Bayes factor B in favour of the experimental hypothesis.
Whatever your prior odds were in favour of the experimental hypothesis over the null, after data collection multiply those odds by B to get your posterior odds.

  • If B is greater than 1, your data support the experimental hypothesis over the null
  • If B is less than 1, your data support the null over the experimental hypothesis
  • If B is about 1, then your experiment was not sensitive

The Bayes factor gives the means of adjusting your odds in a continuous way.

Image

Access: 
Public

Image

Join WorldSupporter!
Search a summary

Image

 

 

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.

Image

Spotlight: topics

Check the related and most recent topics and summaries:
Institutions, jobs and organizations:
Activity abroad, study field of working area:
Countries and regions:
This content is also used in .....

Image

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams

How and why use WorldSupporter.org for your summaries and study assistance?

  • For free use of many of the summaries and study aids provided or collected by your fellow students.
  • For free use of many of the lecture and study group notes, exam questions and practice questions.
  • For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
  • For compiling your own materials and contributions with relevant study help
  • For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

  1. Use the summaries home pages for your study or field of study
  2. Use the check and search pages for summaries and study aids by field of study, subject or faculty
  3. Use and follow your (study) organization
    • by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
    • this option is only available through partner organizations
  4. Check or follow authors or other WorldSupporters
  5. Use the menu above each page to go to the main theme pages for summaries
    • Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Main study fields NL:

Follow the author: SanneA
Work for WorldSupporter

Image

JoHo can really use your help!  Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics
2439