Summary | What is the Item Response Theory (IRT) and which models are there? - Chapter 14

Join Log in Profile Search

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

What is the Item Response Theory (IRT) and which models are there? - Chapter 14

What is IRT?
What is item discrimination?
Which IRT models are there?
Which parameters can you estimate?
How can you describe the characteristics of the test as a whole?
For which purposes can IRT be applied?

What is IRT?

The Item Response Theory (IRT) is an alternative to the classical test theory (CTT). The IRT identifies and analyzes the measurements in behavioral sciences. The reaction of the individual to a certain test item is influenced by characteristics of the individual ( trait level ) and properties of the item (difficulty level).

For a difficult item/question someone needs a high ' trait level' to be able to give a correct answer.
Conversely, with an easy item / question, someone with a low ' trait level' is enough to give a good answer.

Example:

Statement 1: I like to chat with my friends.
Statement 2: I like to speak to a large audience.

Statement 1 needs a low extraversion level (= trait level) to agree with this.
Statement 2 needs a high extraversion level (= trait level) to agree with this.

IRT analysis has a distribution of (0; 1), the average is 0, and the standard deviation is 1.

So if an item has a difficulty level of 0 then:

Has an individual with an average trait level (so 0), 50% chance of a correct answer.
Has an individual with a high trait level (therefore higher than 0), a greater chance than 50% of a correct answer.
Has an individual with a low trait level (therefore lower than 0), a smaller chance than 50% of a correct answer.

What is item discrimination?

Item discrimination refers to distinguishing individuals in low and high trait levels. The discrimination value of the item indicates the relevance of the item in relation to the trait level being measured.

Positive discrimination ≥ 0: relationship between item and trait (property) that is being measured. This means that high trait scores provide a greater chance of answering the item correctly and low trait scores provide a smaller chance of answering the item correctly.
Negative discrimination ≤ 0: inconsistency between item and trait . This means that high trait scores provide a smaller chance to answer the item correctly.
Discrimination value = 0: no relationship between item and trait (property) that is measured by the test.

So: the greater (positive) the discrimination value, the more consistent, the better.

A third component that must be taken into account is gambling. With multiple choice or true / false questions, people might gamble if they don't know the answer. Because of this, they sometimes give the correct answer while they actually did not know the correct answer. IRT can include gambling as a component in the analysis.

Which IRT models are there?

According to the IRT perspective we can identify the components that influence the likelihood that a person will react to a certain item in a certain way. A measurement model expresses the relationship between the outcome (the response of an individual to a certain item) and the components that influence the outcome (the skills of the person, the quality of the item). There are different measurement models, each expressing this link in their own way. In other words; IRT models show the mathematical link between the observed scores and the components that influence the scores. These are both the characteristics of the individual and the characteristics of the item. In this section we will discuss the most common IRT models.

The one-parameter model (1PL): The Rasch model

The Rasch model (one-parameter logistic model) (= 1PL) only has the properties of the individual and the properties of the item as components that influence the scores.

P(Xis=1| Өs, βi) = (e ^{(Өs – βi)}) / (1 + e ^{(Өs – βi)} )

P = chance of a certain answer on item i of respondent s.

X is = response X to item i of respondent s. " X is = 1" indicates a correct answer for this item.

= S = trait level of respondent s.

β i = difficulty value item i.

e = logarithm, you can find this on your calculator.

The two-parameter (2PL) model

The two-parameter model (2PL) has three components that influence the scores, namely the characteristics of the individual, the characteristics of the item and the item discrimination.

The formula here is:

P(Xis=1| Өs, βi, αi) = (e ^{(αi (Өs – βi))} / (1 + e ^{(αi (Өs – βi))} )

α = the discrimination of item i.

The three-parameter (3PL) model

The chance of gambling is also included in the three-parameter model . The 3PL model can be seen as a variation on the 2PL model, where one component has been added (the chance of gambling): c i refers to the lower chance of answering item i correctly . According to the 3PL model, the chance of a correct answer is therefore influenced by:

The characteristics of the individual, i.e., the " trait level" Ө;
the item difficulty β;
the item discrimination α;
the "gamble parameter".

Graded Response Model

The 1PL and 2PL model are made for items with binary answer options. The Graded Response Model (GRM) is made for testing, etc. with more than two answer options. As with previous models, this model assumes that a person's response to an item is affected by that person 's trait level, item difficulty, and item discrimination. But the GRM has different difficulty parameters for one item.

If there are m number of answer options or categories, a distinction can be made m-1 time between answer options. For example, for an item with five answer options (strongly disagree, disagree, neutral, agree, totally agree) there are four differences. Such as the difference between 'agree' and 'totally agree'. Each of these differences can be represented in the following way:

P(Xis ≥ j| Өs, βij, αi) = (e ^{(αi (Өs – βi))}) / (1 + e ^{(αi (Өs – βi))} )

J = the answer option.

βij = difficulty parameter for answer option j on item i.

Other parameters are the same as with the previous models.

P is the chance that a person with trait level s on item i will choose answer option j or higher.

There are m - 1 difficulty parameters (βij) for each item.

You can also calculate the chance that someone will choose a specific answer to a certain item:

P(Xis = j| Өs, βij, αi) = P(Xis ≥ j – 1| Өs, βij, αi) - P(Xis ≥ j| Өs, βij, αi).

J = the answer option (eg completely agree).

J - 1 = the answer option for it (eg agree).

Which parameters can you estimate?

Proportion of correctly answered items for each respondent = divide the proportion of correctly answered items by the total number of answered items.
Trait level: Ө s = LN ( Ps / 1-Ps)
Ps = proportion of correctly answered items by respondent s.
LN = (natural) logarithm
Proportion of correct responses for each item: divide the number of respondents who answered correctly by the total number of respondents who responded.

Item difficulty: βi = LN (1-Pi / Pi)
Pi = proportion of correct responses / correct answers for item i
LN = ( natural ) Log

How can you describe the characteristics of the test as a whole?

Item characteristic curve (ICC)

An item characteristic curve gives the chance of a correct answer to an item for a person with a certain trait level.

x-axis: trait level (with 0.00 = average)
y-axis: chance of correct answer (between 0.00 and 1.00)
from left to right à easiest item (left) à hardest item (right)

An example of the item characteristic curves of four items from a test is shown below.

In this example , the item discrimination parameter is the largest for item 1. Suppose a person has a skill of Ө = 6, then the chance of success (ie, a correct answer) for item 1 is great, but for items 3 and 4 low (even almost 0). Suppose a person has a skill of Ө = 5, then the most likely score pattern (order item 1, item 2, item 3, item 4, where 1 = right and 0 = wrong): 1, 1, 0, 0.

Item information and test information

Perspective of the CTT: there is a single reliability for a test.

Perspective of the IRT: there is more than one reliability. The psychometric quality of a test is better in some people than in others. So a test may give better information for some trait levels than other trait levels.

For example if there are two difficult questions and four respondents: two of them have a low trait level, the other two have a high trait level. The test then provides more information about the two people with high trait levels. The people with low trait levels both answer the difficult questions incorrectly, so even if they have a different low trait level you won't see that on this test. For the two people with the high trait levels, one of them may answer one item correctly and the other answer both items correctly. The test therefore provides more information about people with high trait levels, because small differences in trait level are noted in this group.
Item information can be calculated using the following formula:

I (Ө) = Pi (Ө) (1 - Pi (Ө))

I (Ө) is the item information on a certain trait level (Ө).

Pi (Ө) is the chance that a respondent with a certain trait level will answer the item correctly.

Higher item information values indicate a better psychometric quality of the item.

If we calculate information values for different trait levels then we can display these in an item information curve. Higher curves indicate better quality. The top of a curve represents the trait level at which the item provides the most information.

Item information values of a specific trait level can be added together to determine the test information value of that trait level. If we calculate test information scores for multiple trait levels, we can display them in a test information curve. From this you can read how much information t

For which purposes can IRT be applied?

IRT is a theoretical perspective that is used for different purposes in psychological measurements. A number of applications of IRT are:

Evaluation and improvement of psychometric properties of items and tests.
Evaluate the presence of differential item functioning (DIF). DIF is when the properties of an item in one group are different than in another group. For example a man and a woman with the same trait level have a different chance to answer the item correctly.
Analyzing Person Fit. This is an attempt to identify people whose response pattern does not match the patterns of responses expected on a set of items.
Computerized Adaptive Testing (CAT). CAT is a method that is used to accurately and efficiently determine someone's trait level by conducting computer-controlled testing. The test adjusts the questions to someone's trait level. If you have answered a question correctly, the next question is more difficult, if you answer it correctly, you will get a more difficult question, if you answer the difficult question incorrectly, you will get an easier question. In this way someone can determine his trait level quicker.

For a difficult item / question, someone needs a high ' trait level' to be able to give a correct answer.
Conversely, with an easy item / question, someone with a low ' trait level' is enough to give a good answer.

Access:

Public

Check more of this topic?

Statistics and Data analysis Methods

Search other summaries?

Associate with your Field of Study

Search Summaries or Notes

Start using Summaries

Add a Summary

Click & Go to more related summaries or chapters:

Summary of Psychometrics: An Introduction by Furr - 3rd edition

Summaries per chapter with the 3rd edition of Psychometrics: An Introduction by Furr

Please note: for more summaries and study assistance with more and more recent editions of the book, you can check:

Study Guide for summaries with Psychometrics: An introduction by Furr

Summaries and supporting content:

What is psychometrics? - Chapter 1

What is important when assigning numbers to psychological constructs? - Chapter 2

What are variability and covariability? - Chapter 3

What is dimensionality and what is factor analysis? - Chapter 4

What is reliability? - Chapter 5

How to empirically estimate the reliability? - Chapter 6

What is the importance of reliability? - Chapter 7

What is validity? - Chapter 8

How to evaluate evidence for convergent and divergent validity? - Chapter 9

What types of response bias are there? - Chapter 10

What types of test bias are there? - Chapter 11

What is a confirmatory factor analysis? - Chapter 12

What is the generalizability theory? - Chapter 13

What is the Item Response Theory (IRT) and which models are there? - Chapter 14

Access:

Public

Follow the author: Psychology Supporter

Psychology Supporter

More contributions of WorldSupporter author: Psychology Supporter:

Summary of Statistical Methods for the Social Sciences by Agresti - 5th edition - Exclusive

Samenvatting van Statistical Methods for the Social Sciences van Agresti - 5e druk- Exclusive

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Comments, Compliments & Kudos:

Add new contribution

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams
How and why would you use WorldSupporter.org for your summaries and study assistance?
Using and finding summaries, study notes en practice exams on JoHo WorldSupporter
Quicklinks to fields of study for summaries and study assistance

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

How and why would you use WorldSupporter.org for your summaries and study assistance?

For free use of many of the summaries and study aids provided or collected by your fellow students.
For free use of many of the lecture and study group notes, exam questions and practice questions.
For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
For compiling your own materials and contributions with relevant study help
For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, study notes en practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Use the menu above every page to go to one of the main starting pages
- Starting pages: for some fields of study and some university curricula editors have created (start) magazines where customised selections of summaries are put together to smoothen navigation. When you have found a magazine of your likings, add that page to your favorites so you can easily go to that starting point directly from your profile during future visits. Below you will find some start magazines per field of study
Use the topics and taxonomy terms
- The topics and taxonomy of the study and working fields gives you insight in the amount of summaries that are tagged by authors on specific subjects. This type of navigation can help find summaries that you could have missed when just using the search tools. Tags are organised per field of study and per study institution. Note: not all content is tagged thoroughly, so when this approach doesn't give the results you were looking for, please check the search tool as back up
Check or follow your (study) organizations:
- by checking or using your study organizations you are likely to discover all relevant study materials.
- this option is only available trough partner organizations
Check or follow authors or other WorldSupporters
- by following individual users, authors you are likely to discover more relevant study materials.
Use the Search tools
- 'Quick & Easy'- not very elegant but the fastest way to find a specific summary of a book or study assistance with a specific course or subject.
- The search tool is also available at the bottom of most pages

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study for summaries and study assistance

Field of study

Statistics and Data analysis Methods

Statistics

1259