A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 4

Join Log in Profile Search

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

Giving scores to different responses (e.g. agree = 3) is called scoring by fiat and has no theoretical justification. The observed test score is derived from the item scores by taking the unweighted or the weighted sum of the item scores. The construct score is derived from the item responses under the assumption of a latent variable response model.

The unweighted sum of item scores is the sum of all the item scores of a person j. It uses the following formula:

The weighted sum of item scores can be used because some items need to weigh heavier than others. This uses the following formula:

‘w’ denotes the weight of an item. The formula for the population mean of observed scores is the following:

It is the sum of the (un)weighted score divided by the number of test takers. ‘ε’ denotes expectation and ‘p’ denotes that the expectation is taken with respect to the population of test takers. The formula for the test variance is the following:’

It means the sum of (the test score of person j – the expected score) squared divided by the number of test takers. The expected score is the population mean of observed scores. The formula for the population standard deviation is the following:

It is the square root of the test variance. The population mean test score is estimated by using the sample mean and uses the following formula:

It is the sum of the observed test scores divided by the number of test takers. The item mean of item k uses the following formula:

It is the sum of item score on item k divided by the number of test takers. For dichotomous items, the item mean is equal to the proportion of that item. The formula for item variance of item k is the following formula:

It is the sum of the item score k minus the mean of item k squared. The item test variance uses the same formula, except it uses test scores of test taker j, instead of item scores. The item standard deviation is the square root of the item variance. Item variance for item k for dichotomous items can be calculated using the following formula:

The correlation between item k and item l can be calculated in the following way:

It is the sum of item score k minus the mean of item k times item score l minus the mean of item l divided by the square root of the sum of item score k minus the mean of item k squared times the item score l minus the mean of item score l squared.

The correlation between item k and item l for dichotomous items can be calculated in the following way:

It is the proportion of item k and item l minus the proportion of item k times the proportion of item l divided by the square root of proportion of item k times (one minus the proportion of item k) times the proportion of item l times (one minus the proportion of item l).

Covariance is the measure of how much two variables vary together. The covariance between item k and item l can be calculated using the following formula:

It is the sum of item score k for test taker j minus the mean of item k times the item score l for test taker j minus the mean of item l divided by number of test takers minus one. The covariance between item k and item l for dichotomous items can be calculated using the following formula:

It is the number of test takers divided by the number of test takers minus one times the proportion of item kl minus the proportion of item k times the proportion of item l. The test variance can also be calculated by summing up everything in the variance-covariance matrix. The variance-covariance matrix is a matrix where the covariance between all items are calculated.

Access:

Public

This content is related to:

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Book summary

This bundle describes a summary of the book "A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) ". The following chapters are used: - 1, 2, 3, 4, 5, 6...Read more

1047 reads

Test Theory and Practice – Exam [UNIVERSITY OF AMSTERDAM]

This bundle contains everything you need to know for the exam of Test Theory and Practice for the University of Amsterdam. It uses the book "A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J...Read more

1701 reads

Check more of this topic?

Samenvattingen voor psychologie en gedrag

Search other summaries?

Associate with your Field of Study

Search Summaries or Notes

Start using Summaries

Add a Summary

This content is also used in .....

Test Theory and Practice – Exam [UNIVERSITY OF AMSTERDAM]

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 1

A psychological or educational test is an instrument for the measurement of a person’s maximum or typical performance under standardized conditions, where performance is assumed to reflect one or more latent attributes. Tests are used for diagnosis and psychological and educational decision-making. The dimensionality of a test or subtest is equal to the number of latent attributes which effects test performance. An item is the smallest possible subtest of a test.

A mental test consists of cognitive tasks. A physical test consists of somatic or physiological measurements. A pure power test consists of problems that the test taker tries to solve, without a time-limit. A time-limited power test is a pure power test with a time-limit. A speed test measures the speed taken to solve problems. An ability test (aptitude test) measures a person’s best performance in an area that is not explicitly taught in training and educational programs. An achievement test measures a person’s best performance in an area that is explicitly taught in training and educational programs.

Access:

Public

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 2

Test development consists of several steps:

The construct of interest
The latent variable (construct) that the test is supposed to measure has to be specified. Latent variables can vary in scope (1), content (2) and between educational and psychological variables (3).
The measurement mode
The test has to specify how the latent variable will be measured. This includes three modes: self-performance mode (1), self-evaluation mode (2) and other-evaluation mode (3). Modes can be reactive or nonreactive. In a reactive mode, test takers can deliberately distort their construct value. In a non-reactive mode, test takers cannot deliberately distort their construct value.
The objectives of the test
The test has to specify the objectives of the test. Tests can be used for practical and scientific purposes and objectives can be at the level of an individual test taker or at the level of a group of test takers. Objectives distinguish between description, diagnosis and decision making.
The population
The target population, the set of persons to whom the test has to be applied has to be specified. This often includes inclusion and exclusion criteria.
The conceptual framework
This is the theoretical framework on which the test is based. It is used to make conceptual distinctions and organize ideas.
The item response mode
The test has to specify how test takers respond to the items. This can be a free- or constructed response mode, which includes short-answer items and essay items. It can also be a choice or selected response mode. It can make use of frequency or intensity response scales and of endorsement response scales.
The administration mode
The test has to specify how the test is administered. Administration mode can be oral (1), paper and pencil (2), computerized (3) and computerized adaptive test administration (4). Computerized adaptive test administration adjusts the difficulty of the test according to the level of the test taker.

Response scales can be dichotomous (two ordered categories), partial ordinal-polytomous (more than two ordered categories) and ordinal-polytomous (completely ordered categories). There are several item-writing guidelines:

Focus on one relevant aspect
This means that items should not test two aspects at the same time.
Use independent item content
Items should be independent of each other, but this is not necessary if the questions are based on a reading passage.
Avoid overly specific and overly general content
Overly specific and overly general content lead to ambiguity in the answers.
Avoid items that deliberately deceive test takers
Items that distract test takers’ attention from the problem that they have to solve should be avoided.
Keep vocabulary simple for the population of test takers
For native speakers, the items should not require the reading skill beyond that of a twelve year old.
Put item options vertically
Minimize reading time and avoid unnecessary information
Unnecessary information obscures

Access:

Public

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 3

Typical performance tests assess behaviour that is typical for the person. There are three main types of typical performance tests: personality tests (1), interest inventories (2) and attitude questionnaires (3). The steps for test development of typical performance tests are the same as the steps for test development in general.

There are three classes of strategies for the conceptual framework:

Intuitive class (no / informal knowledge)	Rational method Using all knowledge there is to find about a construct. Prototypical method Items based on prototype of behaviour / construct
Inductive class (weak theory / knowledge)	Internal method Items that seem related to the construct are gathered and administered. Highly correlated items represent a construct. External method Different items are selected and administered. Items that correlate highly with a criterion are selected.
Deductive class (strong theory / knowledge)	Construct method Construction on basis of a strong theory about the construct and its relation to other constructs. Facet method Conceptual analysis of the construct and every aspect (facet) of the construct is measured in a systematic way.

There are several item writing guidelines which are especially relevant for typical performance tests:

Elicit different answers at different construct positions
Test takers who have completely different construct positions should give different answers to the item.
Focus on one aspect per item
Avoid making assumptions about test takers
Test takers that do not meet the assumptions cannot answer the question.
Use correct language
Use clear and comprehensible wording
Use non-sensitive language and content
Put the situational or conditional part of a statement at the beginning and the behavioural part at the end
Use positive statements
Use 5-7 categories in ordinal polytomous response scales
Label each of the categories of a response scale and avoid the use of numbers alone
Format response categories vertically

An indicative item is an item where a high frequency or endorsement indicates a high level of the construct. A contraindicative item is an item where a high frequency or endorsement indicates a low level of the construct. Response tendencies are the differential application of the response scales. The response set is the differential use of the item response scale by different persons and with

Access:

Public

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 4

The unweighted sum of item scores is the sum of all the item scores of a person j. It uses the following formula:

The weighted sum of item scores can be used because some items need to weigh heavier than others. This uses the following formula:

‘w’ denotes the weight of an item. The formula for the population mean of observed scores is the following:

It is the square root of the test variance. The population mean test score is estimated by using the sample mean and uses the following formula:

It is the sum of the observed test scores divided by the number of test takers. The item mean of item k uses the following formula:

The correlation between item k and item l can be calculated in the following way:

It is the sum of item score k minus the mean of item k times item score l minus the mean of item l divided by

Access:

Public

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 5

Measurement precision consists of information (1) and reliability (2). Information applies to the test score of a single person. It is the within-person aspect of measurement precision. Reliability applies to a population of persons. It is the between-person aspect of measurement precision. The true score is the score with a perfect measurement instrument. Measurement error is the distortion of the true score because the measurement instrument is not perfect. They are unsystematic influences.

The test score of a person is the true score of that person plus the measurement error:

The true score equals the mean test score over infinite test administrations. The expected value of the measurement error over infinite trials equals zero. Test taker j’s standard error of measurement is the square root of the within-person variance. The information on test taker j’s true score is the inverse of the within-person error variance and uses the following formula:

0.8-1.0	Good
0.7-0.8	Sufficient
0.6-0.7	Moderate
0.0-0.6	Poor

ere are several guidelines for reliability. A parallel test is a different test with exactly the same properties as the original test. Reliability is the correlation between two parallel test scores. Reliability can be calculated by splitting the test in two halves and treating them both as parallel tests. The reliability of each part equals the correlation between the two parts. It makes use of the following formula:

It is two times the correlation between part one and part two divided by one plus the correlation between part one and part two. Cronbach’s alpha is used to calculate the lower-bound of the reliability of the full test. It uses the following formula:

It is the number of items divided by number of items minus one times (one minus the sum of item variances divided by the test variance). The reliability depends on the number of items. A large test is usually more reliable than a shorter test. The correlation between the test scores is not equal to the correlation between the constructs, as the

Access:

Public

A conceptual introduction to psychometrics, development, analysis, and application of psychological and educational tests, by G. J. Mellenberg (first edition) – Summary chapter 6

An item score distribution can be described by location (1), dispersion (2) and shape (3). Item difficulty is a parameter in maximum performance tests. More test takers fail on more difficult items. Item attractiveness is a parameter in typical performance tests. More test takers choose attractive items. The item difficulty / item attractiveness is equal to the item mean.

Items with small variances do not contribute much to the overall variance. There is a danger of small variances due to floor / ceiling effects in Likert scales (e.g. items with low attractiveness). Large item correlations result in high reliability. Item discrimination refers to how well a given item can distinguish between people that differ on the underlying construct.

The item-test correlation is the correlation between the scores on a given item and the test scores. Items that discriminate well have a high item-test correlation. It uses the following formula:

It is the sum of item score k for test taker j minus the mean for item k times the test score for test taker j minus the mean of the test score divided by the square root of the sum of the item score k for test taker j minus the item mean for item k squared times the sum of the test score for test taker j minus the mean of the test score squared. In other words, it is the covariance between item k and the test score divided by the standard deviation of item k times the standard deviation of the test score.

The item-rest correlation is the correlation between the scores on a given item and the rest score, the score without that item. It is used because in the item-test correlation, correlation is biased upwards as you are correlating an item with itself. It uses the following formula:

It is the sum of the item score k for test taker j minus the mean for item k times the test score for test taker j without item k minus the mean of the test score without item k divided by the square root of the sum of the item score k minus for test taker j minus the mean for item k squared times the sum of the test score for test taker j without item k minus the mean of the test score without item k squared. In other words, it is the covariance between item k and the test score without that item (rest score) divided by the standard deviation of item k times the standard deviation of the test score without item k (rest score standard deviation).

The item-reliability index uses the following formula:

It is the correlation between item k and the test score times the standard deviation of item k. It uses the item-test correlation and not the item-rest

Access:

Public

Follow the author: JesperN

JesperN

More contributions of WorldSupporter author: JesperN:

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Comments, Compliments & Kudos:

Add new contribution

Promotions

The JoHo Insurances Foundation is specialized in insurances for travel, work, study, volunteer, internships an long stay abroad

Check the options on joho.org (international insurances) or go direct to JoHo's https://www.expatinsurances.org

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams
How and why would you use WorldSupporter.org for your summaries and study assistance?
Using and finding summaries, study notes en practice exams on JoHo WorldSupporter
Quicklinks to fields of study for summaries and study assistance

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

How and why would you use WorldSupporter.org for your summaries and study assistance?

For free use of many of the summaries and study aids provided or collected by your fellow students.
For free use of many of the lecture and study group notes, exam questions and practice questions.
For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
For compiling your own materials and contributions with relevant study help
For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, study notes en practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Use the menu above every page to go to one of the main starting pages
- Starting pages: for some fields of study and some university curricula editors have created (start) magazines where customised selections of summaries are put together to smoothen navigation. When you have found a magazine of your likings, add that page to your favorites so you can easily go to that starting point directly from your profile during future visits. Below you will find some start magazines per field of study
Use the topics and taxonomy terms
- The topics and taxonomy of the study and working fields gives you insight in the amount of summaries that are tagged by authors on specific subjects. This type of navigation can help find summaries that you could have missed when just using the search tools. Tags are organised per field of study and per study institution. Note: not all content is tagged thoroughly, so when this approach doesn't give the results you were looking for, please check the search tool as back up
Check or follow your (study) organizations:
- by checking or using your study organizations you are likely to discover all relevant study materials.
- this option is only available trough partner organizations
Check or follow authors or other WorldSupporters
- by following individual users, authors you are likely to discover more relevant study materials.
Use the Search tools
- 'Quick & Easy'- not very elegant but the fastest way to find a specific summary of a book or study assistance with a specific course or subject.
- The search tool is also available at the bottom of most pages

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study for summaries and study assistance

Field of study

Check the related and most recent topics and summaries:

Activity abroad, study field of working area:

Samenvattingen voor psychologie en gedrag

Institutions, jobs and organizations:

Universiteit Amsterdam: UVA

Access level of this page

Public
WorldSupporters only
JoHo members
Private

Statistics

1338