Developing maximum performance tests - a summary of chapter 2 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 2
Developing maximum performance tests

Construct of interest
Measurement mode
The objectives
The population
The conceptual framework
Item response mode
Administration mode
Item-writing guidelines
Item rating guidelines
Pilot studies on item quality
Compiling the first draft of the test

Seven elements

Construct
Measurement mode
Objectives
Population and subpopulations
Conceptual framework
Respons mode
Administration mode

Construct of interest

The test developer must specify the latent variable of interest that has to be measured by the test.
Latent variable is a general term. The term construct is used when a subsantive interpretation is given of the latent variable.
The latent variable (construct) is assumed to effect test makers’ item responses and test scores.

Constructs can vary in many different ways.

Vary in content of mental abilities, psychomotor skills or physical abilities
Construct may vary in scope
For example: from general intelligence to multiplication skill
Constructs vary from educational to psychological variables.

A good way to start a test development project is to define the construct that has to be measured by the test.
This definition describes the construct of interest, and distinguished it from other, related, constructs.
Usually, the literature on the construct needs to be studies before the definition can be given. Frequently the definition can only be given when other elements of the test development plan are specified.

Measurement mode

Different modes can be used to measure constructs.

Self-performance mode
The test taker is is asked to perform a mental or physical task
Self-evaluation mode
The test taker is asked to evaluate his or her ability to perform the task
Other-evaluation mode
Ask others to evaluate a person’s ability to perform a task

The objectives

The test developer must specify the objectives of the test. Tests are used for many different purposes.

Scientific vs practical
Individual level vs groep level
Description (describe performances) vs diagnosis (adds a conclusion to a description) vs decision-making (decisions are based on tests)

The population

Target population: the set of persons to whom the test has to be applied.
The test developer must define the target population, and must provide criteria for the inclusion and exclusion of persons.
A target population can be split into distinct subpopulations. The test developer must specify whether subpopulations need to be distinguished. And, if so, they need to define the subpopulations, and to provide criteria to include persons in subpopulations.

The conceptual framework

Test development starts with a definition or description of the construct that has to be measured by the test. But, the definition or description is usually not concrete enough to write test items.

A conceptual framework: gives the item writer a handle to write items.
In the literature, examples of conceptual frameworks are available.

Item response mode

The item response mode needs to be specified before item writing starts.
Dsitinction:

Free- vs constructed-response
Choice vs selected response

Free response items are further divided into:

Short-answer items
Essay items

Different types of choice modes are used in achievement and ability testing:

Conventional multiple-choice mode
Consists of a stem and two or more options. The options are divided into one correct answer and one or more distractors.
Usually, choosing the correct option of a multiple-choice item indicates that test takers’ ability or skill is sufficiently high to solve the item.
Distractors can be constructed to contain specific information on the reasons why the test taker failed to solve the item correctly. The choice of a distractor indicates which deficiency the test taker has and as such can be used for diagnosing specific deficiencies.
A dichotomous item response scale has two ordered categories. An answer is correct or incorrect.
An ordinal-polytomous scale has more than two ordered categories.
Partial ordinal-polytomous response scale: the correct option is ordered above the distractors, but the distractors are not ordered among themselves.

Administration mode

A test can be administered to test takers in different ways:

Oral
The test is presented orally by a single test administrator to a single test maker
Paper-and-pencil
The test is presented in the form of a booklet
Computerized
Test order of the items is the same for each of the test takers. It is presented on a computer.
Computerized adaptive test administration
The test is adaptive. The computer program searches for the items that best fit the test taker.

Item-writing guidelines

Focus on one relevant aspect
Each item should focus on a single relevant aspect of the specification in order to guarantee good coverage of the important aspects of the achievement or ability.
Only a single aspect of the specification needs to be measured to guarantee that test takers’ item responses are unambiguously interpretable.
Use independent item content
The content of different items is independent.
Testlet: a group of items that may be developed as a single unit that is meant to be administered together.
Avoid overly specific and overly general content
The disadvantage of overly specific item content is that the content may be trivial, and the disadvantage of overly general content is that the content may be ambiguous
Avoid items that deliberately deceive test takers
Keep vocabulary simple for the population of test takers
Put item options vertically
Minimize reading time and avoid unnecessary information
Use correct language
Use non-sensitive language
Use a clear stem and include the central idea in the stem
Word the item positively, and avoid negatives
Negative phrased items are harder to understand and may confuse test readers.
Use three options, unless it is easy to write plausible distractors
Use one option that is unambiguously the correct or best answer
Place the options in alphabetical, logical, or numerical order
Vary the location of the correct option across the test
Keep the options homogeneous in length, content, grammar, etc.
Avoid ‘all-of-the-above’ as the last option
Make distractors plausible
Avoid giving clues to the correct option

Item rating guidelines

The responses to free- (constructed-) response items have to be grated by raters.

Important guidelines:

Rate responses anonymously
Rate responses to one item at a time
Provide the rater with a frame of reference
Separate irrelevant aspects from the relevant performance
Use more than one rater
Re-rate the free responses
Rate all responses to an item on the same occasion
Rearrange the order of responses
Read a sample of responses

Pilot studies on item quality

Standard practice is that item writers produce a set of concept items and pilot studies are done to test the quality of these concept items.
Generally, at least half of the concept items do not survive the pilot studies, and items that survive are usually revised several times.

Expert’s and test takers’ pilot studies need to be done for both free-response and multiple-choice items.
For free-response items pilot studies need to be done on the ratings of test takers’ responses to the items.

Expert’s pilots

The concept items have to be reviewed before they are included in a test.
Items are reviewed on their content, technical aspects, and sensitivity.

The content and technical aspects are assessed by experts in both the field of the test and item writing.

Each of the concept items is discussed by a panel of experts.
A good start for the discussion of a multiple-choice item is to look for distractors that panel members could defend as (partly) correct answers.
The reviewing of the items yields qualitative information that is used to rewrite items or to remove concept items that cannot be repaired.
Revised items should be reviewed again by experts until further rewriting is not needed.

The sensitivity of items also needs to be reviewed.
Usually, the panel for the sensitivity review of the items consists of person not on the panel reviewing the content and technical aspects of the items.
The sensitivity review panel is composed of members of different groups.
The panel has to be trained to detect aspects of the items and the tests that may be sensitive to subpopulations.
The sensitivity review provides qualitative information that also could lead to rewriting or removal of concept items.

Test takers’ pilots

The concept items are individually administered to a small group test takers from the population of interest.
Each of the test takers is interviewed on their thinking while working on an item.

Two versions of the interview can be applied

Concurrent interview: the test taker is asked to think aloud while working on the item
Retrospective interview: the test taker is asked to recollect his or her thinking after completing the item.

Protocols of the interviews are made and the information is used to rewrite or remove concept items.

Compiling the first draft of the test

The concept items that survived the pilot studies are used to compile a concept version of the test that includes instructions for the test takers.
Usually, the instruction contains some example items that test takers have to answer to ensure that they understand the test instructions.
The concept test may consist of a number of subtest that measure different aspects of the ability or achievement.

The conventional way of assembling a maximum performance test is to start with easy items and to end with difficult items.

The concept test is submitted o a group of experts.
The group can be the same as the group that was used in the experts’ pilot study on item quality. The group has expertise in:

The content of the ability or achievement
Test construction

The experts evaluate two different properties of the concept test.

Whether the test instruction is sufficiently clear for the population of test takers
Whether the test yields adequate coverage of all aspects of the ability or achievement being measured by the test. (content validation)
Whether the test is balanced with respect to multicultural material and references to gender

The comments of the experts are used to compile the first draft of the test.
The first draft of the test is administered in a try-out to a sample of at least 200 test takers from the population of interest.

Access:

Public

Join WorldSupporter!

Join with a free account for more service, or become a member for full access to exclusives and extra support of WorldSupporter >>

This content is related to:

A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary

This is a summary of the book A conceptual introduction to psychometrics by G, J., Mellenbergh. The summary contains chapter 1 to 6, and focusus on developing psychological tests. The first chapter of this summary is for free, but to support worldsupporter and Joho,...Read more

4765 keer gelezen

Check more of topic:

Samenvattingen voor psychologie en gedrag

This content is used in:

A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary

Going abroad?

Insure your way around the world

International expat insurances

Travel & Worldsupporter insurances (NL)

Study with summaries

Associate with your Field of Study

Search Summaries or Notes&

Start using Summaries

Add a Summary

Contributions: posts

Help other WorldSupporters with additions, improvements and tips

Add new contribution

Spotlight: topics

Check the related and most recent topics and summaries:

Activities abroad, study fields and working areas:

Samenvattingen voor psychologie en gedrag

Countries and regions:

The Netherlands

This content is also used in .....

A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary

This is a summary of the book A conceptual introduction to psychometrics by G, J., Mellenbergh. The summary contains chapter 1 to 6, and focusus on developing psychological tests.

The first chapter of this summary is for free, but to support worldsupporter and Joho,

...

easel-2714167__340.jpg

Introduction - a summary of chapter 1 of A conceptual introduction to psychometrics by G, J., Mellenbergh

Developing maximum performance tests - a summary of chapter 2 of A conceptual introduction to psychometrics by G, J., Mellenbergh

Typical performance tests - a summary of chapter 3 of A conceptual introduction to psychometrics by G, J., Mellenbergh

Observed test scores - a summary of chapter 4 of A conceptual introduction to psychometrics by G, J., Mellenbergh

Classical analysis of observed test scores - a summary of chapter 5 of A conceptual introduction to psychometrics by G, J., Mellenbergh

Classical analysis of item scores - a summary of chapter 6 of A conceptual introduction to psychometrics by G, J., Mellenbergh

Test theory and practice

Lees verder over A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary
4765 keer gelezen

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams
How and why use WorldSupporter.org for your summaries and study assistance?
Using and finding summaries, notes and practice exams on JoHo WorldSupporter
Quicklinks to fields of study for summaries and study assistance

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

How and why use WorldSupporter.org for your summaries and study assistance?

For free use of many of the summaries and study aids provided or collected by your fellow students.
For free use of many of the lecture and study group notes, exam questions and practice questions.
For use of all exclusive summaries and study assistance for those who are member with JoHo WorldSupporter with online access
For compiling your own materials and contributions with relevant study help
For sharing and finding relevant and interesting summaries, documents, notes, blogs, tips, videos, discussions, activities, recipes, side jobs and more.

Using and finding summaries, notes and practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Use the summaries home pages for your study or field of study
Use the check and search pages for summaries and study aids by field of study, subject or faculty
Use and follow your (study) organization
- by using your own student organization as a starting point, and continuing to follow it, easily discover which study materials are relevant to you
- this option is only available through partner organizations
Check or follow authors or other WorldSupporters
Use the menu above each page to go to the main theme pages for summaries
- Theme pages can be found for international studies as well as Dutch studies

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study for summaries and study assistance

Main summaries home pages:

Main study fields:

Business organization and economics, Communication & Marketing, Education & Pedagogic Sciences, International Relations and Politics, IT and Technology, Law & Administration, Medicine & Health Care, Nature & Environmental Sciences, Psychology and behavioral sciences, Science and academic Research, Society & Culture, Tourisme & Sports

Main study fields NL:

Studies: Bedrijfskunde en economie, communicatie en marketing, geneeskunde en gezondheidszorg, internationale studies en betrekkingen, IT, Logistiek en technologie, maatschappij, cultuur en sociale studies, pedagogiek en onderwijskunde, rechten en bestuurskunde, statistiek, onderzoeksmethoden en SPSS
Studie instellingen: Maatschappij: ISW in Utrecht - Pedagogiek: Groningen, Leiden , Utrecht - Psychologie: Amsterdam, Leiden, Nijmegen, Twente, Utrecht - Recht: Arresten en jurisprudentie, Groningen, Leiden

WorldSupporter: what are the features, functionalities and rules on WorldSupporter.org?

WorldSupporter NL: hoe vind je samenvattingen en studiehulp op WorldSupporter.org en JoHo.org

Summaries and Study Assistance - Start

Submenu: Summaries & Activities

Follow the author: SanneA

SanneA

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Statistics

2740

Search a summary, study help or student organization

Select any filter and click on Search to see results