What are selected response tasks?
What are some of the strengths and weaknesses of selected response tasks?
What are the major criteria for selected response tasks?
What are some of the strengths and weaknesses of true-false tests?
What are some of the strengths and weaknesses of matching items?
What are some of the strengths and weaknesses of fill-in-the-blank items?
What are some of the strengths and weaknesses of multiple choice?
What are some modifications of multiple choice items?
What are some common problems with multiple choice items?
How does the teacher prepare for assessment with Selected Response Tasks?
Type of objectives |
Conceptual understanding, skills |
Number of students |
Large group |
Teacher prep time |
Depends on the quality of the questions. Questions requiring more than rote recall require much longer to prepare |
Class time |
One class period |
Scoring time |
Short |
Scoring method |
Answer key |
Possible problems |
Questions that do not reflect objectives, wrong answers keyed, does not reflect the type of work students will be called upon to do as adults, cannot probe in-depth understanding |
Possible values |
Can test a large number of students quickly, scoring is objective, can cover a wide variety of topics and skills quickly |
These are tasks where students select from given responses of one or two words. These tasks include multiple choice, true/false, matching, and fill-in-the-blank.
Multiple choice, true/false, matching, and fill-in-the-blank assessments furnish snapshots of performance. They can be developed to assess many problem-solving skills, but generally the skills are tested in isolation. These common forms of assessments are blamed for some of the current problems in education. All too often teacher and student alike tend to focus on "will this be on the test" more than "is this important to learn." However, the ease of administration and scoring, and statistical reliability of these items will probably cause them to remain in use.
Multiple choice appears to be the best of the selected response assessment methods. True/false items tend to be tricky and require a tremendous number of items to maintain statistical reliability. Matching items often give away many of the answers. Fill-in-the-blank items are generally scored with a single keyed response. If a different correct response is given it probably will not receive credit causing student frustration. For these reasons multiple choice will probably remain as a large part of large scale assessments.
The following are some of the common criteria used to evaluate selected response items:
True-false tests are excellent for testing memorization of factual information. They are quickly prepared and scored. True-false tests require about twice as many items as a multiple choice test to have the same reliability. Test items that contain negatives are very difficult to interpret and should be avoided.
Matching items like true-false tend to cover factual information, but are usually a bit more difficult to construct. The use of negatives in one part of the matching can cause confusion and should be avoided.
Fill-in-the-blank items are easily constructed and easy to score when a key is used. Frequently students answer with a correct response that is not given in the key. The answer is marked incorrect causing many hurt feelings among parents, students, and teachers. This incorrect marking occurs much more frequently with fill-in-the-blank than in true-false or multiple choice tests.
Good thought-provoking multiple choice items can be fashioned, but this requires a great deal of time and content expertise. The difficulty of preparation can be minimized by using item models and banking good items.
Some multiple choice questions have multiple correct answers. A wise test-taker treats each answer as a true/false question. The method used for scoring can dramatically affect the way a test taker should answer the questions. Students who fail to understand the importance of various scoring methods can be severely penalized in this form of testing.
Justified multiple choice is another modification of multiple choice questions. In this form the student is required to justify choices and in some cases refute incorrect options. This form of multiple choice assessment provides an opportunity for the teacher to see if the student really knows the answer or is just guessing. The disadvantage of the justified multiple choice is that it has all of the problems that any multiple choice question has and it takes away the speed of scoring which is the major strength of multiple choice.
The better items presented in this section are low level items and are only better in the sense that they are better than the poor items that precede them. The questions that follow are typical of the questions that a test writer would ask to facilitate construction of high quality test items.
Which of the following is a sex-linked characteristic?
NOTE: Look at questions and ask, "what is the question asking the student to do?" In this case, the item above is asking the student to recall. In addition, the recall is rather trivial.
Color-blindness is a sex-linked characteristic. What are the chances of a normal male and a color-blind female having a color-blind son?
NOTE: The answer to this question requires the student to make a prediction based on an understanding of the heredity of sex-linked characteristics. It is not necessary for the student to have memorized that color-blindness is sex-linked.
Objective: Understand the transfer of heat.
Which temperature is equivalent of 1273 K?
NOTE: If the objective had been, "identify temperature equivalents," then the sample above would match the objective. The question below requires an understanding of heat transfer and is a better match to the given objective.
Additional problems: Kelvin is not preceded by the degree sign as in Celsius or Fahrenheit. These type of details are often missed when writing items, but are usually caught when items are reviewed.
Options A and B can be omitted by a knowledgeable student because 100 °C and 212 °F are equivalent. As two answers are not possible, a student can increase the chances of choosing the correct response to 50%. As with all test items, the students who answer correctly should do so because they understand the material, not because they have good test taking skills.
Which diagram correctly shows the direction of net heat flow in three metal bars?

Who invented bacteria?
NOTE: This item depends on an understanding of the term invented rather than an understanding of the contributions of individuals, since no one invented bacteria.
Additional problems: The use of all of these or none of these should be avoided since they can frequently be justified as the answer even when they are not intended to be the answer.
Who was the first to identify microorganisms as a cause of some diseases?
Every organic compound—
Note: This type of poor item construction is often found in multiple choice tests. Some items may have only a single word followed by a dash leaving the student to wonder, "What does the test maker want?" Test takers can be taught to treat these type of items as four true–false items for each option. The first option, for example, would be read by students as, "Every organic compound is produced by animals. True or false."
Which of these is found in every organic compound?
Note: This question still leaves one wondering why it is important to know that organic compounds contain carbon.
Which of these is not a characteristic of animals?
NOTE: The use of the negative in the stem and in option C make this doubly difficult to read and interpret. Option B is correct for some animals, but not all, making this a weak option. If negatives are used the negative word should be in bold face or all caps. If a negative is used in the stem of the item NO answer option should contain a negative.
Which of these is a characteristic that all animals have?
Vitamin B, found in rice hulls, helps prevent—
As westerners introduced better milling techniques to remove rice hulls, natives of south sea islands began to develop beriberi due to a lack of—
NOTE: The second item can lead the student to the correct answer for the first item. Items that clue each other should not be in the same test. Many good test taking students will get the correct answer, but not because they understand the information.

Which seeds are probably the slowest to germinate?
Which seeds are probably the slowest to germinate?
NOTE: Options with numbers should be in ascending or descending order. This helps the student find the correct option quickly.
Tundra and taiga biomes are found at high latitudes, but similar biomes can also be found—
NOTE: Tundra and taiga biomes are found at high latitudes, but similar biomes can also be found—
NOTE: This problem is the most difficult to detect, especially if you are writing and editing your own items. Alternate correct options are often overlooked if the person sees the answer as "obvious," if the person has a misconception, or if the incorrect options can be justified in some way. In the item above the best answer is clearly option A. However, the justification of the other options makes this type of item very difficult to defend. Only very knowledgeable reviewers can catch this type of item flaw.
If students are writing test items for review for the test, the teacher can easily spot problem areas as students will often have multiple correct answers. Because of these, student test-writing can become a powerful teaching and learning tool.
When is the earth farthest from the sun?
The earth is farthest from the sun when the season in the Northern Hemisphere is
NOTE: Students who understand the information may miss items that have long wordy options. This problem can be avoided by putting more information in the stem of the item.
Which of the following is a compound?
NOTE: Items that lack parallel options often contain more than one justifiable answer. For example, water is clearly the answer, but some minerals are compounds making Option D justifiable. Avoid the use of options that can be justified in someway by a student to minimize frustration for students who tend to ponder each answer. All options should be parallel in grammatical construction as well as in degree of specificity.
Which of the following is a compound?
NOTE: This example provides one clearly correct answer and all of the options are substances.
A test writer should justify each of the options, if not in writing, at least mentally.
What is the source of MOST of the salt in the oceans?
NOTE: "Wrong answer" or "Student guesses" are not justifications, but are frequently given by writers as rationales for using options as wrong answers. "Off the wall" options are wasted because so few students choose them.
What is the source of MOST of the salt in the oceans?
NOTE: The inclusion of rationales does not guarantee good items, but it does make the writer provide options that at least seem reasonable. The item might still focus on trivia as this item does. If students are writing test items they should provide rationals for the options. This can inform the teacher of many misconceptions students hold. It also requires students to investigate the information they are trying to learn in greater depth.
What type of gene is responsible for the five-finger trait in humans?
NOTE: Dominant and recessive are all inclusive. This reduces the chances of selecting the correct response to 50%. Without reading the question one could guess the answer to be either dominant or recessive. Genes are generally taught as dominant, recessive or incompletely dominant; as a result there is not a fourth plausible option to this particular item.
Five-finger trait occurs more commonly in any human population than six fingers. The five-finger trait is probably—
NOTE: The keyed response should be recessive in this item. Review by others is necessary to reveal this type of error. No one writer can be expected to know all of the science there is to know. Multiple reviewers are a must. Student developed questions will often alert teachers to misconceptions held by students because they will mark the wrong answer as the correct one.
Which of these will result if all colors are mixed?
Note: A test-wise student will narrow this to the opposites, thus increasing their chances to 50% of getting it right by guessing. This item also is ambiguous. Are the colors to be light or pigment? White is the correct response if light is intended and black is the correct response is pigment is intended.
Which of these models shows the orbit of the Earth?

NOTE: Is the answer supposed to be a mathematical or scale model? Does the view correct for perspective? When using drawings you must constantly be aware of the problems of depicting materials and scale. Multiple reviewers help prevent interpretation problems. All people reading the items should be encouraged to voice their concerns no matter how trivial they seem.

NOTE: Onion roots do not generally grow to a length of 300 centimeters and even if they did they would not grow as fast as indicated by the graph. Multiple reviewers can help identify illogical data or inaccurate factual information. This graph also lacks numbers on the x-axis.
Prepare a unit assessment instrument that addresses each of the following:
Use the criteria provided in the Selected Response Section to help you design the best possible test.
Provide an answer key.
Describe how the assessment instrument can be used to improve student learning.