To provide consistency in scoring, use anchor sets and scorer training that require all scorers to come to a consensus about the qualities required for a score point. Scoring guides, or rubrics are also developed. The rubric matches the chosen type of holistic scoring. Scoring of student work follows a series of steps, whether the type of subjective scoring is focused holistic scoring, analytic holistic scoring, scoring by major and minor errors, or some variation of these methods.
Scoring of student work should be meaningful to students, efficient in terms of time, and provide consistent results. Holistic scoring can meet these requirements, but it does require some preparation prior to scoring. The following steps outline the general process used for many formal holistic scoring programs. The exact sequence of these steps varies to meet the needs of the program.
Very specific detailed objectives can usually be measured efficiently in machine-scoreable formats. Tasks that require students to meet more global goals and therefore many objectives must usually be scored subjectively.
Since holistic scoring takes time, it is often prudent to utilize one task that covers many objectives or a main goal rather than developing many smaller tasks that address each objective specifically.
These factors might include accuracy, completeness, thoroughness, clarity, synthesis of concepts, supported inferences, or any other factor that is an essential part of the highest quality work.
Each score point of the rubric should address each of the factors that should be included in the work. After the work is examined or the task is defined, the rubric may need to be modified.
For example, if you were scoring a diving competition you would have to thoroughly understand the rubric before you saw the dive because you only get to see the dive once. The same holds true for student products. Rare is the teacher who will have time to reread every paper many times, judging it one time for grammar and another for accuracy, etc. You have to be able to form an overall impression of the quality of the work by quickly determining the relevant strengths and weaknesses of the work.
Share the rubric with your students. Explain how the rubric is used. Show some work from past students. Allow students to score this work using the rubric. The more accurate students become in evaluating others work, the more accurate their self-evaluations become.
The number of groups in the anchor set depends on the number of score points used. For a four point scale the separation would occur as follows: High, medium, and low groups would be determined by comparing papers. Separate the medium group into high medium and low medium. This will provide a four point range—high, high medium, low medium, and low. Write brief annotations describing each set according to the rubric with some specific details. These papers with their annotations become your anchor sets. These sets help you maintain consistency from year to year. If you are working with another teacher, a teacher's aide, parent, or even a large group of scorers the set provides anchors.
The best anchor sets for this example will have the lowest of the high papers (4-) so the scorer will say anything better than this must be a 4. The set will also contain the best (3+) and the worst (3-) of the 3's. This pattern is followed for the rest of the anchor sets. Students, parents, or any interested party can take any paper and say this is most like these papers. The score that they arrive at should be the same or adjacent to the teacher's score. Any discrepancies should be discussed.
Annotations serve as a reminder of how anchor set scores were derived and are usually representative of the types of strengths and weaknesses found at that score point. If there are many assignments that produce the same type of product, one anchor set will usually suffice for classroom assessment. District or state level assessments generally require an anchor set for every prompt.
Keep the rubric and anchor set close by. There may be an occasional student product that requires a second reading or viewing. You may want to annotate these "line calls" to help you explain to score to the student.
When two scorers are scoring the same work, which is usually the case in large scale assessment systems, averages, and third readers are employed to resolve differences.
All types of subjective scoring center on the type of product used to demonstrate various aspects of what the student has learned in performing particular tasks. The product, which may be complex and open-ended, is scored as a whole, in comparison to objective scoring, which scores right or wrong answers. Examples of complex, open-ended tasks include answering discussion questions on a test or performing a particular task, such as debate, reports, or presentations. Subjective scoring requires the scorer to make judgment regarding the quality of the student's work. Focused holistic scoring, analytic holistic scoring, scoring by major and minor errors, or some variation of these methods, are used in holistic scoring methods.
Focused holistic scoring centers on the product as a whole. Numerous pieces of work are studied and a scale is designed by describing factors inherent in the product at the various scoring levels. Each score point of the scoring guide, or rubric, addresses each of the major factors that should be included in the work. In some science inquiry tasks these may include observations, conclusions with supporting evidence, and experimental design. After the work is examined or the task defined, the rubric may need to be modified as factors are added or subtracted and descriptions of score points are reworked. Focused holistic scoring produces only one final number that is assigned to represent the student's work as a whole. It is focused because it focuses on the total product, not on separate aspects of the student's work. This scoring method allows a relatively fast scoring of process skills. However, it fails to pinpoint strengths and weaknesses of students. Annotations can be used to detail strengths and weaknesses.
The following is an example of a focused holistic rubric in which the scorer uses only one number to represent the student's work as a whole:
Analytic holistic scoring can be used as a means of informing both the scorer and students of general areas of high and low quality. Analytic holistic scoring follows the same procedures as focused holistic scoring, but. the rubrics are more specific. The information provided by the rubrics in analytic holistic scoring is generally more useful to students, especially for beginners. Analytic holistic scoring rubrics can be used when only part of an entire paper is to be scored. Several analytic scores can be given to one paper. With a little practice a teacher will be able to read a paper one time and assign several analytic scores. Analytic rubrics are often used with short answer essays. Discussion of the analytic rubrics before and after the task can provide a vehicle of instruction.
This example shows how numeric scores can be assigned to the key performance areas of Observing and Drawing conclusions.
Performance Area: Observing
Performance Area: Drawing Conclusions
Analytic holistic scoring produces several numeric scores, each associated with a different aspect of the student's work. Focused holistic scoring produces only one number that is assigned to represent the student's work as a whole. It is focused because it focuses on the total product, not on separate aspects of the student's work. Focused holistic scoring is appropriately used when a relatively quick and superficial, yet consistent, assessment technique is needed. This method can be used as a precursor to or as a follow-up to other assessment techniques aimed at identifying students' strengths and weaknesses. Analytic holistic scoring provides students and teachers with diagnostic information about students' particular strengths and weaknesses and is desirable when students need feedback about their performance in key areas of their learning products.
Major and minor errors is a type of scoring that uses a general rubric defined by major and minor errors followed by a list of errors. The errors may tend to focus on the components of the task, as in the example below, or they may be centered on qualities of the work such as clarity, reasoning, etc. The type of error listed depends on the product to be reviewed. This type of scoring is tedious to set up, but is very helpful in identifying exactly what the scorer is trying to assess.
Major Errors |
Minor Errors |
The problem is NOT clearly defined in text NOR is the problem inferred by the given experimental technique. |
A problem statement is omitted but the student obviously has identified the problem. |
Design does not permit investigation of a problem. Design omits appropriate control of variables, or has a small sample size or trial number. |
The design includes unnecessary procedures but this does not have detrimental effects on the results. |
Appropriate charts/tables are missing OR are totally unorganized. Missing labels make it impossible to identify the type of data collected. |
Charts may be slightly disorganized, but is understandable. Labels may be missing but it is apparent what data has been collected. |
Appropriate graphs are missing. Graphs may lack enough information to be interpreted correctly, e.g., graph labels are missing. |
Appropriate graphs are given but there are some style errors. Labels may not be complete, but they are understandable. |
No attempt is made to analyze data or to identify trends or patterns from the data. |
There may be a few inconsistencies in the analysis. Major patterns in the data are correctly identified. |
Conclusion is NOT supported by the data or contradicts known facts. Conclusion is missing, incorrect, or is based upon extraneous information. |
Conclusion is generally supported by the data, and factual information but there are a few inconsistencies. |