Achievement Test-Marking and Reporting

Dr. V.K.Maheshwari, M.A(Socio, Phil) B.Se. M. Ed, Ph.D

Former Principal, K.L.D.A.V.(P.G) College, Roorkee, India

Marks are the bases for important decisions made by the student, teachers, councilors, parents, school administration and employers. They can also serve as incentive for increased motivation. Presently marks and reports become the bases for crucial decisions about the educational and occupational destiny of the student.

How marks are the bases for the following decisions:

A) Teachers and counselors use marks to assess past accomplishments , and to assess present ability to help the student in making educational and vocational plans for the future.

B ) The student uses marks to appraise his own educational accomplishments, to select major and minor areas of study, and to decide whether to terminate or to continue his formal education.

C ) Parents use marks to determine which ( if not all ) of their children should send to some specific college and to estimate the probability of success any one child might have in advanced study and particular vocation

D) School and college administrators, faced with limited educational facilities, use marks as the basis for admission to advance study and as indications of the student’s progress after admission.

E) Employers use marks in selecting the applicant most likely to perform best the service they require. Park of hue and cry over marks and marking systems stems from the major role marks desirably and undesirably play in the lives of our students.

Marks can also serve as incentives or positive reinforces. Incentives can increase motivation by raising the anticipation of reaching a desired goal. Incentives can yield learned expectancies. A student who has learned to expect good marks for competent performances will approach most educational tasks with more vim and vigor than will the student who has learned to expect poor marks for inadequate performances. Marks, therefore, not only convey information for crucial decisions but also provide important motivational influences.

The Bases of Marks

In terms of performance assessment, the basis for the assignment of marks is the student’s achievement of the instructional objectives. Unfortunately all teachers do not agree that achievement of instructional objective should be the exclusive basis for marking. Instead they use several other basis.

A) They often base grades on the student’s attitude, or citizenship, or desirable attributes of character. A student who shows a cooperative attitude, responsible citizenship, and strength of character receives a higher mark than a student who shows a rebellious attitude, under developed citizenship, and weakness of character.

B) Teachers often base marks on the amount of effort the student invests in achieving instructional objectives, whether or not these efforts meet with success. Conceivably, the student who expends more effort and does not succeed may receive a higher mark than does a student who expends less effort and does succeed.

C) Teachers base marks on growth or how much the student has learned, even though this amount falls short of that required by the instructional objective.

The Marking Systems

The marking system generally used are based on three types of standards:

A) Absolute

B) Criterion-related

C) Relative standards

Absolute-

It is some times called the absolute system because it assume the possibility of the student achieving absolute perfection. It is based on a 100% mastery. Percentage below 100 represents less mastery. According to the definition of absolute standards of achievement, 100% could simply mean that the student has attained the standard of acceptable performance specified in the instructional objective. With the absolute standard, it is harder to explain the meaning of any percentage below 100 since the student achieves or does not achieve ( all or none ) the required standard. A grade of less than 100% could indicate the percentage of the total number of instructional objectives a student has achieved at some point during or at the end of the course.

Criterion-related-

Criterion-related standards do not rely on absolute mastry of every objective. Instead they depend on criterion established by the teacher who has considered the number and kind of objectives which must be met before the next stage of instruction can be entered. In this way they are more meaningful because they can recognize differential levels of competence and achievement and allow some students to achieve above the minimum performance required for all students. Teachers who use percentage grades rarely adopt either of these interpretations, and their grades, despite their deceptive arithmetical appearance, convey no clear information about student achievement.

Relative standards-

The more popular marking system consists of the assignment of letter grades: A, B, C, D, and F. A denotes superior, B good, C average, D fair and F failing or insufficient achievement. It is sometimes called the relative system because the grades are intended to describe the student’s achievement relative to that of other students rather than to a standard of perfection or mastery. An ordinary but by no means universal assumption of this marking system is that the grading should be on the curve. The curve in question is the normal probability curve. A teacher may decide that one class has earned more superior or failing grades than another class which, in turn, has earned more average and good grades.

Scoring Essay Test

Essay test scoring calls for higher degrees of competence, and ordinarily takes considerably more time, than the scoring of objective tests. In addition to this, essay test scoring presents two special problems. The first is that of providing a basis for judgment that is sufficiently definite, and of sufficiently general validity, to give the scores assigned by a particular reader some objective meaning. To be useful, his scores should not represent purely subjective opinions and personal biases that equally competent readers might or might not share. The second problem is that of discounting irrelevant factors, such as quality of handwriting, verbal fluency, or gamesmanship, in appealing to the scorer’s interests and biases. The reader’s scores should reflect unbiased estimates of the essential achievements of the examinee.

One means of improving objectivity and relevancy in scoring essay tests is to prepare an ideal answer to each essay question and to base the scoring on relations between examinee answers and the ideal answer. Another is to defer assignment of scores until the examinee answers have been sorted and resorted into three to nine sets at different levels of quality. Scoring the test question by question through the entire set of papers, rather than paper by paper (marking all questions on one paper before considering the next) improves the accuracy of scoring. If several scorers will be marking the same questions in a set of papers, it is usually helpful to plan a training and practice session in which the scorers mark the same papers, compare their marks and strive to reach a common basis for marking.

The construction and scoring of essay questions are interrelated processes that require attention if a valid and reliable measure of achievement is to be obtained. In the essay test the examiner is an active part of the measurement instrument. Therefore, the viabilities within and between examiners affect the resulting score of examinee. This variability is a source of error, which affects the reliability of essay test if not adequately controlled. Hence, for the essay test result to serve useful purpose as valid measurement instrument conscious effort is made to score the test objectively by using appropriate methods to minimize the effort of personal biases and idiosyncrasies on the resulting scores; and applying standards to ensure that only relevant factors indicated in the course objectives.

The Point or Analytic Method

In this method each answer is compared with already prepared ideal marking scheme (scoring key) and marks are assigned according to the adequacy of the answer. When used conscientiously, the analytic method provides a means for maintaining uniformity in scoring between scorers and between scripts, thus improving the reliability of the scoring.

This method is generally used satisfactorily to score Restricted Response Questions. This is made possible by the limited number of characteristics elicited by a single answer, which thus defines the degree of quality precisely enough to assign point values to them. It is also possible to identify the particular weakness or strength of each examinee with analytic scoring. Nevertheless, it is desirable to rate each aspect of the item separately. This has the advantage of providing greater objectivity, which increases the diagnostic value of the result.

The Global/Holistic of Rating Method

In this method the examiner first sorts the response into categories of varying quality based on his general or global impression on reading the response. The standard of quality helps to establish a relative scale, which forms the basis for ranking responses from those with the poorest quality response to those that have the highest quality response. Usually between five and ten categories are used with the rating method with each of the piles representing the degree of quality and determines the credit to be assigned. For example, where five categories are used, and the responses are awarded five letter grades: A, B, C, D and E. The responses are sorted into the five categories

This method is ideal for the extended response questions where relative judgments are made (no exact numerical scores) concerning the relevance of ideas, organization of the material and similar qualities evaluated in answers to extended response questions. Using this method requires a lot of skill and time in determining the standard response for each quality category. It is desirable to rate each characteristic separately. This provides for greater objectivity and increases the diagnostic value of the results.

Improving Objectivity in Marking in Essay Test

The following are procedures for scoring essay questions objectively to enhance reliability.

i. Prepare the marking scheme or ideal answer or outline of expected answer immediately after constructing the test items and indicate how marks are to be awarded for each section of the expected response.

ii. Use the scoring method that is most appropriate for the test item. That is, use either the analytic or global method as appropriate to the requirements of the test item.

iii. Decide how to handle factors that are irrelevant to the learning outcomes being measured. These factors may include legibility of handwriting, spelling, sentence structure, punctuation and neatness. These factors should be controlled when judging the content of the answers. Also decide in advance how to handle the inclusion of irrelevant materials (uncalled for responses).

iv. Score only one item in all the scripts at a time. This helps to control the “halo” effect in scoring.

v. Evaluate the answers to responses anonymously without knowledge of the examinee whose script you are scoring. This helps in controlling bias in scoring the essay questions.

vi. Evaluate the marking scheme (scoring key) before actual scoring by scoring a random sample of examinees actual responses. This provides a general idea of the quality of the response to be expected and might call for a revision of the scoring key before commencing actual scoring.

vii. Make comments during the scoring of each essay item. These comments act as feedback to examinees and a source of remediation to both examinees and examiners.

viii. Obtain two or more independent ratings if important decisions are to be based on the results. The result of the different scorers should be compared and rating moderated to reflect the discrepancies for more reliable results.

Scoring Objective Test

Answers to true–false, multiple-choice, and other objective-item types can be marked directly on the test copy. But scoring is facilitated if the answers are indicated by position marking a separate answer sheet. For example, the examinee may be directed to indicate his choice of the first, second, third, fourth, or fifth alternative to a multiple-choice test item by blackening the first, second, third, fourth, or fifth position following the item number on his answer sheet.

Answers so marked can be scored by clerks with the aid of a stencil key on which the correct answer positions have been punched. To get the number of correct answers, the clerk simply counts the number of marks appearing through the holes on the stencil key. Or the answers can be scored, usually much more quickly and accurately, by electrical scoring machines. Some of these machines, which “count” correct answers by cumulating the current flowing through correctly placed pencil marks, require the examinee to use special graphite pencils; others, which use photoelectric cells to scan the answer sheet, require only marks black enough to contrast sharply with the lightly printed guide lines. High-speed photoelectric test scoring machines usually incorporate, or are connected to, electronic data processing and print-out equipment.

Objective test can be scored by various methods. Various techniques are used to speed up the scoring:

i. Manual Scoring

In this method of scoring the answer to test items are scored by direct comparison of the examinees answer with the marking key. If the answers are recorded on the test paper for instance, a scoring key can be made by marking the correct answers on a blank copy of the test . Scoring is then done by simply comparing the columns of answers on the master copy with the columns of answers on each examinee’s test paper. Alternatively, the correct answers are recorded on scripts of paper and this script key on which the column of answers are recorded are used as master for scoring the examinees test papers.

ii. Stencil Scoring

Here separate sheet of answer sheets are used by examinees for recording their answers, it’s most convenient to prepare and use a scoring stencil. A scoring stencil is prepared by pending holes on a blank answer sheet where the correct answers are supposed to appear. Scoring is then done by laying the stencil over each answer sheet and the number of answer checks appearing through the holes is counted. At the end of this scoring procedure, each test paper is scanned to eliminate possible errors due to examinees supplying more than one answer or an item having more than one correct answer.

iii. Machine Scoring

If the number of examinees is large, a specially prepared answer sheets are used to answer the questions. The answers are normally shaded at the appropriate places assigned to the various items. These special answer sheets are then machine scored with computers and other possible scoring devices using certified answer key prepared for the test items. In scoring objective test, it is usually preferable to count each correct answer as one point. An examinee’s score is simply the number of items answered correctly.

Correction for guessing

One question that often arises is whether or not objective test scores should be corrected for guessing. Differences of opinion on this question are much greater and more easily observable than differences in the accuracy of the scores produced by the two methods of scoring. If well-motivated examinees take a test that is appropriate to their abilities, little blind guessing is likely to occur. There may be many considered guesses, if every answer given with less than complete certainty is called a guess. But the examinee’s success in guessing right after thoughtful consideration is usually a good measure of his achievement.

Since the meaning of most achievement test scores is relative, not absolute—the scores serve only to indicate how the achievement of a particular examinee compares with that of other examinees—the argument that scores uncorrected for guessing will be too high carries little weight. Indeed, one method of correcting for guessing results in scores higher than the uncorrected scores.

The logical objective of most guessing correction procedures is to eliminate the expected advantage of the examinee who guesses blindly in preference to omitting an item. This can be done by subtracting a fraction of the number of wrong answers from the number of right answers, using the formula S = R – W/(k – 1) where S is the score corrected for guessing, R is the number of right answers, W is the number of wrong answers, and k is the number of choices available to the examinee in each item. An alternative formula is S = R + O/k where O is the number of items omitted, and the other symbols have the same meaning as before. Both formulas rank any set of examinee answer sheets in exactly the same relative positions, although the second formula yields a higher score for the same answers than does the first.

Logical arguments for and against correction for guessing on objective tests are complex and elaborate. But both these arguments and the experimental data point to one general conclusion. In most circumstances a correction for guessing is not likely to yield scores that are appreciably more or less accurate than the uncorrected scores.

Reporting-

The most popular method of reporting marks is the report card. Most modern report cards contain grades and checklist items. The grades describe the level of achievement, and the checklists describe other areas such as effort, conduct, homework, and social development.

Because the report card does not convey all the information parents sometimes seek and to improve the cooperation between parents and teachers school often use parent-teacher conferences. The teacher invites the parents to the school for a short interview. The conferences allow the teacher to provide fuller descriptions of the student’s scholastic and social development and allow parents to ask questions, describe the home environment and plan what they may do to assist their children’s educational development. There are inherent weaknesses in the conferences and ordinarily they should supplement rather than replace the report card.

Despite the rather obvious limitations of validity, reliability, and interpretation, reform of these marking systems has had only temporary appeal. Reforms advocating the elimination of marks have failed because students teachers, counselors, parents, administrators, and employers believe they enjoy distinct advantages in knowing the student’s marks. Many know that marks mislead them, but many believe that some simplified knowledge of the student’s achievement is better than no knowledge at all.

Recent Posts

Archives

Meta

Achievement Test-Marking and Reporting

Recent Posts

Most used

Archives

Meta