The grade-inflation hawks outnumbered the doves in the months of discussion that led to the adoption of Faculty Senate Resolution 03-04 in December 2002, which appears at http://www.math.lsu.edu/~mcgehee/Grades.html. Its appendices refer to strongly-worded reports with startling proposals. Nevertheless the Resolution itself calls only for moderate and local remedies. It also avoids the inexact term "grade inflation," which in general, alas, is unavoidable.
This seems a good time to call your attention to the Resolution. It contains policy recommendations for administrators, and charges to two Faculty Senate committees.
Valen E. Johnson's new book examines the nature of SETs - Student Evaluations of Teaching - and surveys a few dozen studies conducted over a period of decades at various universities. It also reports on the 1998-1999 study called DUET - Duke Undergraduates Evaluate Teaching - which was designed to address questions left open by earlier studies. Teachers who have paid thoughtful attention to their SET numbers over the years will not be surprised by anything here.
Johnson's work strongly indicates that to use SET numbers exclusively, primarily, or uncritically in the evaluation of teaching is to create pressures for lower grading standards.
The main merit of the book is its well-grounded, careful analysis of how SET numbers behave statistically, and how they are biased. This tells us what to watch out for when drawing conclusions from the numbers. I begin with a non-technical and qualitative description of findings. Three of Valen Johnson's five summary conclusions, as he states them briefly and simply on p. 237, are as follows:
Knowing that such a bias exists, as many studies confirm, it remains to evaluate various theories that might explain the bias: Perhaps the class that gets higher grades is more capable; or more interested in the subject matter; or has been more effectively taught. If such were so, then the bias would be benign. But in the light of the DUET study, those hypotheses do not hold up. The best explanation is the grade-attribution theory: Students have a certain measurable tendency to attribute success in academic work to themselves, and failure to external sources.
Johnson discusses a whole array of biases that are present, and considers ways to improve survey instruments. He confirms that feedback to teachers from certain survey items is associated with improvements in teaching. Nevertheless, he writes, the use of SET numbers as measures of overall teaching effectiveness has been an "unqualified failure" (p. 151); the sensitivity of SET numbers to biases like grading leniency and other extraneous factors remains "unacceptably high" (p. 165).
"Grade inflation" is a flawed term. It suggests an analogy to price inflation. If prices remain stable for a time after a period of inflationary increases, the harm comes to an end. By contrast, grades rise toward the ceiling of 4.0. The effect is a permanent degradation of the grading system, a shrinkage of the scale that is in actual use - "grade compression." Thus in the case of grades, reform may require some degree of "deflation." The longer we wait, the more difficult reform will be.
The task of upholding sound standards, and of assigning grades fairly and appropriately, so that the distinction between one grade and the next is valid, is a difficult component of teaching. It can be both difficult and unpleasant in the absence of support from university policies and administrators.
Overall grade distributions have been rising steadily at LSU and at many other universities. The rise in grades has not been uniform, but has run more rapidly in some programs than in others. Within a single program one often sees not a steady rise in all courses, but major sudden upward lurches, first in one course and then in another.
The problems that are present in our grading practices need to be analyzed and addressed locally, within academic units, in a discipline-specific way. Nevertheless, they must also be appreciated globally.
University-wide, we should consider the interaction between our grading practices and those of the high schools. LSU has increasingly used the high school academic GPA (HSAGPA) as a weight-bearing element of freshman-admissions criteria. A study was done in fall 2002 by the Office of Budget and Planning, considering students entering LSU as new freshmen from LSU's top 50 feeder schools. Some of the results were as follows:
In the perplex of grades and standards, there are curricular issues as well. The academic profile of our entering class is much stronger now than in 1985. Perhaps we owe them an upward adjustment in our expectations, in our standards, and in the level of general graduation requirements.
In sum: Our consideration of "grade inflation" ought to include the full set of related issues. The essential point is that faculty should take hold of that which is clearly within their purview and responsibility - the quality of academic standards and programs.
Valen Johnson's book understandably focuses on the use and abuse of SET numbers, which is surely a central problem. If teachers - particularly the GAs and others with less job security - perceive that we do not care about grading practices, and that their interests lie in raising their average SET number by +0.4 points, then the effect is to press downwards on academic standards. Section 5 of this review discusses how to reform the use of SET numbers. But first, Section 4 will speak of remedies for grade inflation, broadly defined and considered.
The author discusses a number of remedies. I quote in part his descriptions of the first two, which are similar to recommendations in the Faculty Senate Resolution:
Surely those two proposals are essential, and ought to be undertaken without delay. My opinion is that such measures are also very promising. In sum: Shine a light on the problem, confirm the authority of faculty, and stop the absent-minded pressures to lower standards. In particular, restrain and reform the use of SET numbers.
Understanding that the grading bias exists, and that it is not explainable as benign, we should become wiser about SET numbers. They can still be used without harm if they are used with restraint and good judgment; if other indicators of teaching quality are also used; and if the importance of appropriately high grading standards is steadily affirmed as - indeed - a component of good teaching.
Remember that we are talking about only a measurable statistical tendency for the student's grade to predict how he or she will rate the teacher. The analysis does not say that the SET numbers are totally corrupted by this bias. The student's grade is not the only predictor of the rating. In fact, the best predictor of a student's rating is the consensus rating by other students in the class (p. 95). More precisely put: As a predictor of how a student will rate you as a teacher, the grade received is about one-quarter to one-half as important as the consensus rating by the class (page 115).
Thus the studies themselves give us no reason to believe that the student consensus, if adjusted for this bias, is anything but a conscientious evaluation of the teacher. Depending on the level of the class and other circumstances, one might of course reach various conclusions about the validity and significance of the evaluation. In any event, we have an interest in how well satisfied our students are with our teaching. That has an importance in itself. The businesslike and courteous treatment of students, personal concern, and other teacher behaviors ought to be encouraged, whether or not we can prove their correlation with teaching effectiveness.
There are surely more ways than one to design a reasonable and moderate policy for the evaluation of teaching, eschewing both the prevalent practice of using SET numbers excessively, on the one hand, and the extreme notion of ignoring student opinion altogether, on the other. I offer for discussion the following set of guidelines as one possibility:
Let me put it another way. It seems to me that the professional standing of faculty, the principles of academic freedom, and the effective promotion of good teaching require restraint. To wit:
Surely, wisdom does not lie in complacency about the state of grading practices. Just as surely, moderate remedies should be tried first, with patience and persistence. At the same time, it is an instructive exercise to think through stronger measures which have been proposed and sometimes applied. They tend to consist of the insertion, into the system, of mathematical processing schemes, which claim to repair the results of bad practice and/or to induce better practice. In each case, however, we are left with questions: Is the logic of the scheme complete? What will be the actual effect on practices, once the processing scheme is in place? Will that be a good effect? Will it be the intended effect?
The author's presentations of remedies 3-7 illuminate the prevalent injustices and dysfunctions. Nevertheless, the ultimate result may be only to send us back to Remedies 1 and 2 with renewed determination.
Johnson calls Remedies 3 and 4 "radical" (pages 240-241). He calls Remedies 5, 6, and 7 "centrist" and "minimally invasive" (pages 244-245).
After an immersion in Johnson's relentlessly serious analysis, a reader may want an antidote to restore humor and balance. When I presented the Resolution, I tried to express a temperate, decaffeinated attitude in my remarks. But the worried reaction of a few, not quite hearing me, was understandable. So I'll repeat: The Senators who supported the Resolution, by and large, don't believe that grades are everything. We do not wish to make every increment of student effort into a Skinnerian moment. Our hopes are modest. Many of us made some Bs ourselves, and appreciate the teachers who acquainted us with a meaningful A-standard; we'd like to return to the day when a B was a good grade.
Senator Jim Catallo recommended the book by Alfie Kohn entitled Punished by Rewards (431 pages, Houghton Miflin, Boston, 1993). When Kohn was a student in Introduction to Psychology, he was required to train caged rats. He turned in a lab report written from the rat's point of view, and the instructor was not amused. His book seriously questions our society's system of grades, prizes, and other incentive systems. Fair enough. I recommend his remarks on college grading as an antidote to behaviorist extremes. But don't miss his explanation of why he accepts fees for talks.
Those who deny or celebrate grade inflation are happy perhaps because it's essentially destroying the grading system. Kohn is one of those. He is a bit of an educational anarchist. Surely if we have the grace required to make anarchy work - to make a gradeless system work - then we certainly have the grace to make our classical grading system work with integrity, balance, and fairness; and that's what we ought to do.