Friday, March 05, 2010

Grading Article In Teachers College Record

This was published yesterday, March 4th.  I'll keep it posted until the TCR makes their version publicly available next month (right now it's only available to subscribers).

Grading in American Colleges and Universities
Stuart Rojstaczer and Christopher Healy

Here we report on historical and recent grading patterns at American four-year colleges and universities. Records of average grades show that since the 1960s, grading has evolved in an ad hoc way into identifiable patterns at the national level. The mean grade point average of a school is highly dependent on the average quality of its student body and whether it is public or private. Relative to other schools, public-commuter and engineering schools grade harshly. Superimposed on these trends is a nationwide rise in grades over time of roughly 0.1 change in GPA per decade. These trends may help explain why private school students are disproportionately represented in Ph.D. study in science and engineering and why they tend to dominate admission into the most prestigious professional schools. They also may help explain why undergraduate students are increasingly disengaged from learning and why the US has difficulty filling its employment needs in engineering and technology.

Grades are the primary currency of academia. They are used by teachers to motivate students and by graduate schools, professional schools, and employers to identify promising candidates. Yet there is so much variability in grading from one school to the next and between various major areas of study that mistakes in evaluation are commonplace. There have also been calls to abandon grades, but it seems highly unlikely that grade-less classes in institutions of higher learning will become anything more than a small component of academe.

A key problem in terms of evaluation and assessment of undergraduates is that there is no transparency in the nature of grading in American colleges and universities. In its place there is collective hearsay and conventional wisdom, and as is common with such pieces of information, this collective “knowledge” is often completely wrong.

Institutional variability in undergraduate grading is a longstanding issue in the United States. One can find spirited discussions on this topic from the early twentieth century (e.g., Foster, 1912). Over the last forty years, however, interinstitutional variability has increased significantly. Conversely at the intra-institutional level, mean grades have increased to such a degree that grade variability is highly compressed at the upper end.

We’ve looked at contemporary grades from over 160 colleges and universities in the United States with a combined enrollment of over 2,000,000 students and historical grades from over 80 schools (listed below) in a first attempt to provide the necessary transparency to make both meaningful evaluations of student achievement possible and to point to problems with current grading practices at the undergraduate level in the US.

We first briefly look at trends in grading over time. Grade inflation has been an issue in academia since the 1960s (e.g., Juola, 1976; Suslow, 1976). Our analysis of grading histories suggests that while the rapid rise of grades in the 1960s was a unique occurrence, grades rose measurably in the 1930s and 1940s as well (Figure 1). Like any currency, it would seem that grades are intrinsically subject to inflation.

Figure 1. Average GPA over the time period 1930-2006 as a function of school type. Grey dots represent individual data points. Colored squared represent the mean GPA for each school type over time. Suslow (1976) shown for comparison.

The rise in grades in the 1960s correlates with the social upheavals of the Vietnam War. It was followed by a decade period of static to falling grades. The cause of the renewal of grade inflation, which began in the 1980s and has yet to end, is subject to debate, but it is difficult to ascribe this rise in grades to increases in student achievement. Students’ entrance test scores have not increased (College Board, 2007), students are increasingly disengaged from their studies (Saenz et al., 2007), and the literacy of graduates has declined (Kutner et al., 2006). A likely influence is the emergence of the now common practice of requiring student-based evaluations of college teachers. Whatever the cause, colleges and universities are on average grading easier than ever before.

Private and public schools graded similarly until the 1950s when grading practices for these schools began to bifurcate.  The reasons for this bifurcation are not fully understood, but it was during this time that quantitative measures of undergraduates took hold in graduate school and professional school admissions. It appears that sometime in the 1950s to 1960s, the major purpose of grading at colleges and universities changed from an internal measure and motivator of student performance to a measure principally used for external evaluation of graduates. As a Yale dean noted about Yale’s abandonment of their traditional qualitative assessments in favor of the common four point grading system, “We wanted to force graduate schools to look at the student, not at a grade point average. But to a large extent, our effort has been frustrated” (Polan, 1970).

In response, private schools – more so than public schools – raised their grades. In the words of one late faculty member from Dartmouth, “we began systematically to inflate grades, so that our graduates would have more A's to wave around” (Perrin, 1998). The GPA gap between the private and public schools widened through the 1970s, and has stabilized since the 1980s.

Looking at finer scale variability, we find that colleges and universities have, without any collective consultation or external pressure, collectively created an ad hoc national grading scale based on school selectivity. Our database indicates that current grades at an institution can be roughly predicted by either of the following two formulae:

Average GPA = 2.8 +0.005*SEL + (if school is private add 0.2)
Average GPA = 2.8 + 0.001*(SATMV-850) + (if school is private add 0.1)

where SATMV is the combined average Math and Verbal SAT score of students and SEL is a selectivity measure that represents the average of the percentage of students with high school GPAs above 3.75, the percentage of students who graduated in the upper 10% of their high school class, and the percentage of student applicants rejected.

The above two equations suggest that private schools are grading 0.1 to 0.2 higher on a 4.0 scale for a given talent level of student. Since the evidence indicates that private schools in general educate students no better than public schools (Perscarella and Ternzini, 1991), private schools are apparently conferring small but measurable advantages to their students by more generous grading. Private schools also have on average students from wealthier families, and the effect of our nation’s ad hoc grading policy is to confer unfair advantages to those with the most money.

It is perhaps easy to see why graduates from certain private schools dominate placement in top medical schools, law schools, business schools, and why certain private schools are overrepresented in Ph.D. study (E. Bernstein, 2003; J. Burrelli et al., 2008). They grade easier and there is a tendency for graduate schools, professional schools, and some employers to confer extra stature to those who have attended selective private schools. Also, the fact that students from private schools tend to come from wealthier homes means they can stay in school longer.

Not all schools follow the patterns above. There are many that grade significantly higher and lower than their peer schools. Applying the formulae above to observed grades, we can look at how well schools deviate from predicted grading (Figure 2). Private liberal arts colleges have considerable variability from that predicted by the two equations above, particularly on the low end. Public flagship schools are significantly positively skewed in their grading relative to other public schools. What is perhaps most striking is that science and engineering schools – the MITs and Georgia Techs of academe – grade on average about 0.15 lower than their non-science and engineering peers. This may be why these schools tend to have lower retention levels; it also means that their graduates are disadvantaged in terms of their post-graduate prospects.

Figure 2. Histogram of differences between the average GPA predicted by both SAT score and selectivity as a function of school type. Number of schools examined for each school type is given in parentheses.
Examination of grading patterns as a function of major discipline area indicates that while grades tend to be lower across the board for science and engineering schools, part of the reason that students in these schools have relatively low grades is due simply to the fact that they take relatively more science courses (Figure 3). Nationally for all colleges and universities, science departments grade on average roughly 0.4 lower on a 4.0 scale than humanities departments and 0.2 lower than social science departments. These differences have been thought to be a relatively recent phenomenon (Sabot and Wakeman-Linn, 1991); but they have been present for 40 years and sparse data from the 1930s and 1940s suggest that they have been present for over 70 years. Contrary to conventional wisdom, engineering departments tend to grade significantly higher than science departments for the schools in our database.

Figure 3. Differences in GPA’s between disciplines. Grey symbols represent the differences between disciplines for individual schools. Large colored symbols represent averages for all schools by decade.

The overall low grades in science departments at most schools and relatively low grades in science and engineering schools translate into disincentives for students to enroll in science and engineering at an undergraduate level or pursue graduate study (Sabot and Wakeman-Linn, 1991; Johnson, 2003). They also put science students at a disadvantage for receiving academic honors at most schools. Partly because of our current ad hoc grading system, it is not surprising that the US has to rely heavily upon foreign-born graduate students for technical fields of research and upon foreign-born employees in its technology firms (e.g., Cohen and Fields, 1999).

Many of the problems with grading today could be rectified if those who evaluated graduates had readily available measures of how individual institutions and major discipline areas grade so that side-by-side comparisons could be made. As a first attempt to establish more transparency, we have created a public online database of grades, It is our recommendation that a national public database be assembled that involves all colleges and universities. It would remove much of the unnecessary and harmful mystery about undergraduate grading. It may partly serve to remove the current bias toward graduates from families of wealth in graduate school and professional school admissions; and may help minimize the disadvantages sciences and engineering students face because of relatively stringent grading.

A public national database of grades, however, would not likely deter grade inflation. Currently at private colleges and universities in our database, the average GPA is 3.3. At public schools, it is 3.0. At these high levels of grades where almost everybody is automatically above average, it can be difficult to motivate many students to excel (e.g., Harackiewicz et al., 1997). At the upper end of GPAs, it lumps the achievements of outstanding students together with those of all other students.

Some may interpret the data we’ve collected as proof that it would be best if colleges and universities abandon grading altogether. We believe such a move is neither possible nor advantageous. If grades continue to rise or grades are abandoned, motivation problems likely will continue to increase. Graduate schools, professional schools, and employers will likely heavily discount students’ undergraduate achievements and instead rely on standardized test scores in making evaluations.

Our work would not have been possible without the cooperation of many university administrators, archivists and professors including: Benny Amarlapudi, Kirk Baddley, Cheryl Brown, Eric Ditwiler, James Fergerson, Donna Gilleskie, Thomas Harkins, Sarah Hartwell, Earl Hawkey, Chris Jacobson, Gary Kates, Peter Lange, Elizabeth Lieberman, Joseph Meyer, David Oxtoby, Kent Peterman, Catherine Riordan, Jon Rivenburg, Wayne Schneider, the late Marion Shepard, J. Kenneth Smail, Lynn Steen, Lawrence Summers, and Nancy Weiss-Malkiel.

Bernstein, E. (2003). Want to go to Harvard Law? Wall Street Journal, September 9, W1.
Burrelli, J. Rapoport, A., and Lehming, R. (2008). Baccalaureate Origins of S&E Doctorate Recipients, NSF08-311.
Cohen, S.S., and Fields, G. (1999). S Social capital and capital gains: An examination of social capital in Silicon Valley. In
M. Kenney, (Ed.), Understanding Silicon Valley (pp. 190-217). Palo Alto, CA: Stanford University Press.
College Board (2007). Mean SAT Scores of College-Bound Seniors, 1967-2007, Retrieved March 4, 2010, from
Foster, W.T. (1912). Scientific distribution of grades at Reed College. Science, 35, 887-889.
Harackiewicz, J.M., Barron, K.E., Carter, S.M., Lehto, A.T., and Elliot, A.J. (1997). Predictors and consequences of achievement goals in the college classroom: Maintaining interest and making the grade. Journal of Personality and Social Psychology, 73, 1284-1295.
Johnson, V.E. (2003). Grade inflation: A crisis in college education. New York, NY: Springer.
Juola, A.E. (1976). Grade inflation in Higher Education: What can or should we do?, ED129917.
Kutner, M., Greenberg, E., and Baer, J. (2006). A first look at the literacy of America's adults in the 21st century, NCES 2006-470.
Pascarella, E.T., and Terenzini, P.T. (1991). How college affects students: Findings and insights from twenty years of research. San Francisco, CA: Jossey-Bass.
Perrin, N. (1998). How students at Dartmouth came to deserve better grades. Chronicle of Higher Education, October 9, A68.
Polan, C. (1970). Present grading system hit by faculty. Yale Daily News, May 1, 4.
Sabot, R., and Wakeman-Linn, J. (1991). Grade inflation and course choice. Journal of Economic Perspectives, 5, 159-1970.
Saenz, V.B., and Barrera, D.S. (2007). Findings from the 2005 College Student Survey (CSS): National aggregates. UCLA Higher Education Research Institute.
Suslow, S. (1976). A report on an interinstitutional survey of undergraduate scholastic grading 1960s to 1970s ED129187.

Notes on Data
Historical data (15 years or longer in length) come from the following schools. Alabama, Appalachian State, Auburn, Brown, Bucknell, Carleton, Central Florida, Central Michigan, Charleston, Colorado, Colorado State, Columbia, Cornell, CSU-East Bay, CSU-Fullerton, CSU-Sacramento, CSU-San Bernardino, CSU-San Jose, Dartmouth, Duke, East Carolina, Elon, Florida, Furman, Georgia Tech, Georgetown, Georgia, Hampden-Sydney, Harvard, Harvey Mudd, Hawaii-Hilo, Hawaii-Manoa, Hope, Houston, Indiana, James Madison, Kansas, Kent State, Kenyon, Knox, Messiah, Michigan-Ann Arbor, Michigan-Flint, Middlebury, Minnesota-Twin Cities, Missouri State, MIT, Montana State, Nebraska-Kearney, North Carolina-Asheville, North Carolina-Greensboro, North Carolina State, North Carolina-Chapel Hill, Northern Iowa, Northern Michigan, Northwestern, Ohio State, Ohio University, Penn State, Pomona, Princeton, Purdue, Reed, Rutgers, Southern Illinois, Stanford, Texas, Texas A&M, Texas State, UC-Berkeley, UC-Irvine, UCLA, UC-Santa Barbara, Utah, Virginia, Wake Forest, Washington-Seattle, Washington State, Wellesley, Western Michigan, Western Washington, Wheaton, William & Mary, Williams, Winthrop, Wisconsin-Lacrosse, Wisconsin-Madison, Wisconsin-Milwaukee, Wisconsin-Oshkosh, and Yale. In addition to the schools above, contemporary data come from Adelphi, Albion, Allegheny, Amherst, Arkansas-Fayetteville, Ball State, Boston University, Bowdoin, Bowling Green, Butler, Caltech, Case Western, Centre, Clarion, College of New Jersey, Columbia (Chicago), Connecticut, CSU-Fresno, Delaware, DePauw, Emory, Florida Atlantic, Gardner-Webb, George Washington, Gonzaga, Grand Valley State, Grinnell, Haverford, Idaho, Illinois at Urbana-Champaign, Iowa, Iowa State, Johns Hopkins, Kentucky, Lehigh, Louisiana State, Macalester, Methodist, Miami-Oxford, Michigan Tech, Middlebury, Minot State, Missouri-Columbia, Missouri Science and Technology, Monmouth, New York University, North Carolina-Asheville, North Carolina-Wilmington, North Dakota, Oklahoma, Old Dominion, Oregon, Oregon State, Pennsylvania, Portland State, Rensselaer, Roanoke, Rockhurst, Smith, South Florida, Southeastern Louisiana, Southern Connecticut, Southern Methodist, Southern Polytechnic, Southern Utah, Spelman, St. Olaf, SUNY-Oswego, Syracuse, UC-San Diego, Utah State, Valdosta State, Vanderbilt, Virginia Commonwealth, Virginia Tech, Washington and Lee, Washington University, Wellesley, West Georgia, Whitman, Winthrop, Wisconsin-Green Bay, Wright State and Wyoming. Sources for these data are listed at

Data for engineering grades come from the following schools. College of New Jersey, Cornell, Dartmouth, Duke, Georgia Tech, Maryland, Ohio, Princeton, Stanford, Wisconsin-Madison and Wisconsin-Milwaukee.

Data for humanities, social sciences and natural sciences grades come from those above plus the following schools. Bowdoin, Brown, Bucknell, Furman, Georgetown, Georgia, Harvard, Hawaii-Hilo, Macalester, Michigan State, Missouri State, North Carolina-Chapel Hill, Northern Arizona, Oregon, Pennsylvania, Portland State, Rutgers, Spelman, SUNY-Geneseo, Texas A&M, UC-Irvine, UC-Riverside, and Utah.

Cite This Article as: Teachers College Record, Date Published: March 04, 2010 ID Number: 15928


tex said...

Thank you for posting the piece. Interesting stuff.

Two additional issues to consider.

One, students will scream bloody murder to department chair if they have less than a B in a course. As a colleague is fond of saying, B is the new C. Unfortunately, he is right.

Two, expect more grade inflation as we push more unprepared students into the higher education system and demand higher retention rates. North Carolina's university system is considering a proposal that ties retention rates and graduation rates to state funding levels.

My current employer wants more students and higher retention rates. The future of our society, the school, and the faculty be damned.

Tom Finnigan said...

How does this track with the Flynn effect?

Chris Bonds said...

@tex: Your point connecting retention rates to grade inflation is a good one. Where this is particularly noticeable is in at-risk academic programs where administrative decisions on continuing the program are tied to enrollment sufficient to justify the FTEs being used. Giving high grades is a good way to keep students in the program, but ultimately is bad for both the students and the reputation of the institution.

menoitios said...

I noticed grade inflation at 2 of the 3 schools at which I received degrees. None of which were in the US.

What I noticed was that the one school that lacked inflation did so for concrete structural reasons: No matter what your entry average, students were given a "dean's list" scholarship if their average at the school was an A- or greater. This could happen in any year the student went to the school -- these were not scholarships given on entry.

Furthermore students lost these and all other academic scholarships if their grades dropped below that mark.

This meant that it was not in the administration's interest to have inflation.

At other school's the administration was willing to bend to the will of student and parent "customers".

At the school with the generous scholarship policy, departments had to justify grades. The administration was willing to push back when students and parents complained.

Now the willingness to enact such policies may have a common cause with the correlations in this study. I.e. the school in question was not an old elite university, but rather a 20th century foundation.

This however does not change the fact that the policy itself help hold the line.

maia said...

Speaking from Britain, 1) science and maths grads get higher grades because you can't deny them a first: grade inflation means professors are careful not to award firsts to hardly anyone (ours: first/1, upper second/2:1, lower second/2:2, third/3) and everyone gets a 2:1 so they react by denying firsts as much as possible.
2)grade inflation is real higher achievement. I went to uni 20 years ago and then returned aged 40 (about 10% of undergraduates are 'mature students' in UK). We used to know if we got a degree, we'd get a good job. Now we know, less than a 2:1 no employer will look at you, you work so hard....
The only way to end grade inflation is to stop terrifying students that if they get less than top marks they won't end up as cleaners; employers/the public has to stop focusing on 'how can we separate out the geniuses' and accept it's possible to be a good employee with a 2:2. But, every year when exam results arrive, the news organisations chorus 'it's shameful! they get bigger marks each year! it's rigged!' so the students are under even more pressure. The Chinese get high marks because they study 18 hours a day: modern students get higher marks because they study 12-16 hours a day. You can't stop students who work hard doing well unless you rig it.