20.12.15

GRADE ON THE CURVE AT THE SWIMMING POOL.

I got in the habit of telling students that I got paid to say No and uphold standards.  Did a lot to head off the whining about "I'm-paying-for-this-course-and-not-getting-what-I-want" that I kept hearing about (almost always at some remove from the story itself.)

That's not to say that the business of assigning grades is easy, no, no matter how many partitions your rules offer and how subtle the criteria for distinguishing excellent from solid from mediocre from failure are, you always get what I'll start referring to as the NPR Frequency Problem.  How do you reward the 89.5?

That's where the notion of curving comes in.  Strictly speaking, though, a "curve" is a monotone but nonlinear transformation of the Law of Large Numbers, thus my course outline for small classes always featured language to the effect of "in a class of thirty students (or fewer than 128) the population is too small to use an arbitrary curve."  You can get away with such things, particularly the 128 clause, with economics majors, after they've seen enough statistics to be fretful.  If a student raised a question about "curve" early in the class, I'd come back with "I don't want to tell three or four or seven students to expect to fail."  See where it helps to cultivate the Crusty Road Foreman of Engines persona?  My choice of a number was an integer close to ten percent of the enrollment that Registration and Records had on the roster.

In student-speak, however, "curving" inevitably refers to treating the top of the frequency count differently than the bottom of the frequency count, and one simple strategy to deal with that is to assure students that the top total score, whatever that is, sets the curve.

The way to demonstrate the folly of a rule is to comply with it.
Since he started teaching at Johns Hopkins University in 2005, Professor Peter Fröhlich has maintained a grading curve in which each class’s highest grade on the final counts as an A, with all other scores adjusted accordingly. So if a midterm is worth 40 points, and the highest actual score is 36 points, "that person gets 100 percent and everybody else gets a percentage relative to it,” said Fröhlich.

This approach, Fröhlich said, is the "most predictable and consistent way" of comparing students' work to their peers', and it worked well.
That's because there's a Prisoner's Dilemma present in the policy, which formed the basis for a problem I often assigned.  I'd describe such a policy and then ask, loosely, "Why aren't professors concerned about students cooperating to rig the curve?"  A clear on the concept answer would note that the dominant strategy is to defect.

Except when students don't defect.
As the semester ended in December, students in Fröhlich’s "Intermediate Programming", "Computer System Fundamentals," and "Introduction to Programming for Scientists and Engineers" classes decided to test the limits of the policy, and collectively planned to boycott the final. Because they all did, a zero was the highest score in each of the three classes, which, by the rules of Fröhlich’s curve, meant every student received an A.

“The students refused to come into the room and take the exam, so we sat there for a while: me on the inside, they on the outside,” Fröhlich said. “After about 20-30 minutes I would give up.... Then we all left.” The students waited outside the rooms to make sure that others honored the boycott, and were poised to go in if someone had. No one did, though.
I'm not sure how long a Hopkins exam is, but no self-respecting Crusty Road Foreman is going to acknowledge a strike in only twenty minutes. Or not suggest that someone reported sick and will be writing a makeup later. Or not suggest that someone had to catch a plane overseas, or had three exams within 24 hours and wrote the exam early.
Andrew Kelly, a student in Fröhlich’s Introduction to Programming class who was one of the boycott’s key organizers, explained the logic of the students' decision via e-mail: "Handing out 0's to your classmates will not improve your performance in this course," Kelly said.

"So if you can walk in with 100 percent confidence of answering every question correctly, then your payoff would be the same for either decision. Just consider the impact on your other exam performances if you studied for [the final] at the level required to guarantee yourself 100. Otherwise, it's best to work with your colleagues to ensure a 100 for all and a very pleasant start to the holidays."

Kelly said the boycott was made possible through a variety of technological and social media tools. Students used a spreadsheet on Google Drive to keep track of who had agreed to the boycott, for instance. And social networks were key to "get 100 percent confidence that you have 100 percent of the people on board" in a big class.
Gotta love that the students were using course management systems, which often make the class roster available to all students (that's not always desirable, I've heard stories from female students of males in their class hitting on them) in order to construct the set of conspirators who must be brought in. But still, policing the defector who requested some accommodation ahead of time is difficult.

And you'd think teaching a lesson about coordination failure ought to have some more desirable example in mind.
Fröhlich took a surprisingly philosophical view of his students' machinations, crediting their collaborative spirit. "The students learned that by coming together, they can achieve something that individually they could never have done," he said via e-mail. “At a school that is known (perhaps unjustly) for competitiveness I didn't expect that reaching such an agreement was possible.”
The Grumpy Old Road Foreman would note something more along the lines of "If you [snowflakes -- ed] spent half as much time studying as you're spending attempting to game the system, you wouldn't have to game the system.

But course outlines, like the Consolidated Code of Operating Rules, emerge as undesirable situations present themselves.
Although Fröhlich conceded that he did not include such a “loophole” in the policy “with the goal of students exploiting it,” he decided to honor it after the boycott.
The way to demonstrate the folly of a rule is to comply with it.

Then the Rules Examiner proposes a revision.
Despite awarding As to all the students who participated in the boycott, the experience has led Fröhlich to alter his long-held grading policy.

“I have changed my grading scheme to include ‘everybody has 0 points means that everybody gets 0 percent,’” Fröhlich said,  “and I also added a clause stating that I reserve the right to give everybody 0 percent if I get the impression that the students are trying to ‘game’ the system again.” Fröhlich added that going forward, he will give students a choice between a final exam and a final project, and that his class for the spring 2013 semester has voted for the latter.
The task of writing rules, however, is to write them in a way free of or thin on ambiguity, and that "reserve the right ... if I get the impression" opens the door for all manner of grade appeals.

Most institutions pretending to offer higher education, however, require some statement of assignments, weights, and likely outcomes as part of the conditions of carriage, er, Syllabum Omnium.  This statement does not have to be ultra-specific.  I always included a line to the effect that "improvement matters" and another in the form "Historically, students earning 92 or more points have done no worse than an A, 84 or more points no worse than a B ..."  That gave me an easy response to the NPR Frequency Problem: somebody showing continued improvement and finishing at 89.5 (or 91.5) is more likely to have done excellent work ... and the Crusty Road Foreman would add, don't put yourself in that position by [underachieving -- ed] early on.

There's a variation on the "top score sets the curve" approach that readers, assuming you've stayed with me this long, might consider.  In any multiple question assessment, there are no A students, only A answers.  Thus, suppose the exam has four parts, and the 36 point exam earns respectively 10, 9, 9, 8 on each part.  Now suppose there is another exam, where someone has earned correspondingly 7, 10, 8, 6, and another that has earned 9, 6, 7, 10.  Based on three observations, the frontier, or degree of difficulty, of the exam is 39, not 36.  And that, dear reader, is why I developed the habit of encouraging students who wanted to raise their marks to start with their own efforts, and figure out how to bring their answers up to the standard of their best answer.  And if their best answer was a 7, to then think about how to strengthen that.