Adventures with Grading Contracts in Operations Classes: A Preliminary Investigation

Published Online:https://doi.org/10.1287/ited.2023.0049

Abstract

Statewide initiatives and the COVID-19 pandemic have caused us to re-examine how we deliver and evaluate students in our courses. In this paper, we discuss our experiences using grading contracts, which are commonly used in writing programs but virtually unknown in fields like operations research. We found strong undergraduate student support for the contract and no statistically significant differences in final course grades. We urge our fellow educators to consider alternatives to the traditional grading schemes that strongly favor in-class exams in evaluating students.

1. Introduction

In 2015, the California State University (CSU) system set out the Graduation Initiative 2025, with the goal of increasing graduation rates and eliminating opportunity and achievement gaps (CSU 2023b).

The mission of the CSU includes the following:

  • “To advance and extend knowledge, learning, and culture, especially throughout California.

  • To provide opportunities for individuals to develop intellectually, personally, and professionally.

  • To prepare significant numbers of educated, responsible people to contribute to California’s schools, economy, culture, and future.

  • To encourage and provide access to an excellent education to all who are prepared for and wish to participate in collegiate study.” (CSU 2023a)

The system is different from the University of California system, which is research focused and targets the top high school graduates. Rather, the CSU is focused on serving Californians broadly and providing access to higher learning to all; half of all undergraduates in California study at a CSU (CSU 2022). This leads to a remarkably diverse student body on virtually all dimensions. Our department is housed in a business school at one of the CSU campuses, providing operations research-based courses and the statistics and operations core courses required for most undergraduate and graduate majors. As a result, in addition to the diversity inherent in the CSU, we serve a student body that is less inclined toward quantitative subjects, for a variety of reasons.

Quantitatively oriented subjects have notoriously (and sometimes proudly) had high failure rates for all populations and low numbers of underserved communities represented, see, for example, Arcidiacono et al. (2016) or Ramsay‐Jordan et al. (2022). Operations research (OR) is no different in this respect (Johnson and Chichirau 2020). The push from the system to address high failure rates and gaps, combined with intensive professional development provided around equitable instruction practices as a result of the COVID-19 pandemic led our department to rethink how we were delivering our material to students.

In this paper, we explore one of the innovations we have introduced in some classes: the use of grading contracts. The paper is organized as follows. In Section 2, we discuss traditional grading approaches and then introduce alternatives that have been developed. Section 3 details our experiences using grading contracts at both the undergraduate and graduate levels. Finally, Section 4 provides a discussion and our conclusions.

2. Background

2.1. Traditional Grading Approaches

For decades, student grades have traditionally been assigned using a weighted sum of individual assessments and assignments. A passing grade in a science, technology, engineering, and mathematics (STEM) subject course would signify that the student possessed enough knowledge and problem-solving skills to advance to subsequent classes, thus ensuring the student’s future success. For example, in order for a student to succeed in precalculus, they must first master the skills in algebra. As such, STEM subject courses traditionally rely on high-stakes, closed-book exams to test students on the material and problem-solving techniques covered. The various course assignments would be designed to allow the students to master these techniques. Once completed, the exams and assignments are graded for correctness, as insufficient knowledge tends to result in incorrect answers.

After the individual assignments are graded, each student’s course grade can be calculated by a predetermined weighted sum of the individual assignment scores. Some instructors may elect to assign the final letter grade on a curve, which prevents unintended grade fluctuations from class to class due to assessments that are too difficult, easy, or ambiguous. However, grading on a curve tends to create a competitive and hostile learning environment for students rather than a collaborative one (Tobias and Lin 1991, Seymour and Hewitt 1997).

The types of questions typically used for assessments in STEM subjects include open response calculation, multiple choice, and fill-in answers. While timed in-person exams provide a degree of fairness across students, they assess only a subset of student skills (Iannone and Simpson 2011). Multiple choice questions (MCQs) are popular in larger classes as they reduce the burden of grading. However, MCQs have a potential bias for students with different learning styles and confidence levels (Sangwin 2013). Furthermore, it has long been documented that high-stakes exams induce anxiety in students, resulting in increased cases of cheating (Nichols and Berliner 2007). Anxiety and stress directly influence student confidence and persistence, which Previde et al. (2019) find to be a strong indicator for student success and graduation within a university program targeted at underrepresented minorities. Although the issues of traditional assessments have always been present, awareness of them has only become widespread in recent years.

When the COVID-19 pandemic hit in early 2020, educators across the globe were forced to provide continued education in remote modalities. It soon became obvious that alternative assessment approaches were necessary, as in-person proctoring was no longer viable. One common alternative is breaking down high-stakes exams into shorter, lower-stakes quizzes (Nagasaka 2020). In a recent study by Fitzmaurice and Ní Fhloinn (2021), out of 206 STEM instructors across 29 countries, 58% elected to use some type of formative assessment to evaluate their students after the transition to remote learning. Yet regardless of the type of assessments chosen, dishonorable collaboration between students cannot be fully prevented. Although adding an oral examination component could help detect cheaters (Niss 1998), it is not scalable for larger class sizes, and even proponents (Theobold 2021) acknowledge potential equity issues. Instead, alternative grading structures should be adapted to encourage students to learn, and to reduce, if not eliminate, their reasons to cheat.

2.2. Ungrading

Traditional grading uses a well-defined answer key or rubric to assign scores to individual assignments and assessments. A total score is then calculated by tallying up the individual scores with a predetermined formula, such as a weighted average, and translated into a letter grade. Any grading approach that deviates from the traditional grading methods is known as ungrading (Stommel 2021), or “grading for growth.” In this section, we explore two common approaches to ungrading.

2.2.1. Standards-Based Grading.

Traditional assessments are intended to evaluate and ensure student learning outcomes. However, when students are incentivized by one final grade, their motivation to learn may be diminished. Instead, students are incentivized to seek out the shortest path to high scores, such as choosing the simplest required assignment options or even cheating (Kohn 2011). High-stakes assessments such as final exams increase anxiety and stress for students, which can be counterproductive and create negative learning outcomes, as well as health concerns (Shankar and Park 2016). (This is especially true for traditionally underserved students, which can lead them to leave majors where high-stakes assessments are common (Seymour and Hewitt 1997).) Standards-based grading checks for the desired student learning outcomes without using a rigid grading system, using the so-called “ungrading” approach.

Like traditional grading, standards-based grading still employs assignments such as homework and projects, as well as quizzes and exams in some cases. Unlike traditional grading, however, standards-based grading does not use a weighted average formula to determine the students’ final grades. Instead, standards-based grading checks for student learning outcomes in small increments, and the students’ final grades reflect how much of the course learning target they are able to meet at the end of the course (Clymer and Wiliam 2006). For example, to ensure a student understands how to take the first-order derivative after taking a precalculus course, in place of an exam, the student could be given a homework assignment with unlimited attempts. An incorrect answer would not be penalized, and a new but similar problem will be offered for the student for their next attempt until they are able to solve the problem. Shada et al. (2011) study the impact of using ePortfolios in place of traditional assessments. These ePortfolios allow for more self-reflective assessments for students and formative assessments by instructors. The students, as a result, are able to better build their academic identity and make the connection across their coursework, resulting in higher retention rates. Overall, standards-based grading benefits students by reducing anxiety, while still holding student learning outcomes to an acceptable standard.

2.2.2. Grading Contracts.

Labor-based grading contracts, or simply “grading contracts,” are an alternative approach to standards-based grading that focuses on student engagement rather than predetermined benchmarks (Inoue 2019). For example, a contract might include the number of revisions a student must make to their essay in order to receive a specific grade. It has been shown that the students’ active learning is positively correlated with traditional test scores (McMurtrie 2022). Therefore, it would be natural to design a grading contract that encourages active learning in place of traditional assessments.

Like standards-based grading, the use of grading contracts is also effective in reducing anxiety and stress for students (Watson 2021). In addition, many assignments involve some degree of subjectivity, which can raise the question of fairness in grading (Elbow 1993). The use of grading contracts reduces the instructor bias in grading, whether implicitly or explicitly (Chin et al. 2020), thereby ensuring more equitable learning outcomes across underrepresented minorities, who are often systematically marginalized from educational opportunities (Dumas 2016, Santos 2022).

Although grading contracts are effective in subjects where “practice makes perfect,” such as literature and writing (Ward 2021), their application on STEM courses may not be as straightforward. One main difference between STEM and writing courses is that the student learning outcome in STEM courses usually includes specific knowledge and problem-solving abilities. Ensuring these learning outcomes are met is essential in preparing students for success in subsequent courses, which is likely to build on top of these outcomes. Khan highlighted the importance of filling those knowledge gaps during one of his TED talks in 2015 (Khan 2015). Cangialosi (2018) discusses some of the ungrading approaches that are suitable for STEM subjects such as group discussion and research projects. In the next section, we discuss in detail our own experiences in employing grading contracts in both quantitative and writing courses.

2.3. Process of Learning

There is a vast body of literature on “learning”: what is it, how does it take place, and how can it be encouraged. Our goal is to encourage a so-called “deep approach” to learning, as opposed to a surface approach. In the deep approach, students incorporate new information into existing knowledge by questioning and seeking to understand. (A detailed discussion of learning can be found in National Research Council (2000).)

Felder and Brent (2005) summarize a list of eight classroom instruction features that have been found to be correlated with a deep approach to learning:

  1. Interest in and background knowledge of the subject encourage a deep approach; lack of interest and inadequate background discourage it.

  2. Clearly stated expectations and clear feedback on progress encourage a deep approach; poor or absent feedback discourages it.

  3. Assessment methods that emphasize conceptual understanding encourage a deep approach; methods that emphasize recall or the application of routine procedural knowledge discourage it.

  4. Teaching methods that foster active and long-term engagement with learning tasks encourage a deep approach.

  5. Opportunities to exercise responsible choice in the content and method of study encourage a deep approach.

  6. Stimulating and caring teaching encourages a deep approach; apathetic or inconsiderate teaching discourages it. A corollary is that students who perceive that teaching is good are more likely to adopt a deep approach than students with the opposite perception.

  7. An excessive amount of material in the curriculum and an unreasonable workload discourage a deep approach.

  8. Previous experiences with educational settings that encouraged deep approaches further encourage deep approaches. A similar statement can be made regarding surface approaches.

We argue that items 2–5 are more or less directly related to grading contracts. In a sense, item 6 is also related insofar as the instructor is showing concern about students by giving the option of the contract in the first place.

Metacognition is another well-known pedagogical tool for deepening learning (Winne and Azevedo 2014). It allows the learner to think more about the learning, which serves to further incorporate the new knowledge into existing neurological structures in the brain. It can be especially effective if it takes place a certain amount of time after the initial learning (Son 2010, Vlach and Sandhofer 2012).

3. Our Experiences

3.1. Classes Used

At the undergraduate level, grading contracts were used in two upper-division classes: UG1, Computer Simulation, and UG2, Communication for Business Analytics, taught by the same instructor. Both classes are in a list of concentration courses, from which students select four, and both are generally taken by seniors. The simulation class is typically rather challenging for business students as it requires coding and debugging skills, as well as the ability to model ill-defined systems. Because learning/practicing these skills can be very time-consuming and frustrating, students experience a lot of anxiety about their grades. The low-stakes grading scheme used for the class was designed to reduce anxiety but did not have that desired effect. For example, rather than high-stakes exams, students would take weekly quizzes and could drop a quiz score for every five quizzes taken. The final project that constitutes the bulk of the course grade would consist of many components, none of which comprise more than 15% of the final grade and/or are timed. The homework assignments that take so much time comprise only 10% of the final grade and are based on the effort and work shown rather than the final answer. (The grading scheme is discussed in more detail in Section 3.2.)

The communications class is less stressful in that there is no fixed quantitative component. On the other hand, students do not receive grades for their writing until they submit final drafts, which can take a while. (Each student is allowed two revisions per assignment.) Writing can also be difficult to assess, especially for students. Because grading contracts originated in the world of teaching writing, it is a natural step to use them in this class.1

At the graduate level, a grading contract was used by a different instructor in G1, Operations Analysis, a required foundation course for MBA students who did not take an operations management class in their undergraduate program. It is typically taken in the first or second semester of the program. Most students who were not business majors thus need to take this class. Our experience is that undergraduate STEM majors tend to be well equipped to handle this course, but often students who earned liberal arts degrees have had little experience taking quantitative courses, and many have math phobias. Even with the pandemic revision of this class to eliminate high-stakes assessments and calculate the overall course grade based on the best 11 of 12 quizzes or assignments, repeatable practice quizzes, and optional labs for extra credit, some students still expressed trepidation about formula-heavy classes such as this. Both sessions of G1 under study were delivered in a 100% asynchronous format, making this course easier to fit into students’ schedules but providing fewer natural opportunities for peer-to-peer interaction and informal tutoring. Table 1 summarizes the courses discussed in this paper.

Table

Table 1. Summary of Courses Using Grading Contracts

Table 1. Summary of Courses Using Grading Contracts

CourseCourse titleStudent population
UG1Computer SimulationUndergraduate juniors and seniors
UG2Communication for Business AnalyticsUndergraduate seniors
G1Operations AnalysisFirst-year MBA students

3.2. Contracts

The “conventional” grading breakdown used in UG1 is given in Figure 1. Peer evaluations consist of both feedback students give each other on presentations (homework, proposal, and final project), as well as the feedback they give on their group members’ participation in group work. The focus is on providing constructive feedback, both positive and negative. Participation can be either in class or engaging online via Slack or email with the rest of the class or the instructor. The class has a total of eight homework assignments, two of which are completed in report format by project groups. Each project group must also select a single homework problem to present to the class.

Figure 1. “Conventional” Grading Breakdown for UG1

The grading contract for UG1 is given in Figure 2. It puts in writing the grading practices already in place, but difficult for students to remember or to “believe.” A student’s final course grade is the higher of the conventionally calculated grade and the grade according to the contract. (This was done to ease students into the idea of contract grading as a “no lose” scenario.) The draft contract was presented to students during the first class meeting and then discussed and voted on in the second meeting. (There were no proposed changes.) The link to the contract was on both the syllabus and the course web page.

Figure 2. Proposed Grading Contract for UG1

The contract for UG2 was managed similarly to that in UG1. Figure 3 shows the conventional grading breakdown, and Figure 4 shows the contract. Because the class is a writing-intensive class (60% of the course grade is based on writing), the grading contract was easier to construct, given the long history of grading contracts in writing classes. The borrow/lend log is a metacognitive exercise that asks students to reflect on three things they gained (“borrowed”) and three things they contributed (“lent”) to a given class period and assess their participation in the class.

Figure 3. “Conventional” Grading Breakdown for UG2
Figure 4. Proposed Grading Contract for UG2

In both UG1 and UG2, once the class had voted to approve the contract as an option, the instructor determined student course grades the conventional way and via the contract. Students were given the higher of the two grades automatically.

For G1, the grading contract was explained in a video at the beginning of the course. Although no voting or option for alteration was provided as with UG1 and UG2, the contract was presented as an option to the traditional grading approach shown in Figure 5. (LMS refers to the learning management system, in this case, Canvas.) In the last week of the course, after students received graded feedback on all the quizzes and other assignments, they had the option to complete the contract in Excel format, where they would fill the first page in with all their grades and their grade would be automatically determined via the formula shown in Figure 5. The effort-based contract on the second page is shown in Figure 6 and clarifies the expectations for a B− and the additional step-ups or decrements that would happen.

Figure 5. “Conventional” Grading Breakdown for G1
Figure 6. Optional Grading Contract for G1

To stay enrolled in the graduate program, students must maintain at least a 3.0 GPA, so a B− would be considered slightly below program standards but easily balanced by a higher grade in another course. Along with each step, students were asked to provide a brief self-assessment on how this particular grade component contributed to or negatively impacted their learning. The spreadsheet would then calculate the grade, and a final free-form comment box was provided for students to craft a personal reflection on the contract grade. To better support grading transparency, this spreadsheet was available for download at the start of the term, and some students reported using it during the course to determine their progress. Although not formally measured, this file seemed to reduce the number of “what do I need to do to get a [desired grade]” email queries.

3.3. Grades

A total of 61 and 47 students completed UG1 and UG2, respectively, over the three semesters in which a contract was used. Table 2 gives data on the average course outcomes and shows the number of times the grading contract made a difference for the positive and negative, as well as the results of tests of the mean differences.2 (In practice, when the contract would have reduced a student’s grade, students were given the higher of the calculated and contract grades so were not adversely affected by the contract experiment.)

Table

Table 2. Comparison of Grades Under the Two Approaches

Table 2. Comparison of Grades Under the Two Approaches

UG1UG2G1
CalculatedContractCalculatedContractCalculatedContract
Average GPA2.893.123.133.393.413.20
Std dev of GPA1.030.980.660.550.980.94
Sample size614728
No. of times contract improved grade26276
No. of times contract reduced grade0512
No. of times no difference351510
Greatest increaseFive grade steps (once)Four grade steps (once)Six grade steps (once)
Greatest decreaseThree grade steps (once)Five grade steps (once)
Difference (significance)0.23 (0.0087)0.28 (0.13)−0.17 (0.077)

The contract had a positive impact for 43% and 57% of students in UG1 and UG2, respectively. The majority of the changes were by one grade step (e.g., B to B+). One student improved by five grade steps in UG1, whereas one improved by over a letter grade (four grade steps) and one lost a whole letter grade in UG2. In UG1, the average difference in grade point is 0.23 (contract better), which just misses significance at p = 0.087. In UG2, the difference is 0.28, which is also not significant, with p = 0.13. Although there are a few outliers, the differences are, on average, less than a grade step, and this difference is not significant.3 Figure 7 shows the distributions of the grade changes for UG1 and UG2: red indicates the contract grade is lower than the traditional grade, green that the contract grade is higher, and blue for no difference. These figures show that most differences between the two grading methods are minimal.

Figure 7. Distribution of Grading Differences (Contract Grade – Traditional Grade) for UG1 and UG2

Historically, grades tend to be higher in G1, with class averages hovering between B+ (3.3) to A− (3.7). Of the 28 students enrolled across both sections of G1, 10 filled out the grading contract. Given the novelty of the contract, the instructor went over all contracts and sometimes needed to make corrections, such as students who, inadvertently or not, misreported information, such as the number of extra credit labs they participated in. Out of the 10 completed contracts, 6 of those students earned a higher grade over what they would have done by the pure calculation method, one earned the same grade, and the remaining 3 would have received a lower grade (but did not, because the calculated grade was used instead). With the exception of one student who was able to increase their grade from a D+ to a B+, the rest of these improvements were slight, by one to two steps, for the remaining five students. This pushed the overall course average across both sections up slightly from 3.34 to 3.56.

The instructor also determined what the contract grade would have been for the 18 students who did not fill out the contract. None of them would have benefitted from doing so; half of them (nine) would have earned the same calculated grade, and the other half would have done worse by using the grade determined by contract. In fact, the overall average grade for contract grading would be slightly lower if that was the sole method of grading. In part this is due to the contract placing more weight on completing extra credit labs, participating in forum discussions, and retrying the practice problems until full mastery had been demonstrated. Students who were already doing well on the timed quizzes often did not do as many of the extra credit labs, for example. Figure 8 depicts the distribution of grade differences for G1, showing that the vast majority of students would experience little to no grade difference between these two grading methods.

Figure 8. Distribution of Grading Differences (Contract Grade – Traditional Grade) for G1

Although grading contracts have been promoted as an antiracist teaching practice (Inoue 2019), we were not able to establish a statistically significant difference in the grade impact for our underserved populations. Table 3 shows summary statistics for the two undergraduate classes. Certainly, the small sample sizes contribute to the challenge in doing statistical analysis.

Table

Table 3. Summary Statistics for the Difference in Grades for Nonunderserved and Underserved Undergraduate Student Populations

Table 3. Summary Statistics for the Difference in Grades for Nonunderserved and Underserved Undergraduate Student Populations

UG1UG2
NonunderservedUnderservedNonunderservedUnderserved
Average difference0.210.270.30.17
Std dev of difference0.100.1590.1860.241
Observations43183017
Two-tailed p value0.570.37

3.4. Student Response

3.4.1. Undergraduate Student Perceptions.

Students were surveyed at the end of the semester for all three UG2 sections and for one UG1 section. The first time the contract was used in UG1, it was introduced partway through the semester in response to what seemed to be greater-than-average student anxiety. Because it was rather ad hoc, students were not formally surveyed at the end of the semester. The instructor forgot to administer the survey the third time the class was offered.

Table 4 summarizes the responses to the binary and Likert-scale questions. Students were asked whether they had encountered grading contracts before; there were several students who had seem contracts in classes outside their major. In UG2, the majority of students who had used contracts before had used them in UG1. Students were asked their opinion of contracts when they were first introduced (“When the [class] contract was first explained to you, did you think it was a good idea as an alternative to a more conventional grading scheme?”). They were also asked whether they thought it was a good idea to use them in other Decision Sciences classes (“Would you recommend the use of a grading contract for your other DS classes?”).

Table

Table 4. Summary of Survey Responses

Table 4. Summary of Survey Responses

ClassClass sizeNo. of respNo. of first timeGood idea to use contract (at start of semester)Encourage more use of contracts (at end of semester)
Definite yesPossibly yesUnsurePossibly noDefinite noDefinite yesPossibly yesUnsurePossibly noDefinite no
UG2 Sp2210952322025101
UG1 F2221861520034100
UG2 Sp23211168300092000
UG2 Sp2419953221033100
Total61372214135301714301

These before and after questions show a shift toward endorsing the use of contracts. Although there was a total of six “unsure” or “possibly no” responses at the outset, there were only four “unsure” or “definitely no” at the end of the semester. There are 14 (40% of respondents to this question) “definitely yes” before and 17 (49%) after responses.

3.4.2. Undergraduate Student Comments.

Students had largely favorable things to say about the contracts. They were asked to elaborate on their response on increased use of contracts (“Please elaborate on your response:”); what advantages they saw in contracts (“What do you see as the advantages of a grading contract?”); what disadvantages (“What do you see as the disadvantages of a grading contract?”); and a final free response for anything else they wanted to add (“Any other comments?”). Unfortunately, the student who definitely did not think they should be used in other classes did not provide any other feedback, so it is not known what made the experience so negative for them.

Table 5 summarizes the general ideas expressed in the comments, with columns ordered by decreasing number of mentions. A plurality of students explicitly cited the reduced stress/anxiety from worrying about grades. Many tied this to their ability to focus on their learning, in part by being willing to take risks and be more creative in how they approached their assignments. Grade uncertainty was largely mentioned in response to the question on disadvantages, whereas the certainty the contract provided was overwhelmingly an unprompted response to the other questions. Most recently, students have started mentioning that other students may do the minimum work required to get whatever grade they desired. Overall, the comments track exactly with findings from the literature (Hiller and Hietapelto 2001).

Table

Table 5. Summary of Ideas Expressed in Comments

Table 5. Summary of Ideas Expressed in Comments

ClassLess stress (+)Focus on learning (+)Grade certainty (+)More fun/creativity/risk taking (+)Grade uncertainty (−)“Lazy” students (−)Have a say in grading (+)
UG2 Sp222302402
UG1 F228422120
UG2 Sp237653142
UG2 Sp242420201
Total191797865

We provide some representative comments on both the positive (+) and critical (−) sides. (There were far more positive than negative comments.)

3.4.2.1. Positive Responses

  • “I can appreciate the value of focusing on effort and learning rather than thinking about what final product will give me an A. For most of the class I thought about improvement and that was a great process.” (UG2)

  • “I liked having this grading contract because we got to express our opinion on how we should have graded ourselves.” (UG2)

  • “I was a bit confuse on what was expected of me. The grading contract does allow more flexibility in terms of studying time. Students can participate in class instead to earn their grades rather than studying class materials.” (UG2)

  • “The advantage of a grading contract is that you can potentially get a higher grade because you put in the effort to do the work, and gives less stress about the score of an assignment, so you learn more.” (UG1)

  • “It gives a very clear outline of what’s expected and makes it more manageable to understand what is due.” (UG1)

  • “In my specific scenario, I can optimize my time between my desired grade, available effort and other obligations without needing to place efforts into 2-3 exams that count for 80% of the course. Though, at times, I found a few impactful instances benefit my schedule.” (UG1)

  • “I think grading contract is a great alternative for classes that are really hard and require a lot of technical knowledge. It gives students a peace of mind knowing that they will be graded by how hard they try to complete the assignment and not by its correctness. This was the first class with a grading contract, and I think this grading system has a lot of advantages” (UG1)

  • “Students don’t focus as much on the grade and getting things right. I think it sort of encourages students to think outside of the box when solving problems because they will be graded on the amount of effort they put into the assignment and not if it is right or wrong.” (UG1)

  • “The contract anticipates bumps in road and sets up generous wiggle room for the student pursuing the knowledge. The work is the focus now.” (UG2)

  • “The advantage is students actually focus on revising their skills and showing up to be part of discussions. You feel happier to go to class and know that it’s a part of your grade instead of 40 percent of your grade being the final.” (UG2)

  • “Student feel less stress if they know that if they do certain activities that they are ensured to pass the class. This allows for a student to be more open to mistakes and actually willing to explore their errors if they know that this mistake won’t impact their grade, actually helping them learn better.” (UG2)

  • “No other professors in the past had offered contract grading. It just makes so much sense. When compared to real work and life environment, contracts are specified between two parties detailing what is expected of each other. Grading should be no different. The student can focus on the things she decides to be most important to the class or specific areas she can excel in. Grading contracts would be so useful in a scenario such as decreasing the percentage grading of tedious homework problems each week to put more emphasis on a writing journal in reflecting how to solve that one major problem in many pieces along with explaining the concepts behind them.” (UG2)

  • “College would be so much better of an experience if this was widely adopted and implemented earlier in my years.” (UG2)

  • “It’s nice that it was presented as an alternative option and giving the students the chance to decide. That makes you feel more responsible/accountable for your grade? Like, ok I’m holding up my end of the deal here, rather than I need to get these assignments done by this deadline…it feels more two sided with the contract:)” (UG2)

  • “I think it’s great honestly! It serves the purpose of exactly what a person would want it to serve. It’s simple and straight to the point. The guidelines are clear, and if everyone agrees, then there should be no complaints about the grade they receive. It also completely eliminates the fear of failing. For many students, it stresses them if they do not know how their performance is reflected based on the professor’s syllabus. Every class syllabus is different with different percentages, this can overwhelm a student for sure. The contract on the other hand is direct.” (UG2)

  • “No, I think it’s a great idea. This is my second class using it and it’s nice because I can definitely go into this class, especially the assignments and critical thinking aspect of things, without the fear of being “wrong” or “sounding dumb”. The contract allows me to be more open about my strengths and help identify any weaknesses I have that may need to be worked on.” (UG2)

  • “The grading contract takes the edge off of earning high grades and helps student absorb and retain information. Some students prioritize the assignments in a way that will get them As, with the grading contract students can take the time to learn in a way that is best for comprehension. Furthermore, students with special accommodations (like myself) appreciate the opportunity to learn in a less stressful environment. It makes things easier to concentrate and dial in.” (UG2)

  • “I hope that grading contracts get implemented. I’ve been a student who dropped out of school because I felt over loaded with work and class work. I feel like there are a lot of students that will resort to cheating just to pass classes and that shouldn’t be the point of higher education.” (UG2)

  • “Changes the mindset of “completing a checklist” for a grade to incentivizing figuring out where to best place efforts to accomplish SLOs.” (UG2)

  • “The increased motivation to do assignments as I have a clear understanding of what will bring my grade up/down.” (UG2)

3.4.2.2. Critical Responses

  • “The grading contract might be good for some people but usually students want to know where they stand in the class, I mean they want to see actual grade like oh I have a B so need to do much effort or oh I have a C I am just going to drop the class etc…so it might be good but not for everyone.” (UG2)

  • “The disadvantage is some student might not be too comfortable participating and rather enjoy learning new materials on a regular basis.” (UG2)

  • “It was a bit hard to see where we would stand in the class, myself and some of my peers would be anxious if we were going to pass the class or not.” (UG1)

  • “It gives an easy pass for lazy students who just want a C and do bare minimum.” (UG1)

  • “Some people (me) like putting all of our attendance, grades, etc. in an excel spreadsheet to track our grade over time and figure out what grade we need for the final/exam to get a certain grade. Although with the grading contract I know I can get a better grade by participating more etc., it’s not quite “quantifiable” enough for my anxiety.” (UG2)

  • “People will take advantage of the grading contract to do the least amount of work and pass but I think that is life in general. People inherently try to do the least amount of work for the highest amount of reward.” (UG2)

  • “People will abuse it like someone we both know -lol- and you better not think me.” (UG2)

  • “Not the “original” way of grading some would say, does not promote competition if a student is “guaranteed” a grade. The contract could potentially make the student(s) lazier depending on their work ethic.” (UG2)

  • “A professor could have possible bias on who they believe put in effort or not but the judgement is up to the professor either way.” (UG2)

The positive comments show a strong appreciation for the reduced stress of having to get the “right” answer rather than the work that is being done. This does not imply that students do not have to put effort into completing assignments; rather, the focus is on the thought processes that went into the assignment and whether students have a foundational understanding of the material. If not, misunderstandings and misconceptions can be addressed. In this sense, it is the opposite of conventional multiple choice questions.

Critical responses have, over time, shifted from the vagueness of the contract to concerns that not everyone will “try their hardest.” A number of students in the second section of UG2 had had a whole semester of using contracts in UG1 and were clearly quite comfortable with the idea and embraced its advantages. The uncertainty can (and should) be addressed through clearer and repeated messaging during the semester. The concern about not all students striving for the highest grade is, in fact, one of the touted advantages of grading contracts (Hiller and Hietapelto 2001); where it may become challenging is a student who is only trying to get a C is in a group with students who are trying to achieve higher grades. Certainly, open communication among team members is critical; further, using peer evaluation tools like CATME (Loignon et al. 2017) can be used to identify individual contributions to a group effort and to make any necessary grade adjustments.

3.4.3. Graduate Student Response.

Although no official survey of student attitudes was taken in G1, we can consider the number of students who filled out the contract as providing some feedback. The fact that the four students who did not earn a higher grade with it and still went through the effort of filling it out and submitting it indicates they found some value in the process.

Within the grading contracts, several students mentioned that they appreciated that there were other ways to do well than just earning a high score on timed quizzes, which many found the most challenging part of the class. Several mentioned that there was effectively a floor on how poorly they would do so long as they put effort in to master the materials. There were no negative comments from the graduate students who filled out the contracts, even those whose grade did not improve from doing so.

3.5. Quality of Learning

One question that naturally arises is whether the quality of student learning has changed. In addition to pure subject matter learning, there are myriad other facets of learning. How long do students retain the information? Has their attitude toward learning changed? These are difficult to assess at any time and we are unable to address them in this paper. We have, however, looked at assurance of learning results and done an analysis of weekly quiz scores in UG1.

Table 6 shows summary statistics for the average and minimum student quiz score across 12 quizzes. (The course has 15 quizzes and for every 5 quizzes taken, students are able to drop a quiz score. To standardize values, we dropped the three lowest scores for each student.) Fall 2020 and Fall 2021 can be considered pregrading contract, as the contract was not introduced until partway through the Fall 2021 semester. Fall 2022 and Fall 2023 are postcontract introduction. An analysis of variance confirms no statistically significant difference between the quiz averages (p = 0.43) or quiz minima (p = 0.32). From this and the assurance of learning on student outcomes, we conclude that the same level of student learning is occurring as has traditionally been the case.

Table

Table 6. Summary Statistics for Quiz Averages and Quiz Minimum Scores in UG1 out of 10 Points

Table 6. Summary Statistics for Quiz Averages and Quiz Minimum Scores in UG1 out of 10 Points

Average F20Average F21Average F22Average F23Minimum F20Minimum F21Minimum F22Minimum F23
Average8.188.838.588.515.497.016.216.25
Standard deviation1.310.841.171.602.841.862.732.80
Minimum4.336.735.923.6301.500
Maximum9.469.90109.928.759.5109

4. Discussion

There has been, with good reason, a push toward providing a more equitable education to students, especially because the COVID-19 pandemic has exacerbated pre-existing iniquities in student preparation and ability to engage with their education. It has caused us to re-evaluate what we think the purpose and outcomes of our courses (and programs) should be and how to best achieve them.

As discussed in Section 2.1, conventional grading approaches have served some, but certainly not all, students well. These approaches tend to create students who excel at memorizing and then displaying their knowledge during high-stress situations. Arguably, we would like to train students who are able to think critically and problem solve. Because much work these days is done collaboratively, the ability to individually accurately complete algorithmic or computational steps quickly and under pressure may be less of a priority, at least in the field of OR. Are there therefore other more fruitful approaches to teaching and learning that our profession might embrace?

In this paper, we described the ideas of ungrading, standards-based grading, and grading contracts and shown that the latter can be implemented to great success in at least certain types of OR classes. Although sample sizes are small, we did not find evidence of grade inflation, nor deterioration on quiz performance, and student feedback was more positive than negative. However, we are not arguing that all instructors should adopt contracts for all their courses; rather, we make the case that instructors might want to re-evaluate how they structure their classes to make them more equitable. One way to do so is to use grading contracts.

Instructors have wide latitude in how to structure their contracts. Contracts can also be renegotiated during the semester if the instructor and/or students feel the original contract is no longer suitable. In our examples, grades using the contract were not statistically significantly higher at the undergraduate level than those not using the contract. Furthermore, standards-based learning can be incorporated into the contract to ensure that students are meeting the intended learning outcomes of the course.

Instructors who are concerned that contract grading may not provide enough structure for courses with more complex subjects may wish to investigate “specifications grading” as explained in Graves (2023). In specifications grading, students are graded on a pass/fail basis for all assignments, which are closely tied to course learning outcomes via the specifications. In order to achieve a given grade, students complete a corresponding number of assignments and scaffolding could be used to ensure that students have mastered prerequisite building blocks necessary later on in that course.

We are encouraged by the positive comments from students, especially the ones around students being able to focus on learning rather than worrying about their grades. We hope this article inspires our colleagues to consider alternatives to the way most of us were taught and assessed during our higher education, and the way many of us still deliver our courses. It seems that there may be ways that are more enjoyable and exciting for both instructors and students.

Acknowledgments

The authors thank Jolie Goorjian, Dan Curtis-Cummins, and Jennifer Trainor for helpful conversations about grading contracts.

Endnotes

1 There may be some concern about the impact generative AI may have on classes that focus on writing and the use of contracts. In class, we discussed the advantages/disadvantages of using generative AI; students were told they could use it, but the importance of editing in order to address the actual assignment was emphasized. In general, the “raw” output of student-generated AI text did not adequately address the prompt or capture the correct tone for the specified audience of the piece of writing.

2 Letter grades are converted to grade points using a standard A = 4, A− = 3.7, and so on, model. Students that withdrew from the class are excluded from the analysis.

3 The grading contract was introduced partway through the semester for students in UG1 in Fall 2021. This may have affected the results somewhat; unfortunately, we are not able to adjust for this.

References

  • Arcidiacono P, Aucejo E, Hotz VJ (2016) University differences in the graduation of minorities in STEM fields: Evidence from California. Amer. Econom. Rev. 106(3):525–562.CrossrefGoogle Scholar
  • Cangialosi K (2018) But you can’t do that in a STEM course! Hybrid Pedagogy 26.Google Scholar
  • Chin MJ, Quinn DM, Dhaliwal TK, Lovison VS (2020) Bias in the air: A nationwide exploration of teachers’ implicit racial attitudes, aggregate bias, and student outcomes. Ed. Res. 49(8):566–578.CrossrefGoogle Scholar
  • Clymer JB, Wiliam D (2006) Improving the way we grade science. Ed. Leadership 64(4):36.Google Scholar
  • CSU (2022) Facts about the CSU. Accessed August 20, 2023, https://www.calstate.edu/csu-system/about-the-csu/facts-about-the-csu/Documents/facts2022.pdf.Google Scholar
  • CSU (2023a) CSU mission. Accessed August 20, 2023, https://www.calstate.edu/csu-system/about-the-csu/Pages/mission.aspx.Google Scholar
  • CSU (2023b) Graduation initiative 2025. Accessed August 20, 2023, https://www.calstate.edu/csu-system/why-the-csu-matters/graduation-initiative-2025.Google Scholar
  • Dumas MJ (2016) Against the dark: Antiblackness in education policy and discourse. Theory Practice 55(1):11–19.Google Scholar
  • Elbow P (1993) Ranking, evaluating, and liking: Sorting out three forms of judgment. College English 55(2):187–206.CrossrefGoogle Scholar
  • Felder RM, Brent R (2005) Understanding student differences. J. English Ed. 94(1):57–72.Google Scholar
  • Fitzmaurice O, Ní Fhloinn E (2021) Alternative mathematics assessment during university closures due to Covid-19. Irish Ed. Stud. 40(2):187–195.CrossrefGoogle Scholar
  • Graves BC (2023) Specifications grading to promote student engagement, motivation and learning: Possibilities and cautions. Assessing Writing 57:100754.CrossrefGoogle Scholar
  • Hiller TB, Hietapelto AB (2001) Contract grading: Encouraging commitment to the learning process through voice in the evaluation process. J. Management Ed. 25(6):660–684.Google Scholar
  • Iannone P, Simpson A (2011) The summative assessment diet: How we assess in mathematics degrees. Teaching Math. Its Appl. 30(4):186–196.Google Scholar
  • Inoue AB (2019) Labor-Based Grading Contracts: Building Equity and Inclusion in the Compassionate Writing Classroom (WAC Clearinghouse, Fort Collins, CO).Google Scholar
  • Johnson MP, Chichirau GR (2020) Diversity, equity, and inclusion in operations research and analytics: A research agenda for scholarship, practice, and service. Pushing the Boundaries: Frontiers in Impactful OR/OM Research, INFORMS TutORials in Operations Research (INFORMS, Catonsville, MD), 1–38.Google Scholar
  • Khan S (2015) Let’s teach for mastery: Not test scores. TED (November 2015), https://www.ted.com/talks/sal_khan_let_s_teach_for_mastery_not_test_scores?language=en.Google Scholar
  • Kohn A (2011) The case against grades. Ed. Leadership 69(3):28–33.Google Scholar
  • Loignon AC, Woehr DJ, Thomas JS, Loughry ML, Ohland MW, Ferguson DM (2017) Facilitating peer evaluation in team contexts: The impact of frame-of-reference rater training. Acad. Management Learn. Ed. 16(4):562–578.CrossrefGoogle Scholar
  • McMurtrie B (2022) Why the science of teaching is often ignored: There’s a whole literature on what works but it’s not making its way into the classroom. Chronicle Higher Ed. 68(10):22.Google Scholar
  • Nagasaka K (2020) Multiple-choice questions in mathematics: Automatic generation, revisited. Proc. 25th Asian Tech. Conf. Math. Virtual Format (Radford, VA).Google Scholar
  • National Research Council (2000) How People Learn: Brain, Mind, Experience, and School: Expanded Edition (National Academies Press, Washington, DC).Google Scholar
  • Nichols SL, Berliner DC (2007) The pressure to cheat in a high-stakes testing environment. Psychology of Academic Cheating (Elsevier, Amsterdam), 289–311.CrossrefGoogle Scholar
  • Niss MA (1998) Dimensions of geometry and assessment. Perspective on the Teaching of Geometry for the 21st Century: An ICMI Study (Kluwer Academic Publishers, New York), 263–274.Google Scholar
  • Previde P, Graterol C, Love MB, Yang H (2019) A data mining approach to understanding curriculum-level factors that help students persist and graduate. Proc. IEEE Frontiers Ed. Conf. (IEEE, Piscataway, NJ), 1–9.Google Scholar
  • Ramsay‐Jordan NN, Appiagyei A, Kiel AD (2022) Productive interaction: An effective approach to addressing inequity in STEM. School Sci. Math. 122(6):283–285.Google Scholar
  • Sangwin C (2013) Computer Aided Assessment of Mathematics (OUP Oxford).CrossrefGoogle Scholar
  • Santos MC (2022) How I implemented Asao B. Inoue’s labor-based grading and other antiracist assessment strategies. CEA Critic 84(2):160–179.CrossrefGoogle Scholar
  • Seymour E, Hewitt NM (1997) Talking About Leaving, vol. 34 (Westview Press, Boulder, CO).Google Scholar
  • Shada A, Kelly K, Cox R, Malik S (2011) Growing a new culture of assessment: Planting ePortfolios in the metro academies program. Internat. J. ePortfolio. 1(1):71–83.Google Scholar
  • Shankar NL, Park CL (2016) Effects of stress on students’ physical and mental health and academic success. Internat. J. School Ed. Psych. 4(1):5–9.CrossrefGoogle Scholar
  • Son LK (2010) Metacognitive control and the spacing effect. J. Experiment. Psych. Learn. Memory Cognition 36(1):255–262.CrossrefGoogle Scholar
  • Stommel J (2021) Ungrading: An introduction. Accessed June 18, 2024, https://www.jessestommel.com/ungrading-an-introduction/.Google Scholar
  • Theobold AS (2021) Oral exams: A more meaningful assessment of students’ understanding. J. Statist. Data Sci. Ed. 29(2):156–159.CrossrefGoogle Scholar
  • Tobias S, Lin H (1991) They’re Not Dumb, They’re Different: Stalking the Second Tier (American Association of Physics Teachers, College Park, MD).Google Scholar
  • Vlach HA, Sandhofer CM (2012) Distributing learning over time: The spacing effect in children’s acquisition and generalization of science concepts. Child Development 83(4):1137–1144.CrossrefGoogle Scholar
  • Ward E (2021) Easing stress: Contract grading’s impact on adolescents’ perceptions of workload demands, time constraints, and challenge appraisal in high school English. Assessment Writing 48:100526.CrossrefGoogle Scholar
  • Watson EJ (2021) An Integrated Mixed-Methods Study of Contract Grading’s Impact on Students’ Perceptions of Stress, Self-Worth Protection Behaviors, and Academic Performance in High School English Courses (Lancaster University, Lancaster, UK).Google Scholar
  • Winne PH, Azevedo R (2014) Metacognition. Sawyer RK, ed. The Cambridge Handbook of the Learning Sciences, vol. 2 (Cambridge University Press, Cambridge, UK), 63–87.CrossrefGoogle Scholar