I had a great discussion with my students today. A couple of them asked me why I don’t grade participation in Socratic seminars. I used to. I stopped because I find that grading participation is slippery. If you quantify it, you run the risk of encouraging shallow participation for points. In their reflections, students share what they have learned as a result of the seminar. I think part of the concern the students shared is that their reflection must include a summary of ideas discussed in the seminar, and the students who raised the concern did not earn full points for their summaries. They argued that if they are trying to capture the discussion in their notes, they will not be as present in the discussion.
What I told the students is that grading is a means of communicating their learning, and if they would prefer to be assessed on participation because it helps them learn, then I will do what helps them learn. I asked that we have a discussion about it as a class. We had that discussion this morning, and I was really impressed with how the students were able to articulate what works for them in assessing seminars and why. They have a strong sense of what kind of assessment feels equitable and what does not. They were able to articulate why setting goals and assessing progress toward the goals was helpful, and why grading participation didn’t work for most of them.
I pointed out that the skills of note-taking and listening are important for success. Students need to listen to their teachers and peersâ€”now and later in collegeâ€”and be able to take notes on what they hear, so my rationale for assessing these skills is that they are skills that are important to practice. Yet, I understand their arguments as well. We cannot have a good seminar if students do not participate. On the other hand, their classmates insisted that participation was not a problem in our first seminar. At one point, they asked me to display our discussion map from last time (thanks, Equity Maps!). Did we actually have a problem that needed solving, or was our discussion working without grading participation?
The class consensus was to leave the assessment as is, particularly as they have only experienced one seminar so far and judgment based on one experience would not tell the whole story. I don’t think everyone was happy, and frankly, the discussion did become a bit heated. I don’t think that made the students feel comfortable. I asked them if they felt heardâ€”not agreed with, because that’s not the same thingâ€”but heard. I think the net result is that students appreciated the opportunity to share their ideas. I was super impressed with them, and I shared that feedback with them.
We have our second seminar tomorrow, and it will be interesting to see how this debate informs the discussion. In the end, the compromise/consensus seemed to be that students want to be assessed on making progress on their goals. Part of their reflection is to identify their goals for the next seminar. This means I need to go back into their last reflections and refresh my memory about what their individual goals are and ensure I give them feedback on their progress toward meeting their goals. They also asked for feedback on their contributions, though they recognized that one person’s idea of an insightful comment may differ from another’s.
The bottom line is that it’s important to engage students in the assessment of their learning. Some of the best discussions I have had with my students have centered on grading and assessment. They have a lot to say about assessment, but they are not always a part of the conversation about how they’ll be assessed. It was a good exercise for my students today to hear others’ perspectives on this topic and take those perspectives into consideration.
In June I successfully defended my dissertation at Northeastern University. My research focused on grading and assessment, which will likely not surprise anyone who has been reading this blog for a while, as I have written about grading and assessment frequently.
My dissertation was qualitative action research, a dissertation in practice grounded in the Carnegie Project on the Education Doctorate. Grading and assessment are ripe for qualitative action research because we have over a century of quantitative research in grading and assessment, and not as much positive change, at least with grading, as we might like to see. I might argue we are seeing more authentic assessment in schools, but grading remains, well, stuck. One of the reasons I think we’re stuck is that we believe persistent myths about grading.
Grades Communicate Students’ Proficiency
One of the most persistent myths about grading is that we agree on what grades mean. As long ago as 1888, researchers were raising questions about inter-rater reliability (Edgeworth, 1888). Study after study indicates that grades are highly inconsistent measures of students’ learning. Starch & Elliott (1912) conducted a study that examined consistency among graders and found that scores on student writing varied by 30-40 points out of 100, or a probable error of 4.5. You might be thinking, “yes, but isn’t writing a little subjective anyway? I’m sure that doesn’t happen in, say, math.” Well, the following year, Starch & Elliott (1913) found that scores on a geometry exam varied even more widelyâ€”as much as a probable error of 7.5. They ascribed the difference to several factors: the possibility that graders differently evaluate the studentsâ€™ methods for reaching the solution, that they assess quality of the studentsâ€™ drawings, and that they assign different values to problems.
Naturally, things have changed in a hundred years. What do more recent studies say? Brimi (2011) sought to answer that very question. Brimi (2011) engaged 73 participants working for the same school district trained to use the 6+1 Traits of Writing Rubric developed by Education Northwest to score the same argumentative essay using the rubric. The participantsâ€™ grades ranged from an A to an F on the traditional grading scale; furthermore, the range of scores assigned to the essay spanned 46 points (Brimi, 2011).
Grading is inconsistent for many reasons, but one of the chief reasons is that teachers evaluate different things when they grade. Some teachers offer extra credit or give students points for bringing supplies (Townsley & Varga, 2018). Teachers can be highly individualistic in selecting criteria for students’ performance (Bloxham et al., 2016). Other factors also impact how teachers evaluate students’ performance. For example, Brackett, et al. (2013) found that a teacher’s mood while grading can impact students’ scoresâ€”teachers in a bad mood tend to rate students’ performance lower. This holds true even when grading more objective criteria such as correct spelling (Brackett, et al. 2013). Think what this means as we are teaching in the midst of a pandemic and during a time when it feels as though teachers are being attacked from all sides.
One of the reasons traditional letter or number grades emerged is due to perceived inconsistency, inefficiency, and complication involved in narrative grade reports (Feldman, 2019). It was thought that letter grades could communicate learning both efficiently and plainly (Schneider & Hutt, 2014). By the 1940s, the A-F letter grade system had become the most popular grading system (Schneider & Hutt, 2014).
Traditional grades tend to be derived by averaging the performance on all assessments during a grading period; this average may not capture studentsâ€™ eventual proficiency in learning and can place undue emphasis on performance anomalies rather than tendencies (Feldman, 2019). In addition, traditional grading sometimes incorporates assessment of student behaviors, such as participation, engagement, and effort (Feldman, 2019).
We might think that grades communicate students’ proficiency in learning, but there are simply too many variables to say this definitively.
Grades Motivate Students
One fear many educators express is that if students are not graded, they will not be motivated to do the work. At best, grades serve as extrinsic motivation for learning. When students care more about the grades than the learning, they are more likely to resort to academic dishonesty. In fact, pressure to earn high grades contributes to academic dishonesty and mental health problems (Rinn et al., 2014; Villeneuve et al., 2019). Grades affect studentsâ€™ achievement, self-concept, and motivation (Casillas et al., 2012; Pulfrey et al., 2011). Students who earn low grades tend to achieve less and feel lower self-esteem over time (Klapp, 2018).
Fear of earning low grades or focus on earning high grades both serve as extrinsic motivators for learning rather than intrinsic motivators, which demonstrate more effectiveness in supporting learning (Froiland & Worrell, 2016; Hattie & Timperley, 2007). Intrinsic motivation is positively associated with both engagement and achievement (Froiland & Worrell, 2016; Hattie & Timperley, 2007). Helping students develop their intrinsic motivation to learn may increase studentsâ€™ achievement (Froiland & Worrell, 2016). Extrinsic motivation to earn good grades or avoid the negative consequences of poor grades drives many students rather than the desire to learn, and over time, extrinsic motivation decreases studentsâ€™ achievement (Hattie & Timperley, 2007). In addition, the reward of good grades tends to decrease motivation for otherwise engaging learning (Hattie & Timperley, 2007).
It’s worth noting that motivation appears to change depending on the grading system used. When students are graded using a 100-point system in which the sum of all student work is worth a total of 100 points, students tend to view each point deducted as a loss (Smith & Smith, 2009). Bies-Hernandez (2012) describes such grading systems as â€œloss-framed gradingâ€ (p. 179). However, when students are graded using a total points system tallying all points earned, they tend to view grades as opportunities to improve and build toward a desired grade (Smith & Smith, 2009). Students who are graded with a system weighting assignment categories by percentage fell in between students in the other grading groups (Smith & Smith, 2009). Even if controls ensure that the resulting grade is the same regardless of the calculation system, studentsâ€™ responses on a Likert scale questionnaire indicate they still perceive greater risk in 100-point systems and were less motivated and self-assured (Smith & Smith, 2009). Bies- Hernandez (2012) replicated these findings and further found that studentsâ€™ performances in courses with a loss-framed grading system also decreased. Thus, the framing of the grading system not only has an impact on studentsâ€™ perceptions of their performance but also on their actual performance (Bies- Hernandez, 2012). The implication is that teachersâ€™ approaches to grading may affect studentsâ€™ academic achievement (Brookhart et al., 2016).
However, proficiency-based grading (sometimes known as competency-based grading, standards-based grading, or mastery-based grading) has the potential to make grades more meaningful and purposeful (Buckmiller et al., 2017; Guskey, 2007). Proficiency-based grading practices may also lead to greater academic achievement, particularly if the grades are paired with formative feedback (Hattie & Timperley, 2007). Proficiency-based grading practices may also foster more cooperation and less competition (Burleigh & Meegan, 2018). Taking academic risks, weighing differing conclusions, and considering varied points of view are all necessary for developing critical thinking skills, but if students must risk failing grades in order to do so, they are much more likely to take the safer route to earning a higher grade (Hayek et al., 2014; McMorran et al., 2017). Knowing that they could continue to learn, revise, and reflect on their work may increase studentsâ€™ motivation to learn (Hattie & Timperley, 2007; McMorran et al., 2017).
100-point Grading Scales are More Precise than A-F or 4-Point Grading Scales
Do you know why we use the 100-point scale? It’s not because it’s more precise. It’s because it’s the scale in the gradebook software (Guskey, 2013; Guskey & Jung, 2016). The 100-point scale is terrible, and that’s a hill I’m willing to die on. The 100-point grading scale has become one of the most common scales for reporting studentsâ€™ grades, but it is one of the most unreliable scales in use (Guskey, 2013).
The 100-point scale is inaccurate and inequitable because the scale is skewed toward failing grades (Feldman, 2019). Passing grades comprise only 40 points of the grading scale, spanning typically from 60 points to 100 points (or from 70-100 points in some systems!), while failing grades comprise the remaining points possible spanning from 0 to 59 (or even 0-69). Serious mathematical errors arise when teachers input zeros in the gradebook when students are missing work (Feldman, 2019). While this practice ostensibly holds students accountable for handing in work, it can make it impossible for students to recover academically (Feldman, 2019). The literature suggests that teachers may compensate for the 100-point scale’s mathematical errors by artificially raising grades in a number of ways (Schneider & Hutt, 2014), including grading formative assessments and executive function skills (Bowers, 2011; Brookhart et al., 2016; Townsley & Varga, 2018).
Unfortunately, a lot of educators perceive the 100-point grading scale to be more accurate (Brookhart & Guskey, 2019; Feldman, 2019). While using 100 points as opposed to four or five points may seem more accurate, it results in a probable error of five or six points; teachers find it difficult to distinguish levels of performance on a 100-point scale (Brookhart & Guskey, 2019). Some grading reformers advocate for the use of minimum grading, or inputting a minimum grade such as 50 percent, rather than inputting zeros for missing work; this practice reduces mathematical error (Carifio & Carey, 2013; Carifio & Carey, 2015; Feldman, 2019). Essentially what educators are doing when they use minimum grading, however, is compensating for the deficiencies of the 100-point scale by converting it to a rough approximation of the 4-point scale. In a four-point scale, failing grades span from 0-0.99 of a point, while passing grades span from 1-4 points (or 2-4 points in a system without a “D”).
Grades Reduce Bias
Variable and unreliable grading practices also introduce equity problems. Black students have less access to AP courses all over the United States (Francis & Darity, 2021). Schools that use gatekeeping methods (Francis & Darity, 2021), such as teacher recommendations and prerequisite grades, may be basing their decisions about studentsâ€™ fitness for advanced coursework on subjective measures common in traditional grading (Feldman, 2019). Students of color are most impacted by teachersâ€™ implicit bias (Feldman, 2019), especially if subjective, non-academic factors are included in assessment (Cvencek et al., 2018). Implicit bias may especially play a role in lower grades assigned to students of color when the criteria for proficiency are unclear or undefined (Quinn, 2020). Traditional gradingâ€™s subjectivity can harm all students, but students of color may be most impacted due to implicit bias (Feldman, 2019; Quinn, 2020).
However, proficiency-based grading can make grades more equitable and more reflective of studentsâ€™ actual learning (Buckmiller et al., 2017). Proficiency-based grading may include using practices such as rubrics for evaluating student work and student-generated portfolios; however, it may also include traditional assessments such as tests (Baete & Hochbein, 2014; Buckmiller et al., 2017; Iamarino, 2014; Miller, 2013). Studentsâ€™ grades are tied to their mastery of content, such as standards, knowledge, and skills, as opposed to an average of all the grades earned during a grading period or course (Iamarino, 2014; Miller, 2013). Teachers using proficiency-based grading typically provide students with feedback on formative assessments (Buckmiller et al., 2017). Students may revise and resubmit work in order to demonstrate their proficiency in learning (Buckmiller et al., 2017). Through revision, students demonstrate their learning of the content and skills. As a result, proficiency-based grades may more accurately reflect what students have learned rather than a snapshot of their performance on a single assessment.
We Have to Use Grades
Grades have actually not existed, at least not in the form we’re familiar with, for a very long period of time (Schneider & Hutt, 2014). One of the worst reasons to perpetuate any system is the notion that we’ve always done it that way, especially when it’s not even true that we have always done it this way. The A-F grading system gained popularity as late as the 1940sâ€”as I mentioned beforeâ€”as educators saw a need to establish more uniform methods for determining studentsâ€™ proficiency (Schneider & Hutt, 2014). For many years preceding the establishment of “traditional grading,” we used all sorts of other systems (good and bad) for measuring learning. This system is entrenched, but it’s not as old as people might think, and if we decided, collectively, that it no longer worked for us, we could find a better system. The problem is, well, that it’s a system, and systems are notoriously hard to change.
I have heard many educators express anxiety that students will either not be prepared for college or will not get into college unless they are graded. Many schools, however, have successfully eliminated traditional grades. Colleges understand the transcripts these students send them, and these students are able to go to college. For example, the Watershed School, a member of the Mastery Transcript Consortium, does not issue traditional letter grades or test students through final exams and has a 100% college acceptance rate (Plaskov, 2019). A college counselor I worked with told me anecdotally that “colleges are fine with grading thatâ€™s ‘non-traditional.’ Parents usually get very concerned about going off the A-F standard, but college admissions folks are experts on grading scales, and what Iâ€™ve consistently heard from them is that the most-accurate/least-translated reporting is what they like.”
My own personal experience is that some schools’ grading practices are more entrenched, and while another system of evaluation would work, it wouldn’t be politically feasible. Proficiency-based grading shows additional promise here. Attaching grades to standards or competencies can make grades more accurate reflections of students’ proficiency in learning. Proficiency-based report cards have the potential to be more useful in understanding studentsâ€™ learning than traditional report cards including only a letter grade (Blauth & Hajdian, 2016; Swan et al., 2014). Swan et al. (2014) found that parents and teachers generally find proficiency-based reports more helpful and easier to understand, in addition to having more and better information about studentsâ€™ progress.
It’s worth noting that one study I examined indicated parents reported feeling less confidence in the standards-based grade reports because they were unfamiliar and felt the school had not taken their feelings as stakeholders into account before implementing standards-based grade reports (Franklin et al., 2016). These parents also reported finding the grade reports unclear (Franklin et al., 2016). Importantly, Franklin et al. (2016) indicate the parents in their study were all dissatisfied with standards-based report cards; these parents also described themselves as strong students who enjoyed school. Their study did not include parents who expressed satisfaction with the reports. (Franklin et al., 2016).
The Bottom Line?
I think it’s important for teachers to open dialogue with students and parents, read the research on grading and assessment, and work within the system they’re in to make grades more accurate and meaningful. I highly recommend the works referenced in this post, which is derived largely from my dissertation. For a good deep dive, Joe Feldman’s book Grading for Equity is excellent.
Baete, G. S. & Hochbein, C. (2014). Project proficiency: Assessing the independent effects of high school reform in an urban district. The Journal of Educational Research, 107(6), 493-511. https://doi.org/10.1080/00220671.2013.823371
Bies-Hernandez, N. J. (2012). The effects of framing grades on student learning and preferences. Teaching of Psychology, 39(3), 176-180. https://doi.org/10.1177/0098628312450429
Blauth, E. & Hadjian, S. (2016). How selective colleges and universities evaluate proficiency-based high school transcripts: Insights for students and schools. New England Board of Higher Education. https://www.nebhe.org/info/pdf/policy/Policy_Spotlight_How_Colleges_Evaluate_PB_HS_Trans cripts_April_2016.pdf
Bloxham, S., den-Outer, B., Hudson, J., & Price, M. (2016). Letâ€™s stop the pretence of consistent marking: Exploring the multiple limitations of assessment criteria. Assessment & Evaluation in Higher Education, 41(3), 466-481. https://doi.org/10.1080/020602938.2015.1024607
Bowers, A. J. (2011). Whatâ€™s in a grade? The multidimensional nature of what teacher-assigned grades assess in high school. Educational Research and Evaluation, 17(3), 151-159. https://doi.org/10.1080/13803611.2011.597112
Brackett, M. A., Floman, J. L., Ashton-James, C., Cherkasskiy, L., & Salovey, P. (2013). The influence of teacher emotion on grading practices: A preliminary look at the evaluation of student writing. Teachers and Teaching, 19(6), 634-646. https://doi.org/10.1080/13540602.2013.827453
Brimi, H. M. (2011). Reliability of grading high school work in English. Practical Assessment, Research & Evaluation, 16(7). http://pareonline.net/getvnasp?=16&n=17
Brookhart, S. M., & Guskey, T. R. (2019). Reliability in grading and grading scales. In T. R. Guskey & S. M. Brookhart (Eds.), What we know about grading: What works, what doesnâ€™t, and whatâ€™s next (pp. 13-31). ASCD.
Brookhart, S., Guskey, T. R., Bowers, A. J., McMillan, J. H., Smith, J. K., Smith, L. F., Stevens, M. T., Welsh, M. E. (2016). A century of grading research: Meaning and value in the most common educational measure. Review of Educational Research, 86(4), 803-848. https://doi.org/10.3102/0034654316672069
Buckmiller, T., Peters, R., & Kruse, J. (2017). Questioning points and percentages: Standards-based grading (SBG) in higher education. College Teaching, 65(4), 151-157. https://doi.org/10.1080.87567555.2017.1302919
Burleigh, T. J. & Meegan, D. V. (2018). Risky prospects and risk aversion tendencies: does competition in the classroom depend on grading practices and knowledge of peer-status? Social Psychology of Education, 21(2), 323-335. https://doi.org/ 10.1007/s11218-017-9414-x
Carifio, J. & Carey, T. (2013). The arguments and data in favor of minimum grading. Mid-Western Educational Researcher, 25(4), 19-30.
Carifio, J. & Carey, T. (2015). Further findings on the positive effects of minimum grading. Journal of Education and Social Policy, 2(4), 130-136.
Casillas, A., Robbins, S., Allen, J., Kuo, Y. L., Hanson, M. A., & Shmeiser, C. (2012). Predicting early academy failure in high school from prior academic achievement, psychosocial characteristics, and behavior. Journal of Educational Psychology, 104(2), 407-420. https://doi.org/10.1037/a0027180
Cvencek, D., Fryberg, S. A., Covarrubias, R., & Meltzoff, A. N. (2018). Self-concepts, self-esteem, and academic achievement of minority and majority North American elementary school children. Child Development, 89(4), 1099-1109. https://doi.org/10.1111/cdev.12802
Edgeworth, F. Y. (1888). The statistics of examinations. Journal of the Royal Statistical Society, 51(3), 599-635.
Feldman, J. (2019). Grading for equity: What it is, why it matters, and how it can transform schools andclassrooms. Corwin.
Francis, D. V. & Darity, W. A., Jr. (2021). Separate and unequal under one roof: The legacy of racialized tracking perpetuates within-school segregation. RSF: The Russell Sage Foundation Journal of the Social Sciences, 7(1), 187-202. https://doi.org/10.7758/RSF.2021.7.1.11
Franklin, A., Buckmiller, T., & Kruse, J. (2016). Vocal and vehement: Understanding parentsâ€™ aversion to standards-based grading. International Journal of Social Science Studies, 4(11), 19-29.
Froiland, J. M. & Worrell, F. C. (2016). Intrinsic motivation, learning goals, engagement, and achievement in a diverse high school. Psychology in the Schools, 53(3), 321-336. https://doi.org/10.1002/pits.21901
Guskey, T. R. (2007). Multiple sources of evidence: An analysis of stakeholdersâ€™ perceptions of various indicators of student learning. Educational Measurement: Issues and Practice, 26(1), 19-27. https://doi.org/10.1111/j.1745-3992.2007.00085.x
Guskey, T. R. (2013). The case against percentage grades. Educational Leadership, 71(1), 68-72.
Guskey, T. R. & Jung, L. A. (2016): Grading: Why you should trust your judgment. Educational Leadership, 73(7), 50-54.
Hattie, J. & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81-112. https://doi.org/10.3102/003465430298487
Iamarino, D. L. (2014). The benefits of standards-based grading: A critical evaluation of modern grading practices. Current Issues in Education, 17(2), 1-11.
Klapp, A., (2018). Does academic and social self-concept and motivation explain the effect of grading on studentsâ€™ achievement? European Journal of Psychology of Education, 33(2), 355-376. https://doi.org/10.1007/s10212-017-0331-3
McMorran, C., Ragupathi, K., & Luo, S. (2017). Assessment and learning without grades? Motivations and concerns with implementing gradeless learning in higher education. Assessment & Evaluation in Higher Education, 42(3), 361-377. https://doi.org/10.1080/02602938.2015.1114584
Miller, J. J. (2013). A better grading system: Standards-based, student-centered assessment. English Journal, 103(1), 111-118.
Plaskov, J. C. (2019, October 23). Reimagining college admissions season. The Mastery Transcript Consortium. https://mastery.org/reimagining-college-admissions-season/
Pulfrey, C., Buchs, C., & Butera, F. (2011). Why grades engender performance-avoidance goals: The mediating role of autonomous motivation. Journal of Educational Psychology, 103(3), 683-700. https://doi.org/10.1037/a0023911
Quinn, D. M. (2020). Experimental evidence on teachersâ€™ racial bias in student evaluation: The role of grading scales. Educational Evaluation and Policy Analysis, 42(3), 375-392. https://doi.org/10.3102/0162373720932188
Rinn, A. N., Boazman, J., Jackson, A., Barrio, B. (2014). Locus of control, academic self-concept, and academic dishonesty among high ability college students. Journal of the Scholarship of Teaching and Learning. 14(4), 88-114. https://doi.org/10.14434/josotl.v14i4.12770
Schneider, J. & Hutt, E. (2014). Making the grade: A history of the A-F marking scheme. Journal ofCurriculum Studies, 46(2), 201-224. https://doi.org/10.1080/00220272.2013.790480
Smith, J. K. & Smith, L. F. (2009). The impact of framing effect on student preferences for university grading systems. Studies in Educational Evaluation, 35, 160-167.
Starch, D. & Elliott, E. C. (1912). Reliability of the grading of high-school work in English. The School Review, 20(7), 442-457.
Starch, D. & Elliott, E. C. (1913). Reliability of grading work in mathematics. The School Review, 21(4), 254-259.
Swan, G., Guskey, T., & Jung, L. (2014). Parentsâ€™ and teachersâ€™ perceptions of standards-based and traditional report cards. Educational Assessment, Evaluation, and Accountability, 26(3), 289-299. https://doi.org/10.1007/s11092-01409191-4
Townsley, M. & Varga, M. (2018). Getting high school students ready for college: A quantitative study of standards-based grading practices. Journal of Research in Education, 28(1), 92-112.
Villeneuve, J. C., Conner, J. O., Selby, S., & Pope, D. C. (2019). Easing the stress at pressure-cooker schools. Phi Delta Kappan, 101(3), 15â€“19. https://doi.org/10.1177/ 0031721719885910
My ambition got away from me. I have continued to read Joe Feldman’s Grading for Equity, but I haven’t been posting reflections here. I will, eventually. I just need to finish a project I’m working on before I post here.
In other news, a manuscript I submitted toÂ English Journal was rejected with encouragement to revise, and I just don’t have time to revise right now. I think the peer reviewers’ comments were helpful and would make the writing a stronger piece, but it’s just not going to happen. Instead, I plan to post the article here, perhaps in three or four parts, so that the ideas might be something you can implement in your classroom (if you are so inclined). I had good reviewers, and I appreciate the time they put into the manuscript. I know that’s a lot of work.
I’m a researcher and graduate student, and the power of feedback to make your writing and thinking better cannot be overstated, but sometimes you need to put the rough ideas out there anyway, so that’s what I plan to do.
What is this article about? Here is a little hint.
I have a copy of my great-great-grandmother Stella Bowling Cunningham’s diary from 1893-1894, which I transcribed. It’s a fascinating window into history for many reasons, one of which is that while Stella was writing the diary, she was a teacher. She married in May 1894, after which she had to quit teaching and keep house.
Her primary concerns as a teacher seem to center around keeping order in her classroom. She remarks very little on what she actually taught her students, but she mentions whether or not class was unruly a few times. I also have a copy of a letter she wrote my great-uncle Alvin, who must have been assigned to write to grandparents and ask what school was like when they were little. Stella’s letter is wonderful (I reproduced it on this blog about 14 years ago).
I think I have always found the history of education, particularly schools, fascinating. I really enjoyed reading Joe Feldman’s chapter on the history of grading in Â Grading for Equity. Much of it was material I already knew, as one of his sources, Schneider & Hutt’s (2014) article “Making the Grade: A History of the A-F Marking Scheme” was one my own sources as well. If you can get your hands on this article, I highly recommend you read it (the full citation, including DOI, is at the end of this post). I learned some really interesting things from it, particularly the fact that the A-F grading system is not really that old. It quickly became entrenched in schools, and it seems nearly impossible now to imagine schools with A-F grades, but they actually didn’t become entrenched until about the 1940s. My grandparents were still in school in the 1940s, though my grandfather would have graduated in the very early 1940s. The history of letter grades as a method for communicating learning isn’t that old.
First, yesterday I promised to continue reflecting on Feldman’s “Questions to Consider” for chapter 1 today; however, on reading them more closely, I’m not sure you care over much why I am reading this book or who I’m reading it with, so I’ll skip those, except to say thatÂ I’ll reconsider anything I’m doing if it means my grading practices will be more equitable. Chapter 2 dives into the history of schools and grades a bit more.
How do schools in the first half of the twenty-first centuryâ€”their design, their purpose, their studentâ€”compare with schools in the first half of the twentieth century?
I have actually sat in desks that were bolted to the floor. Have you? I find that the design of classrooms, at least in schools where I have taught, is much more fluid. Desks are mobile, sometimes even on wheels. Students sit in a large circle or square in my classrooms. My classroom looks different from the classrooms I sat in and from the images of vintage classrooms (like the one at the beginning of this post). We also have projectors and computers. My students learn from viewing images and watching videos in addition to reading. Most stakeholders would probably agree that my school’s purpose is to prepare students for college. I don’t think that was the goal of most schools in the early 20th century.
Did you know that Thomas Jefferson was one of the first people to propose schools as we might describe them today? In his Notes on the State of Virginia (which isn’t read enough and is why people don’t realize how complicated and problematic Jefferson’s ideas could sometimes be), he wrote (emphasis my own, spelling his):
This bill proposes to lay off every county into small districts of five or six miles square, called hundreds, and in each of them to establish a school for teaching reading, writing, and arithmetic. The tutor to be supported by the hundred, and every person in it entitled to send their children three years gratis, and as much longer as they please, paying for it. These schools to be under a visitor, who is annually to chuse the boy, of best genius in the school, of those whose parents are too poor to give them further education, and to send him forward to one of the grammar schools, of which twenty are proposed to be erected in different parts of the country, for teaching Greek, Latin, geography, and the higher branches of numerical arithmetic. Of the boys thus sent in any one year, trial is to be made at the grammar schools one or two years, and the best genius of the whole selected, and continued six years, and the residue dismissed. By this means twenty of the best geniusses will be raked from the rubbish annually, and be instructed, at the public expence, so far as the grammar schools go. At the end of six years instruction, one half are to be discontinued (from among whom the grammar schools will probably be supplied with future masters); and the other half, who are to be chosen for the superiority of their parts and disposition, are to be sent and continued three years in the study of such sciences as they shall chuse, at William and Mary college, the plan of which is proposed to be enlarged, as will be hereafter explained, and extended to all the useful sciences. The ultimate result of the whole scheme of education would be the teaching all the children of the state reading, writing and common arithmetic: turning out ten annually of superior genius, well taught in Greek, Latin, geography, and the higher branches of arithmetic: turning out ten others annually, of still superior parts, who, to those branches of learning, shall have added such of the sciences as their genius shall have led them to: the furnishing to the wealthier part of the people convenient schools, at which their children may be educated at their own expence.
Pardon the long quote, but I find it worth quoting at length because it several ideas come into focus if you read the whole thing:
School was never envisioned to be equitable, not even the mind of the guy who wrote that “all men are created equal.” It was made to sort people, which is why tracking is still so common.
The language Jefferson uses is telling: he describes students as “rubbish.” He didn’t include girls or BIPOC in the calculation at all. It’s a pretty classist idea even if you remove the sexism and racism. You know the boy children of poor farmers weren’t going to college.
If you’re struggling to parse the language, the proposal is as follows:
Send one boy per “hundred” to a grammar school. The remaining students would end their schooling after three years in the “hundred” school.
Of those boys sent to grammar school, competition for continued education would be fierce: Jefferson suggests one or two years of grammar school to separate the wheat from the chaff, after which one of those grammar school students could continue his education for six more years.
Half of those boys lucky enough to continue their education past grammar school would then be able to go to college after that six years of education.
The competition among students was baked into American education early on. My great-great-grandmother Stella describes such competition when she describes spelling class: “We sat on long benches and a class would go up to the teacher to recite and sit on a long bench, only the spelling classes would stand in a row and â€œturn downâ€, when one missed a word.”
I would argue school has changed a great deal since the early 1900s but some aspects of school haven’t changed much. I have cited studies ranging from 1888-2019 in my research that document traditional letter grades’ issues with reliability, consistency, motivation, and self-concept. Grades seem to be the one aspect of school we are resistant to changing, in spite of a large body of evidence supporting change.
Once again, I’ve gone on too long and you’re probably not reading anymore. More tomorrow on how I see ideas and beliefs of the early 20th century at work in schools where I have taught.
Citations for further reading:
Feldman, J. (2019). Grading for equity: What it is, why it matters, and how it can transform schools and classrooms. Corwin.
Back in the day, I sometimes reflected on professional reading on this blog, and sometimes, book clubs resulted. Blogging has fallen by the wayside in favor of Twitter, which makes me sad because sometimes the long-form reflection is better than a tweet thread. The UbD Educators wiki grew out of the reflection I did, and until Wikispaces went defunct, it was a promising project, though I confided to Grant Wiggins that it was hard to find teachers to commit to adding to the wiki. He wasn’t surprised because lack of time makes it difficult. I always say that we make time for the things that are important to us, and this blog is pretty important to me, but I hadn’t made a lot of time for it for some years. I’m going to try to change that, and one thing I want to do is document my thinking as I read Joe Feldman’s Grading for Equity. I joked to a couple of colleagues that I am finally making time to actually read this book, which has been on my radar for a long time, and I realize I should have made the time to read it as soon as it was released because Feldman is citing much of the same research as I am citing in my dissertation. I could have saved myself a lot of searching through the library database!
First of all, I encourage educators to take the quiz How Equitable is Your Grading? on Feldman’s website. If, in the wake of George Floyd’s murder, you are examining your curriculum’s diversity, equity, and inclusion, I think that’s great. I think it’s great if you are engaged in movements to #DisruptTexts and #TeachLivingPoets. You also need to take a hard look at your grading practices, too. If, as Feldman says, you are implementing some equitable practices, such as “responsive classrooms, alternative disciplinary measures, diverse curriculumâ€”but meanwhile preserve inequitable grading,” you are perpetuating inequity in schools.
I’m going to start by using Feldman’s “Questions to Consider” at the end of chapter 1. I’ll just answer the first two and update tomorrow with responses to the remaining three questions. Otherwise, this post will be way too long. Maybe it already is!
What are some deep beliefs you have about teenagers? What motivates and demotivates them? Are they more concerned with learning or their grade?
After over 20 years of teaching mostly teenagers, I have concluded that a lot of adults expect them to be more “adult” because they tend to look more adult. What I mean is they expect teenagers have developed an internal locus of control. Not even all adults have an internal locus of control. Teenagers tend to still mostly have an external locus of control, which means they are more likely to attribute a poor grade to a teacher’s lack of regard for them instead of a lack of proficiency on their part. I think we need to remember that when we are grading. As such, they might be motivated to earn good grades (carrot) or avoid bad ones (stick), but grades in an of themselves don’t motivate them to learn. I think they do help give students some kind of yardstick they can use to judge their performance, but I didn’t think grades had even this utility until I started doing research. Grades might not communicate what we think or wish they would, but they communicate something. I think students are much more concerned with grades rather than learning when they are in classes in which all high-stakes assessments result in grades that cannot be improved through revision and in which all earned grades are averaged together. If, however, they are in a classroom that encourages revision and focuses on proficiency, they focus a lot more on learning. Teenagers actually love to learn things, but the trick is that teachers need to communicate the relevance, and the wrong answer is “I’m the adult, so I say it’s relevant.” And if what you are teaching isn’t relevant, you need to figure out how to Marie Kondo the curriculum.
What is your vision for grading? What do you wish grading could be for students, particularly the most vulnerable populations? What do you wish grading could be for you? In which ways do current grading practices meet those expectations, and in which ways do they not?
Before I started my research, I wanted to eliminate grades a measure of student learning. There is a movement to do just that, and many schools successfully use other methods for reporting learning, and yes, their students still get into college. I no longer think grades are entirely useless. I think we have just perpetuated inequitable grading for so long that I couldn’t figure out another way aside from burning the whole system down. Now I advocate for proficiency-based grading, and that means that students might revise their work, sometimes several times, in order to reach a level of proficiency in learning content and skills. In almost any aspect of life, we have chances to practice a skill until we master it, and no one says it is unfair. There was a time when every musician we know didn’t know how to play their instrument, when every athlete didn’t know how to play their sport. But we don’t judge their current competence by where they started. I think grading based on reaching proficiency, whenever it happens or however it happens, is much more equitable.
My dissertation is a dissertation in practice, meaning I need to take an action step and evaluate its success. My action step is to create a proficiency-based grading and authentic assessment guide for a pilot group of faculty, to implement the practices therein (along with a focus group), to evaluate the guide’s success and revise it accordingly, and to present the findings to my colleagues. Feldman’s ideas will be invaluable in framing the guide, grounded also in my own research. I am hoping implementing this action step will make grading less of a chore for me, tooâ€”I related so much to Feldman’s argument that teachers don’t like grading (p. 5).
What I need to do is figure out a system that is more mathematically sound and use it. I am doing fairly well on most equitable grading practices according to Feldman’s quiz, with the exception of that one. For example, I already:
Don’t weigh homework much. Homework is preparation for class, such as reading and writing. I don’t even really use the homework category in my online grade book for graded work.
Don’t calculate behavior and executive function skills in my grade.
Allow students to revise their work and replace the grade entirely with the new grade.
Don’t subscribe to the idea that grades need to fall on a bell curve or that I need a certain distribution of grades.
Don’t count participation as a grade category. It is part of the rubric in a Socratic seminar.
I do not have students asking me to create homework assignments, and they mostly do the preparation I ask them to do. Students sometimes turn work in late for me, but it doesn’t bother me. Other than that, I don’t feel I miss anything by excluding executive function skills. Students actually work harder knowing the grade can entirely be replaced if the work improves. I don’t subscribe to fears about grade inflation or worries that students have too many high grades, and I find conversations with others who are still hung up here to be maddeningly frustrating. I have long felt participation was too slippery to calculate, and sometimes students are super engaged but don’t say as much. I still get excellent participation from students without grading it.
More tomorrow on the first chapter reflection questions. Let me know if you want to “book group” this book.