Understanding by Design: Criteria and Validity

Understanding by DesignI usually wait until I’m finished reading a chapter before I start my reflection, but I am finding myself talking back to “Criteria and Validity” (Understanding by Design) a lot, so I decided to blog as I read.

Wiggins and McTighe recommend the use of an analytic rubric and argue against “boil[ing] down an evaluation to a single (holistic) score” (174-175). I’m not sure I understand. Do they mean we should give, for example, six different scores on a composition without adding or averaging the score to make a single grade? I’m not sure that’s realistic due to the confines of the grading system. I have to give a grade on a composition, and I am not sure I would be supported if I chose to break the grade into six different pieces without putting them back together again to form a whole, or a single grade. Ultimately, I don’t see a way around giving a single grade to student work. Even if I look at separate criteria, I ultimately have to average the scores on each criterion together in order to deduce a final score. The authors also argue that assigning a series of grades and averaging those grades over the course of a grading period is “counterproductive” (177). I don’t have an option with this one, and I’m sure I’m not alone. Yes, I think it is productive to use a portfolio to show progress. However, I am not sure I have ever worked at a school that would find the grading system Wiggins and McTighe propose acceptable. I have had to average grades and provide a final grade everywhere I’ve worked. I would love to be able to do away with grades and just give students feedback. I think it would take the pressure off students, who could simply learn and perhaps care more about what they learn. Unfortunately, getting rid of grades is unrealistic in the extreme. I doubt that few students, parents, teachers, or administrators are with me on this one, for one thing, and for another, I’m not sure it’s realistic to expect that students will be intrinsically motivated to learn without that carrot of “passing” or even making good grades in front of them. Would they? I know that students would be motivated to learn what they care about, just as all of us are. If we care about knitting, we are motivated to learn how and to do it well.

I ran into another problem using rubrics this year. I have excellent rubrics, and in fact, it was Jay McTighe who introduced me to them. I stapled rubrics to each paper I gave back, with performance areas circled. Most students didn’t bother reading them and didn’t seem to consider them feedback. You might want to see this post for more discussion of problems I had with rubrics. I think rubrics are excellent, but I’m not sure they help students when we use them to respond to their work. Also, I am not sure we always need to look at the parts instead of the whole. A student might, for instance, write an essay the demonstrates he/she composed a compelling thesis and fully developed it, organized the paper effectively, varied sentences in a sophisticated and engaging manner, and demonstrated command of grammatical and mechanical conventions. But let’s say this was a research paper, and most of the information was documented correctly and the Works Cited page was a disaster. Using a rubric rigidly, the student wouldn’t be evaluated properly for his or her understanding of how to write a research paper or cite evidence if all of the other criteria had the same weight. One solution, of course, is to weigh the criteria differently depending on which elements are most important for the understandings you are trying to assess.

It occurred to me that the rubric on pp. 178-179 could be transferred to a wide variety of assessments.  I am required to write reports about students on their report cards, and I think the wording of the rubric could be tweaked to demonstrate a student’s understanding of the big ideas of an entire grading period.

On pp. 181-182 the authors outline a six-step process for analyzing student performances

  1. Gather samples of student performance that illustrate the desired understanding or proficiency.
  2. Sort student work into different “stacks” and write down the reasons.
  3. Cluster the reasons into traits or important dimensions of performance.
  4. Write a definition of each trait.
  5. Select samples of student performance that illustrate each score point on each trait.
  6. Continuously refine.

I think this is worthwhile process, but it could take years, too.  Is this a problem?  I fully realize that in many ways, students who have a teacher who is in the middle to late part of his/her career will have a teacher who has tested, refined, and honed his/her teaching methods, but I can’t help but think that new teachers will find much of this chapter somewhat disheartening.  Actually, come to think of it, it makes me cringe when I think I only came across these ideas after teaching for 10 years.  What about all my former students?  They frankly didn’t have as good a teacher then as they would have if they were in my classes now.  This line of thinking is depressing.  On the one hand, it’s unrealistic to expect that a teacher would be different from any other person.  We are not born effective teachers; however, our job is so important, I think, that we need to be effective from the start.

The authors next discuss validity, “the meaning we can and cannot properly make of specific evidence, including traditional test-related evidence” (182).  I realize that I often compose tests with questions that all have an equal weight, but are not equal in difficulty, and that’s something I need to address in the future.  I think a lot of us do that, but frankly, it doesn’t really mean that students understand some of the big ideas in a unit if they can guess with a 50/50 shot on a true/false question or match characters to their descriptions or quotations.  To be fair, these types of questions are a staple of many “canned” testmaking companies, such as Perfection Learning.   One of the reasons I wanted to make sure I read all the summer reading this year is to ensure that my objective tests over the reading really assessed the types of understandings I wanted the students to have.   I would like to construct units and courses in which no student who did not truly understand the big ideas could do well on the assessments.  I am not trying to be punitive, but I don’t want to ever feel again as though my students’ grades are based on their ability to memorize and regurgitate.

We have to be sure that the performances we demand are appropriate to the particular understandings sought.  Could a student perform well on the test without understanding?  Could a student with understanding nonetheless forget or jumble together key facts?  Yes and yes — it happens all the time.  We want to avoid doubtful inferences when assessing any student work, but especially so when assessing for understanding. (183)

And as the authors emphasize, looking at the students’ thought processes can be key in discovering why they didn’t appear to understand.  In fact, we may find that they really did understand, but made a key mistake that impeded them from obtaining the correct result.  Math teachers, I think, intuitively understand the value of “showing your work.”

In determining whether an assessment is truly valid evidence of a student’s understanding, the authors argue we should ask ourselves how likely it is that “a student could do well on this performance task, but really not demonstrate the understandings [we] are after” or whether “a student could perform poorly on this task, but still have significant understanding of the ideas and show them in other ways” (184).

Once again, the authors stress self-reflection and peer review for analysis of performance tasks.  I am more and more glad all the time that we started the UbD wiki in order to participate in peer review.  I do not think my colleagues at school would necessarily be inclined to participate in a project like this, and how wonderful is it that when we find ourselves in such circumstances that we can use Web 2.0 tools to create a cyber faculty lounge (or, to put it more aptly, a cyber professional development program).  Are any of you able to earn professional development credit for participating in the wiki?  Frankly, I think an activity like this could do more for our professional development than some of the ridiculous classes we have to participate in.  Well, let me back up, because I don’t think I have had to participate in those types of classes since I began working at my current school, but I sure have felt some of the staff development I’ve done in the past was a waste of my time.

I really like the self-assessment on p. 187, and I decided to test my project ideas for the Historical Fiction novel I want my British Lit. students to read.  You can look over my self-assessments, if you’d like, and I’d appreciate comments:

In reading this chapter, I was reminded of the problems inherent in the SAT and similar standardized tests.  Think of how much weight is put upon a student’s performance on these tests, which may amount to one test, one day?  Or what about the fact that many college courses assess students’ understanding on only two tests?  I don’t understand the concepts I learned in Physical Geography, I can tell you that, but I managed to earn an A by sheer memorization and regurgitation.  I thought in particular of the SAT essay.  Much care has been taken to try to make this writing task a reliable indicator of a student’s ability to write, but so many factors come into play that can impede a student’s performance.  For one thing, have you ever seen the questions?  Some of the prompts are very good, but a great many are mediocre or poor.  How is a student’s performance on any of these types of assessments “typical of the student’s pattern of performance”? (189).

As the chapter concludes, the authors provide a handy list of general guidelines to use in creating assessments of understanding.  I think it really helps if we can figure out ways for students to show us what they are thinking.  I found a really good lesson plan at ReadWriteThink.org that addresses metacognition in composition.

Work Cited: Wiggins, Grant, and Jay McTighe. Understanding by Design. Expanded 2nd Edition. Alexandria, VA: ASCD, 2005.

[tags]Grant Wiggins, Jay McTighe, UbD, Understanding by Design, assessment, curriculum, education, criteria, validity[/tags]

One thought on “Understanding by Design: Criteria and Validity”

  1. Dana, I REALLY appreciate the reflections you're writing on UbD. They are FABULOUS! As soon as I have some spare $$$, I'm going to purchase the book.

    Your reflections certainly have given me a lot to think about as I'm heading into my first official teaching job. I have to admit that I'm now more nervous than ever. I have a lot to read over this summer–teaching books and coursework books for my students too. YIPES!

Comments are closed.