A learner gets back a quiz with 86% stamped at the top. Eyes scan for red marks, shoulders shrug, paper folds into a binder pocket or balled up and tossed into the trach can. End of story. But where’s the learning story? What did they understand? Where did the wheels wobble? What’s the very next move? That tidy, little number didn’t answer any of that.
If we’re honest, grades are often mirrors for adults, reflecting how we designed the learning, more than they are maps for learners. Numbers travel fast. Understanding, not so much. Let’s reframe grades as feedback for us, and give learners something far better than the dead end for learning they currently are.
The Hodgepodge Problem: Why Grades Don’t Map Learning
Look inside most gradebooks and you’ll see the mash-up: points for quizzes and essays spiked with late penalties, extra credit for tissues, participation checks, neatness, “had a heading this time.” That cocktail is exactly what researchers have called a hodgepodge: a single mark that mixes achievement with behavior, effort, and attitude. The result? A murky signal about what a learner actually knows and can do.
This isn’t something new, and it isn’t small. Cross & Frary’s classic survey found widespread use of nonacademic factors in secondary grading, validating decades of prior studies and popularizing Brookhart’s “hodgepodge” label. Later reviews show the pattern persists across subjects and grade levels. Different teachers, different blends, different meanings even when learners demonstrate similar understanding.
Layer on the reliability problem, especially with percentage scales. A century ago Starch & Elliott showed huge variation when many teachers graded the same paper; modern replications still find this to be widespread even with common rubrics. If two competent teachers can rate the same work anywhere from “borderline” to “excellent,” then an 86% is less a measure and more a momentary opinion, which is shaped by scale choice, rubrics, penalties, and what we decide to count.
So when a kid asks, “What does my 86% mean?” the honest answer is: it depends on what got mixed in, on who did the grading, on how the points were sliced. That’s not a learning map. It’s a weather report. We all know how accurate those can be, and tomorrow’s forecast might look different.
Feedback Beats Labels: What Actually Moves Learning
Numbers travel fast. Learning moves when information turns into next steps. That’s why two high-leverage levers, teacher clarity and formative evaluation, consistently outperform labels and grades. MetaX reports a weighted mean effect size ~0.85 for teacher clarity (clear intentions and success criteria) and ~0.40 for formative evaluation (checking understanding during learning and responding). Both practices help learners answer: What am I aiming for? Where am I now? What will I do next?
Zoom out to the feedback itself. The MetaX synthesis pegs feedback’s overall impact around 0.50 while breaking out comparisons of comments and grades shows small, mixed effects. The takeaway isn’t “never grade”; it’s “make feedback usable.” Task and process-level information (“Do this next; try this strategy”) beats labels and grades because it reduces the gap between current and desired performance. Pair that with opportunities to act (revision, error analysis, reteaching), and you’ve turned a single grade into systematic growth.
This is the heart of the case for formative assessments: evidence of learning must change what happens next like tomorrow’s mini-lesson, today’s small-group, a learner’s revision plan. When assessment information actually drives instruction (and not the gradebook averages), achievement rises. Not because a number appeared in the gradebook, but because teaching and learning adjusted in real time. That’s feedback with a pulse.
Try This on Monday: Practical Shifts K–12
We don’t need a new grading religion tomorrow. We need better design, and we can start inside today’s constraints.
In K–5: Go Standards-Based Where Mastery Matters
Early grades are primed for standards-based grading (SBG). Report what a learner knows and can do against specific standards, not a single averaged score. Keep work habits (effort, behavior, timeliness) separate so they don’t distort the academic signal. Your conferences instantly get clearer: “In 3.NF.1 she’s consistently representing unit fractions; in 3.NF.2, number-line placement is emerging.” (This directly addresses the hodgepodge problem by isolating achievement from non-achievement.)
In 6–12: Stuck with a Traditional Gradebook? Make It Act Like Feedback.
If your SIS insists on percentages, retrofit it:
- Tag every assessment item to a standard. Enter scores by standard (or create categories) so a 79% unpacks into strong in evidence, instead of shaky in reasoning.
- Use most-recent or trend, not average. Replace old attempts when new evidence shows growth, so grades report current learning, not a mean of past versions of the learner.
- Separate habits from achievement. Keep “responsibility” points out of academic tallies; report them in a different column or comment.
These moves preserve compliance with traditional systems while giving learners a map, not a mystery.
Replace Point-Hunting with Learning Routines
1. Five-Item Formative Checks + Two Error Explanations
Skip the 10-question “gotcha” quiz. Give five well-designed items aligned to one or two targets. Then ask learners to explain why two are wrong (or how a common error happens). This taps the self-explanation effect, explaining “why” strengthens their understanding far more than circling answers. I can close my eyes and pick a “right” answer, but it takes knowing something to explain why it’s wrong.
2. Build an Error-Analysis Protocol
Make reflection a habit after every assessment: What did I get wrong? Why did I get it wrong? (misread, misconception, missing step?) What will I do next time to avoid this error?
Error analysis deepens conceptual understanding and improves transfer, especially when paired with worked examples and prompts. Learners don’t just fix this problem; they rewire the idea.
3. Teach with Success Criteria
I’ve written about success criteria so many times before. Post the criteria in learner language and point to them during modeling, practice, and feedback. Use them to structure peer review and self-assessment. This is teacher clarity in action, and it’s one of your highest-yield design choices.
4. Comment Like a Coach
Shift from “88%—Nice job” to task/process feedback: “Claim is clear. Next, integrate two pieces of textual evidence and explain the link back to your claim.” OR “Great representation. Now label axes and scale; then describe the trend in one sentence.”
When it comes to results, effective coaches don’t focus on the final score, they focus on what led to the score. That’s information learners can use tomorrow. And the evidence says it’s what moves the needle.
So… What Does an 86% Mean?
On its own? Almost nothing. As part of a living, breathing feedback system? It can mean we have work to do: tightening clarity, improving tasks, dialing up the formative checks, designing space for error analysis and revision. Grades should inform the adults about our design decisions and inform learners about their next moves. If a mark can’t do both, it’s not feedback; it’s a full stop.
Let’s build classrooms where every learner can answer the three magic questions: What am I learning? Why am I learning it? How will I know I’ve learned it? And let’s make sure that our gradebooks don’t just keep score, but that they keep the learning going.


