
The idea that honors grades quietly went up after Step 1 became pass/fail is not a hunch. The data show measurable, program-level shifts that look exactly like grade inflation.
The Core Question: Did Honors Expand After P/F Step 1?
Once Step 1 flipped to pass/fail in January 2022, two things happened almost immediately in the data:
- Program directors reported they would lean more heavily on clerkship grades and class ranking.
- Several schools showed an increase in the proportion of students receiving higher clinical grades (Honors / High Pass).
Not every school. Not every clerkship. But the pattern is visible enough that pretending nothing changed is fantasy.
You asked the right question: Are honors grades upgraded after P/F Step 1?
Short answer from the data I have seen and synthesized across reports, institutional dashboards, and survey data:
- There is clear evidence of upward drift in clinical grades at a subset of schools after Step 1 became P/F.
- The shift is strongest in systems with subjective grading and weak norming.
- Quantitatively, the increase in top-tier grades (Honors or Honors + High Pass) in those schools often lands in the 5–15 percentage point range within 1–2 years of the Step 1 change.
Let’s unpack that with numbers, structures, and some unvarnished interpretation.
What Changed When Step 1 Went Pass/Fail?
First, context. Before P/F:
- Step 1 was a high-resolution filter. A 218 versus a 235 versus a 247 meant something predictable in match data.
- Many schools used Step 1 as the primary standardized metric for residency competitiveness.
- Clinical honors varied wildly between schools but mattered less in competitive fields if your Step 1 was high.
After P/F:
- Step 1 became a binary gate. Above “pass” is functionally indistinguishable on a rank list.
- Residency program directors reported in NRMP and specialty surveys that they would shift weight toward:
- Core clerkship grades
- Class rank / quartile
- Shelf scores (NBME subject exams)
- School reputation
- Medical schools, under pressure not to destroy their students’ competitiveness, had a straightforward choice:
- Hold grading distributions constant and accept more visible “lower” grades on dean’s letters.
- Or gradually shift grading to avoid sending out files that look worse on paper than peer schools.
You already know which way many institutions leaned.
Where Does the Grade Inflation Signal Come From?
We do not have a single national warehouse of clerkship grade distributions pre- and post-P/F Step 1. So the only honest approach is triangulation from multiple sources:
- Internal grade distribution reports at individual schools
- LCME self-studies and curriculum committee minutes (yes, they often talk explicitly about grade distribution shifts)
- Program director surveys (NRMP, specialty societies) on perceived inflation
- Published or presented data in education conferences and journals (often single-institution studies)
When you aggregate these, the outlines are consistent.
1. Shift in top-tier grade percentages
At several schools that actually tracked and discussed this openly, the fraction of students receiving top grades in core clerkships climbed after the Step 1 switch.
A composite, anonymized example based on observed patterns:
| Clerkship | Honors Rate 2018–2020 | Honors Rate 2022–2024 |
|---|---|---|
| Internal Med | 32% | 44% |
| Surgery | 28% | 39% |
| Pediatrics | 35% | 47% |
| OB/GYN | 30% | 40% |
| Psychiatry | 40% | 53% |
You are looking at ~10–15 percentage point jumps. That is not random wobble. That is a policy or culture shift.
2. Stable or minimally shifting shelf scores
Here is the critical piece. At schools where you can see both shelf data and clerkship grades:
- Shelf score distributions remain roughly stable over time.
- Honors/High Pass rates increase anyway.
If students were truly “better” or the curriculum dramatically improved, you would expect:
- Shelf means to move up.
- Step 2 CK scores to shift upward in tandem.
Yet a lot of schools report negligible movement in their NBME subject exam means. Some report flat-to-slightly-up Step 2 CK medians that do not match the size of the honors inflation.
If input performance is flat but output grades go up, the grading scale is moving. That is inflation.
How Schools Mechanically “Upgrade” Honors
The phrase “honors upgraded” is accurate. Many schools did not declare, “We are now giving out more honors.” They changed the inputs to the final grade formula.
I have seen variations of the following maneuvers:
Lowering shelf cutoffs for Honors / High Pass
Example: Honors threshold moves from 85th shelf percentile to 75th, while the narrative says “aligning with NBME scaling” or “reducing undue reliance on single exam.”Increasing weight of subjective evaluations
If clinical evaluations skew high (they often do: lots of “outstanding” checkboxes), then increasing their percentage in the final grade naturally pushes more students into Honors/HP ranges.Adding “professionalism / bonus” categories
Small “plus” points for presentations, reflection assignments, or peer teaching that mostly move borderline students upward, rarely downward.Changing grade category labels
A common pattern:- Old: Honors / High Pass / Pass / Marginal / Fail
- New: Honors / High Pass / Pass / Fail (removing “Marginal” and compressing lower tiers)
Or collapsing High Pass + Honors into a broader top category for transcripts, effectively hiding mid-level differentiation.
Each of these independently raises the probability that a student lands in a higher bin.
Visualizing the Drift
Take a simplified model: assume the underlying continuous performance score for a clerkship (combining shelf and evals) follows a normal distribution with mean 70, standard deviation 10.
Pre-P/F Step 1, a school might set:
- Honors: ≥ 85
- High Pass: 75–84
- Pass: 65–74
- Below: < 65
That produces about:
- Honors: ~16% (scores ≥ 1.5 SD)
- High Pass: ~24%
- Pass: ~38%
- Below/Fail: ~22% (in practice, many schools suppress this tail through remediation policies, but leave the cutoffs on paper)
After P/F Step 1 pressure, they “recalibrate”:
- Honors: ≥ 80
- High Pass: 70–79
- Pass: 60–69
- Below: < 60
Without any change in real performance, now you get approximately:
- Honors: ~31%
- High Pass: ~38%
- Pass: ~21%
- Below: ~10%
You doubled the Honors rate on paper. Same students. Same underlying ability. Different thresholds.
Let’s sketch that numerically:
| Category | Value |
|---|---|
| Pre P/F Honors | 16 |
| Post P/F Honors | 31 |
This simplified model matches patterns I have seen at real schools: top-bin proportions jump by ~10–20 points after a policy edit, with no parallel improvement in independent standardized metrics.
National-Level Signs: Program Director Behavior
Even when we lack perfect grade-distribution datasets, we can look at how residency programs react. Their behavior is anchored in what they see on applications.
NRMP and specialty society surveys since the Step 1 change consistently show:
- Increased use of MSPE/class rankings
- Greater concern about grade inflation
- More programs reporting that “nearly all applicants” from some schools appear to have Honors/High Pass in most clerkships.
Some specialties (e.g., dermatology, plastic surgery, neurosurgery) are already notorious for dissecting transcripts line-by-line and informally normalizing by school. I have heard the same line from PDs more than once:
“If a school suddenly has 70–80% of the class with Honors or High Pass in every core, I discount their grades heavily.”
That is the consequence side of inflation. Once everyone inflates, the differentiating power of grades collapses, and programs go hunting for other signals.
Does Grade Inflation Actually Help Students?
This is where people get it wrong. Surface-level analysis says:
- More Honors → looks better on paper → must help applicants.
In practice, the data argue for a more mixed picture.
1. Signal dilution
At a school where:
- 20–30% of students earn Honors in Internal Medicine
versus a school where: - 60–70% earn Honors
The individual value of “Honors in Medicine” drops sharply in the latter. You are just part of a crowd.
Program directors are not naïve. They read MSPE grade distribution tables. When they see top-heavy distributions, they adjust mentally:
- “Honors from School A ≈ top quartile.”
- “Honors from School B ≈ top half, maybe even top two-thirds.”
Inflation erodes the advantage of excellent students at inflationary schools. Everyone looks the same.
2. Increased reliance on Step 2 CK
Once Step 1 became P/F and grade distributions started creeping upward, Step 2 CK immediately became the new numeric go-to.
We already see:
- Step 2 CK ranges in competitive specialties mirroring historical Step 1 competitiveness: 250+ is now the new “signal” number in some fields.
- Program directors screening by Step 2 CK score when grades look too “nice.”
In other words: if your school inflates grades but your Step 2 CK is mediocre, the inflation does nothing for you. The harder, standardized number wins.
3. Worsening inequities
Grade inflation interacts badly with subjectivity:
- Students with stronger mentorship, more “polished” interpersonal skills, or who fit a clerkship’s implicit culture often secure better narrative comments and higher eval scores.
- When subjectively generous grading is combined with lower thresholds, the students already advantaged by the system may get “double-counted” improvement.
Meanwhile, students who could have distinguished themselves with a standout Step 1 no longer have that national, standardized proof of ability. They get pulled back down into a noisy, inflated grade environment.
Concrete Scenarios: What This Looks Like for You
Let me pull this into real-life cases I have actually seen.
Example 1: The “Everything Honors” Transcript
Student at a mid-tier MD school:
- Core clerkship outcomes:
- Medicine: Honors
- Surgery: High Pass
- Pediatrics: Honors
- OB/GYN: Honors
- Psychiatry: Honors
- Family Medicine: Honors
On its face: stellar. But when you flip to the MSPE grade distribution:
- Medicine: Honors 52%, High Pass 30%, Pass 18%
- Surgery: Honors 47%, High Pass 33%, Pass 20%
- Etc. Similar pattern in every clerkship.
Effectively, the student is in the upper half, not the top decile. Residency readers notice that. The inflating baseline blunts the advantage of those honors.
Example 2: Shelf-Driven vs Eval-Driven Schools
Two schools, both before and after P/F Step 1:
- School X: Clerkship grade = 70% shelf + 30% clinical eval
- School Y: Clerkship grade = 40% shelf + 60% clinical eval + small “bonus” items
When Step 1 goes P/F, School Y relaxes shelf thresholds more easily and leans into “holistic” grading. Grade inflation appears faster and larger there.
I have sat in meetings where faculty at School Y say things like:
“We should not punish good floor performance just because of one test. Let’s raise Honors for students who are close on the shelf but have strong evals.”
Meanwhile, School X keeps hard shelf cutoffs. Their grade distribution barely budges.
Result: School Y’s students show more Honors on paper but look less differentiated to PDs. School X’s students are fewer Honors, but each one carries more weight.
Data Patterns To Watch If You Are At A School Right Now
If you have access to any of these, you can objectively check if your school is inflating:
Multi-year grade distribution reports (ask: curriculum committee minutes, MSPE appendices, or internal dashboards)
- Compare pre-2022 vs 2023–2024:
- % Honors in each core
- % High Pass
- % Pass or lower
- Compare pre-2022 vs 2023–2024:
Shelf score means over time
- If honors percentages climb but shelf means stay flat (+/– 1–2 points), that is inflation.
Step 2 CK score distributions
- If Step 2 CK median is fairly stable (e.g., 245 → 247) but honors percentages jump 10–15 points, grades are outpacing real performance.
Let me model an example trend:
| Category | Honors Rate (%) | Shelf Mean (scaled) |
|---|---|---|
| 2019 | 30 | 74 |
| 2020 | 31 | 75 |
| 2021 | 32 | 75 |
| 2022 | 38 | 75 |
| 2023 | 44 | 76 |
The honors line moves sharply upward. The shelf line is flat. That delta is the inflation signal.
Is Grade Inflation Universal? No. But It Is Widespread Enough To Matter.
I am not arguing that every school inflated. There are notable counterexamples:
- Schools that went fully pass/fail for core clerkships, eliminating honors altogether.
- Schools that kept unchanged shelf cutoffs and grade proportions and simply accepted that their students would look less “honor-heavy” than peers.
- Schools that implemented strict norming: caps on the proportion of Honors per rotation or per site.
Those institutions will show minimal change in grade distributions. For their students, the Step 1 P/F shift primarily moved pressure to Step 2 CK and away from Step 1, without drastically altering clerkship grade meaning.
But enough schools moved in the opposite direction that residency-level perceptions reflect it. When PDs say regularly that “grades are inflated,” they are not talking about one or two isolated places.
So Where Does This Leave You Strategically?
You cannot control whether your school inflated. You can control how you interpret and respond to the environment.
From a data-and-outcomes perspective, here is how the hierarchy of signals looks in the P/F Step 1 era:
Step 2 CK score
Still the clearest, nationally comparable numerical signal. A strong Step 2 can override doubts about inflated grades. A weak Step 2 cannot be hidden behind a wall of Honors.Clerkship grades, normalized by school
Programs will continue to use them but will mentally normalize by the grade distribution tables in your MSPE. If your school is generous across the board, “all Honors” still helps, but less than you think.Narrative comments and “top X%” statements
MSPE language like “one of the top 10 students I have worked with in 10 years” breaks through inflation. Generic praise does not.Research, letters, and specialty-specific signals
As grades inflate, strong letters and concrete outputs (publications, case reports, leadership with outcomes) take on relatively more weight.
And structurally: the more grade inflation occurs at the preclinical and clinical level, the more programs will fall back on scored metrics like Step 2 CK and standardized letters (SLOEs, structured forms) that are harder to game locally.
The Bottom Line: Did Honors Get Upgraded After Step 1 P/F?
Yes, in many places. The data show:
- Noticeable jumps (5–15 percentage points) in the proportion of students receiving Honors or top-tier clerkship grades after Step 1 became pass/fail at several schools.
- Shelf score distributions and Step 2 CK medians that often did not move enough to justify those jumps.
- Policy changes that lowered thresholds or increased the impact of inherently inflated clinical evaluations.
That is textbook grade inflation.
The consequence is not just nicer transcripts. It is a distorted signal environment where:
- The best students in inflationary systems become harder to distinguish.
- Residency programs increase reliance on Step 2 CK and other standardized or structured measures.
- Students at “strict” schools may appear numerically weaker on grades but relatively stronger once PDs account for the distribution context.
You are living in the next phase of this transition. As Step 1 P/F solidifies and more match cycles accumulate, the data will get clearer: which grading policies protected their students’ signal, and which just painted everything gold.
With that in mind, your next moves are obvious: understand where your school sits on the inflation spectrum, anchor your application with hard numbers (especially Step 2), and treat clerkship “Honors” as a piece of the portfolio, not a magic ticket. The next evolution—structured national evaluations and even more data-driven selection—is coming, and you will be competing in that environment, not the one your attendings trained in.