
The Unspoken Step 1 Benchmarks Top Academic Programs Expect
It’s March of M2. You just got your first NBME back: 229. Your classmates are quietly comparing numbers in the hallway, but no one says anything concrete. Someone mutters “220s are fine, Step 1 is pass/fail now anyway.” Another whispers that at UCSF the residents “don’t really care about scores, they care about the whole person.”
You’re not stupid. You’re thinking: “What do the top academic programs actually expect? What number are they thinking when they glance at my score— even if they never say it out loud?”
Let me tell you what they say when you’re not in the room.
I’ve sat in those meetings. I’ve heard the exact phrases:
“Below our usual cut.”
“Fine for a solid program, not for us.”
“Great applicant, but that Step is going to hurt them here.”
Nobody publishes these benchmarks. But they exist. They guide who gets interviews, who gets flagged as “risky,” and who gets shunted into the “probably rank low” pile.
We’re going to talk about those numbers.
First: The Dirty Secret About “Pass/Fail” and Legacy Expectations
You already know this, but let me strip away the marketing.
Yes, Step 1 is officially pass/fail now. Program directors did not all collectively throw away 10+ years of mental calibration just because the score line disappeared. They still think in three buckets:
- “This would have been a very high score.”
- “This would have been in the acceptable range.”
- “This would have been concerning.”
How do they know? Easy. They look at your practice performance if they can get their hands on it (home advisors, letters, internal notes); they look at Step 2; and they still have a picture in their head of what old Step 1 numbers meant for resident quality.
When attendings at top academic programs say, “Our residents used to average around 245 on Step 1,” that’s not just nostalgia. That’s their expected cognitive baseline. They still compare you to that internal standard using whatever surrogate they can.
And here’s the part no one says out loud:
At elite academic programs, the cultural memory of numerical Step 1 is still driving their idea of what a “strong” applicant looks like.
What “Top Academic” Actually Means in Step 1 Terms
You know the brand-name programs: Mass General, Brigham, UCSF, Stanford, Hopkins, Mayo, Penn, Columbia, Duke, NYU, WashU, Michigan, etc.
Inside those walls, when faculty talk about “our residents,” these are the Step 1 ranges they are thinking of from the pre-pass/fail era:
| Program Tier | Typical Old Step 1 Range | Internal Attitude |
|---|---|---|
| Elite academic IM (MGH, BWH, UCSF) | 245–255+ | “This is our norm” |
| Strong academic IM (Michigan, Mayo, Duke) | 240–250 | “Competitive and expected” |
| Solid university IM | 230–240 | “Perfectly fine, no issue” |
Nobody says this in the brochure. But behind closed doors:
- A 245 was seen as “on brand” for their residents.
- A 230 was “fine, but more like our categorical at a good state university.”
- Below 225 raised eyebrows for those places unless something else was truly exceptional.
Do they still think like this? Yes. They just map it to Step 2 now and quietly infer Step 1 potential from your entire academic story.
The Translation Game: What Practice Scores Signal Behind the Scenes
Here’s the unspoken conversion: top academic programs still mentally map your performance to the old Step 1 score bands. They don’t talk in “percent correct.” They talk in the ghosts of 230, 240, 250.
When an attending advisor emails a PD, they do not write:
“Student X is averaging 68% on UWorld.”
They write something much closer to:
“Student X consistently tested around the mid-240s equivalent before Step 1 went pass/fail, and their Step 2 score is similar.”
That’s the language PDs understand. They trained on numbers.
So, what does your prep data really say in their heads? Rough mapping from the era when these were being correlated all the time:
| NBME Predicted Score | How Top Academic PDs Interpreted It |
|---|---|
| ≥ 250 | “Very strong, no cognitive concern” |
| 240–249 | “Strong, competitive for us” |
| 230–239 | “Acceptable; will need other strengths” |
| 220–229 | “Borderline for us; risky unless stellar elsewhere” |
| < 220 | “Not our usual profile” |
Those mental categories never died. They just got reassigned to Step 2 and to any hints they can get about your test-taking ability.
The Quiet Benchmarks by Specialty (What They Actually Expect)
Let’s get concrete. You’re not applying to “medicine in general.” You’re aiming at something specific, and the expectations are not the same.
Internal Medicine – Top Academic (MGH, BWH, UCSF, Hopkins, Stanford, etc.)
I’ve heard versions of this many times in IM selection meetings:
- “We’re used to our residents being in the 245–255 range on Step 1.”
- “Below around 235 historically, they struggled more in our environment.”
So what’s the unspoken Step 1 benchmark, now projected onto your profile?
For top-tier academic IM, they expect that if scores still existed, you’d likely have landed in the 240+ range on Step 1. For Step 2, they pretty much want to see that: 245+ is the mental comfort zone.
So when you’re prepping for Step 1 (or your school’s CBSE/COMSAE equivalents), this is the unsaid bar if you’re dreaming of high-caliber IM:
- Practice scores consistently in the equivalent of mid-230s climbing into 240s by the end of dedicated.
- No pattern of barely clearing “pass” territory on NBMEs.
They won’t say “we want a 245.” But that’s the standard that built their current resident cohort.
Dermatology, Plastic Surgery, Ortho, ENT, Neurosurgery
These are blood sport. And they never pretended otherwise.
When Step 1 was scored, I sat in one plastics meeting where someone literally said:
“We have tons of 260s in the pile. Why are we spending time on this 241 again?”
Brutal, but honest.
For these specialties at top academic centers, the historical mental cutoffs looked like this:
- Comfortable range: 250+
- Concern starts: < 240
- Below 235: “Needs something truly exceptional to justify”
So what’s the current expectation, even with pass/fail?
They assume that a realistic, competitive applicant to these programs would have had the horsepower to score 245–255+ on Step 1. They test that assumption with:
- Step 2 (they want 250+; 260+ makes everyone relax)
- Your shelf exam performance
- Your school rank / honors in preclinical and clinical years
If your practice exams for Step 1 never cracked what used to be 235–240, the harsh truth is this: at the most elite derm/plastics/ortho/ENT programs, faculty are going to quietly question whether you’d keep up with their historical benchmark.
General Surgery – Top Academic
Surgery is funny. They talk a big “we’re tough” game, but I’ve seen plenty of top academic gen surg programs take residents who had Step 1s in the mid-230s—if they were gritty, hard-working, and clearly surgical.
Still, for the elite surgery departments (think UCSF, MGH, Hopkins, Michigan), the unstated expectation is:
- Old Step 1 “sweet spot”: 240–250
- Below 235: “We might still take them, but they need something big – research, strong home support, killer letters.”
Again, they now lean on Step 2 and performance, but the internal standard hasn’t gone away.
Pediatrics, Family, Psychiatry – Even Here, Top Academics Think in Tiers
On the softer-competitive side, PDs will tell you they “don’t care about numbers.” Then they’ll say, “Our residents usually had Step 1s in the 230s” five minutes later.
Translation for top academic peds/psych/FM programs:
- Expected old Step 1 equivalent: 230–240 for the strongest programs.
- They’re not chasing 260s, but persistent barely-passing-level performance will make them nervous about fellowship aspirations and board passing rates.
How Your Step 1 Prep Trajectory Signals Where You Belong
Most students only look at their NBMEs as “am I going to pass?” Program directors think: “What was their trajectory and where were they trending by the end?”
Here’s the back-channel logic:
- A student who started with NBMEs in the 210s and climbed to 238 by the last one? That’s someone with work ethic, plasticity, and a final level that roughly maps to “solid academic program, possibly higher if other boxes are checked.”
- A student stuck in the low 220s on multiple NBMEs, squeaking by at “pass”? That’s someone they assume would have had a ~220–225 Step 1. For top-tier academic places, that’s below their historical comfort zone.
Let’s make this visual.
| Category | Applicant A | Applicant B | Applicant C |
|---|---|---|---|
| NBME 1 | 218 | 225 | 205 |
| NBME 2 | 228 | 224 | 212 |
| NBME 3 | 235 | 226 | 220 |
| NBME 4 | 243 | 227 | 229 |
Applicant A is the one top programs get excited about. Not because 243 is magical, but because the final level and upward curve match their mental template: can handle a high-complexity environment.
Applicant B looks stagnant. Applicant C shows growth but ends below the “elite” band; that’s a better fit for strong but not hyper-elite academic programs.
The Conversations You Never Hear in Selection Committee
Let me walk you through an actual style of conversation from a real committee meeting. I’m changing names, but not the tone.
Case 1: High-trajectory, mid-240s equivalent
“This one had a predicted Step 1 in the mid-240s on their last NBME, Step 2 is a 253, AOA, strong medicine letter. They’ll be fine here.”
Subtext: They fit right into our historical average. No anxiety.
Case 2: Low-230s equivalent, strong research
“Practice scores suggest they would have been in the low-230s on Step 1. Step 2 is 237. But they have a first-author in JAMA Internal Medicine and did a year of research with one of our faculty.”
Response: “We can work with that. The research and mentorship at our institution are strong enough to justify a slightly lower test profile.”
Case 3: Borderline practice performance
“Multiple attempts at NBMEs hovering around passing, Step 2 just above 230, no significant upward trend.”
Response at a top-tier program: “We’ll probably pass. This is more in line with a solid university program; we need to protect our boards pass rates.”
No one says: “Our cutoff is 240.” They say: “We need people who can pass our boards on the first try and handle complex patients at 2am without falling apart.”
The convenient shorthand they’ve always used for that? Step scores—or whatever best approximates them now.
How to Aim Your Step 1 Prep if You Want a Top Academic Spot
Let’s be blunt. You’re not going to “manifest” your way into UCSF or MGH if your entire testing history screams “barely pass.”
You have to reverse engineer what they expect and train accordingly.
If you want to be taken seriously by top academic programs, here’s the target mindset for Step 1 even now:
- Your dedicated performance should realistically project to an old Step 1 score of ≥235 as a minimum, and ideally 240+ if you have academic-center ambitions in IM, neuro, EM, etc.
- For derm/plastics/ENT/ortho/neurosurg at truly top places, your testing life needs to plausibly support a 245–255+-equivalent narrative. Otherwise, you’re swimming against a very strong current.
How do they infer your “would-have-been” Step 1?
- They look at Step 2 directly.
- They ask your school’s advisor what your practice performance looked like if they trust them.
- They read subtle code in MSPE and letters: “strong test-taker,” “excelled in basic science courses,” “consistently at the top of their class on standardized exams.”
Your job during Step 1 prep is to generate evidence that you belong in the higher cognitive band:
- UWorld percentages trending into the mid-70s by the end (for most schools, that correlates with 240+ behavior).
- NBMEs showing upward movement from low 220s to high 230s/240s by your last exam.
- Not just one lucky test, but a pattern.
| Category | Value |
|---|---|
| <55% | 205 |
| 55–64% | 220 |
| 65–74% | 238 |
| 75%+ | 250 |
No PD is going to see this bar chart. But this is very close to how seasoned advisors internally translated UWorld performance to Step 1 potential.
What If You’re Not Hitting Those Benchmarks?
Here’s the part no one tells you early enough.
If you’re plateaued in what would have been the low 220s on Step 1 practice exams:
- You are not doomed.
- But you do need to recalibrate your target program list.
Top academic programs are not the only places where you get good training, or where you can match into competitive fellowships. Plenty of mid-tier academic and strong community programs take residents with old Step 1 equivalents in the 220s who go on to match into cards, GI, heme/onc, or critical care.
I’ve seen it over and over:
Resident with a 223 Step 1 equivalent and a 233 Step 2 at a solid university program → stacked research, good mentorship → matched GI at a big-name institution.
What they did not do was pretend they were a 250-equivalent applicant when building their initial list.
If you’re early enough in M2, and your practice scores are ugly, your job is twofold:
- Maximize the upward trajectory now – squeeze every point of growth out of Step 1 prep, because that pattern will echo in Step 2 prep as well.
- Be realistic about tier – if by the end of dedicated, your best NBME equivalent is 225, stop fantasizing about programs that historically averaged 250+ Step 1s and start building a plan where you will actually thrive.
Top academic programs are not impressed by people who ignore reality. They’re impressed by people who know themselves, build from where they are, and then crush Step 2 and clinical work.
A Quick Reality Check on Boards Pass Rates (Why PDs Care So Much)
There’s one more piece you need to understand: programs are terrified of their board pass statistics.
Every PD sees a version of this graph every year: resident in-training exam scores vs. eventual board pass/fail. The strongest predictor, historically? Step 1 and Step 2.
So when they’re skittish about low-equivalent scores, what they’re really thinking is:
“If we stack our program with people who tested like they’d be 220s, our board pass rate is going to drop, and the Dean will be in my office.”
That’s the institutional anxiety sitting behind the “unspoken benchmarks.” It’s not snobbery for its own sake. It’s self-preservation.
| Category | Value |
|---|---|
| <220 | 80 |
| 220–229 | 88 |
| 230–239 | 93 |
| 240–249 | 96 |
| 250+ | 98 |
The actual numbers vary, but the pattern is consistent. Higher old Step 1 bands correlated with safer board outcomes. Program directors know this. They act accordingly, even now.
FAQ: The Questions You’re Afraid to Ask Out Loud
1. If Step 1 is pass/fail, should I still push like I’m aiming for a 250?
If you’re aiming at top academic or competitive specialties, yes. Not because anyone will see that number, but because the level of mastery required to be able to score 245–255 will show up again on Step 2, shelf exams, and clinical reasoning. Those are visible. They’re the new currency. Training yourself to that standard now pays off later when it’s actually measured.
2. My practice Step 1 scores topped out around a 230 equivalent. Am I shut out of elite academic programs forever?
You’re not automatically barred, but you’re swimming upstream. To seriously contend, you’ll need a genuinely strong Step 2 (think 245+), excellent clinical grades, and a very clear “hook”: major research, unique background, or powerful letters from known faculty. Most applicants in your band will be better served targeting strong mid-tier academic or university-affiliated programs where your profile is a better fit.
3. How much does a high Step 2 really compensate for a weaker Step 1-equivalent level?
A lot more than people admit publicly. PDs love a redemption arc where Step 2 jumps significantly above what your Step 1 trajectory suggested. If your practice Step 1 performance hinted at low 220s, but you produce a 250 Step 2, they’ll quietly recalibrate their impression of your cognitive ceiling. For current cohorts, Step 2 has become the de facto “Step 1 score” in PD brains.
4. My school doesn’t share practice data. Will programs still infer my Step 1 level?
Yes. Through Step 2, your basic science transcript, preclinical honors, shelf exam performance, and any coded language in your MSPE and letters. Advisors talk. PDs read between the lines. You can’t hide from the underlying question: “Could this person have hung with our historical 240–250 residents?” Your job in Step 1 prep is to make that answer as close to “yes” as you realistically can.
Key takeaways:
- Top academic programs still think in the old Step 1 score bands, even if they never say the numbers out loud.
- If you want to be taken seriously by those places, your performance needs to plausibly fit the 235–250+ mental range, now demonstrated mainly through Step 2 and your overall testing history.
- If your trajectory doesn’t match those unspoken benchmarks, adjust your strategy—not your worth—and build a path where you can actually excel instead of chasing a fantasy tier that was never calibrated for you.