
The obsession with “NBME vs UWorld percentages” has derailed more Step 1 prep than any single resource choice. The data shows: most students are reading their numbers wrong.
You do not fail Step 1 because your UWorld percentage is 58 vs 62. You fail because you misunderstand what those numbers actually represent, how they correlate (or do not) with NBME performance, and how to act on them.
Let me break this down the way it should be done: as a numbers problem, not a vibes problem.
1. UWorld Percentages: What the Data Actually Measures
People talk about UWorld “scores” as if they were exam scores. They are not.
UWorld percentages are performance metrics on a learning tool, with all kinds of bias baked in:
- Content is revisited
- Explanations are visible
- Students change their behavior over time
- Question difficulty is not consistent set to set
UWorld shows you a percentage correct per block and cumulative percentage. That is already two different signals, and both are easy to misinterpret.
The three main distortions in UWorld data
Selection bias
Stronger students tend to:- Start earlier
- Use tutor mode less
- Do more mixed, timed blocks
Those behaviors inflate the apparent predictive value of high UWorld scores. You see lots of “I averaged 70% and got 250+” posts. You do not see the 55% → 230 crowd as loudly, even though they exist in large numbers.
Timing bias
Early UWorld percentages almost always look worse.
Later percentages are artificially better because:- You have seen more question styles
- You have closed more knowledge gaps
- You are subconsciously memorizing patterns
Comparing “early 52%” with “late 64%” without context is meaningless.
Mode and behavior bias
Tutor vs timed, random vs subject-specific, phone vs desktop. Those change the cognitive load and effective difficulty. I have watched students get 75–80% in tutor, subject-specific, then drop to 55–60% in timed, mixed blocks. Same brain. Different test conditions.
So when someone asks, “Is 62% on UWorld enough for Step 1?”, the only honest analytical answer is: it depends on how you got that 62%.
2. NBME vs UWorld: Different Tests, Different Purposes
NBME forms and UWorld questions are not measuring the same thing in the same way.
- NBME: Sampling of the actual exam blueprint, written by the same organization that designs Step 1. Fixed length, fixed difficulty distribution.
- UWorld: Commercial question bank designed to teach and reinforce concepts, with explanations as the real product.
If you treat these two as interchangeable “practice tests,” your predictions will be wrong.
How their numbers fundamentally differ
- NBME score: a scaled score meant to approximate your current Step 1 performance under exam-like conditions.
- UWorld percentage: an average proportion of items answered correctly in a non-standardized environment.
NBMEs target test performance.
UWorld targets learning and exposure.
NBME is closest to what the psychometricians call “criterion-referenced” for Step 1 readiness.
UWorld is “performance on a practice item bank,” heavily context-dependent.
3. Rough Correlations: What Percentages Tend to Mean
Let me be explicit: the numbers below are approximate, pattern-based, not a guaranteed prediction. But they are far better than hand-waving “it depends” answers.
Assumptions here:
- You are 2–6 weeks from the exam.
- UWorld is being done in timed, random, mixed blocks.
- You are not memorizing questions (fresh-ish blocks, not 3x repeats).
- You have taken at least 2 recent NBME forms.
UWorld percentages and typical NBME ranges
Under those conditions, I repeatedly see this pattern among students:
| UWorld Timed-Mixed Avg | Rough NBME Range | Risk Category |
|---|---|---|
| < 50% | Below pass–205 | High risk |
| 50–55% | 200–215 | Borderline/low |
| 55–60% | 210–225 | Solidly passing |
| 60–65% | 220–235 | Comfortable pass |
| 65–70% | 230–245 | Strong performance |
| > 70% | 240+ | Excellent performance |
You will always find exceptions. The student doing 60% early in dedicated then sharpening to 245+. The 70% memorizer who bombs an NBME at 220. But as aggregates, these bins are surprisingly stable across schools.
To visualize the “non-linear” relationship, think of it this way:
| Category | Value |
|---|---|
| 45% | 195 |
| 50% | 205 |
| 55% | 215 |
| 60% | 228 |
| 65% | 238 |
| 70% | 248 |
Notice two things:
- The curve is not perfectly linear. Gains in the 60–70% band are “worth more” than gains in the 45–55% band.
- The variance around each point is wide. A 60% UWorld average might correspond to 220–240, depending on other factors.
4. Why NBME Percentages “Count More” Than UWorld
If you want the short version: trust NBME, calibrate with UWorld.
NBMEs are closer to the operational test:
- Item style and length similar
- Vignette structure more realistic
- Scoring scales anchored to historical performance
UWorld, by contrast, often:
- Uses longer stems
- Over-represents high-yield / tricky issues
- Includes more teaching in the question itself
How NBME performance predicts Step 1
Before Step 1 became pass/fail, many schools (and plenty of students) tracked NBME vs real Step 1 pretty obsessively. The generalized relationship looked like:
- Your best NBME score within 1–2 weeks of the exam was often within 5–10 points of your Step 1.
- Your average of the last 2–3 NBMEs was an even more stable predictor.
Step 1 is now pass/fail, but the psychometrics did not magically change. The pass line still maps to roughly the same ability level.
Typical modern pattern I see:
- NBME scaled score ≥ 215–220 → very low risk of failing.
- NBME at 205–215 → possible pass, but margin is not comfortable.
- NBME < 200 a week from the test → high-risk territory.
NBME is your altitude.
UWorld is your climbing speed and muscle memory.
Both matter, but if they disagree, believe NBME more.
5. Converting NBME Percent Correct to “Real” Performance
NBME now gives you percent correct and a “predicted” range. Students often get confused when they see 62% correct and assume that means “barely passing.”
That assumption is usually false.
NBME forms are built to be harder than a typical med school exam and their percent-correct → scaled score relationship is not intuitive.
A very rough, averaged mapping that I see repeatedly:
| NBME Percent Correct | Approx. Step 1 Score Band | Interpretation |
|---|---|---|
| 50–55% | 195–205 | Borderline |
| 55–60% | 205–215 | Pass range |
| 60–65% | 215–230 | Comfortable pass |
| 65–70% | 230–240+ | Strong performance |
| >70% | 240+ | Very strong performance |
This is not an official concordance table. It is a range based on students’ self-reported NBME forms vs outcomes. But it demolishes a common myth: “65% is a C and therefore barely passing.” Completely wrong. On many NBME forms, 65% can sit deep into the “strong pass” band.
The key is not the raw percentage. It is the NBME-provided scaled score and predicted range. That is the anchor for your Step 1 readiness.
6. How to Read Your Own Data: A Practical Framework
Let’s stop hand waving and build an actual decision rule. You care about two main dimensions:
- Current level (altitude) – NBME-based.
- Trajectory (slope) – UWorld-based.
Here is the rational way to combine them.
Step 1: Look at your last 2–3 NBME scores
Calculate:
- Most recent NBME scaled score
- Average of last 2 NBME scores
- Trend direction (up, flat, or down)
If:
- Latest ≥ 215 and trend is stable or rising → ready or near ready for Step 1 from a pass/fail standpoint.
- 205–215 with upward trend → probably OK with additional study weeks.
- < 200 or flat/downward trend → not ready, regardless of UWorld percentages.
Step 2: Examine UWorld in the right context
Use only:
- Timed
- Random
- Mixed blocks
- Within the last ~30–40% of the QBank
Compute:
- Average of last 500–800 questions
- Breakdown by system (e.g., cardio, neuro, renal)
- Performance compared with peers if available
Interpretation pattern I see work well:
- Last 800 questions < 55% and NBME < 210 → risk is high. You are not consistently applying knowledge.
- Last 800 questions 55–60% with NBME 210–220 → reasonable, but you still have room to tighten.
- Last 800 questions 60–65% with NBME 220–230 → aligned, comfortable pass trajectory.
- > 65% and NBME > 230 → you are above the pass threshold by a wide margin.
The direction matters more than the raw UWorld number:
- 52 → 58 → 61% over three 500-question windows is a good sign.
- 62 → 60 → 57% with stable NBME is stagnation or fatigue.
7. Common Misinterpretations That Wreck People
I have watched all of these play out, repeatedly.
1. “My UWorld is 70%, so I am guaranteed high performance.”
No. Ask:
- Was it all untimed tutor mode while watching Pathoma on a second screen?
- Were you doing subject-specific blocks after just reviewing that topic?
- Have you repeated the bank?
A 70% built on ideal, non-exam conditions is not the same as 70% timed, random, mixed, first pass.
2. “My NBME is 205, but my UWorld average is 65%; I should trust UWorld.”
No.
NBME more closely reflects exam conditions: fixed length, fatigue, mixed content, actual Step-style questions.
If NBME and UWorld disagree, especially when NBME is lower, assume:
- You are not yet deploying your knowledge efficiently under realistic pressure.
- Your UWorld behavior (mode, selection) was artificial.
You fix this by more NBME-style practice and targeted drilling, not by chasing a higher QBank percentage.
3. Overreacting to single bad blocks or single NBME
One 48% UWorld block tells you very little. Same for one NBME that drops 8 points compared with your prior form.
You look for:
- Moving averages, not single points.
- At least 2–3 NBMEs to define a trend.
Emotional reactivity to noise is the enemy of rational preparation.
8. Example Scenarios: How the Numbers Play Out
Let’s walk through a few realistic composites.
Scenario A: “Solid but anxious”
- Last 3 UWorld 500-question windows: 58% → 62% → 64%
- Last 2 NBMEs: 218 → 224
- Exam in 3 weeks
Data says:
- Both altitude and slope are good.
- Risk of failing is extremely low if you maintain effort and do not implode on test day.
Rational moves:
- Keep timed, mixed blocks.
- Target systems where UWorld shows <55%.
- Add 1–2 more NBME forms to maintain calibration.
Scenario B: “Good UWorld, mediocre NBME”
- UWorld average: 66%, mostly tutor, system-based
- One NBME: 205
- Exam in 2 weeks
Data says:
- UWorld conditions were not exam-like. Inflated indicator.
- NBME shows borderline performance.
Rational moves:
- Delay exam if possible.
- Switch to timed, random, mixed exclusively.
- Take another NBME in 7–10 days after hard push.
- Your true ability is likely closer to NBME than UWorld brag numbers.
Scenario C: “Low UWorld, improving NBME”
- UWorld average: 52% (time pressure high, mixed)
- NBME sequence: 192 → 204 → 212 over 4 weeks
- Exam in 2–3 weeks
Data says:
- NBME slope is excellent.
- UWorld average is held down by early performance and difficulty.
Rational moves:
- Weight NBME trend more heavily.
- Recompute last 800 UWorld questions only; if that subset is closer to 55–58%, trajectory is likely acceptable.
- Another NBME a week before the exam will tell you if you have crossed into safe territory.
9. Visualizing Your Own Progress
If you want to think like a data analyst, treat your prep as a time series. Not a stack of disconnected numbers.
At a minimum, log weekly:
- NBME scaled scores
- UWorld last-400 or last-800 question average
- Cumulative questions completed
Then look at trajectory:
| Category | Value |
|---|---|
| Week 1 | 500 |
| Week 2 | 900 |
| Week 3 | 1300 |
| Week 4 | 1800 |
| Week 5 | 2200 |
| Week 6 | 2600 |
Overlay that with NBME scores on your own spreadsheet. You want to see:
- NBME line trending up or at least flattening in the safe zone.
- UWorld rolling average stabilizing or improving after an initial dip.
If both are flat or declining while question volume rises, that is not “grinding.” That is banging your head against a wall and calling it effort.
10. What Actually Matters: A Short Checklist
Strip away the noise. The numbers that mean something are:
Your recent NBME scores
- Last 2–3 forms
- Trend direction
- Position relative to ~215–220
Your recent UWorld performance
- Last 500–800 questions only
- Timed, random, mixed
- System-specific weak spots (<55% bands)
Consistency of your test-like conditions
- Are you practicing like the real exam, or gaming your own statistics?
Everything else is psychological decoration.
Key Takeaways
- UWorld percentages are learning metrics, not exam scores. NBME results, especially recent ones, are far more predictive of Step 1 performance.
- You need to analyze trends (NBME trajectory, recent UWorld averages) rather than obsess over any single number or anecdotal “I had 68% and passed” story.
- When NBME and UWorld disagree, default to NBME as your readiness gauge and use UWorld to surgically identify and fix weak content areas.