
The myth that “Step 2 CK just needs to be a pass” is not only outdated; it is statistically dangerous for anyone who cares about the Match.
The data show a clear pattern: matched and unmatched applicants diverge sharply around specific Step 2 CK score thresholds, and that divergence widens in the most competitive specialties and for IMGs. I have sat in rank list meetings where a single 8–10 point difference on Step 2 CK separated “interview” from “auto-screened out.” Programs will not tell you this explicitly. The numbers will.
Let me walk you through where that divergence happens, specialty by specialty, and how far above “average” you actually need to be to look like people who match, not the ones who explain their reapplication on Reddit a year later.
1. The Big Picture: How Step 2 CK Predicts Matching
If you strip the process down to cold numbers, Step 2 CK behaves like a continuous probability variable for the Match. Higher scores push your odds up; lower scores drag them down. Not in a gentle slope either—there are cliffs.
Using recent NRMP Charting Outcomes data and USMLE performance summaries (post–Step 1 pass/fail shift), three patterns stand out:
- There is a steep inflection in match probability between roughly 235–245 for U.S. MD seniors applying broadly.
- The score gap between matched and unmatched applicants grows as specialties become more competitive.
- Programs are increasingly using Step 2 CK as the primary standardized metric, especially now that Step 1 is pass/fail.
To make that visible, look at a simplified model of overall match probability by Step 2 CK band for U.S. MD seniors (all specialties combined, approximated from NRMP trends):
| Category | Value |
|---|---|
| <220 | 55 |
| 220-229 | 70 |
| 230-239 | 82 |
| 240-249 | 90 |
| 250-259 | 94 |
| 260+ | 96 |
Below ~220, you are flipping a coin. Between 230 and 245, the slope is steep—every 5–10 points yields a meaningful jump in probability. Beyond 250, returns are smaller, but that is where the truly competitive applicants cluster for the hot specialties.
So the first non-negotiable point: “passing” is not a benchmark. The divergence between matched and unmatched begins more than 40 points above the passing standard.
2. Benchmarks by Competitiveness: Where the Curves Split
Where things get interesting is when you break the data by specialty. The gap between matched and unmatched applicants is not uniform. It widens dramatically in competitive fields.
Here is a cleaned, rounded view based on recent NRMP-style score distributions for U.S. MD seniors:
| Specialty Tier | Typical Matched Range | Typical Unmatched Range | Gap Center (Matched − Unmatched) |
|---|---|---|---|
| Ultra-competitive (Derm, Ortho, Plastics, Neurosurg, ENT) | 250–260+ | 238–248 | ~10–12 points |
| Highly competitive (Radiation Onc, Urology, EM, Anesth at top places) | 245–255 | 235–245 | ~8–10 points |
| Mid-competitive (IM, Gen Surg, OB/GYN, Pediatrics) | 238–248 | 228–240 | ~7–8 points |
| Less competitive (FM, Psych, Path, Neuro at many programs) | 230–240 | 220–233 | ~5–7 points |
The takeaway is simple: matched and unmatched applicants do not share the same Step 2 CK “middle.” Their medians are 7–12 points apart depending on specialty tier. That is not noise. That is structural.
You see this play out in program filters. In one large IM program I know, the pre-screen spreadsheet literally had color coding:
- Red: Step 2 CK < 225 (rarely reviewed unless extenuating factors)
- Yellow: 225–234 (context-dependent)
- Green: 235–249 (routine review)
- Dark green: ≥250 (flagged as “strong”)
If you were sitting at 229, you were not even in the same statistical conversation as the 245s, despite both being “above average.” This is what “divergence” actually looks like in workflow.
3. Specialty-Specific Divergence: Concrete Score Lines
Let us get more granular. I will group specialties into three types and show where matched vs unmatched bands really separate.
A. Ultra-Competitive Specialties
Dermatology, Plastic Surgery, Neurosurgery, Orthopedic Surgery, ENT.
These specialties hoover up the top of the Step 2 CK distribution. Think 250 as the functional floor at many academic programs, not the ceiling.
Approximate U.S. MD senior Step 2 CK data (rounded):
| Specialty | Median Matched | Median Unmatched | Rough “Safer Zone” | Major Risk Zone |
|---|---|---|---|---|
| Dermatology | 257–260 | 244–247 | ≥255 | <245 |
| Ortho | 253–256 | 242–245 | ≥250 | <240 |
| Plastics | 255–258 | 244–247 | ≥255 | <245 |
| Neurosurgery | 254–257 | 242–246 | ≥250–255 | <242 |
| ENT | 252–255 | 240–244 | ≥248–250 | <240 |
Look at those gaps: 10–13 points between matched and unmatched medians. That is not bad luck; that is stratification.
And this is where applicants self-delude. I have heard the exact line: “I got a 242 Step 2 CK, that is great, I am competitive for Derm.” Statistically, that applicant just landed in the unmatched median range, not the matched one.
If you are serious about these specialties:
- Below ~245, you are now relying on extreme strengths elsewhere (research, home program advocacy, away rotation performance).
- Around 250, you are in the realistic conversation.
- Above 255, you start to look like what PDs subconsciously tag as “typical match applicant.”
There are exceptions, but the aggregate data do not lie.
B. Mid-Competitive Workhorse Specialties
Internal Medicine, General Surgery, OB/GYN, Pediatrics. Most U.S. MDs match here, but the Step 2 CK gradient still matters—especially at academic or “prestige” programs.
Approximate numbers:
| Specialty | Median Matched | Median Unmatched | Strong Region | Vulnerable Region |
|---|---|---|---|---|
| Internal Med (categorical) | 244–247 | 233–236 | ≥245 | <235 |
| General Surgery | 245–248 | 235–238 | ≥245 | <238 (especially <235) |
| OB/GYN | 241–244 | 231–234 | ≥240 | <232 |
| Pediatrics | 239–242 | 228–232 | ≥238 | <230 |
Here, the divergence is more like 7–10 points. Still significant.
For a mid-tier IM program, I watched them sort applications by Step 2 CK and then draw an implicit cutoff around 235. Files below that got a brief scan for red flags or extraordinary features, but they were rarely invited. In their final rank list, the bulk of ranked applicants sat between about 238 and 255. That is the real “functional range,” not what is on the website.
The nuance: if you want academic IM with fellowship ambitions (cards, GI, heme/onc), a 235 might technically match you into “IM somewhere,” but it may not match you where the fellowship pipeline is strongest. So you have to think two steps ahead. The data for fellowship entry are tightly correlated with the prestige/academic intensity of the residency program, and those programs are picky. Step 2 CK is a fast proxy.
C. Less-Competitive But Not “Easy” Specialties
Family Medicine, Psychiatry, Pathology, Neurology at many non-elite programs. Match rates are higher, but low Step 2 CK scores still correlate with unmatched status, especially for IMGs.
Approximate patterns:
| Specialty | Median Matched | Median Unmatched | Safer Band | Watch-Out Band |
|---|---|---|---|---|
| Family Med | 234–237 | 222–226 | ≥235 | <225 |
| Psychiatry | 238–242 | 226–230 | ≥238 | <228 |
| Pathology | 235–238 | 223–227 | ≥235 | <225 |
| Neurology | 238–241 | 228–232 | ≥238 | <230 |
What happens here is subtle. A U.S. MD with a 225 will often still match FM or Psych, particularly with strong letters and a good story. But the unmatched group is stacked heavily in the sub-225 region, especially among IMGs and reapplicants. For them, 225 is not “fine”; it is the first warning sign.
The main statistical lesson: less-competitive specialties do not erase the Step 2 CK effect. They just shift the band where “divergence” begins about 5–10 points lower.
4. US MD vs DO vs IMG: Different Curves, Different Risk
Step 2 CK is not interpreted the same way for everyone. The same raw score carries different risk profiles by applicant type.
Here is an approximate model of match probability by Step 2 CK band and applicant status, aggregated across specialties, based on patterns from NRMP data:
| Category | US MD | US DO | IMG |
|---|---|---|---|
| <220 | 50 | 35 | 20 |
| 220-229 | 70 | 55 | 35 |
| 230-239 | 82 | 70 | 55 |
| 240-249 | 90 | 82 | 72 |
| 250+ | 94 | 88 | 82 |
Do not obsess over the exact numbers; focus on the pattern:
- At any given score, US MD applicants have the highest match rate.
- US DO applicants lag slightly behind, especially at lower scores.
- IMGs (US and non-US) sit significantly lower at every score band.
Concrete example: a 235 Step 2 CK.
- For a U.S. MD, that might be “borderline but matchable” for IM or OB.
- For a U.S. DO, that is more like a “safer” point for FM, Psych, Neuro.
- For an IMG, 235 is still a risk-bearing score for many core specialties and too low for competitive fields, unless backed by serious research and connections.
I have seen IMGs with 250+ still scramble because they under-applied or targeted hyper-competitive cities. For them, Step 2 CK is necessary but not sufficient. But if you are an IMG with <230, the data are brutal. Your match probability in anything other than FM, Psych, or Path at IMG-friendly programs drops sharply.
So your personal Step 2 CK benchmark depends on which curve you are on:
- US MD: aim for the median matched range in your target specialty.
- US DO: aim for median matched + a few points to offset bias and fewer “home” programs.
- IMG: aim well above the median matched range for that specialty if you want a realistic chance at U.S. training in anything mid-competitive or above.
5. Where Programs Actually Draw the Line: Implicit Filters
Programs rarely publish cutoffs, but in practice they use very blunt rules.
From IRL conversations with PDs and seeing their spreadsheets:
- Many mid-tier programs set an initial Step 2 CK screen around 230–235 for core specialties.
- Competitive academic programs and competitive specialties push informal filters closer to 240–245, sometimes 250+.
- For IMGs, internal spreadsheets often had a separate, higher threshold (e.g., 240 for IMGs where 230–235 might be accepted for U.S. MDs).
The workflow is predictable:
| Step | Description |
|---|---|
| Step 1 | Application Received |
| Step 2 | Set aside pending score |
| Step 3 | Reject or low priority |
| Step 4 | Review full file |
| Step 5 | Consider for interview |
| Step 6 | Step 2 CK Reported |
| Step 7 | Score above filter? |
| Step 8 | Red flags? |
If your score sits barely above the filter, you are not “safe.” You are simply allowed into the next stage where subjective factors now determine your fate. If you sit comfortably above (say filter is 235 and you are at 248), your file is psychologically coded as “strong,” which changes how people read your narrative, letters, and experiences.
This is the underappreciated part: Step 2 CK is not only a numeric cutoff. It is a framing device. A high number biases reviewers favorably; a low one biases them the other way. I have watched reviewers forgive a mediocre personal statement because “the scores and letters are great,” and I have watched them nitpick a strong application because “the Step is a bit soft.”
6. Strategic Benchmarks: What You Should Actually Aim For
Summarizing the data into actionable targets:
For ultra-competitive specialties (Derm, Ortho, Plastics, ENT, Neurosurg)
- Target: 250–260+ (US MD); 255+ (DO/IMG with serious research and connections).
- Below 245: You are now in the statistical zone of the unmatched cohort; success requires outsized strengths elsewhere and realistic backup planning.
- Backup: Strongly consider applying to a “parallel” specialty (e.g., IM, Gen Surg) if <245.
For mid-competitive specialties (IM, Gen Surg, OB/GYN, Pediatrics)
- Target:
- US MD: ≥240–245 for comfortable academic/urban matches.
- DO: ≥245 to offset bias, especially for Gen Surg and OB/GYN.
- IMG: ≥245–250 for IM/Gen Surg at IMG-friendly programs.
- Red zone:
- <235 for US MD
- <230 for DO
- <235 for IMG
You can still match with scores in the red zone, but your application must be surgically optimized and your list broad and realistic.
For less-competitive specialties (FM, Psych, Path, Neuro at many programs)
- Target:
- US MD: ≥230–235 puts you in a solid band.
- DO: aim ≥235–240.
- IMG: ≥235–240 for a more comfortable profile.
- Serious concern:
- <220 for anyone; the unmatched rates skyrocket here, especially for IMGs.
Remember: these are benchmarks, not guarantees. Step 2 CK explains a substantial chunk of match variance, but not all of it. Still, it is the one variable you can move in a quantifiable way before you apply.
7. The Post–Step 1 Pass/Fail Reality: Step 2 CK as the New Gatekeeper
Before Step 1 went pass/fail, programs leaned heavily on that three-digit score. Now, Step 2 CK has been quietly promoted to gatekeeper status.
What I have seen (and what PDs have admitted in private):
- Many programs now require Step 2 CK to be in by interview decisions, not just by rank list certification.
- Some have raised their informal Step 2 CK expectations by 5–10 points because they lost Step 1 as a discriminating tool.
- The variance in Step 2 CK is now the primary standardized way to differentiate large piles of applicants with similar transcripts and vague MSPE language.
This shift makes “I will just crush Step 2 if Step 1 was weak” both more common and more dangerous. If you delay Step 2 and underperform, you have lost the last major standardized lever. And programs have no patience for narratives like “I am not a good test taker” when the majority of their matched residents are sitting 10–15 points above you.
One more angle: the later in the cycle your Step 2 CK score posts, the less it can help you. Early high scores get you interviews. Late high scores mostly help on the rank list—useful, but not as powerful as getting in the door in September instead of November.
8. How to Use These Benchmarks Without Losing Your Mind
A data-heavy article like this can induce one of two bad reactions: denial (“these are just averages, I am different”) or panic (“my career is over because I scored 5 points low”). Both are unhelpful.
Here is the rational way to use these numbers:
- Locate yourself on the distributions for your target specialty and applicant type. Are you in the matched median band, the unmatched median band, or the overlap region?
- Adjust your strategy, not your identity. If your score sits:
- At or above the matched median: you have statistical tailwind. Apply broadly, but you can aim higher geographically and prestige-wise.
- Slightly below the matched median but above the unmatched median: your outcome is highly sensitive to non-score factors. Letters, rotations, and program list quality matter a lot.
- At or below the unmatched median: you must consider backup specialties, more applications, targeted IMG-friendly or lower-tier programs, and polished narratives explaining any score issues.
- Stop pretending scores do not matter. Programs do not care how you feel about standardized testing. They care that the residents they rank can pass boards and handle the cognitive load. Step 2 CK is their crude but effective proxy.
If your Step 2 CK is already in your past, the data tell you where the risk is, not that you are doomed. I have seen 225s match competitive programs when everything else lined up. I have also seen 255s unmatched after disastrously narrow application lists and arrogant personal statements.
Key Points to Remember
- The divergence between matched and unmatched applicants starts far above the passing threshold—usually around 235–245, depending on specialty.
- Competitive specialties show 10–12 point gaps between matched and unmatched medians; even mid-competitive fields sit around a 7–10 point gap.
- Your realistic benchmark is not “passing” or “average”; it is the median matched score for your specialty and applicant type, plus a buffer if you are DO or IMG.