Residency Advisor Logo Residency Advisor

Step Score Deciles and Recommended Program Numbers: A Data Model

January 6, 2026
13 minute read

Resident reviewing residency application data dashboards -  for Step Score Deciles and Recommended Program Numbers: A Data Mo

14% of residency applicants submit 60 or more applications—yet their match rate is barely higher than those who submit half as many.

That is not an efficiency problem. That is a strategy problem. And most of it comes down to one thing people hand-wave: how your Step score decile should drive how many programs you apply to.

Let me be direct: “apply to as many as you can afford” is lazy advice. The data does not support it. Your optimal number depends on three variables:

  1. Your Step 2 CK score decile
  2. Specialty competitiveness
  3. Whether you are a US MD, US DO, or IMG

You control only one of those at application time: how widely you apply. So we build a data model around that.


1. The Core Idea: Deciles, Not Raw Scores

The NRMP and NBME never give you a clean “apply to exactly X programs” chart. What they do give you:

The step that most people skip is normalizing their score. A 245 means nothing by itself without context. What matters is your position in the distribution—your decile.

You can think about Step 2 CK more or less like this (recent cycles, ballpark):

bar chart: 10th, 25th, 50th, 75th, 90th

Approximate Step 2 CK Decile Cutoffs
CategoryValue
10th220
25th230
50th240
75th250
90th260

That is not exact for every year but close enough to model behavior.

Translation:

  • Bottom 10%: ≤ ~220
  • 10–25%: ~221–229
  • 25–50%: ~230–239
  • 50–75%: ~240–249
  • 75–90%: ~250–259
  • Top 10%: ≥ ~260

Programs do not care if you are a 246 vs 247. They care if you are in their usual bucket: “below average, average, above average, excellent.”

So we’ll frame recommendations by decile bands, not fetishizing individual points.


2. How Programs Actually Behave (Not How People Think)

Here is the unromantic pipeline:

  1. Filter by Step (often hard cutoffs for IMGs, softer for US MDs)
  2. Roughly rank by: Step band → school type → red flags → extras (research, AOA, etc.)
  3. Invite enough people to fill interview slots with a safety margin
  4. Rank list is built mostly from the people they actually meet

From the applicant side, matching usually requires:

  • A minimum number of interviews (magic number is ~10–12 for many non-competitive specialties if you are US MD; more for DO/IMG and for competitive fields.)
  • Interviews are a function of: number of applications × your invite rate per application.

So the question “how many programs should I apply to?” is really:

How many applications do I need to send to probabilistically generate ~X interviews, given my decile and specialty?

Let’s put structure to that.


3. A Simple Data Model: Interviews as a Function of Decile

We can approximate an “interview yield rate” (percent of applications turning into interviews) by Step decile and specialty competitiveness.

For US MDs in an average competitiveness specialty (IM, FM, peds, psych), a reasonable working set might look like this:

  • Top 10%: ~20–25% of apps become interviews
  • 75–90%: ~15–20%
  • 50–75%: ~10–15%
  • 25–50%: ~6–10%
  • 10–25%: ~4–7%
  • Bottom 10%: ~2–4%

Yes, there is massive noise around those numbers (school, geography, red flags, etc.), but for planning, this is more useful than vibes.

If your goal is:

  • ~12 interviews: comfortable probability of matching in most non-competitive specialties for US MDs
  • ~15–18 interviews: more comfortable for competitive specialties or DO/IMG
  • ~8–10 interviews: still a decent shot in less competitive fields, but higher risk

Then you can invert the math:

Applications needed ≈ Target interviews ÷ Interview yield

Let’s convert that into something you can actually use.

Illustrative Interview Yield Model by Step Decile (US MD, Average Specialty)
Step 2 CK DecileApprox Score BandEstimated Interview YieldApps Needed for ~12 Interviews
Top 10%≥ 26020–25%50–60
75–90%250–25915–20%60–80
50–75%240–24910–15%80–120
25–50%230–2396–10%120–200
10–25%220–2294–7%170–300
Bottom 10%≤ 2202–4%300–600+

Notice something that surprises people: even at the top, you still benefit from breadth. A US MD with a 262 in internal medicine who applies to just 20 programs is trusting an unusually high yield rate from each app. That works if you are geographically flexible and not overreaching; it fails fast if you are picky.


4. Adjusting for Specialty Competitiveness

“Average specialty” is doing a lot of work in that table. Competitiveness matters more than most applicants want to admit.

Three rough buckets:

  • Less competitive: Family medicine, internal medicine (categorical, not physician-scientist tracks), pediatrics, psychiatry, pathology, neurology in many regions
  • Mid-range: OB/GYN, anesthesia, EM (though in flux), general surgery (categorical)
  • Highly competitive: Dermatology, plastics, ortho, ENT, urology, radiation oncology (small numbers), integrated vascular, some radiology tracks

The data pattern:

  • In less competitive fields, your interview yield is better at a given decile. You can safely reduce applications modestly.
  • In highly competitive fields, your yield is worse (especially at mid and low deciles). You compensate by either:
    • Applying to more programs, and/or
    • Adding a parallel backup specialty

For high-competition specialties, it is not unusual to see US MDs with top 10% scores applying to 60–80+ programs, and still sweating for 10–12 interviews. That is not overkill; that is the market.

Let’s sketch indicative ranges for US MDs, assuming no major red flags and moderate geographic flexibility.

Suggested Application Ranges by Step Decile and Competitiveness (US MD)
Step 2 CK DecileLess CompetitiveMid-rangeHighly Competitive
Top 10%25–4040–6060–80+
75–90%30–4550–7070–90+
50–75%40–6060–9080–120+ (+ backup)
25–50%50–8080–120120+ (+ strong backup)
10–25%60–100100–160Often not viable without extraordinary factors
Bottom 10%80–150Avoid unless strong hooksAvoid; focus on safer fields

Those are not guarantees. They are risk-calibrated ranges based on typical interview yields.


5. Applicant Type Multipliers: US MD vs DO vs IMG

Same score, different reality.

A 240 in a US MD applicant is not the same asset as a 240 in a non-US IMG. The filter logic at many programs still looks like this, silently:

  1. US MD with acceptable Step scores
  2. US DO
  3. US IMG
  4. Non-US IMG

The data show large gaps in match rates by applicant type at similar score bands. So you adjust using multipliers.

Approximate application inflation factors relative to US MD, same decile and specialty:

  • US DO: multiply suggested applications by ~1.2–1.5
  • US IMG: multiply by ~1.5–2.0
  • Non-US IMG targeting competitive fields: often 2.0–3.0 or simply not realistically competitive without exceptional strengths

So if a US MD in the 50–75th decile for anesthesia might be in the 60–90 apps range, a US DO in the same score range should be thinking more like 80–120. A non-US IMG at that same number likely needs 120–160+ with heavy focus on community and IMG-friendly programs.

Here is a compressed view.

hbar chart: US MD, US DO, US IMG, Non-US IMG

Relative Application Volume Multipliers by Applicant Type
CategoryValue
US MD1
US DO1.3
US IMG1.7
Non-US IMG2.2

Programs rarely say this out loud, but their rank lists and historical match data show it clearly.


6. The Hidden Constraint: Diminishing Returns

Here is where mass-application thinking breaks.

If you look at NRMP data and institutional reviews, you see a curve like this:

  • Up to a certain application count, additional applications yield more interviews at a roughly linear rate.
  • Beyond that threshold, every additional 10–20 applications generate almost no new interviews.

Typical threshold bands:

  • Less competitive specialties: strong flattening around 40–60 applications for US MDs with mid-plus scores.
  • Competitive specialties: flattening around 60–80 for high-decile US MDs, higher for DO/IMG.

Why? Because programs overlap in behavior. The ones willing to interview you usually show up early in your invite list. The tier that consistently ignores your profile will likely ignore you whether you applied to 60 or 120 of them.

That is why throwing money at 80 extra “reach” programs with strict cutoffs often yields 0–1 extra interview. Or none.

So any responsible model must include an upper cap where marginal ROI is tiny. For most applicants:

  • Past ~100–120 applications in a single specialty, returns per app become very low unless you are an IMG in a competitive field with no better options.
  • Beyond ~150, you are entering “panic spam” territory, not data-driven strategy.

7. Building a Simple Personal Model

Let’s walk through an example with actual numbers.

Example 1: US MD, Step 2 CK 244, Internal Medicine

  • Step 2 CK 244 ≈ 50–75th decile
  • Specialty: less competitive to average
  • Applicant type: US MD
  • Geography: open, no strict regional constraint

Reasonable interview yield guess: 10–15%.
Target: 12–14 interviews.

Use the mid value (12%) for planning:

  • Apps needed ≈ 13 interviews ÷ 0.12 ≈ 108

Now apply diminishing returns logic. Our earlier table for this band and specialty suggested 40–60 for less competitive, maybe 60–80 for average IM if you have some constraints.

Why did the rough math give ~108? Because I used a cautious 12% and ignored the fact that IM has many programs that over-interview mid-decile US MDs. Realistically:

  • If you apply to 60 well-chosen IM programs as this applicant, your effective yield might run closer to 15–20%, not 12%. That gets you 9–12 interviews.
  • If you are nervous, push to 70–80.

So for this profile, I would say: 60–80 IM programs, prioritized by fit and geography, not 120–150.

Example 2: US DO, Step 2 CK 232, Anesthesia

  • Step 2 CK 232 ≈ 25–50th decile
  • Specialty: mid-range competitiveness
  • Applicant type: US DO

From the earlier table, a US MD at this decile for a mid-range specialty would be in the 80–120 range. Apply 1.3× multiplier for DO:

  • Suggested range ≈ (80–120) × 1.3 ≈ 104–156

Now add reality: some academic anesthesia programs have DO biases and higher Step expectations. If you apply to 120 programs and 40 of those are pure reaches, your effective yield will be worse.

So the data-driven version is:

  • 100–140 applications, but with:
    • Heavy weighting to DO-friendly and community programs
    • Very selective “reach” behavior

Not “send 200 apps to every big-name program that shows up on FREIDA.”

Example 3: Non-US IMG, Step 2 CK 252, Internal Medicine

  • Step 2 CK 252 ≈ 75–90th decile (strong)
  • Specialty: less competitive
  • Applicant type: non-US IMG

This is someone with a high decile but “penalty” for applicant type.

Base for US MD at 75–90th decile, less competitive: 30–45 programs.
Apply 2.0× multiplier:

  • 60–90 programs

But with IMG realities (less yield from university hospital programs), I have seen many such applicants land ~8–15 interviews by applying to 80–120 well-chosen programs. So the real recommendation:

  • 80–120 internal medicine programs, heavy IMG-friendly targeting, and focus on regions that historically interview IMGs.

That is not paranoia. That’s what the data show.


8. Time and Money: The Other Data You Ignore at Your Peril

Every extra 20 applications:

  • Costs you more in ERAS fees
  • Generates extra secondaries / supplemental questions
  • Adds more interview scheduling chaos

If you apply to 120 programs and magically get 25 interview offers, you will not attend 25. You will cancel a third of them, possibly more. So you wasted time and money generating interviews you were never going to use.

Think of a residency season as a small optimization problem:

  • Objective: Maximize the probability of obtaining 10–15 high-quality interviews you will actually attend.
  • Constraints: Budget, time, personal energy, away rotations, school obligations.

More is not automatically better. The optimal strategy is usually:

  • Enough applications to hit that 10–15 interview zone for your risk profile
  • Heavy targeting to programs where your decile and background fit their historical behavior
  • Conscious avoidance of ridiculous reaches that just drain fees

line chart: 20, 40, 60, 80, 100, 120, 140

Interviews vs Applications - Diminishing Returns Shape
CategoryValue
203
407
6010
8012
10013
12014
14014

Most people emotionally believe the line keeps going up. It flattens.


9. How to Use This Model in Practice

Condensed workflow:

  1. Identify your Step 2 CK decile band

    • Use current score distributions; approximate is enough.
  2. Classify your specialty

    • Less competitive / mid-range / highly competitive.
  3. Factor in applicant type

    • US MD, US DO, US IMG, non-US IMG: choose the multiplier.
  4. Start from the table ranges

    • Use the earlier application range table as a starting interval.
  5. Adjust based on personal factors

    • Major red flags? Add 20–30% to the range.
    • Very strong extras (AOA, great research in that field)? You can trim modestly, especially in less competitive specialties.
    • Severe geographic constraints? You may need to push to the top of the range.
  6. Cap at a reasonable maximum

    • In one specialty, going above ~120 for most applicants rarely shifts outcomes meaningfully unless you are an IMG in trouble.

10. The Bottom Line in Plain Numbers

If you only remember a few anchors, use these:

  • US MD, strong score (top 25%) in a less competitive field: 30–60 programs.
  • US MD, middle-of-the-pack in a mid-range field: 60–100 programs.
  • US MD attempting a highly competitive specialty: 60–90+ plus a backup specialty (30–60 apps) unless you are truly elite.
  • US DO: inflate US MD ranges by ~20–50%, depending on field and region.
  • IMGs: expect to live near the top end of the ranges and be extremely targeted.

And ignore anyone who says, “Just apply to everything and see what happens.” That is not a plan. That is a gambling strategy.


Key points to walk away with:

  1. Your Step 2 CK decile, not the raw number, is what should anchor how many programs you apply to.
  2. Optimal application counts shift sharply with specialty competitiveness and applicant type; use multipliers, not one-size-fits-all advice.
  3. There is a clear point of diminishing returns—usually well below 150 programs—where each extra application adds cost and chaos but almost no incremental chance of matching.
overview

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

* 100% free to try. No credit card or account creation required.

Related Articles