
Most applicants guess about “fit.” The data shows you can measure it. And if you do not, you waste interviews and tank your match odds.
Residency matching is not mysterious. It is noisy, but not random. Programs behave like risk‑averse investors: they want predictable returns (residents who will pass boards, not quit, and match their brand). You can model that behavior with numbers.
What you need is a quantitative fit model: a way to translate your scores, grades, research, and preferences into probabilities of interview and probability of matching at a given program. Then rank accordingly.
Let me walk through a concrete, numbers-first way to do this.
1. The Data Reality: What Actually Predicts Interviews and Matches
Before building any model, you need to be honest about what the data says drives outcomes. Not what advisors say. What the NRMP and program surveys actually show.
The big buckets:
- Board exam performance (USMLE/COMLEX)
- Academic record (clerkship grades, AOA, class rank)
- Research and scholarly output
- School type and reputation
- “Signal” variables: geographic ties, away rotations, signals (if your specialty uses them)
- Non‑quantifiable but structured: letters, interviews, perceived personality
The first four are model‑friendly. The last two are noisy and hard to quantify ahead of time, so you treat them as uncertainty, not core predictors.
Look at what program directors report in NRMP surveys (roughly, across competitive specialties):
- Step 2 CK (or equivalent) is almost always top 3 in importance for interview offers.
- Clerkship grades and class ranking/AOA land in the top 5.
- Research output shows strong correlation in competitive fields (Derm, Plastics, Rad Onc, Ortho).
- Failing a board exam, leaves of absence, or professionalism flags drastically reduce odds regardless of strength elsewhere.
So if your personal “story” is great but your Step 2 is 215 and you are applying ortho, the data is not on your side. A quantitative fit model forces you to confront that early, not at SOAP.
2. Define the Input: Your Applicant Profile as a Numeric Vector
You cannot model fit if “my profile” is just vibes. Turn yourself into numbers.
A practical minimal feature set:
Exams
- Step 2 CK (or COMLEX Level 2, converted)
- Step 1: pass/fail now, but still binary history (pass on first attempt vs not).
Academic performance
- Core clerkship honors count (number of H vs HP/P)
- AOA / Gold Humanism / class quartile as categorical or binary flags.
Research
- Total pubs + abstracts + posters (you do not need a 3‑page CV; programs scan totals).
- First‑author count as a separate feature for research‑heavy fields.
Program‑relevant extras
- Home program in that specialty (yes/no).
- Completed away rotation at that program (yes/no).
- Geographic connection to program’s region (0/1).
School background
- US MD, US DO, Non‑US IMG, US IMG (dummy variables).
- “Top‑tier research school” flag if applicable (you can proxy with USNWR top‑25 research or similar).
That gives you roughly 10–14 numeric/binary variables. Enough for a simple scoring model.
Represent it mentally like this:
ApplicantVector = [Step2Z, ClerkshipScore, ResearchScore, AOAFlag, HomeProgramFlag, RegionFlag, SchoolTypeDummies…]
You are not going to publish this in JAMA. You are going to use it to avoid rank list disasters.
3. Quantifying Program Competitiveness
Now flip the camera: what makes a program “competitive” in quantifiable terms?
You do not have perfect internal data, but you do have strong proxies:
- Specialty competitiveness overall (NRMP fill rate, Step score distributions).
- Program tier (perception, case volume, research intensity).
- Historical behavior (how often they rank DOs/IMGs, if known).
- Location desirability (major coastal city vs mid‑size Midwest vs rural).
- Program size (more positions = slightly more forgiving).
You can distill this into a single “Program Competitiveness Index” (PCI) on a 0–100 scale. Think USNWR‑style, but grounded in actual match behavior.
Here is a simple structure for PCI:
- 40%: Specialty baseline difficulty (e.g., Derm high, FM low).
- 30%: Step 2 CK typical range and AOA concentrations of recent residents (you can infer from alumni LinkedIn profiles, program websites, and published Step statistics by specialty tier).
- 15%: Research intensity (publications per faculty, NIH funding level, % residents going into fellowships).
- 10%: Location desirability (based on applicant preferences—NYC/Boston/LA get bumped).
- 5%: Program size and fill pattern (small, always fully filled programs → higher PCI).
Let me show a simplified ranking across three hypothetical programs in the same specialty:
| Program | Specialty | PCI (0–100) | Typical Step 2 CK | Residents with >5 pubs |
|---|---|---|---|---|
| Alpha Academic | Derm | 92 | 255–265 | 70% |
| Beta Regional | Derm | 78 | 245–255 | 30% |
| Gamma Community | Derm | 63 | 240–248 | 10% |
The actual weights can be debated, but the point is the same: PCI is not subjective “prestige”; it is a synthetic metric reflecting how hard it is to get in.
To visualize the spread across a single specialty:
| Category | Value |
|---|---|
| Top Quartile | 90 |
| Upper-Mid | 80 |
| Lower-Mid | 70 |
| Bottom Quartile | 60 |
Your applicant vector will be compared to this PCI.
4. Building a Fit Score: From Intuition to a Formula
Here is the core idea:
Fit is a function of how your profile compares to what similar programs historically accept.
You want two related quantities:
- Probability of interview at that program.
- Conditional probability of matching there given an interview.
You are not going to perfectly estimate those without large datasets, but you can get a surprisingly useful approximation with a fit score.
4.1 Normalize you vs the specialty
First normalize your Step 2 CK relative to that specialty:
Step2Z = (YourStep2 - SpecialtyMean) / SpecialtySD
Same for research, using a rough distribution by specialty. For instance, in Dermatology:
- Mean total publications for matched US MDs ≈ 8–10
- SD ≈ 6–8 (wide, but skewed)
So:
ResearchZ = (YourPubs - SpecMeanPubs) / SpecSDPubs
For clerkships, build a simple score:
- Honors in IM/Surgery/Peds/OB: 2 points each
- High Pass: 1 point
- Pass: 0
Then normalize that vs classmates if you have the data, or vs a typical distribution.
Now your standardized performance vector might look like:
[Step2Z = +0.7, ResearchZ = -0.2, ClerkshipZ = +0.5]
You are +0.7 SD above mean on Step 2, slightly below average on research, slightly above on clinical grades.
4.2 Map your performance to a “personal competitiveness index”
Define your own Applicant Competitiveness Index (ACI) on the same 0–100 scale as PCI.
Example formula:
ACI = 50 + 12 * Step2Z + 8 * ClerkshipZ + 7 * ResearchZ + AOA_bonus + SchoolType_bonus
Where:
- AOA_bonus ≈ +5 if AOA, 0 otherwise
- SchoolType_bonus ≈ +5 for US MD @ top‑25 research, +2 for other US MD, 0 for DO, -5 for IMG (reflects actual PD behavior in many specialties)
So if you have:
- Step2Z = +1.0
- ClerkshipZ = +0.5
- ResearchZ = 0
- AOA = yes
- US MD, mid‑tier school
Then:
- Base = 50
- Step2 contribution = 12 * 1.0 = 12
- Clerkship = 8 * 0.5 = 4
- Research = 7 * 0 = 0
- AOA_bonus = +5
- SchoolType_bonus = +2
Total ACI = 50 + 12 + 4 + 0 + 5 + 2 = 73
You now have a “you score” that is anchored to the specialty’s distribution.
5. Converting PCI vs ACI into Match Probabilities
When you have your ACI and each program’s PCI, the central question becomes:
“How does outcome probability change as ACI – PCI varies?”
This difference, Δ = ACI – PCI, is your fit gap. Positive = you are stronger than that program’s usual level. Negative = you are weaker.
You can approximate the relationship between Δ and actual match outcomes with a simple logistic curve. Based on observed behavior from many specialties, the pattern is roughly:
Very high Δ (you significantly exceed typical level)
→ High interview probability, reasonably high match probability (but not 100%; geography, vibe, and internal candidates still matter).Moderate positive Δ
→ Decent interview rate, moderate match odds.Slight negative Δ (~ -5)
→ Some interviews if you have geographic ties or strong connections, but lower match odds.Large negative Δ (≤ -15)
→ Very low chance; treat as “lottery ticket” unless you have extraordinary insider support.
You can visualize a stylized curve:
| Category | Value |
|---|---|
| -20 | 2 |
| -15 | 5 |
| -10 | 10 |
| -5 | 18 |
| 0 | 30 |
| 5 | 45 |
| 10 | 60 |
| 15 | 72 |
| 20 | 80 |
Interpretation:
- At Δ = -10 (you are 10 points below the program’s typical competitiveness), your match probability is maybe 10% at best, probably lower without a strong signal.
- At Δ = +10, now you are in the 60%+ region given that you actually rank each other and you do not self‑sabotage on interviews.
For practical planning, convert this into categories:
- Δ ≥ +10 → “High fit”
- 0 ≤ Δ < +10 → “Moderate fit”
- -10 ≤ Δ < 0 → “Reach”
- Δ < -10 → “Long‑shot / lottery”
You should see the pattern: you want your rank list dominated by “High” and “Moderate” with a controlled tail of “Reach” and a few “Lottery” if and only if your application volume and budget allow.
6. Building a Quantitative Rank List Strategy
Now we get to the only part that actually affects Match Day: how you assemble and rank programs once you have ACI and PCI.
6.1 Calibrate your risk based on specialty and total applications
NRMP data across specialties shows:
- Overall, ranking more programs monotonically increases match probability, but with diminishing returns.
- The average matched applicant in a competitive specialty like ortho or derm ranks ~10–15 programs; in less competitive like FM, often fewer.
- Unmatched applicants often have both fewer ranks and a higher fraction of extreme reaches.
Use a simple risk metric:
- Compute your Δ for each applied program.
- Count programs in each bin: High, Moderate, Reach, Lottery.
- Look at the distribution of your ranked list, not just total length.
Say you are applying to Internal Medicine with ACI ≈ 65:
- 5 programs with Δ ≥ +10 (High)
- 8 programs with 0 ≤ Δ < +10 (Moderate)
- 7 programs with -10 ≤ Δ < 0 (Reach)
- 3 programs with Δ < -10 (Lottery)
You have 23 programs. That mix is generally safe for IM, provided geography is not absurdly constrained.
If you are applying Ortho with the same mix, that is risky. For highly competitive specialties, your proportion of High + Moderate fit must be larger because baseline non‑match rates are simply higher.
6.2 Rank by a composite of fit and preference
Your personal happiness still matters. But do not delude yourself into treating your #1 dream program as equal to a realistic mid‑tier because “anything can happen.”
Best practice:
Filter out programs where Δ < -15 unless:
- You rotated there and crushed it, and
- You have explicit, strong support from faculty inside the program.
For remaining programs, compute a Combined Score:
CombinedScore = w1 * FitScore(Δ) + w2 * PreferenceScore + w3 * GeographicScoreWhere:
- FitScore(Δ) maps your Δ into 0–100 (use the curve above).
- PreferenceScore is your 0–100 subjective ranking (case mix, culture, city).
- GeographicScore accounts for family/partner constraints.
Reasonable weights for most people: w1 = 0.5, w2 = 0.3, w3 = 0.2
Sort by CombinedScore and sanity‑check. Make minor manual adjustments if needed, but do not throw out the structure entirely.
You will usually notice that some programs you “liked more” in the moment are statistical landmines. That preempts some very bad rank list choices.
7. Concrete Example: Applying the Model to a Realistic Applicant
Let’s run numbers on a hypothetical:
- Specialty: General Surgery
- Specialty data (approximate, for US MD matched):
- Mean Step 2 CK = 250, SD = 8
- Mean pubs = 4, SD = 3
Applicant:
- Step 2 CK = 247 → Step2Z = (247–250)/8 = -0.375
- Pubs = 3 → ResearchZ = (3–4)/3 ≈ -0.33
- Clerkship honors: IM (H), Surgery (HP), Peds (P), OB (H)
- Score = 2 + 1 + 0 + 2 = 5 (assume that is ~average → ClerkshipZ ≈ 0)
- AOA: no
- US MD, non‑top‑25
Plug into ACI formula:
ACI = 50 + 12*Step2Z + 8*ClerkshipZ + 7*ResearchZ + AOA_bonus + SchoolType_bonus
→ Step2 component = 12 * (-0.375) = -4.5
→ Research = 7 * (-0.33) ≈ -2.3
→ Clerkship = 0
→ AOA_bonus = 0
→ SchoolType_bonus = +2
ACI = 50 - 4.5 - 2.3 + 0 + 0 + 2 ≈ 45.2
Call it ACI = 45.
Now three program PCIs in Surgery:
- BigName Academic: PCI = 72
- Regional University: PCI = 58
- Community Program: PCI = 46
Compute Δ:
- BigName: Δ = 45 − 72 = -27 (Lottery / basically no chance)
- Regional: Δ = 45 − 58 = -13 (Strong Reach)
- Community: Δ = 45 − 46 = -1 (Near‑median fit)
Your match strategy for this profile:
- Do not build your list around BigName‑type programs. Apply to a few if you truly want to, but treat them as lottery tickets.
- The backbone of your list must be programs near Δ ≈ 0 to +10. Think community and moderate‑tier university programs.
- If you love academics, look for lower‑PCI academic‑affiliated programs rather than dominating your list with PCI 70+ dream institutions.
That is drastically different from what many students do. They apply to 20 high‑PCI places “just in case” and under‑apply to realistic options. The result shows up in SOAP statistics every year.
8. Using the Model to Decide “How Many” and “Where”
You can also combine this with NRMP historical data on ranks vs match rate.
Simplified:
- For moderately competitive specialties, matched applicants often have ~10–12 ranks; unmatched ~5–6.
- For very competitive specialties, matched often have ~15+; unmatched fewer, but also much higher exposure to Δ << 0 programs.
So one reasonable quantitative heuristic:
- Compute your ACI.
- Compare to the specialty median competitiveness (call it SpecACI_median).
- If ACI is below the specialty median, target:
- More total applications,
- Heavier concentration in Δ ≥ 0 programs.
A sample target breakdown by where you fall:
| ACI vs Specialty Median | Total Programs to Rank | High Fit (Δ ≥ +10) | Moderate (0 ≤ Δ < 10) | Reach (-10 ≤ Δ < 0) | Lottery (Δ < -10) |
|---|---|---|---|---|---|
| ACI ≥ +1 SD above | 12–18 | 3–5 | 5–7 | 3–5 | 1–2 |
| Within ±1 SD | 15–22 | 3–6 | 6–9 | 4–7 | 1–2 |
| ≥1 SD below | 18–25+ | 4–8 | 7–10 | 5–7 | 0–2 |
Use this as a scaffold, then adjust for cost and interview yield.
9. Limitations, Noise, and What the Model Cannot See
If you do this right, you will be tempted to overtrust the numbers. That is a mistake.
The model ignores:
- Letters of recommendation quality (which can be decisive).
- Interview performance and “vibe.”
- Internal candidates, couples match entanglements, last‑minute program priorities.
- Diversity initiatives and mission fit that do not show in your raw metrics.
Think of the ACI–PCI framework as a prior probability. Then each interview, each new positive contact, and each strong letter is a Bayesian update in your favor.
Another important nuance: distributions differ by school. An ACI = 60 from a state school that sends 2 grads per year to a competitive specialty might be more impressive to PDs than ACI = 60 from a place that sends 20. The model does not see that granularity unless you explicitly incorporate it via school‑specific multipliers.
That said, even a crude ACI/PCI framework dramatically outperforms the default “I felt like I vibed with them” approach. The NRMP’s unmatched statistics are full of people who overestimated their reach tier and under‑indexed on mid‑fit programs.
To conceptualize the whole process:
| Step | Description |
|---|---|
| Step 1 | Collect Applicant Data |
| Step 2 | Compute ACI |
| Step 3 | Collect Program Data |
| Step 4 | Compute PCI |
| Step 5 | Calculate Delta ACI-PCI |
| Step 6 | Assign Fit Category |
| Step 7 | Estimate Match Probabilities |
| Step 8 | Build Ranked Program List |
10. How to Implement This Without Becoming a Full‑Time Statistician
You do not need a PhD or a full NRMP database mirror.
Here is a practical, one‑evening approach:
Estimate specialty means/SDs for Step 2 and publications from NRMP “Charting Outcomes” PDFs. Write them into a simple spreadsheet.
Compute your own Z‑scores and ACI using a formula like the one above.
For each program of interest:
- Assign a rough PCI based on:
- Perceived tier (top, mid, community)
- Typical Step 2 ranges you can infer from publicly available data or residents you know
- Location desirability
- You do not need PCI precise to 1 point. ±3 is fine.
- Assign a rough PCI based on:
Calculate Δ and bin each program into High/Moderate/Reach/Lottery.
Check your list for balance:
- Are ≥50–60% of your applications/anticipated ranks in High or Moderate?
- Are you overloaded with Δ < -10 programs?
- In very competitive specialties, do you have enough total programs and enough Δ ≥ 0 anchors?
If you want to be fancier, you can layer a simple logistic formula in a spreadsheet to convert Δ → estimated match probability and then compute your overall chance of at least one match assuming independence (which is an approximation, but decent for planning).
For example:
P(match at program i | ranked & interviewed) ≈ 1 / (1 + e^(-k * (Δ_i - c)))
Pick k and c so that Δ = 0 → ~30% and Δ = +10 → ~60–70%. Then compute:
P(at least one match) ≈ 1 - Π (1 - P_i) across all realistic programs.
Is it perfect? No. Is it better than guessing? By an order of magnitude.
Final Takeaways
Turn yourself and programs into numbers. ACI vs PCI, and the Δ between them, is a powerful lens for judging realistic fit and risk.
Build your rank list around High and Moderate fit programs, not dreams. Let a limited number of Lottery programs in the door, but do not let them dominate.
Use the model as a disciplined baseline, not a rigid rule. Combine it with letters, interviews, and personal constraints, and you will avoid the avoidable disasters embedded in the match statistics every year.