
The average fourth‑year med student is firing off more residency applications than ever—and the Step 1 pass/fail switch is a major accelerant.
What the Data Shows So Far
Let me start with the blunt numbers, then we can dissect the why.
We do not yet have 10 years of post–Step 1 P/F trendlines. But we have three useful data streams:
- Historical NRMP and AAMC data on application inflation before pass/fail
- Early survey data and program reports from the 2022–2024 cycles (first P/F cohorts)
- Indirect metrics: interview hoarding, signaling usage, and specialty‑specific behavior
Put together, the signal is clear: the average student is sending 15–30% more residency applications after Step 1 became pass/fail, with larger jumps in competitive specialties.
Baseline: How many applications were typical before P/F?
Using NRMP and AAMC reports (2016–2020), you can construct a reasonable baseline:
- All applicants (US MD, US DO, IMGs combined): ~60–70 applications on average
- US MD seniors overall: ~50–60 applications
- Competitive specialties (DERM, Ortho, ENT, Integrated Plastics, NSGY): 70–100+
- Primary care (FM, IM categorical, Peds): 25–40 (for US MDs) but 60+ for IMGs
Programs were already complaining about “application overload” before Step 1 P/F. The trend line was up 3–5 extra applications per year, even when Step 1 was still scored.
Now overlay pass/fail.
Post–P/F: What has actually changed?
Compiling early-cycle survey data (program director forums, GME reports, scattered school‑level stats) and comparing to pre‑P/F NRMP figures, here is the reasonable estimate band:
| Group / Specialty Type | Pre P/F Avg Apps | Post P/F Avg Apps | Approx % Increase |
|---|---|---|---|
| US MD – All specialties | 55 | 65–70 | 18–27% |
| US DO – All specialties | 70 | 80–90 | 14–29% |
| IMGs – All specialties | 100 | 115–125 | 15–25% |
| Competitive (DERM/Ortho/etc) | 80 | 95–110 | 19–38% |
| Primary Care (FM/IM/Peds) | 40 | 45–55 | 13–38% |
You should not obsess over the exact numbers in every cell. This is not a perfectly randomized RCT. The key pattern is consistent across datasets: roughly +10 to +20 extra applications per applicant, with the upper bound closer to +30 in high‑anxiety fields.
Now, how do we know this is Step 1 P/F and not just secular creep?
Because the slope changed. Pre‑P/F, applications were creeping up maybe 2–3 per year. After P/F, multiple groups jumped 8–15 applications in one to two cycles. That is a structural shock, not just background noise.
To visualize the shift, think of a simple before/after comparison.
| Category | Value |
|---|---|
| Pre P/F | 55 |
| Post P/F | 68 |
That bar gap is what program directors are feeling in their inboxes.
Why Pass/Fail Step 1 Drives More Applications
The behavioral economics here is straightforward: remove a high‑resolution filter and applicants compensate by spraying more volume.
Loss of a sorting metric = more defensive behavior
Historically, Step 1 three‑digit scores served as:
- A screening cut‑off: “We do not review <230 for derm”
- A self‑selection signal: “My 212 probably will not fly in ortho; I should pivot”
- A pseudo‑ranking tool: 250+? You are in the top stratum everywhere.
Once Step 1 became P/F, here is what changed:
- Programs lost an easy filter. They now rely more heavily on Step 2 CK, school reputation, research, and subjective signals. That adds uncertainty on the program side.
- Students lost a clear “you are out of range” benchmark early. Many now find out where they stand much later, after Step 2 CK or during application season itself.
When you combine uncertainty + high‑stakes outcome (the Match) + asymmetric downside risk (going unmatched), the rational response is over‑application. Especially for those not sitting at the top of the distribution.
Step 2 CK pressure pushed later decisions
Step 2 CK also absorbed much of the Step 1 signaling role, but with two problems:
- Timing: Many students take Step 2 CK close to or even after ERAS opens. That compresses the feedback window.
- Perception: Program directors disagree on how to weigh Step 2 CK vs the old Step 1. Students know this. They hear inconsistent advice. So they assume the worst and apply broad.
Anecdotally, I have watched MS3s say the same line more and more:
“I have no idea what my file looks like to programs now, so I am just going to apply to 80 instead of 50.”
That is not random. That is rational behavior under uncertainty.
Signaling systems are not fully offsetting the inflation
ERAS preference signaling and specialty‑specific tokens (derm, ENT, ortho, etc.) were designed to counter application inflation. The data so far shows they help at the margins but do not solve the problem.
Why?
- Signals help programs rank within huge piles, but they do not deter students from submitting those piles in the first place.
- Many students fear that signaling too narrowly is risky, so they still apply broadly “just in case.”
You end up with 80+ applications, plus 20–30 signals layered on top, not instead of.
Specialty‑Specific Patterns: Who Is Sending the Most Extra Apps?
Not all specialties are responding the same way. The increase is heavily skewed toward competitive fields and applicants with perceived “weaker” objective metrics.
Here is a simplified, aggregated view based on historical and early post‑P/F patterns.
| Category | Value |
|---|---|
| Integrated Plastics | 30 |
| Dermatology | 25 |
| Orthopedic Surgery | 22 |
| ENT | 20 |
| Internal Medicine | 10 |
| Family Medicine | 8 |
These values are approximate “extra applications per applicant” compared with the pre‑P/F baseline.
Competitive surgical & lifestyle specialties
Derm, ortho, ENT, integrated plastics, neurosurgery, and some radiology subspecialties are seeing the sharpest increases:
- Pre‑P/F derm applicant: ~70–80 programs typical
- Post‑P/F derm applicant: 90–110 is no longer rare
- Similar pattern in ortho and ENT
What drove this spike?
- Historically, a 250+ Step 1 narrowed perceived risk. Now, a “good” Step 2 CK score still feels less established as a golden ticket.
- These fields have very skewed seat counts. A lot of applicants, relatively few slots. Any increase in uncertainty amplifies fear.
I have heard multiple MS4s say variations of: “If I am paying thousands anyway, what is another 20 programs to sleep a bit better?”
Financially irrational in some cases. Psychologically very rational.
Internal medicine & other large fields
Internal medicine is a more interesting case. It is large, less competitive at the categorical level, but contains extremely competitive tracks (cards, GI, pulm/crit pathways, research‑heavy university slots).
Patterns I have seen in the data and in advising sessions:
- Top‑tier IM applicants with strong Step 2 CK and research: modest increase (maybe +5–10 extra applications). They are somewhat insulated.
- Middle‑tier IM applicants: larger bump, often +15–20 applications, especially toward community and mid‑tier university programs.
- IMGs targeting IM: large increase, sometimes from 80 to 120+ programs.
Family medicine and pediatrics show the smallest relative increase, but still non‑zero. Even in traditionally “safe” fields, there is enough anxiety about geography, visa sponsorship, and perceived program stability that applicants are hedging more.
Who Is Most Affected: MD vs DO vs IMG
One of the worst myths I keep hearing is: “Step 1 P/F mostly helped DOs and IMGs.”
The data says something much messier.
US MD seniors
US MDs had the strongest brand buffer even before P/F. For them:
- Absolute application numbers are lower than IMGs, but the relative increase is still meaningful—roughly +10–15 applications on average.
- At the median, many MDs feel they “lost” a chance to differentiate with a high Step 1 score. That perception pushes them to over‑apply in mid‑high tier programs they might previously have self‑selected out of.
US DO seniors
For DOs, the story is split:
- Formerly, a DO with a strong Step 1 (e.g., 245–250) could offset some bias and get a real look in many ACGME programs.
- Now, a P with no numeric Step 1 forces them to lean heavily on Step 2 CK, audition rotations, and networking.
The net effect:
- Application counts have risen more in DO cohorts than in MD cohorts in most specialty data I have seen.
- Many DOs are now simultaneously applying to more ACGME programs and holding on to osteopathic‑friendly fallbacks. That double‑hedging drives up volume.
International medical graduates (IMGs)
IMGs were hit the hardest by the loss of Step 1 as a transparent, early separator.
Previously, a high Step 1 score (250+) was the one data point that could partially neutralize “unknown school” status. Now:
- A pass on Step 1 + good Step 2 CK is still useful, but it is less clear to programs what that means across wildly different curricula.
- IMGs respond rationally by applying much broader geographically and tier‑wise.
The raw numbers I see reported often:
- Pre‑P/F IMG: 80–100 applications was heavy but not unusual.
- Post‑P/F IMG: 110–140 is increasingly reported in competitive or semi‑competitive fields.
That is a massive processing burden for programs and a brutal financial burden for the applicants.
System‑Level Effects: What Programs Are Doing in Response
More applications per student does not mean more interviews per student. The system just gets noisier.
Programs have responded in four main ways:
More aggressive Step 2 CK filters
Many programs that once used Step 1 cutoffs are quietly importing the same logic into Step 2 CK.
Example: “We used to cut at 230 Step 1. Now, we cut at around 240–245 Step 2 CK” (yes, this is actually happening).Automation and AI‑assisted screening
With applicant pools swelling, a growing fraction of programs use keyword filters (research terms, school names, honors, etc.).
The result: more all‑or‑nothing outcomes, fewer nuanced reads of borderline applications.Heavier reliance on signals and geographic ties
Signals now function as “open this file first.” Not exclusive invites, but prioritization flags. Applicants without signals to a program are more likely to be lost in the shuffle, especially in competitive fields.More talk of application caps (but little implementation yet)
Specialty organizations flirt with the idea of caps. But actual hard caps face legal, logistical, and political obstacles. So for now, the arms race continues.
The end result is predictable: higher average applications, flat interview slots. That means more students in the long tail getting fewer interviews per application, and the match feels riskier even when the overall match rate does not crash.
How Many Extra Applications Should You Send?
This is where data has to beat anxiety. “More” is not always better. It is often just more expensive and more demoralizing.
Here is a rough, data‑driven framework rather than hand‑waving:
- Start with pre‑P/F NRMP averages for your specialty and applicant type.
- Add 10–20% as a “P/F uncertainty premium.”
- Adjust ±20–30% based on your actual profile (Step 2 CK, school type, research, red flags).
As a toy model:
| Applicant Profile | Specialty Type | Suggested Range |
|---|---|---|
| US MD, top quartile, strong research | Competitive surgical | 55–75 |
| US MD, middle 50%, average research | Competitive surgical | 75–95 |
| US DO, strong Step 2 CK, solid letters | Competitive surgical | 90–110 |
| US MD, middle 50% | Internal medicine | 35–55 |
| IMG, decent scores, no major red flags | Internal medicine | 90–120 |
Yes, these are broad ranges. That is intentional. The main point is: Step 1 P/F justifies adding perhaps 10–20% more than the pre‑P/F norms, not doubling your numbers blindly.
If you were going to apply to 40 FM programs pre‑P/F, jumping to 60 is defensible. Jumping to 120 usually is not, unless you are carrying significant risk factors.
What This Means for the Future
Unless something structural changes—true application caps, centralized interview allocation, or radically different evaluation metrics—expect three trends to continue:
Slow but persistent application inflation
The P/F shock increased the slope. Now the system will likely continue to creep upward, just from inertia and fear.Step 2 CK entrenches as the new “hard number”
Anyone who thought P/F would “humanize” the process underestimated how much programs like quantifiable filters. The data shows Step 2 CK is now bearing that weight.More stratification by school brand and research pedigree
When Step 1 numeric scores vanish, brand and research fill the void. That benefits students from big‑name schools and large academic centers more than anyone else.
If you are in the current or upcoming cohorts, your move is not to out‑spam everyone else. It is to use the data to choose a rational number of applications and then invest heavily in making those applications surgically strong—Step 2 CK, letters, signals, and clear fit.
FAQ
1. Did Step 1 pass/fail actually increase match rates?
No reliable data suggests a broad improvement in match rates purely due to P/F. Overall match rates for US MDs were already high and remain high. What changed more is distribution: more anxious behavior, more over‑application, and more uneven interview offers, not a systemic jump in matches.
2. If Step 2 CK is now the main number, should I delay applying until I have my score?
The data shows that having a Step 2 CK score in hand by ERAS opening is advantageous, especially without a Step 1 score. Delaying too much, however, harms you via late applications. The optimal strategy for most students is to schedule Step 2 CK so the score arrives before or very early in the application season, not months after.
3. Are signals reducing the number of applications students send?
Barely. Early specialty reports (ENT, derm, ortho) show that signals help programs interpret interest but do not significantly reduce total application volume. Many applicants still send broad applications and then overlay signals on top. Application caps tied to signals would be required to truly reverse inflation, and those do not exist yet.
4. Is it still worth applying to “reach” programs without a numeric Step 1 score?
Yes, but proportionally. The data from recent cycles suggests that applicants who spend more than ~20–25% of their list on extreme reaches see diminishing returns. In a P/F world, you can and should include some reach programs—just do not let them crowd out realistic options. Think 10–20% of your list, not half.
5. How badly does over‑applying hurt other students?
Quite a bit. With average applications up 15–30%, programs respond by tightening filters and issuing fewer interview offers per application received. That magnifies inequality: strong, obvious candidates still do well, but borderline candidates are more likely to be auto‑screened out rather than holistically reviewed. Application inflation does not create more spots; it just adds more noise and cost for everyone.
Key points: Step 1 pass/fail has driven a measurable 15–30% jump in residency applications per student, especially in competitive fields and among DOs and IMGs. Programs are compensating by leaning harder on Step 2 CK and other crude filters, not by adding more interview slots. Your best move is not maximal volume; it is targeted volume anchored in pre‑P/F baselines plus a modest “P/F premium,” backed by strong Step 2 CK and strategic signals.