Resources Step 1 Pass/Fail Era Research Output vs Step Scores: What Matters More Post–Step 1 P/F?

Research Output vs Step Scores: What Matters More Post–Step 1 P/F?

January 8, 2026

15 minute read

step 2 ck research output step 1 pass/fail residency match competitive specialties research cv residency application academic medicine

Resident reviewing research data and exam score reports - for Research Output vs Step Scores: What Matters More Post–Step 1

The belief that “Step scores decide everything” is now statistically outdated. In the post–Step 1 pass/fail era, the data show a power shift: Step 2 CK and research output have moved from “nice to have” to hard selection filters. But they do not matter equally, and they do not matter the same way for every specialty.

Let me be blunt: if you are chasing dermatology, plastics, ortho, ENT, neurosurgery, or radiation oncology, you are no longer competing on Step 1. You are competing on Step 2 CK plus a research CV that looks like a junior faculty member’s. For internal medicine or pediatrics, the curve is flatter—but the pattern is the same. Programs are replacing the lost Step 1 signal with a combination of Step 2 and “scholarly productivity.”

The key question is not “Which matters more, research or Step scores?” The key question is: “For my target specialty, with my current stats, where does one additional unit of effort produce the highest marginal gain in match probability?”

That is a data question. So let’s treat it like one.

1. What Actually Changed When Step 1 Went Pass/Fail?

We are not guessing here; we have trend lines.

Before Step 1 went P/F:

Program directors ranked Step 1 as the single most important factor for interviews in nearly every competitive specialty.
Research productivity mattered, but it was often a secondary differentiator after scores and class rank.

After Step 1 went P/F, survey and match data show three clear shifts:

Step 2 CK jumped into the vacuum.
In multiple NRMP Program Director Surveys since the change, Step 2 CK climbed to the top 2–3 factors for interview offers across most specialties. For competitive fields, it is now the primary standardized metric.
Research output became more stratified by specialty.
Already–research-heavy fields (derm, radiation oncology, neurosurgery) increased expectations. Historically lower-research fields (FM, psych) did not suddenly become publication-driven. The slope increased where the culture already valued research.
Screening is more multi-factorial.
Without a single 3-digit Step 1 gate, programs lean harder on:
- Step 2 CK
- Clerkship grades / honors
- Home / away rotations
- Research, especially in-field
- Letters from known faculty

But they are not weighted equally, and there is real quantitative variation by specialty.

2. The Baseline Numbers: How Much Research and How High Scores?

First, you need to know the landscape you are walking into. The most useful numbers are from the last few NRMP Charting Outcomes in the Match reports and PD surveys. These are rounded and simplified for clarity, but the relative differences are the point.

Typical Profiles of Matched US MD Seniors

Specialty (Matched MD)	Mean Step 2 CK	Mean Abstracts/Posters/Pubs	Mean Programs Applied
Dermatology	257–260	18–22	70–80
Orthopedic Surgery	252–255	9–12	70–80
Neurosurgery	255–258	20–25	70–80
Internal Medicine	245–248	4–6	40–50
Family Medicine	238–242	2–3	25–35

Data interpretation:

Competitive surgical / lifestyle specialties show roughly 2–4x the research output of IM/FM.
Their mean Step 2 CK scores are 7–15 points higher than primary care specialties.
The “research arms race” is real in a subset of fields. But not everywhere.

Now let’s visualize how research volume escalates with competitiveness.

bar chart: Primary Care, Mid-Competitive, Highly Competitive

This is the playing field. The question is where your marginal effort moves you meaningfully within it.

3. How Programs Actually Use Step 2 CK vs Research

Think of Step 2 CK and research as different kinds of signals:

Step 2 CK = a hard numeric filter and comparative benchmark.
Research output = a fit and interest signal, especially for academic and niche programs.

They enter the decision process at different points.

Step 2 CK: The Gatekeeper

Patterns I have seen repeatedly in raw applicant lists and program filters:

Many competitive programs set Step 2 CK cutoffs:
- ~250+ for derm / plastics / ENT “top tier”
- ~240–245+ for ortho, neurosurgery, radiology, EM at competitive sites
- ~230–235+ for IM at large academic centers
Filters are often binary. A 249 may pass while a 239 never gets seen, regardless of research.

That means Step 2 CK affects:

Whether your application is opened at all.
The probability of receiving an interview if your overall file is average.

Research cannot help you if you never make it through the numeric filter—especially in ERAS software where PDs/PCs literally sort by score.

Research: The Multiplier and Tiebreaker

Research behaves differently:

It rarely functions as a strict cut-off (“no interview below 10 pubs” is uncommon outside a few ultra-academic programs).
Instead, it:
- Boosts your perceived commitment to a field.
- Strengthens letters and connections.
- Helps you stand out once you survive initial score filtering.

Where it becomes pseudo-mandatory:

Derm, neurosurgery, radiation oncology, plastics, ENT: many matched applicants have double-digit outputs.
Top-tier academic IM and subspecialty-track programs: serious applicants often show 5–10+ items and at least some first-author work.

But again, the order of operations matters: Step 2 gets you in the door. Research often decides how far you go once you are inside.

4. Marginal Value: One More Publication vs +5 Step 2 Points

You should not think in absolutes. You should think in marginal returns.

If you have 6 months of bandwidth, what is more impactful?

Turning a projected 245 Step 2 into a 252
vs
Converting 0–2 low-impact abstracts into 6–8 items?

Let’s sketch a rough, data-informed model for a US MD targeting a competitive specialty (ortho/derm-type field), using simplified probabilities to illustrate the trade-off.

Assume:

Current Step 2 CK practice trajectory: ~245
Current research: 2 low-impact items (posters, middle author)
Target: match at any accredited program in that specialty

Based on historical match curves and program director commentary, something like this is reasonable:

At Step 2 = 245, 2 research items → baseline match probability ~55–60%
At Step 2 = 252, 2 research items → bump to maybe ~70–75%
At Step 2 = 245, 8 research items (including 1–2 first-author, in-field) → also ~70–75%

In other words, in the mid-range:

+7 Step 2 points ≈ +6 substantial research items in impact on overall match odds.

Now, how much effort is that?

Going from a projected 245 to 252 may require:
- +3000–5000 high-quality questions
- 3–6 more weeks of focused study
- A disciplined schedule but a single-exam target
Going from 2 to 8 research outputs may require:
- 1–2 years of longitudinal involvement
- Substantial writing, data analysis, IRB, and revision cycles
- Buy-in from faculty and some luck on timelines

The data story: at the margin, boosting your Step 2 CK into a higher bracket is often a more time-efficient way to increase your match probability than chasing multiple extra low-impact abstracts, unless you are already numerically secure or targeting a research-hungry niche.

5. Specialty-Specific Weighting: Where Research Really Competes with Scores

Let’s break it down by broad category, because the relative weight of research vs Step 2 changes.

A. Hyper-competitive, research-driven fields

Dermatology, neurosurgery, plastics, radiation oncology.

What the data show:

Mean Step 2 CK in the high 250s; significant fraction ≥260.
Median research output in double digits, often 15–25+.
Many matched applicants complete dedicated research years.

For these:

A subpar Step 2 (say, <245) is hard to compensate for, even with strong research.
But among applicants above a rough Step 2 floor (≈245–250), research heavily stratifies competitiveness.

If I reduce it to a simplistic rule:

Below the Step 2 threshold → Step 2 matters more (because without it, you are filtered out).
Above threshold and already decently productive (8–10 items) → research growth and networking can matter more than squeezing another 3–4 CK points.

So for a derm hopeful with:

Step 2 252 vs 255: small delta in probability.
5 research items vs 20, with multiple in-field first-authors: massive delta in probability.

The inflection point shifts once you are safely within the score “acceptable” band.

B. Competitive but not research-obsessed: Ortho, ENT, Urology, EM, Anesthesia

These fields care about scores and research, but the distributions are different:

Average Step 2 CK: high 240s to low 250s.
Research: commonly 6–12 items, not 20+.
Strong home/away rotations and letters can partially compensate for thinner research.

In these specialties:

Getting Step 2 from ~238 → 248 can change your chances dramatically and often more efficiently than chasing two extra posters.
However, going from zero research to a focused 3–5 projects in the field still yields a meaningful bump, because programs like demonstrated interest and academic curiosity.

Rule of thumb here:

If you are under the score 50th percentile for matched applicants, prioritize Step 2 until you at least hit that band.
When you are at/above median Step 2, growing targeted research (even 3–7 good in-field items) starts to compete with another few score points in impact.

C. Academic Internal Medicine and subspecialty-track aspirations

Internal medicine is the statistical engine of residency. The match is less cutthroat than derm, but distinctions still matter, especially if you want GI, cards, heme/onc later.

Typical matched MD stats:

Step 2 CK around 245–248, with many academic programs skewed higher (250+).
Research: 4–6 items on average, but premier programs often see 8–10+.

For IM:

Step 2 is critical to avoid being screened out among thousands of applications.
Once you are at or above ~245–250, incremental gains in research clearly help you climb program tiers and open doors for fellowship.

In other words:

To simply match IM at a decent program: Step 2 matters more up to a respectable level.
To match at MGH, UCSF, Penn, Hopkins–type places: the research profile becomes almost as important as the score, sometimes more if you already have a 250+.

D. Primary care and less research-heavy fields: FM, Psych, Peds, Neurology (most programs)

Here the data are different:

Mean Step 2 CK: high 230s to low 240s.
Research: 2–4 outputs, often with modest impact.

Realistically:

An extra 5 Step 2 points often does more for match safety and geographic choice than 3 extra posters.
Research can still be a differentiator for the most academic programs in these fields, but the baseline expectations are lower.

In most of these specialties, unless you are specifically targeting the top 5–10 academic programs nationally, I would rank Step 2 above “piling on more research” in marginal value.

6. Modeling the Trade-Off: A Simple, Practical Framework

You do not need machine learning to make a good decision. You need a structured comparison.

Use this 4-step process.

Prioritizing Step 2 CK vs Research Effort
Step	Description
Step 1	Define Target Specialty Tier
Step 2	Check Score Position vs Matched Mean
Step 3	Prioritize Step 2 CK Preparation
Step 4	Assess Research Output vs Specialty Norm
Step 5	Invest Heavily in Targeted Research
Step 6	Balance - Maintain Scores and Deepen Research Quality
Step 7	Below Mean by 5+?
Step 8	Research Below Norm?

Step 1: Define your target specialty and tier
Are you aiming for:

Any program in that specialty?
Only large academic centers?
Specific geographic or prestige targets?

Step 2: Locate yourself vs the Step 2 CK distribution

If your predicted or actual Step 2 CK is >5 points below the matched mean for that specialty:
- Your first-order goal is to close that gap.
- Research will not save you from widespread screening at that point in competitive fields.

Step 3: Compare your research output to norms

Use a rough mapping based on what we know from match data:

Primary care:
- 0–1 items → weak
- 2–4 → typical
- 5+ → strong / academic-leaning
Competitive surgery / derm / rad onc / neurosurg:
- 0–5 items → weak
- 6–12 → solid
- 13–25+ → strong / research year–level
Academic IM:
- 0–3 → weak
- 4–7 → typical
- 8+ → strong

Step 4: Evaluate marginal effort vs marginal impact

Ask yourself:

Can I reasonably move my Step 2 CK band upward with 4–8 weeks of focused study?
Or is my Step 2 already “good enough” for my target range, and I have 12+ months where a deep research plunge could credibly yield multiple first/second-author works?

Then you can choose:

Step 2–heavy strategy if:
- You are below or near the bottom of the competitive Step 2 range.
- You have limited time (≤6 months) before application season.
- You lack the infrastructure or time to generate serious research before ERAS.
Research-heavy strategy if:
- You already sit at or above the median Step 2 for your target specialty.
- You have at least 9–18 months before applying.
- You can plug into a high-yield research group (dedicated year, robust mentorship, realistic pipeline).

7. Quality vs Quantity: The Research Trap

One more uncomfortable data point: not all “20 pubs” are equal.

When I look at CVs, I see:

Posters from local student symposia counted as “publications.”
Middle-author case reports with minimal involvement.
Reviews written largely by residents or attendings where the student did little more than formatting.

Programs see this pattern too.

From discussions with PDs and faculty, three things reliably matter more than raw count:

Field alignment
Derm programs care more about derm projects than about a random cardiology case report.
Role clarity
First- or second-author on a substantial manuscript beats “8th author” on five low-impact pieces.
Narrative coherence
A progression from simple chart reviews → clinical studies → perhaps some basic/translational work, with clear mentorship, tells a stronger story.

So, if you are going to sacrifice Step 2 study time for research, the projects must realistically convert into:

In-field outputs
With higher authorship position
On a timeline that hits before ERAS submission

If you cannot secure that, then the data argue for protecting your Step 2 preparation instead.

8. A Data-Driven Summary: What Matters More, and When?

The question “Research output vs Step scores: what matters more post–Step 1 P/F?” has a conditional answer:

For getting past initial filters in most specialties, Step 2 CK matters more.
A weak Step 2 will quietly kill your application long before anyone appreciates your “20 posters.”
For moving up the program quality ladder after you have an acceptable Step 2, research starts to rival or exceed the marginal impact of a few extra CK points—especially in research-centric fields.

Here is the short version, stripped to the essentials:

Relative Priority: Step 2 CK vs Research by Scenario

Scenario	Higher Yield Focus
Below mean Step 2 for target specialty	Step 2 CK
Near mean, minimal research in research-heavy field	Research (targeted)
Above mean Step 2, average research, academic goals	Research + strong letters
Primary care, broad geographic goals	Step 2 CK (moderate)
Derm/neurosurg/plastics, score acceptable	High-yield research

And one last visual to drive home how Step 2 and research generally trade off across competitiveness tiers:

stackedBar chart: Primary Care, Academic IM, Competitive Surgery, Ultra-Competitive (Derm/Neurosurg)

Medical student balancing exam prep with research work - for Research Output vs Step Scores: What Matters More Post–Step 1 P

If you remember nothing else, remember this:

Step 2 CK is the new Step 1 for screening. If you are below your specialty’s typical range, correcting that is the single most effective move you can make.
Once your Step 2 is in a competitive band, research output—especially targeted, high-quality, in-field projects—becomes the key lever for moving from “any match” to “the match you actually want.”

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

See Your Residency Matches

* 100% free to try. No credit card or account creation required.

If You Delayed Step 1 Into the P/F Era: Navigating the Gray Zone

Delayed Step 1 into the pass/fail era? Learn to focus on Step 2 CK, clinical grades, research, and strong letters to boost residency match chances.

Overcorrecting for Pass/Fail: CV Errors PDs See from Your Cohort

Avoid CV mistakes residency PDs hate in the Step 1 pass/fail era. Learn to streamline CVs, present authentic research, and showcase depth over quantity.

How to Use Clerkships to Replace the Signaling Power of Step 1

Turn clerkships into a Step 1 substitute: actionable tactics to earn honors, strong evaluations, persuasive narratives, and better residency visibility.

Thinking Step 1 Doesn’t Matter Now? 7 Career-Limiting Assumptions

Don't assume Step 1 is irrelevant. Read 7 career-limiting Step 1 myths and how pass/fail affects Step 2 CK, clerkships, and residency prospects.

Clerkship Grades vs Step 2: What PDs Privately Say They Trust Most

Learn why program directors (PDs) trust Step 2 CK over clerkship grades for residency screening, and how to boost Step 2, MSPE narratives, and ranking chances.

How Many More Applications Are Students Sending After Step 1 P/F?

Explore how Step 1 pass/fail increased residency applications ~15–30%, why students apply more, and tips to manage post-P/F application inflation.

How PDs Really Use Pass/Fail Step 1 to Sort Your Application

Discover how PDs use Step 1 pass/fail to screen residency applications - what replaced it (Step 2 CK, school, grades) and how to improve your chances.

The Backchannel Conversations About Step 1 Pass/Fail You Don’t Hear

Reveals how residency PDs secretly replaced Step 1 with Step 2 CK, school reputation, clinical grades, and 'hooks' after pass/fail.

Gap Year Timing: When It Still Makes Sense After Step 1 P/F

Decide if and when a gap year still helps after Step 1 pass/fail: timing, Step 2 strategy, research boosts, and month-by-month action plan.

What Committees Now Scrutinize Instead of Step 1 Scores

Discover what residency committees now scrutinize instead of Step 1 scores — Step 2 CK, school reputation, clerkship grades, letters, research, professionalism.

Crafting LOR Requests That Compensate for Missing Step 1 Numbers

Optimize letters of recommendation to replace missing Step 1 numbers: strategic LOR requests, targeted writers, and highlighted work ethic, reliability.

The New Red Flags: Missteps Schools Notice Instead of Step 1 Scores

Learn new residency red flags in the Step 1 pass/fail era: what PDs watch (Step 2 CK, clerkship evals, professionalism) and how to avoid them.

Already Took Step 1 as Numeric, Now Everything Is P/F: What to Do

Already have a numeric Step 1 while Step 1 is P/F? Learn how to leverage your score, plan Step 2 CK, and strengthen residency applications.

Building a ‘Safety Net’ Profile for Competitive Fields Without Step 1

Build a practical 'safety net' residency profile without Step 1: target a high Step 2, pick a coherent backup specialty, and craft convincing applications.

Will Step 1 P/F Hurt My Chances at Highly Competitive Residencies?

Worried Step 1 pass/fail will harm competitive residency odds? Learn how to pivot—boost Step 2 CK, research, clerkships, and networking to match top programs.

Step 1 Pass/Fail Strategy: A Concrete Plan to Stand Out Anyway

Step 1 pass/fail strategy: concrete plan to pass efficiently, protect Step 2 readiness, and build research, clerkship, and CV strengths to stand out now.

Residency Application Season: Key Deadlines in the New Testing Order

Navigate residency application deadlines under Step 1 pass/fail: optimize Step 2 CK timing, sub‑Is, letters, and deadlines to maximize interview invites.

Inside Ranking Meetings: What Matters More Than Step 1 Now

Learn what matters more than Step 1: Step 2, clinical evaluations, away-rotation reputation, and letters showing reliability to boost residency chances.

Did Step 1 P/F Level the Playing Field? Winners and Losers Exposed

Uncover who truly benefited from USMLE Step 1 pass/fail: winners, losers, and how programs now rely on Step 2, school prestige, grades.

If Your Step 1 Prep Burned You Out Before Clinical Year Starts

Recover from Step 1 burnout before clerkships with a practical 2–6 week plan to restore energy, rebuild routine, and enter clinical year confident.

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

See Your Residency Matches

* 100% free to try. No credit card or account creation required.

Category	Step 2 CK Weight	Research Weight
Primary Care	70	30
Academic IM	55	45
Competitive Surgery	50	50
Ultra-Competitive (Derm/Neurosurg)	45	55

Research Output vs Step Scores: What Matters More Post–Step 1 P/F?

1. What Actually Changed When Step 1 Went Pass/Fail?

2. The Baseline Numbers: How Much Research and How High Scores?

3. How Programs Actually Use Step 2 CK vs Research

Step 2 CK: The Gatekeeper

Research: The Multiplier and Tiebreaker

4. Marginal Value: One More Publication vs +5 Step 2 Points

5. Specialty-Specific Weighting: Where Research Really Competes with Scores

A. Hyper-competitive, research-driven fields

B. Competitive but not research-obsessed: Ortho, ENT, Urology, EM, Anesthesia

C. Academic Internal Medicine and subspecialty-track aspirations

D. Primary care and less research-heavy fields: FM, Psych, Peds, Neurology (most programs)

6. Modeling the Trade-Off: A Simple, Practical Framework

7. Quality vs Quantity: The Research Trap

8. A Data-Driven Summary: What Matters More, and When?

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Related Articles

If You Delayed Step 1 Into the P/F Era: Navigating the Gray Zone

Overcorrecting for Pass/Fail: CV Errors PDs See from Your Cohort

How to Use Clerkships to Replace the Signaling Power of Step 1

Thinking Step 1 Doesn’t Matter Now? 7 Career-Limiting Assumptions

Clerkship Grades vs Step 2: What PDs Privately Say They Trust Most

How Many More Applications Are Students Sending After Step 1 P/F?

How PDs Really Use Pass/Fail Step 1 to Sort Your Application

The Backchannel Conversations About Step 1 Pass/Fail You Don’t Hear

Gap Year Timing: When It Still Makes Sense After Step 1 P/F

What Committees Now Scrutinize Instead of Step 1 Scores

Crafting LOR Requests That Compensate for Missing Step 1 Numbers

The New Red Flags: Missteps Schools Notice Instead of Step 1 Scores

Already Took Step 1 as Numeric, Now Everything Is P/F: What to Do

Building a ‘Safety Net’ Profile for Competitive Fields Without Step 1

Will Step 1 P/F Hurt My Chances at Highly Competitive Residencies?

Step 1 Pass/Fail Strategy: A Concrete Plan to Stand Out Anyway

Residency Application Season: Key Deadlines in the New Testing Order

Inside Ranking Meetings: What Matters More Than Step 1 Now

Did Step 1 P/F Level the Playing Field? Winners and Losers Exposed

If Your Step 1 Prep Burned You Out Before Clinical Year Starts

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.