Residency Advisor Logo Residency Advisor

Using FREIDA and Specialty Reports to Quantify Program Risk Factors

January 8, 2026
14 minute read

Resident reviewing FREIDA data and specialty reports -  for Using FREIDA and Specialty Reports to Quantify Program Risk Facto

The way most applicants “get a vibe” about residency programs is statistically reckless.

You have access to one of the densest data sources in graduate medical education—FREIDA and specialty workforce reports—and most people use them like a phone book. Scroll, click, shrug. That is how you miss red flags.

If you treat FREIDA and specialty reports as a structured dataset instead of a directory, you can quantify program risk with surprising precision. And no, that does not mean building a PhD-level model. It means doing what serious analysts do every day: define signals, set thresholds, and compare outliers.

Let me walk through how to do that properly.


Step 1: Treat FREIDA as a Dataset, Not a Brochure

FREIDA looks like a glorified search tool. It is not. It is a structured database. The data fields are inconsistent across specialties, but the core signals for risk assessment are there.

The data that actually move the needle:

Most applicants click into a program and read the description paragraph like it’s marketing copy. They ignore the hard numbers below. That is backward. The description is the least reliable field. The structured data is where the red flags live.

Here are the key FREIDA fields to quantify:

  • Total resident complement vs PGY-1 positions
  • Reported average weekly work hours
  • Night float vs 24-hour call frequency
  • Minimum USMLE Step 2 / COMLEX scores “preferred” (signals selectivity vs desperation)
  • Percent of residents who are IMGs / DOs (only a red flag when wildly discordant with your target specialty norms)
  • Research requirement and scholarly output expectations

On their own, none of these scream “bad program.” But once you start comparing programs side-by-side and tying them back to specialty-wide statistics, patterns of risk appear.


Step 2: Anchor Everything to Specialty-Wide Benchmarks

You cannot call something “high risk” without a baseline. This is where specialty reports come in.

I am talking about:

  • NRMP: “Charting Outcomes in the Match” and “Program Director Survey”
  • AAMC: workforce and specialty pipeline reports
  • Specialty society reports (AAIM for internal medicine, ABS for surgery, ACR for radiology, etc.)
  • Board organizations: certification pass rate reports

These documents give you:

  • Average program size
  • Typical board pass rates
  • Usual fellowship match patterns
  • National distribution of US MD / DO / IMG residents by specialty
  • Average duty hours (self-reported) where available
  • Competitiveness metrics (Step scores, research, etc.)

So instead of saying “this program has 8 residents per year, seems fine,” you can say “median categorical IM program is ~12-14 residents per year; this one at 3 per year is a genuine outlier.”

Let’s visualize the relative benchmarks, even roughly.

bar chart: Internal Med, General Surg, Pediatrics, Psychiatry

Approximate Median Metrics by Specialty
CategoryValue
Internal Med12
General Surg6
Pediatrics10
Psychiatry8

Interpretation: a categorical IM program with 3 residents per year is not just “small.” It is operating at 25% of the specialty median. That is a risk factor until proven otherwise.


Step 3: Build a Simple Program Risk Framework

You do not need fancy software. A spreadsheet plus discipline is enough.

Define 3 buckets of risk indicators:

  1. Structural red flags – inherent to size, geography, and resources
  2. Training-quality red flags – how residents are educated and worked
  3. Outcome red flags – what happens to residents afterward

You can assign each category 0–2 points based on data available from FREIDA and specialty reports. More points = more risk. Simple and brutally clear.

Here is a concrete example of what that could look like:

Sample Residency Program Risk Scoring Framework
CategoryIndicator0 points (Low risk)1 point (Moderate)2 points (High risk)
StructuralClass size vs specialty median≥ 75% of median50–74% of median< 50% of median
StructuralGeographic isolationLarge city with multiple hospitalsMedium city, limited backup programsRural / single hospital region
Training qualityReported avg weekly hours55–65 hours66–75 or 45–54 hours>75 or <45 (both suspicious)
Training qualityCall structureNight float, few 24h callsMixedFrequent 24h calls without clear limits
OutcomesBoard pass rates vs national averageWithin 5% of national6–10% below>10% below or “data not available”
OutcomesFellowship/job placement transparencyDetailed, recent, specific listPartial info, older than 3 yearsNo data, vague claims, or clearly weak placement patterns

Add up the points:

  • 0–3 = Low risk
  • 4–6 = Moderate risk
  • 7+ = High risk

Is this perfectly statistically validated? No. But it is far better than the default “I liked the PD on Zoom.”


Step 4: Use FREIDA Fields to Quantify Structural Risk

Now, let’s take that framework and plug in actual FREIDA data.

1. Program size as a proxy for fragility

Small programs can be excellent. They can also collapse quickly when a single attending or hospital partner leaves.

Heuristics:

  • Compute: Program size ratio = program PGY-1 spots / specialty median PGY-1 spots
  • Thresholds:
    • ≥ 0.75 – structurally stable for most specialties
    • 0.50–0.74 – caution, especially in surgical and procedural fields
    • < 0.50 – high fragility risk

For example, if NRMP data show median PGY-1 categorical positions:

  • Internal Med: 12
  • General Surgery: 6
  • Pediatrics: 8
  • Psychiatry: 6

A surgery program with 2 categorical spots has a ratio of 2/6 = 0.33. That is not automatically malignant. But statistically, small programs have:

  • Less redundancy when residents leave
  • Fewer peers per class for cross-coverage
  • Greater exposure to single attending personalities dominating the culture

That is structural risk.

2. Geographic isolation and backup options

FREIDA will give you city, state, and linked institutions. Combine that with simple geography.

Risk pattern I see often:

  • Single hospital town
  • No other residency programs within 60–90 miles
  • The program is “new” or has had rapid expansion

If the institution runs into financial or accreditation trouble, residents have fewer nearby transfer options. When things go wrong, they go very wrong.

The data play here is simple: map your list, identify “orphan” programs in their region, and treat that as a risk multiplier.


Step 5: Use Hours and Call Data to Flag Potentially Malignant Training Environments

Many programs under-report work hours. I know. You know. Everyone knows.

But the reported numbers still contain signal, especially when compared across programs in the same specialty.

Imagine you extract self-reported average weekly hours for 20 internal medicine programs from FREIDA. Hypothetical distribution:

  • Median: 62 hours
  • Interquartile range: 58–66
  • Outliers: 48, 80

Visualizing those as a boxplot would immediately show you what is suspicious.

boxplot chart: Program Group

Hypothetical Weekly Work Hour Distribution (Internal Medicine)
CategoryMinQ1MedianQ3Max
Program Group4858626680

Interpretation:

  • Programs reporting < 50 hours for IM in a busy tertiary center are usually under-reporting or misrepresenting acuity.
  • Programs reporting ≥ 75–80 hours are either chaotic, non-compliant, or both.

So you do not chase the “lowest hours” blindly. Extremely low and extremely high reported hours are both red flags.

Call structure reported in FREIDA also matters:

  • Night float vs 24-hour call
  • Number of in-house call nights per month
  • Presence of home call for certain rotations

Risk heuristic:

  • If every heavy rotation includes 24-hour call, and there is no clear night float system, and hours are already high → high training-risk program
  • If call structure is vague or “varies by rotation” without specifics → uncertainty penalty

I have seen programs that “forget” to mention that PGY-2s do q3 24-hour calls on three different services. That shows up less in the brochure and more in resident word-of-mouth—but when FREIDA data are already extreme, that is your first hint to dig deeper.


Step 6: Outcomes: Board Pass Rates, Fellowship Placement, and Attrition

This is where specialty reports really start earning their keep.

Board pass rates

Most specialty boards publish national first-time pass rates. For example (approximate numbers, will vary by year):

  • IM ABIM pass rate: ~88–92%
  • General Surgery ABS QE: ~85–90%
  • Pediatrics ABP: ~85–90%

The program’s board pass rate relative to these matters more than any glossy “Our residents are well-prepared” line.

Heuristic:

  • Within 5 percentage points of national average: fine
  • 6–10 points below: yellow flag
  • 10 points below or “data not reported”: red flag

You do not always get exact pass rates from FREIDA. Sometimes you find them on the program’s own site. Sometimes specialty societies aggregate data by program. It is worth the extra search.

If a program whose specialty has a 90% national pass rate is sitting at 70% over multiple years, that is not bad luck. That is a training problem.

Fellowship and career outcomes

Many programs list fellowship match outcomes in promotional materials. Treat this like any other dataset.

  • Look for: number of residents per class vs number going into competitive fellowships, primary care, hospitalist roles, etc.
  • Look at: where they match, not just into what. “Top 10” vs unranked community fellowships is a meaningful difference for some specialties.

The red flags here:

  • No outcomes listed at all (“Our graduates go on to a variety of careers”)
  • Only listing 1–2 cherry-picked star matches, with no comprehensive list
  • Outcomes page clearly out of date (last updated 5–7 years ago)

If a program has a solid track record, they almost always quantify it. Vague language in place of data is itself a negative signal.


Step 7: Cross-Reference with NRMP Program Director Survey

This is a massively underused document.

NRMP’s Program Director Survey details what PDs say they value when deciding whom to invite and rank. But it also reveals program behavior and priorities by specialty.

You can use this to:

  • Identify specialties where programs heavily weight board pass rates and in-service performance
  • Understand how many programs in a specialty consider reputation of medical school vs personal fit vs exam scores
  • Infer which programs are likely to be hypersensitive to attrition and performance issues

How does that help your risk assessment?

Programs under major accreditation pressure (e.g., because of board pass rates) often respond by:

  • Tightening evaluation and remediation
  • Pushing residents harder to hit numbers
  • Becoming less tolerant of “median” performance

So if a program:

  • Is in a specialty where PDs overwhelmingly rank “exam scores” and “board pass rates” as top factors
  • Has below-average board performance historically
  • Is small and geographically isolated

You have stacked systemic risk factors for resident burnout and punitive culture. That comes directly from aligning specialty-level PD behavior with program-specific outcomes.


Step 8: Build a Basic Quantitative Filter for Your Rank List

Let’s make this practical.

You have 30 programs on your list. You want to avoid obvious red flags without throwing away opportunities.

Set up a spreadsheet with columns:

  • Program name
  • Specialty
  • City / region
  • PGY-1 spots
  • Specialty median PGY-1 spots (from NRMP)
  • Program size ratio (computed)
  • Reported hours
  • Call structure notes
  • Board pass rate vs national
  • Fellowship outcomes transparency (0/1/2 score)
  • Geographic isolation (0/1/2)
  • Total risk score (sum of category scores)

Then do three simple things:

  1. Sort by total risk score – anything ≥ 7 warrants serious reconsideration or at least aggressive questioning on interview day.
  2. Highlight outliers – programs where any single indicator is extremely abnormal (size, board pass, hours).
  3. Cross-check with subjective data – resident interviews, Reddit/SDN experiences, word-of-mouth.

Here is what a simple classified list might look like for internal medicine, illustrative only:

hbar chart: Program A, Program B, Program C, Program D, Program E, Program F, Program G, Program H, Program I, Program J

Example Risk Scores Across 10 Hypothetical IM Programs
CategoryValue
Program A2
Program B3
Program C4
Program D5
Program E6
Program F7
Program G3
Program H8
Program I1
Program J5

If you see yourself ranking Program H (score 8) above Program A (score 2) without a very good reason, you are taking on avoidable risk.


Step 9: Pay Attention to “Missing Data” as a Red Flag

Analysts get nervous not only when numbers look bad, but when numbers are missing.

On FREIDA and program websites, I get concerned when:

  • Board pass rates are “coming soon” or “not available”
  • Case volume data are absent in procedure-heavy specialties (surgery, ortho, ENT, IR, anesthesia)
  • Faculty-to-resident ratios are not mentioned anywhere
  • There is no clear statement about resident scholarly activity expectations or support

In data work, missing data is rarely random. It clusters where people have something to hide or have failed to measure what matters.

So in your scoring framework, assign a penalty for missing key metrics:

  • No board pass rate data: +2 risk points
  • No case volume or exposure description in a procedural specialty: +2 points
  • No fellowship or job placement outcomes: +1–2 points, depending on specialty competitiveness

You do not assume the worst, but you stop assuming everything is fine.


Step 10: The Future: Better Data, More Transparency… Slowly

The future of residency program evaluation will be more quantitative, not less. You are already seeing:

  • ACGME publicly tracking duty hour violations and citations (though still not as transparent as it could be)
  • Specialty boards releasing more program-level performance data
  • Applicants scraping and aggregating FREIDA and website data into private spreadsheets and community resources
  • Hospitals under pressure to quantify burnout, attrition, and workplace violence incidents

We are not at the point of a “Residency Yelp” with full longitudinal outcomes. But the direction is clear: programs with serious red flags will have a harder time hiding behind nice websites.

If you are smart about it, you can front-run that trend. Use FREIDA and specialty reports now as your early warning system:

  • Track programs that are expanding rapidly without clear increases in faculty or resources
  • Flag programs in health systems with highly publicized financial or labor issues
  • Watch for specialties where board pass rates are dropping nationally and see which programs are outliers on the downside

You are not just choosing where to spend three to seven years. You are choosing the statistical environment that will shape your skills, mental health, and career trajectory.


Final Takeaways

Three points, because that is all you will remember:

  1. Treat FREIDA and specialty reports as data, not brochures. Benchmarks and ratios (size, hours, pass rates) expose real program risk that “vibes” will miss.
  2. Build a simple, numeric risk score for each program. Structural, training, and outcome factors together will clearly separate low-risk from high-risk options.
  3. Missing or vague data is itself a red flag. When good programs have good numbers, they show them. When numbers disappear, assume risk until proven otherwise.
overview

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

* 100% free to try. No credit card or account creation required.

Related Articles