Residency Advisor Logo Residency Advisor

Intern Well-Being Surveys: Which Interventions Show Measurable Benefit?

January 6, 2026
13 minute read

Hospital residents walking through a hallway between shifts -  for Intern Well-Being Surveys: Which Interventions Show Measur

The residency wellness movement is flooded with feel‑good language and almost no hard outcomes. The uncomfortable truth: a large fraction of “wellness initiatives” show minimal or no measurable benefit when you actually run the numbers.

You asked the right question: which interventions actually move the needle on intern well‑being scores?

I am going to focus bluntly on what the data shows: effect sizes, response rates, and which interventions survive contact with a real call schedule.


What Intern Well-Being Surveys Actually Measure

Before talking interventions, you have to know what you are optimizing.

Common instruments used in intern and resident studies:

  • Maslach Burnout Inventory (MBI): emotional exhaustion, depersonalization, personal accomplishment.
  • PHQ‑9 / GAD‑7: depression and anxiety severity.
  • Perceived Stress Scale (PSS).
  • PROMIS well‑being or quality of life scales.
  • Simple 0–10 Likert items: “Overall, how burned out are you?”

Most intern-focused studies define “benefit” as:

  • A statistically significant improvement in a validated scale compared with:
    • The intern’s own baseline (pre vs post), and
    • A control group of interns not receiving the intervention (when available).

You care about both:

  1. Effect size: how big the change is (Cohen’s d, raw point change).
  2. Durability: how long it lasts (4 weeks vs 6–12 months).

Let me anchor this with some simple ranges pulled from multiple resident well‑being trials and systematic reviews:

bar chart: Schedule/Hours, Sleep/Fatigue, Coaching/Mentoring, Mindfulness Programs, Peer Support Groups, Online Modules/Apps

Typical Effect Sizes of Resident Wellness Interventions
CategoryValue
Schedule/Hours0.4
Sleep/Fatigue0.35
Coaching/Mentoring0.4
Mindfulness Programs0.25
Peer Support Groups0.2
Online Modules/Apps0.1

Interventions with Cohen’s d ≥ 0.4 are meaningfully noticeable at the individual level. ≤ 0.2 and you are in the “might be real, unlikely to be felt” zone.


The Heavy Hitters: System-Level Changes

If you want real signal in your survey data, you start with structural changes. Not mindfulness. Not pizza.

1. Duty Hours and Schedule Design

Blunt but true: the strongest improvements in intern well‑being almost always appear when you change the number of hours worked or how those hours are arranged.

Key data patterns from multi-center studies and program‑level analyses:

  • Shorter shifts (e.g., eliminating 28‑hour calls):
    • Reductions in emotional exhaustion by 3–6 MBI points within 3–6 months.
    • Effect size typically d ≈ 0.4–0.6 compared with baseline.
    • Sleep duration increases 0.5–1.5 hours per 24‑hour period on average.
  • More “golden weekends” or protected days off:
    • Consistent improvement in overall well‑being scores (0.3–0.6 SD).
    • Reduction in reported “severe burnout” prevalence by ~10–20 percentage points.
Schedule Changes and Reported Impact
Intervention TypeHours/Pattern ChangeTypical Effect Size (Burnout)Notes
Shortened call shifts28h → 16–24h0.4–0.6Strongest effect on exhaustion
More guaranteed days off1 in 7 → 2 in 70.3–0.5Improves life satisfaction
Night float vs 24h callCall → night float system0.3–0.4Helps sleep regularity
Capped weekly hours80 → enforced ≤ 700.3–0.4Impact blunted if “work compression” happens

What shows up repeatedly when you look at MBI and PHQ‑9 data:

  • Interns on more humane schedules report lower emotional exhaustion and better sleep by 1–2 months.
  • The benefit is largest early in the year (July–October) when the learning curve and cognitive overload are highest.

The catch: program leadership sometimes implements schedule changes that look good on paper but compress the same workload into fewer hours. That drives:

  • More work‑intensity complaints.
  • Mixed movement in well‑being scores: less fatigue, more “moral injury” and “unsafe” feelings.

On surveys, that shows up as:

  • Emotional exhaustion improves modestly.
  • Depersonalization and “feeling unable to provide good care” get worse.

So schedule changes help, but only if the actual workload is also rationalized.


2. Sleep and Fatigue Management

Separate from raw hours, sleep-focused interventions have good data. Especially for interns.

Common strategies:

bar chart: Baseline (pre), Post-Sleep Education, Post-Nap Protocol + Enforcement

Average Sleep Gain from Fatigue Interventions (Hours per 24h)
CategoryValue
Baseline (pre)5.5
Post-Sleep Education6
Post-Nap Protocol + Enforcement6.5

What shows up in the numbers:

  • Interventions that actually change behavior (nap protocols with coverage, dedicated sleep rooms, formal expectation to nap) add ~0.5–1 hour sleep per 24 hours.
  • That translates to:
    • PSS (Perceived Stress Scale) reductions of ~2–3 points.
    • PHQ‑9 drops of ~1–2 points.
    • Small to moderate reductions in self‑reported errors.

Effect sizes: d ≈ 0.3–0.5 on fatigue/stress scales, somewhat smaller on burnout scales but still real.

Interns usually report:

  • Less constant exhaustion.
  • Marginal but noticeable improvement in “overall well‑being” scores.

Programs that just “educate” about fatigue without structural support (no call coverage, no enforcement) see almost no measurable change. The surveys capture that gap: knowledge up, sleep unchanged, burnout unchanged.


Individual-Level Interventions: What Actually Moves Scores

Now the popular stuff: mindfulness, resilience workshops, coaching, peer groups. This is where a lot of wellness budgets go. Let’s be honest about impact.

3. Coaching and Mentoring Programs

Among the “soft” interventions, coaching and structured mentoring rank near the top by effect size.

Typical designs in the literature:

  • 4–6 individual sessions with a trained coach over several months, or
  • Structured mentoring program with assigned faculty, regular check‑ins, and explicit support goals.

Average findings:

  • Emotional exhaustion: 3–5 point reduction on MBI.
  • Overall burnout prevalence: ~10–15 percentage point drop (e.g., 55% → 40–45%).
  • Satisfaction with training increases by 0.3–0.5 SD.

Effect size: often d ≈ 0.3–0.5 for emotional exhaustion and overall well‑being.

The reasons are obvious when you look at free‑text survey comments:

  • “Someone in power actually listened and helped me reprioritize.”
  • “I learned it was okay to say ‘no’ sometimes.”
  • “We changed my schedule slightly after I brought up issues.”

Where coaching/mentoring fails:

  • Programs that assign a “mentor” on paper but do not create time or accountability.
  • Interns who never meet their mentor or meet once in July and never again.

Surveys from those programs look flat. No detectable benefit. The intervention exists administratively but not in reality.


4. Mindfulness and Resilience Training

These are ubiquitous. Every hospital wants its “mindfulness curriculum.” The data is more nuanced than the marketing.

Patterns from multiple resident-focused RCTs and pre/post studies:

  • Short courses (e.g., 4–8 week mindfulness programs, brief workshops):
    • Small to moderate improvements in stress, mindfulness, and sometimes emotional exhaustion.
    • Effect sizes typically d ≈ 0.2–0.3.
  • Benefits are strongest for:
    • Interns who actually attend most sessions.
    • Those starting with high baseline stress scores.

But there are recurring problems:

  • Attendance: often 40–70% actually show up consistently.
  • Durability: gains tend to shrink by 3–6 months after the program unless there is ongoing practice or booster sessions.

A typical pattern:

  • PHQ‑9: decrease 1–2 points post‑intervention compared with control.
  • MBI emotional exhaustion: 2–3 point drop.
  • Overall well‑being: 0.2–0.3 SD improvement.

Not nothing. But modest. And these effects are frequently smaller than what you see from schedule changes or coaching.

Where mindfulness does look stronger is when it is:

  • Voluntary rather than mandatory.
  • Integrated into the workday (short daily practices embedded in rounds or sign‑out).
  • Paired with institutional changes, not used as a Band‑Aid.

On surveys, residents can smell the difference between “here’s a breathing exercise” and “we fixed your call schedule and here’s a breathing exercise.”


5. Peer Support and Reflective Groups

Balint‑style groups, facilitated debriefs, peer support sessions. The data is mixed but not useless.

Typical outcomes:

  • Qualitative feedback is very positive: interns feel more connected, less isolated.
  • Quantitative outcomes:
    • Small improvements in depersonalization.
    • Modest gains in “sense of belonging” and “support from colleagues” scores.
    • Often no major change in overall burnout levels unless paired with other interventions.

Effect sizes:

  • Usually d ≈ 0.2–0.3 on connectedness and depersonalization.
  • Burnout overall: often non‑significant or small.

Here is what I have seen in real program survey data:

  • Interns who attend these groups consistently report fewer thoughts of quitting.
  • Free‑text: “I’m still exhausted, but I do not feel alone in it anymore.”

That does not “fix burnout scores” dramatically, but it genuinely matters for retention and mental health risk.


6. Apps, Online Modules, and “Self-Directed” Tools

This is where interventions go to die statistically.

Wellness apps, self‑paced “resilience modules,” email courses. When you program‑wide deploy these and then survey interns pre/post, most datasets show:

  • Low engagement: <30–40% complete meaningful portions.
  • Marginal or no significant improvement in MBI, PHQ‑9, or stress scores at the group level.
  • A small subgroup of high‑engagement users sometimes shows benefit, but they are usually already motivated and self‑selected.

Net effect size: often d ≈ 0.1 or less on most scales. Sometimes nothing.

This matches what interns tell us in comment sections:

  • “I clicked through the module during a slow night.”
  • “I do not have time for another app.”
  • “Fix the schedule instead.”

From a data perspective: these are low‑yield if deployed as the primary wellness intervention. They might be a useful add‑on resource, but do not expect major shifts in your annual ACGME well‑being numbers.


Combined Approaches: Synergy vs Noise

The highest‑impact programs usually do not rely on a single intervention. They combine:

  • Structural changes (hours, workflow).
  • Individual support (coaching or mentoring).
  • Optional skills training (mindfulness, stress management).
  • Culture work (leadership modeling, psychological safety).

You see a pattern in the numbers from these multi-component programs:

  • Burnout prevalence reductions of 15–25 percentage points across 1–2 years.
  • Moderate effect sizes (d ≈ 0.4–0.6) in emotional exhaustion and well‑being.
  • Better retention and lower rates of “seriously considered leaving residency” on annual surveys.

area chart: Baseline, 6 Months, 12 Months

Change in Intern Burnout with Different Intervention Types
CategoryValue
Baseline60
6 Months48
12 Months40

Interpretation: imagine a program where 60% of interns report burnout at baseline. After 6 months of serious combined interventions, that drops to ~48%. By 12 months, with sustained efforts, it stabilizes around 40%. That is not utopia—but it is a real shift you can see on your ACGME or internal well‑being surveys.

The failures are instructive:

  • Programs that throw three low‑impact interventions together (an app, a mindfulness lecture, and a “wellness newsletter”) and then act surprised when the burnout rate is unchanged at 65%.
  • Programs that focus only on “resilience” and ignore workload. The survey data usually shows stable burnout with maybe a slight improvement in “I feel supported” scores. Interns see through it.

Designing Intern Well-Being Surveys That Show Real Signal

If you are an intern answering these surveys, or a chief/PD reading them, you should know how to distinguish noise from signal.

Use Validated Instruments, Not Only Home-Brew Items

Programs that track MBI (even just the emotional exhaustion subscale), PHQ‑9, or PSS can:

  • Quantify effect sizes.
  • Compare across years and against published norms.
  • Detect small but real changes.

Quick anonymous “pulse checks” with 1–3 Likert items can be useful for rapid feedback, but they are poor at detecting modest improvements over time.


Pair Quantitative Scores with Specific Exposure Data

A recurring analytic problem: programs measure burnout but never systematically record which interns actually received the intervention.

If you want to see which interventions work, you should be tracking:

  • Who attended coaching/mentoring sessions.
  • Who participated in mindfulness courses.
  • Who worked under old vs new schedule structures.

Then you can run simple comparisons:

  • Mean MBI emotional exhaustion among:
    • Interns with ≥4 coaching sessions vs 0–1 sessions.
    • Interns on shorter calls vs traditional 28‑hour calls.

The data usually shows dose‑response relationships. You rarely see that if you aggregate everyone together and call it a “program‑wide” intervention without exposure data.


Look Beyond p-Values: Focus on Magnitude and Durability

A 0.5‑point change on a 0–54 emotional exhaustion scale might be “statistically significant” with large N. It is also clinically meaningless.

For intern well‑being:

  • Aim for:
    • ≥3‑point reduction in MBI emotional exhaustion.
    • ≥2‑point reduction in PHQ‑9 for depressive symptoms.
    • Clear drop (≥10 percentage points) in “I feel burned out from my work” high‑score category.

Also look at:

  • Whether gains persist at 6–12 months or vanish after the intervention ends.
  • Whether benefits show up across most interns, or only in a small subgroup.

What Actually Helps You As an Intern

Let me be direct: as an intern on 70–80 hours a week, you will not “mindfulness” your way out of a fundamentally broken schedule. The data does not support that fantasy.

Based on the numbers and the pattern across programs:

  1. Structural fixes give the largest improvements in well‑being scores.

    • Advocating for sane scheduling (shorter call, more predictable days off) has more impact than almost anything else.
    • If your program is piloting schedule changes, fill out those surveys honestly. That data is your leverage.
  2. Coaching and real mentorship are the most efficient individual-level interventions.

    • If your program offers genuine one‑on‑one coaching or consistent mentoring, use it. The effect sizes on burnout are in the same ballpark as many schedule tweaks.
    • Push for these to be protected time, not done on your post‑call 28th hour.
  3. Mindfulness and peer support help, but mostly as adjuncts.

    • They can reduce stress and loneliness, which matters.
    • Do not let leadership pretend they are adequate substitutes for safe staffing and reasonable expectations.
  4. Ignore the noise.

    • Wellness apps, generic online modules, posters about resilience—on their own—produce almost no detectable improvement in intern well‑being surveys.
    • These are fine as optional extras, not core solutions.

Quick Summary: The Data’s Verdict

  • System-level changes—hours, schedules, sleep protection—produce the largest, most consistent improvements in intern well‑being scores (effect sizes around 0.3–0.6).
  • Coaching and structured mentoring regularly show moderate reductions in burnout and improved satisfaction, outperforming most standalone mindfulness programs.
  • Mindfulness, peer support, and online tools can add marginal benefit, but without structural change they rarely shift overall burnout rates in a meaningful way.

That’s what the numbers say. Ignore the branding and the slogans; follow the effect sizes.

overview

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

* 100% free to try. No credit card or account creation required.

Related Articles