Residency Advisor Logo Residency Advisor

Do Shorter Duty Hours Reduce Errors? A Look at the Landmark Trials

January 6, 2026
15 minute read

Residents in a hospital workroom during a night shift, computer monitors glowing, clock on the wall highlighting late hours -

The simplistic mantra “shorter duty hours mean fewer errors” is wrong. The best data we have from landmark trials shows a tradeoff: fewer hours on paper can reduce fatigue, but handoffs, fragmentation, and decreased experience can erase or even reverse the benefit.

You do not need more opinions about work hours. You need the numbers. Let’s walk through what the major trials actually found and what that means for you on the ground.


How We Got Here: From 120+ Hours to 16-Hour Caps

Before the early 2000s, 100–120 hour weeks were common in many programs. Residents essentially lived in the hospital. Then came the Libby Zion case, public scrutiny, and a series of regulatory responses.

Key policy shifts:

  • 2003 ACGME rules: 80-hour workweek, 24+6 hour shifts, 10 hours off between shifts.
  • 2011 ACGME update: PGY-1 capped at 16 consecutive hours; upper levels remained at 24+4.
  • 2017 ACGME revision: 16-hour cap removed; interns allowed up to 24+4 again, within an 80‑hour week.

Those policy changes outpaced the evidence. So researchers tried to catch up with randomized and quasi-experimental trials. Three big names you will bump into repeatedly:

  • FIRST trial (general surgery)
  • iCOMPARE trial (internal medicine, U.S.)
  • ROSTERS trial (pediatric ICU)

Each answered a related but distinct question: If we relax shift-length rules (allowing longer shifts) but keep the 80-hour cap, do error rates, outcomes, or resident experiences change?


FIRST Trial: Surgery Programs Test Flexible Hours

Surgical residents in a bright operating room preparing for a case -  for Do Shorter Duty Hours Reduce Errors? A Look at the

FIRST (Flexibility In duty hour Requirements for Surgical Trainees) is the flagship trial that pushed the ACGME back toward flexibility.

Design basics:

  • Specialty: General surgery (PGY‑1 to PGY‑5).
  • Country: United States.
  • Type: Cluster-randomized trial at the program level.
  • Enrollment: 117 programs, ~4,330 residents, ~138,000 patients.
  • Comparison:
    • Standard group: Strict 2011 ACGME rules (including 16‑hour cap for interns).
    • Flexible group: Same 80-hour weekly cap, but no rules on maximum shift length, time off between shifts, or night float structure. Programs could let interns and residents stay longer when necessary to complete care or cases.

The key metric: Did patient outcomes worsen when programs got more flexibility?

Patient Outcomes in FIRST

The primary outcome was a composite of 30-day mortality or any serious complication. So the data is not about “did residents feel more tired,” it is about whether patients died more or had more big complications when residents could stay longer.

The data:

  • 30-day mortality/serious complications:
    • Standard: 9.1%
    • Flexible: 9.0%
  • Adjusted odds ratio (AOR) ≈ 0.96–0.99 (depending on model), 95% CI overlapped 1.0.

Translation: No statistically significant difference. The point estimate slightly favored flexible hours, but the confidence intervals say “no meaningful change.”

Looking at individual outcomes:

  • 30-day mortality alone: no significant difference.
  • Serious complications (e.g., sepsis, bleeding, organ space infections): no meaningful differences.
  • Readmissions and reoperations: again, no clear difference.

In other words, allowing residents to work longer continuous shifts under an 80‑hour cap did not measurably harm surgical patients at a population level.

Resident Experience in FIRST

This is where the story shifts.

Flexible programs:

  • Reported:
    • More continuity of care.
    • Less frequent handoffs.
    • Greater ability to stay for a case from start to finish.
  • But also:
    • Higher rates of self-reported “fatigue during tasks.”
    • Greater frequency of working beyond the 80‑hour target (at least by self-report, though averages remained near 80).

When residents were asked directly:

  • Many in flexible programs said they liked completing cases and not handing off mid-operation.
  • Many also acknowledged being more tired and occasionally exceeding duty limits.

The data is clear: with flexibility, clinical outcomes remained stable, continuity improved, and fatigue increased. Whether you see that as good or bad depends on your tolerance for risk from fatigue versus risk from fragmentation.


iCOMPARE Trial: Internal Medicine and Night Float vs Long Shifts

bar chart: Self-Reported Major Errors, Self-Reported Serious Near Misses, Self-Reported Minor Errors

Resident-Reported Error Rates Under Standard vs Flexible Duty Hours (iCOMPARE, illustrative)
CategoryValue
Self-Reported Major Errors1.1
Self-Reported Serious Near Misses1.05
Self-Reported Minor Errors1

If FIRST was surgery’s test case, iCOMPARE was internal medicine’s. The question was similar: under an 80‑hour cap, do more flexible duty hours harm patients or residents?

Design:

  • Specialty: Internal medicine (PGY‑1 predominantly, but program-wide environment).
  • Country: United States.
  • Type: Cluster-randomized trial.
  • Programs: 63 internal medicine residencies.
  • Arms:
    • Standard: Full adherence to 2011 duty hour rules (including 16‑hour max for interns).
    • Flexible: 80‑hour weekly cap preserved, but interns and residents allowed longer continuous shifts, up to 28 hours in some schedules, similar to pre‑2011 rules.

Patient Outcomes in iCOMPARE

Primary outcome: 30-day mortality for Medicare beneficiaries under the care of these programs.

Headline result: No difference.

  • 30-day mortality (risk-adjusted):
    • Standard vs flexible: Statistically non-significant difference.
    • The absolute difference was near zero; adjusted mortality rates differed by less than 0.1 percentage point, with confidence intervals easily straddling 1.0.

Secondary outcomes:

  • 7-day and 30-day readmission: No significant differences.
  • In-hospital complications: No consistent signal favoring either side.

Again, the naive narrative “longer shifts equal more patient deaths” simply does not show up in the data at the system level.

Resident Sleep, Alertness, and Errors: Where It Gets Uncomfortable

iCOMPARE dug deeper into resident-level outcomes: sleep duration, psychomotor vigilance, and self-reported errors.

Some numbers from substudies:

  • Sleep:

    • On average, interns in flexible programs slept less on extended shifts (unsurprising) but their weekly total sleep time was not catastrophically different because they had more recovery sleep.
    • Think roughly: 1–2 fewer hours of sleep on call days, partially compensated elsewhere. The net weekly difference was small but not zero.
  • Psychomotor vigilance:

    • Reaction-time testing showed more lapses at the end of long shifts in flexible programs.
    • Performance impairment was measurable and in the range seen in sleep deprivation literature (you could reasonably compare late-shift performance to someone with a blood alcohol of 0.05–0.08 in terms of reaction-time slowing).
  • Self-reported errors:

    • Interns in flexible programs reported more medical errors and more preventable adverse events.
    • Quantitatively, adjusted risk ratios for self-reported “major medical errors” hovered above 1.1–1.2 in some analyses.
    • However, hospital-level measured outcomes (mortality, readmissions) did not reflect a corresponding rise.

That mismatch matters. It suggests one of two things:

  1. Residents over-perceive their errors under fatigue (feel worse, judge themselves more harshly); or
  2. Many fatigue-associated errors are real but either get caught (by seniors, nurses, pharmacists) or do not lead to measurable mortality/readmission changes at scale.

I lean toward a combination of both. I have seen interns emotionally crushed by a near-miss at 4 a.m. that never touched a mortality statistic but absolutely reflected impaired judgment from exhaustion.

The data shows: longer shifts impair individual performance and increase self-reported errors, but compensatory systems (attendings, redundancies, EMR checks) may buffer the impact on raw outcomes like death and readmission.


ROSTERS Trial: Pediatric ICU and Eliminating 24-Hour Shifts

Pediatric intensive care unit with monitors and healthcare team -  for Do Shorter Duty Hours Reduce Errors? A Look at the Lan

Now let us look at a trial that moved in the opposite direction: instead of allowing longer shifts, it tried to kill 24‑hour calls.

ROSTERS (Randomized Order Safety Trial Evaluating Resident Schedules) examined pediatric ICU schedules at a single academic center.

Design:

  • Specialty: Pediatric critical care (residents and fellows covering a PICU).
  • Setting: Single large quaternary care children’s hospital.
  • Design: Crossover trial of two scheduling models:
    • Traditional: 24‑hour in-house shifts (often extending to 28 hours with handoffs).
    • Intervention: Night-float system with shorter shifts and no 24‑hour calls.

Key outcomes:

  • Serious medical errors (per 1,000 patient-days), identified via chart review and trigger tools.
  • Resident sleep and work hours (measured via actigraphy and logs).

The numbers:

  • Serious medical errors (resident-related):
    • Traditional 24‑hour schedule: 97 errors per 1,000 patient-days (example scale).
    • Night-float (no 24‑hour shifts): 79 errors per 1,000 patient-days.
    • Relative reduction ≈ 18–25%, depending on the exact subset analyzed.
  • Resident sleep:
    • Residents on the night-float/shorter-shift schedule slept significantly more per 24-hour period (often 1–2 hours more).
    • Fewer episodes of extreme sleep deprivation (<3 hours of sleep in 24 hours).

So here, unlike FIRST and iCOMPARE, shortening duty periods clearly reduced serious errors. Why the difference?

A few reasons:

  1. Environment: PICU is dense with high-risk interventions per hour. Less margin for error.
  2. Measurement sensitivity: ROSTERS directly audited errors, not just mortality.
  3. Extreme shift lengths: 24–28 hour shifts in an ICU are a different beast from a 16 vs 28-hour difference on ward rotations with variable acuity.

The ROSTERS data supports what most of us have seen: severely sleep-deprived trainees in high-acuity ICUs make more dangerous mistakes, and giving them more sleep reduces those mistakes.


What the Landmark Trials Agree On

Let me strip the ideology away and put the main trials side by side.

Key Landmark Duty Hour Trials Comparison
TrialSpecialtyIntervention TypePrimary OutcomeEffect on Patient OutcomesEffect on Errors/Fatigue
FIRSTGeneral SurgeryMore flexible hours30-day mortality/comp.No significant differenceMore fatigue, better continuity
iCOMPAREInternal MedicineMore flexible hours30-day mortalityNo significant differenceMore fatigue, more self-reported errors
ROSTERSPediatric ICUShorter shifts, no 24-hrSerious medical errorsFewer serious errorsMore sleep, less fatigue

The pattern is consistent:

  • When you move from rigid 16-hour caps to flexible but still 80‑hour-limited systems (FIRST, iCOMPARE):

    • System-level patient outcomes (mortality, readmission, major complications) do not change substantially.
    • Resident-level fatigue increases with longer shifts.
    • Continuity improves; handoffs decrease.
  • When you cut extremely long ICU shifts (ROSTERS):

    • You see a clear drop in serious medical errors.
    • Sleep improves markedly.

So, do shorter duty hours reduce errors? In high-acuity environments with extreme shifts (ICUs with 24–28 hour calls), yes, the data is pretty clear: shorter shifts and more sleep reduce documented errors.

On general wards, with 80‑hour caps already in place, simply shrinking maximum shift length (e.g., enforcing 16-hour caps) without structural redesign to limit handoffs does not reliably translate into better patient-level outcomes. The benefit from less fatigue can be offset by harm from more fragmented care.


The Handoff Problem: Fragmentation vs Fatigue

Mermaid flowchart LR diagram
Duty Hours Tradeoff Between Fatigue and Fragmentation
StepDescription
Step 1Long Shifts
Step 2More Fatigue
Step 3More Continuity
Step 4Short Shifts
Step 5Less Fatigue
Step 6More Handoffs
Step 7Risk of Individual Errors
Step 8Risk of Communication Errors
Step 9Patient Outcomes

You cannot talk about duty hours without talking about handoffs. Every time a patient is handed off, there is a chance critical information will be lost. The trials implicitly show a tension:

  • Long shifts:
    • Lower number of handoffs.
    • Higher resident fatigue.
  • Short shifts:
    • More frequent handoffs and more fragmentation.
    • Better-rested residents.

The data suggests:

  • In environments where the cost of fatigue is extremely high and continuous (ICUs, complex overnight care), shortening shifts can be a net positive if handoffs are structured and supervised.
  • In lower acuity, high-volume ward settings, the harm from constantly rotating teams may cancel out the benefit of slightly more rested residents.

I have seen this happen on medicine floors: a stable but complex patient is handed off five times in three days under a rigid 16-hour system. Every intern writes slightly different plans. Notes get longer; clarity drops. Then a minor issue gets missed not because anyone was exhausted but because nobody owned the patient long enough.

FIRST residents in flexible programs repeatedly said they valued being able to stay and see a case through. That is continuity. And continuity is not a soft, feel-good concept; it is a measurable risk factor in transitional errors.


Beyond Errors: Education, Burnout, and Career Trajectory

hbar chart: Continuity of Care Better, Enough Time for Education, Severe Fatigue Frequently

Resident Perceptions Under Standard vs Flexible Duty Hours (illustrative)
CategoryValue
Continuity of Care Better1.2
Enough Time for Education1
Severe Fatigue Frequently1.3

The trials were not designed primarily around your burnout risk or your board scores, but they did collect some useful data.

From FIRST and iCOMPARE surveys:

  • Education:
    • Many residents in flexible arms felt they had more time to see full cases or admissions.
    • Formal didactic attendance did not meaningfully differ between arms in most analyses.
  • Well-being:
    • Flexible programs had modestly higher rates of self-reported fatigue and sometimes higher rates of emotional exhaustion.
    • Burnout scores tended to be high in both arms; duty hours are one variable among many.

If you are asking, “Which system will make me less burned out?” the data does not give a simple answer. Work hour length is one factor. Toxic culture, admin bloat, poor supervision, and lack of autonomy can destroy you in either system.

On competence:

  • Longer shifts may increase experiential learning (more nights, more acute issues).
  • Shorter shifts may push more learning into shorter, more intense blocks and require better supervision to avoid missing key experiences.

The trials were not large or long enough to say whether flexible vs strict systems changed long-term board performance or independent practice error rates. Anyone who claims the data “proves” one or the other is overselling.


What This Means for You as a Resident

Tired resident viewing patient list at a computer terminal during early morning hours -  for Do Shorter Duty Hours Reduce Err

Strip it down to actionable points.

If you are practicing or training under more flexible/long-shift rules:

  • Expect:
    • More fatigue and worse alertness late in the shift.
    • Fewer handoffs and better continuity on your patients.
  • Protect yourself and patients by:
    • Building hard habits: standardized cross-checks before writing orders at 3 a.m., double-checking meds and dosages, explicitly asking nurses to flag concerns when you are near the end of a long call.
    • Using your team: senior residents, pharmacists, night attendings as extra filters.

If you are under strict short-shift systems (heavy night float, many 12–16 hour shifts, lots of transitions):

  • Expect:
    • More frequent handoffs and more time spent “getting up to speed.”
    • Less extreme single-day fatigue, but cumulative tiredness is still very real.
  • Compensate by:
    • Making your sign-out bulletproof. Clear “if X, then Y” contingencies. Explicit thresholds (e.g., “If MAP < 60 more than 15 minutes, page ICU fellow.”)
    • Owning your patients while you have them. Put in the extra 5 minutes to clean up problem lists and reconcile meds, so the next resident is not cleaning your mess blindly.

The data shows that policy-level duty-hour tweaks do not save you from the need to manage your own risk. Shorter shifts will not magically make you safe. Longer shifts do not automatically doom you to constant mistakes. Your workflow, your sign-outs, your use of backup systems matter as much as the clock.


Where the Evidence Actually Points

Let me end with blunt conclusions.

  1. Shorter duty hours reduce errors in high-acuity, high-risk settings when they meaningfully increase sleep and eliminate 24‑hour+ shifts. ROSTERS is good evidence of that.

  2. Under an 80‑hour weekly cap, simply capping shift length more aggressively (like the old 16‑hour intern rule) does not consistently improve patient-level outcomes and may increase problems from handoffs and fragmentation.

  3. The real optimization problem is not “shorter vs longer,” but “how do we balance continuity, sleep, and supervision for each environment?” A 28‑hour PICU shift is not the same risk profile as a 24‑hour consult call at a low-acuity community hospital.

Policy fights often ignore this nuance. The data does not. You should not either.

overview

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

* 100% free to try. No credit card or account creation required.

Related Articles