Resources Medical Teaching Careers Behind Closed Doors: How Teaching Evaluations Are Read and Scored

Behind Closed Doors: How Teaching Evaluations Are Read and Scored

January 8, 2026

17 minute read

teaching evaluations faculty development medical education student feedback promotion review evaluation scoring teaching feedback clinical teaching

Medical faculty member reading teaching evaluations alone in an office - for Behind Closed Doors: How Teaching Evaluations A

Last spring, a junior faculty member forwarded me her teaching evals with a single line: “Do I need to start looking for another job?” One student had written “worst lecturer in the curriculum” and she was convinced her academic career was over. What she did not know—and what nobody had ever told her—is how little that single comment actually mattered in the rooms where decisions are really made.

Let me walk you through those rooms.

What Actually Happens The Day Your Evals Come In

At most med schools and teaching hospitals, there is a predictable, almost boring workflow behind the supposedly terrifying “teaching evaluations.”

It goes something like this, though nobody ever explains it to you.

First, the raw survey data land in the educational office. Not on the department chair’s desk. Not in the Dean’s inbox. In the hands of an overworked education coordinator or data analyst who’s juggling clerkship schedules, room bookings, and accreditation reports.

They run the standard report:

Mean scores for each item (1–5 or 1–7 scale)
Standard deviations
Response rate
Comment dump at the end

Then there’s the first, unspoken filter: response rate. If you got 5 responses out of a rotation of 40 students, every serious educator silently downgrades the “precision” of your feedback. We do not treat 3 angry comments from 10% of your learners the same way we treat a consistent pattern from 70–80%.

bar chart: Preclinical Course, Core Clerkship, Elective, Conference Series

Those reports get rolled up. Depending on the institution, they’re:

Emailed to you and your division chief
Stored in a central faculty performance system
Summarized annually in a teaching dossier

The real reading and scoring happens later—during promotion cycles, contract renewals, teaching awards, or when there’s a complaint.

That’s the part you never see. Because you’re not in the room when your name comes up.

How Program Leadership Actually Reads Your Numbers

Let me be blunt: nobody is sitting in a dark room, obsessing over whether your “organized presentation” item is 4.3 or 4.4.

Senior people look for three things:

Are you consistently below the group?
Are you consistently above the group?
Is there a pattern—over time or across different settings?

Notice what’s missing: perfection.

No serious educator expects straight 5.0s. When you see those, seasoned program directors get suspicious. Either the sample size is tiny, the evals are inflated, or the faculty member aggressively pressures learners for good scores.

Here’s how those numbers are really interpreted behind closed doors:

How Leadership Interprets Teaching Eval Numbers

Pattern	Typical Interpretation
4.6–5.0 with high response rate	Strong teacher, likely doing something special
4.2–4.5 around departmental mean	Solid, reliable, not a concern
3.8–4.1 slightly below mean	Watch list, may need support or context
< 3.8 consistently, multiple years	Problem we cannot ignore
Big year-to-year swings	Context change – need to investigate

You obsess over “I got a 4.1 and my colleague got a 4.3.” Leadership doesn’t. We look at:

Where are you relative to the median faculty member?
Is your trend stable, improving, or sliding?
Does this fit what we hear anecdotally from residents and students?
Does this fit your teaching context? (more on that later)

When I sit in promotions meetings, the conversation about teaching evals usually lasts maybe 2–5 minutes per candidate. That’s it. No one is printing your entire comment history and reading it aloud.

Unless you are truly an outlier. Then the tone shifts.

The Dark Secret: Comments vs. Scores

Let me tell you what everyone pretends isn’t true: the comments matter more than the numbers—if there are enough of them saying the same thing.

But a single spectacularly nasty comment? We’ve all been subject to those. Experienced committee members discount them almost immediately.

In a typical meeting, here’s what you’ll hear:

“She’s at or slightly above the departmental mean. Nothing concerning here.”
“There are a few negative comments, but they’re not consistent year to year.”
“Students love his bedside teaching but hate his PowerPoints—that’s fixable.”

Now, when do comments start to count against you?

When we see repetition:

“Disorganized” every year, across courses
“Humiliates students” from UME and GME learners, multiple cohorts
“Makes sexist jokes” echoing across time
“Never lets us do procedures, just pushes us aside”

One offhand complaint about being “mean” because you failed a student who deserved to fail? That gets thrown in the mental trash. Three independent cohorts calling you belittling? That changes how your file gets read.

Printed teaching evaluation comments with highlighted repeated themes - for Behind Closed Doors: How Teaching Evaluations Ar

And there’s another uncomfortable truth: glowing comments rarely save bad numbers. A pattern of low scores with a handful of gushing “best teacher ever” comments looks like this to us: you connect well with a subset of learners and fail the rest.

That’s not excellence. That’s variability.

Context: The Part Nobody Tells You Gets Adjusted

You think all teaching evals are judged the same. They aren’t. Not even close.

When I evaluate a faculty member’s teaching, I’m adjusting in my head for:

Content difficulty. The person teaching renal physiology or biostatistics is graded by students harsher than the person teaching dermatology pictures. Not fair, but very real.
Learner seniority. Preclinical small groups vs. intern boot camp are different worlds. Interns will slam you for being demanding; students might adore the structure.
Required vs. elective. Evaluations from electives are almost always inflated. The learners chose to be there. We all know this.
Time of year. Pathology in April vs. September? Two different audiences. Burned-out learners are not generous graders.

Good education leaders know this and mentally curve things. Here’s roughly how people weight different teaching contexts behind closed doors:

doughnut chart: Core Clerkship Teaching, Preclinical Lectures, Elective Teaching, One-off Noon Conferences

If you’re outstanding in core clerkships and mediocre at a once-per-year noon conference, nobody cares about the noon conference.

Flip it—if your only strong evals are from a handpicked elective and your core teaching is weak—that’s a problem. And yes, we notice that pattern.

What Committees Actually Do With Your Evals

Let’s talk about where the evals really matter: promotions, reappointments, and teaching awards.

For Promotion and Reappointment

In a typical Assistant Professor → Associate Professor discussion, your teaching section gets handled like this:

A committee member has your dossier open. They’ve read the summary prepared by the education office.
They see:
- Aggregate scores over ~5 years
- Comparison to departmental averages
- Selected representative comments
- List of teaching roles (lectures, small groups, clerkships, simulation, etc.)
They talk for 60–120 seconds:
- “Teaching evaluations are consistently at or above departmental mean.”
- “He’s taken on increasing teaching responsibility.”
- “Comments highlight approachability and bedside teaching.”
- “There were concerns about organization early on, which seem to have improved.”

That’s the entire teaching segment.

If you’re below the mean, the conversation shifts:

“She’s consistently 0.3–0.5 below the departmental mean in the clerkship.”
“Comments repeatedly mention disorganization.”
“She has not sought faculty development despite previous feedback.”

Then someone will ask:

“Is this a support issue, or a patient safety / professionalism issue?”
If it’s the former, they may still recommend promotion but with a strong note that you need teaching development. If it’s the latter, things get sticky.

For Teaching Awards

Here’s the part nobody will tell you: teaching awards are political.

Yes, evals matter. But so do:

Who is on the selection committee
Whether your department advocates for you
Whether learners bother to nominate you with specific stories
Whether you’re “visible” in big lectures vs. buried in night-float teaching

We absolutely look at your evaluations. But we are looking for sustained excellence and narrative evidence, not perfection.

I’ve sat in award meetings where someone with 4.9 averages lost to someone with 4.6 but legendary comments and clear impact on struggling learners. The committee valued story over fraction.

The Bias Problem: What We Say Quietly After The Meeting

There is a conversation that happens after the official meeting ends. You won’t see it in any policy document.

We all know the literature: teaching evaluations are biased. Against women. Against underrepresented faculty. Against anyone with an accent. Against those who enforce standards.

So after reading a dossier from, say, a Black woman surgeon with “tough but fair” comments and slightly lower means, experienced chairs will literally say in the room:

“Adjust for bias. She’s holding residents to standards and they’re reacting.”

I’ve heard this word-for-word:

“Her scores are a bit lower, but she’s in trauma surgery nights with angry PGY-1s. This is not a red flag.”
“Students call him ‘intimidating’ but he’s the only one giving them real feedback. I’m not docking him for that.”

Faculty promotions committee in serious discussion - for Behind Closed Doors: How Teaching Evaluations Are Read and Scored

Are all committees this self-aware? No. Some still treat evals as objective truth. But in most serious academic centers, the bias problem is known and at least partially corrected for—informally, in people’s heads.

That said, there’s another ugly bias: charisma.

If you’re funny, extroverted, and good on stage, your evals are inflated. If you’re quiet, methodical, and introverted, students often underrate you even if they learn more from you.

We know this too. Some of the best clinical teachers I’ve seen get “good but not great” evals because they’re not performers. A thoughtful promotions committee will read that correctly.

A lazy one won’t.

How Harsh Comments Are Actually Perceived

Let’s go back to that “worst lecturer in the curriculum” line that crushed my junior colleague.

Here’s the mental filter experienced faculty use when they read hostile comments:

Singular extreme comment in a sea of decent scores
Translation: one unhappy learner, probably personality clash, grading backlash, or someone having a bad day.
Hostile comment + low response rate
Translation: noise. Statistically meaningless.
Hostile comment, but the only specific complaint is “too hard,” “expects too much,” or “tests on things we didn’t see”
Translation: might actually be doing their job.
Hostile comment + specifics that match multiple other comments
Now we pay attention.

There’s also the clinical reality check. If residents say, “She’s mean, she insists I present cases clearly and read overnight,” there’s usually a chuckle and someone says, “Sounds like my best attending in residency.”

But if students say, “He made racist jokes about patients,” the room goes quiet. That’s not teaching style; that’s professionalism.

We’re not idiots. We do know the difference.

How Smart Faculty Use Evals Instead of Fearing Them

The best teachers I know do not have perfect evals. They have clear stories in their evals.

I’ll tell you what they do differently:

They read for themes, not for ego. They keep a running document of:

Phrases that repeat over time (“organized,” “approachable,” “tough but fair”)
Specific criticisms that show up more than once (“too fast,” “slides crowded”)
Comments that align with what they themselves feel is weak

Then they make small, visible changes and—this part matters—they signal those changes to learners.

Examples I’ve seen work:

“Last year I got feedback that I move too fast through the imaging. So today I’m going to pause after each case and give you a minute to process before we discuss.”
“Residents have told me the feedback I give is too vague. I’m going to be more explicit today—expect some very direct comments.”

You know what that does? It:

Shows you take feedback seriously
Lowers the temperature when you inevitably give someone tough feedback
Makes it harder for learners to write lazy, generic complaints

Faculty Response to Teaching Evaluations
Step	Description
Step 1	Receive eval report
Step 2	Note but do not overreact
Step 3	Prioritize 1-2 changes
Step 4	Implement change next cycle
Step 5	Tell learners about changes
Step 6	Review next round of evals
Step 7	Look for patterns

Here’s the real secret: committees love seeing trajectory. If your early years show mixed evals and your later ones show thoughtful improvement, that impresses people far more than a flat line of “fine, I guess.”

What Actually Gets You In Trouble

Let me be crystal clear about the scenarios that really raise red flags in closed-door meetings:

Persistent, multi-year, below-average scores across multiple settings
Not just one tough rotation. Across lectures, wards, small groups, everything. That suggests a global teaching problem, not a bad fit.
Recurrent concerns about humiliation, disrespect, or safety
“Belittles students,” “throws instruments,” “yells in the OR,” “punishes honest mistakes.” Even if the numbers are middling, this kind of pattern forces the committee’s hand. It becomes a professionalism issue.
Discrepancy between evals and what you claim
If your personal statement screams “passionate educator” but your evals are bottom quartile and you do zero faculty development, the mismatch bothers people. It looks like self-delusion or spin.
Ignoring clear, repeated feedback
If every year you hear “too disorganized” and the evals 5 years later say the same thing, promotion committees lose patience. The problem isn’t the evals; it’s your refusal to adapt.

Notice what’s not on that list:

One bad year during COVID chaos
A rough first year on a new clerkship
Lower scores in a notoriously tough course
A handful of angry comments after you failed someone or reported misconduct

We remember we were attendings and residents once too. Most of us have our own horror stories of “that one eval.”

How To Read Your Own Evals Like a Program Director

When you open your next eval report, stop reading it like a wounded human for five minutes and read it like a division chief.

Ask yourself:

Where am I relative to my peers? (If you don’t see comparison data, ask for it.)
What are the 2–3 adjectives that keep repeating across years?
Are my worst comments about style or about safety/respect?
Is there a specific context where my evals are consistently lower? (e.g., lectures vs. bedside)
Can I name one concrete change I’ll make next cycle?

Then, later, once the sting is gone, read them again as a human. Let yourself be proud of the quiet, specific compliments: “Took extra time to explain,” “Made me feel comfortable admitting what I didn’t know,” “Pushed me to be better.”

Those matter more than the random “best lecturer ever!!!” from the student who already loved the topic.

Physician educator annotating evaluation report with notes for improvement - for Behind Closed Doors: How Teaching Evaluatio

If You’re On The Receiving End Of A “Concern”

If your chair calls you in “to talk about your teaching evaluations,” here’s what is usually happening:

Someone (clerkship director, course director, program director) flagged a pattern.
The chair wants to see if this is:
- A documentation problem
- A context problem
- A real performance problem

You are not on trial yet. You are under assessment.

The worst move you can make is immediate defensiveness or blaming “these new learners today.” That tells the chair you’re going to be hard to coach.

A better approach:

Acknowledge you’ve seen the pattern.
Share at least one concrete change you’re willing to try.
Ask if there’s someone in the department with strong evals you could observe or get mentored by.

When we see a faculty member meet this moment thoughtfully, we’re actually relieved. We don’t want to fire you. We want to be able to tell the Dean, “We addressed it, they’re improving.”

FAQ

1. Can bad teaching evaluations actually get me fired or prevent promotion?
Yes—but only when there’s a sustained, multi-year pattern of poor evaluations across multiple teaching settings, usually combined with serious comments about disrespect, humiliation, or unsafe supervision. One bad year, or one hostile cohort, almost never derails a career by itself. It’s the combination of consistency, severity, and refusal to change that becomes lethal in promotion discussions.

2. Do committees really adjust for biased evaluations (gender, race, accent)?
At reputable academic centers, yes, at least informally. Seasoned leaders are very aware that women, underrepresented faculty, and foreign-trained physicians get harsher evals. In meetings, you’ll actually hear things like “remember the bias data” or “she’s in a male-dominated field, curve this in your head.” The problem is that this “correction” is not systematic—it depends on who’s in the room. That’s why you should not count on evals alone to prove your teaching value; build other evidence like peer observations and teaching portfolios.

3. How much do student vs. resident evaluations matter?
For medical school promotions, student evaluations carry more formal weight because they’re standardized and heavily audited for accreditation. For residency-focused faculty, resident evals are scrutinized more closely, especially in ACGME core faculty roles. In truth, committees like seeing strength in both groups: strong UME and GME evals tell us you can teach across levels, which looks very good for promotion. Persistently poor resident evals worry people more, because they imply issues with supervision and patient care.

4. What if my scores are average but I put in a huge amount of teaching effort?
Behind closed doors, effort alone doesn’t move promotion decisions—but documented impact does. If your numbers are average, you can still build a strong teaching case by showing: you developed curricula, created new rotations, mentored successful trainees, led simulations, or improved exam performance or milestone outcomes. Smart faculty don’t rely solely on numeric evals; they collect letters from course directors, document teaching roles, track outcomes, and get peer evaluations. Committees love a coherent narrative of “this person steadily improved and built something that lasts,” even if the raw scores are merely solid rather than spectacular.

When the door closes and your file is on the table, nobody is counting decimal points. They’re asking three questions: Are you safe? Are you teachable? And are you contributing more than you’re costing? If your evaluations tell a story of basic competence, gradual growth, and responsiveness to feedback, you’re going to be fine—no matter what that one vicious comment said.

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

See Your Residency Matches

* 100% free to try. No credit card or account creation required.

Survey Data on Faculty Burnout in Education-Heavy Positions: Key Numbers

57% of faculty in education-heavy medical roles meet burnout criteria. See key numbers, drivers, and actions to keep clerkship directors & educators now.

Elevate Your Medical Teaching Career: Mastering Professional Development

Unlock your potential in medical education with strategic professional development techniques that enhance teaching and accelerate career growth in healthcare.

Match Outcomes for Programs Led by Education-Focused vs Research-Focused PDs

Compare match outcomes for programs led by education vs research-focused PDs and see how PD focus affects fellowship match rates.

Building a Teaching Portfolio: What to Collect Every Year of Training

Build a winning teaching portfolio year-by-year: what to collect during med school, residency, fellowship to secure jobs, promotions, and teaching awards.

The Busy Attending’s Guide to Developing High-Yield Bedside Teaching

Practical 3-7 minute bedside teaching system for busy attendings: high-yield templates, micro-tools, and decision-focused techniques to teach on rounds.

Salary Differences for Clinician-Educators vs Clinician-Researchers: By Rank

Compare clinician-educator vs clinician-researcher salaries by rank, with median pay gaps from assistant to full professor and career impact.

Embracing Change: New Professors Shaping the Future of Medical Education

Explore how new professors are transforming medical education through technology, collaboration, and innovative curriculum development.

Debunking the Myth That Medical Teaching Kills Your Research Career

Learn how medical teaching can support, not derail, your research career; discover strategies for protected time, aligned roles, and recruiting trainees.

Essential Networking Strategies for Success in Medical Education

Discover effective networking strategies for medical educators. Enhance professional growth through mentorship, collaboration, and key connections.

Rural Physician Wanting to Teach: Creating Regional Medical Student Sites

Build a regional medical student site: step-by-step guide for rural physicians to host clerkships, map assets, and pitch med schools.

Competency-Based Medical Education: Practical Implementation for Faculty

Practical competency-based medical education (CBME) implementation guide for faculty: map EPAs, define milestones, use micro-assessments, and coach learners.

Year One as Clerkship Director: Seasonal Tasks and Deadlines to Expect

Guide for new clerkship directors: seasonal tasks, hard deadlines, orientation, evaluations and LCME reporting across the academic year to stay on track.

Academic Promotion Data: How Much Teaching Productivity Actually Matters

Learn how teaching productivity impacts academic promotion: why teaching hours rarely secure advancement and which educational activities actually move the needle.

Should I Choose the Clinician-Educator Track or Traditional Tenure Track?

Choose between clinician-educator track and tenure track: compare daily duties, promotion risks, pay, and culture to pick the right academic medicine path.

Unlocking Success in Medical Academia: Strategies for Future Leaders

Explore essential strategies for academic success in medical education and mentorship. Insights for students and residents seeking a meaningful healthcare career.

Designing Milestone-Based Assessment Systems for Residency Education

Design milestone-based assessment systems for residency: align EPAs, supervision levels, and formative/summative data to improve promotion and patient safety.

Do I Need a Medical Education Fellowship to Become Core Faculty?

Do you need a medical education fellowship to become core faculty? Learn when it helps, alternatives to build educator credentials, and how to advance.

Correlation Between Teaching Awards and Promotion Speed in Academic Medicine

Teaching awards modestly speed faculty promotion in academic medicine; learn which awards matter, typical time savings, and how to use awards effectively.

What If I Don’t Have Any Teaching Awards—Can I Still Be a Medical Educator?

No teaching awards? Learn to build a strong medical educator CV, document teaching roles, and create evidence to advance your medical education career.

Navigating Your Career Shift: Embrace Medical Education as a Teacher

Explore the rewarding journey from clinical practice to a medical teaching career. Learn essential teaching skills and professional development tips for success.

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

See Your Residency Matches

* 100% free to try. No credit card or account creation required.

Category	Value
Preclinical Course	78
Core Clerkship	65
Elective	42
Conference Series	30

Category	Value
Core Clerkship Teaching	40
Preclinical Lectures	30
Elective Teaching	20
One-off Noon Conferences	10

Behind Closed Doors: How Teaching Evaluations Are Read and Scored

What Actually Happens The Day Your Evals Come In

How Program Leadership Actually Reads Your Numbers

The Dark Secret: Comments vs. Scores

Context: The Part Nobody Tells You Gets Adjusted

What Committees Actually Do With Your Evals

For Promotion and Reappointment

For Teaching Awards

The Bias Problem: What We Say Quietly After The Meeting

How Harsh Comments Are Actually Perceived

How Smart Faculty Use Evals Instead of Fearing Them

What Actually Gets You In Trouble

How To Read Your Own Evals Like a Program Director

If You’re On The Receiving End Of A “Concern”

FAQ

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Related Articles

Survey Data on Faculty Burnout in Education-Heavy Positions: Key Numbers

Elevate Your Medical Teaching Career: Mastering Professional Development

Match Outcomes for Programs Led by Education-Focused vs Research-Focused PDs

Building a Teaching Portfolio: What to Collect Every Year of Training

The Busy Attending’s Guide to Developing High-Yield Bedside Teaching

Salary Differences for Clinician-Educators vs Clinician-Researchers: By Rank

Embracing Change: New Professors Shaping the Future of Medical Education

Debunking the Myth That Medical Teaching Kills Your Research Career

Essential Networking Strategies for Success in Medical Education

Rural Physician Wanting to Teach: Creating Regional Medical Student Sites

Competency-Based Medical Education: Practical Implementation for Faculty

Year One as Clerkship Director: Seasonal Tasks and Deadlines to Expect

Academic Promotion Data: How Much Teaching Productivity Actually Matters

Should I Choose the Clinician-Educator Track or Traditional Tenure Track?

Unlocking Success in Medical Academia: Strategies for Future Leaders

Designing Milestone-Based Assessment Systems for Residency Education

Do I Need a Medical Education Fellowship to Become Core Faculty?

Correlation Between Teaching Awards and Promotion Speed in Academic Medicine

What If I Don’t Have Any Teaching Awards—Can I Still Be a Medical Educator?

Navigating Your Career Shift: Embrace Medical Education as a Teacher

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.