Resources Women in Medicine Performance Evaluations: Decoding Gendered Language in Written Feedback

Performance Evaluations: Decoding Gendered Language in Written Feedback

January 8, 2026

18 minute read

gender bias performance evaluations written feedback women in medicine medical education implicit bias clerkship evaluations mspe

Female physician reviewing performance evaluations highlighting gender bias - for Performance Evaluations: Decoding Gendered

Most “constructive” written feedback for women in medicine is quietly sabotaging them.

Not because your attendings are villains. Because the language they were trained to use is biased, coded, and normalized. And unless you learn to read that code, you will underestimate yourself, internalize garbage, and let biased narratives shape your career.

Let me break this down specifically.

1. How Gendered Language Sneaks Into Evaluations

Every resident has heard some version of this:

The male resident is “decisive, assertive, a natural leader.”
The female resident, doing the same thing, is “abrasive, opinionated, can come across as harsh.”

Same behavior. Different adjectives. Different consequences.

This is not abstract theory. It shows up in:

Clerkship evaluations
MSPE (Dean’s Letter) narratives
Residency milestone assessments
Fellowship and job letters of recommendation

Repeatedly. Across institutions. Across specialties.

The structural problem

Most evaluation systems rely on “free text” narrative comments:

“Strengths”
“Areas for improvement”
“Comments”

The prompts are vague. The writers have variable training. The narrative gets filtered through:

Gender stereotypes (communal vs agentic traits)
Racial stereotypes (even more damaging for women of color)
Role expectations (“team player” vs “leader”)
Personality “fit” with the attending

So you think you are getting objective data about your performance. You are not. You are getting performance + stereotype + personal preference + mood of the evaluator on post-call day 3.

When you know what to look for, this becomes painfully obvious.

2. The Gendered Vocabulary: What The Words Really Signal

You will not fix bias if you treat all adjectives as neutral. Some words are loaded. Some are red flags. Some are disguised compliments that cap your trajectory.

Let’s dissect the big buckets.

bar chart: Assertive, Aggressive, Nice, Leader, Bossy, Team player, Brilliant, Hardworking

(Think of the first four as more commonly applied to men, the next four disproportionately to women.)

2.1 Agentic vs communal traits

Agentic (power, autonomy, decision-making) terms:

Assertive
Confident
Decisive
Independent
Leader / leadership potential
Takes charge

Communal (support, warmth, cooperation) terms:

Caring
Helpful
Supportive
Kind
Nurturing
Team player

Pattern I see over and over:
Men are framed as agentic plus competent.
Women are framed as communal plus “nice.”

Neither set of words is intrinsically bad. The problem is what they do downstream:

Agentic + competent → leadership, fellowships, academic tracks
Communal + nice → “great to work with,” stays in the trenches, overlooked for advancement

Now layer on the negative forms.

2.2 The “too” problem

Men get credit for high-intensity traits; women get punished for the same things.

For women, these show up constantly:

“Can be too assertive”
“Comes across as too direct”
“At times too confident for her level”
“Can be too outspoken in team discussions”

The “too” is almost always about violating a gender norm, not an actual safety or professionalism issue. If there were true professionalism concerns, the language would (and should) look different:

“Raised her voice at nursing staff when redirected multiple times.”
“Ignored direct instruction regarding patient safety protocol.”

You see the difference. One is about tone policing. The other is about specific behavior.

2.3 Personality labels vs performance descriptors

Watch for this split:

Men:
“Excellent clinical reasoning.”
“Impressive fund of knowledge.”
“Handles complex cases independently.”

Women:
“Pleasure to work with.”
“Very nice with staff.”
“Always smiling.”
“Reliable and dependable.”

Ask a simple question: Does this sentence describe what I can do, or what I am like?

If your evaluation skews heavily toward “what you are like,” it is less about your competence and more about your role in keeping the team emotionally comfortable.

2.4 Code words that should make you pause

Here are the words and phrases I pay attention to when I review evaluations for women residents and students:

“Emotional,” “overly emotional,” “too sensitive” → often used when a woman sets boundaries or shows normal human reaction to stress or unfairness.
“Not confident enough,” “needs to be more confident” → sometimes true, often a signal of bias when the same attending praises a male colleague as “appropriately cautious.”
“Can be perceived as harsh” or “might come off as abrasive” → vague, usually tone policing, rarely backed with concrete behavior examples.
“Quiet,” “reserved,” “soft-spoken” as negatives → often penalizing women for not performing extroverted leadership.
“Strong team player, always willing to help” as the only compliment → classic trap; you are the support structure, not the star.

And the big one:

“Not ready for independent practice” with no specific competencies listed. That phrase can kill fellowship chances if it shows up in your MSPE or letters.

3. What High-Stakes Evaluations Actually Do With That Language

The problem is not just hurt feelings. It is structural.

Narrative language gets fed into:

Rank lists for honors in clerkships
AΩA membership decisions
“Top tier” vs “middle tier” ranking at CCC (Clinical Competency Committees)
Dean’s Letters (MSPE) language bands (“outstanding,” “excellent,” “very good”)
Fellowship and job selection decisions

How Narrative Language Translates Into Outcomes

Narrative Pattern	Typical Interpretation	Downstream Effect
Agentic + specific competence	High performer, leader	Honors, leadership roles
Communal + vague praise	Nice, reliable worker	Solid but not “star”
Tone-policing with no specific examples	Questionable “fit”	Hesitation for high-visibility
“Not confident” with no objective deficiencies	Risk-averse, needs supervision	Less autonomy, weaker letters
Lacks leadership language entirely	Not leadership material	Overlooked for chief, committees

In MSPE review committees, I have literally heard:

“She sounds very nice, but I am not seeing words that signal leadership.”
“He is described as assertive and decisive; that reads as a future chief.”
“There is a comment about her being ‘too direct’—is that a professionalism issue?”

Same text. Wildly different readings depending on who is in the room and how much they understand gendered language.

4. Decoding Your Own Evaluations: A Stepwise Approach

Let’s get practical. You have a pile of evaluations. You sense something is off, but you cannot quite articulate it.

Here is how I would walk through it with you.

Process for Reviewing Written Evaluations
Step	Description
Step 1	Collect evaluations
Step 2	Strip names and dates
Step 3	Highlight competence words
Step 4	Highlight personality words
Step 5	Check balance agentic vs communal
Step 6	Identify vague or coded negatives
Step 7	Extract specific behavior-based feedback
Step 8	Plan response and development steps

Step 1: Separate signal from noise

Print or digitally copy your evaluations. Then:

Highlight all comments that describe specific behaviors or skills.
Example: “Presented a well-organized oral case with appropriate differential.”
In another color, highlight adjectives that describe personality or style.
Example: “Pleasant, easygoing, always smiling.”

You want to see the ratio. If behavioral / skill comments are sparse, that is a problem regardless of gender.

Step 2: Sort the adjectives into buckets

On a blank sheet (or spreadsheet), make four columns:

Agentic positive (assertive, decisive, confident, leader)
Communal positive (kind, supportive, team player, caring)
Neutral descriptors (organized, punctual, thorough)
Negative / coded (abrasive, emotional, too direct, not confident, quiet, etc.)

Count how many you see in each.

Most women I work with are drowning in communal positives, light on agentic positives, and have a few vague negatives that eat away at their confidence.

doughnut chart: Agentic positive, Communal positive, Neutral, Negative/coded

You can show this to your mentor or PD with a straight face. It is data.

Step 3: Identify which “negatives” are actually growth points vs bias flags

Take each negative or critical comment and ask:

Is there a specific, observable behavior described?
Does it tie to a concrete impact on patient care, team function, or professionalism?
Is it about what I did or how I made them feel?

Examples:

“Needs to improve time management; often started notes late, which delayed sign-out.”
→ Real. Actionable. You can work on this.

“Can come across as intense and may need to soften communication style with staff.”
→ Vague. Could be bias. Needs probing.

“Lacked confidence in procedures, often deferred to seniors even when capable.”
→ Potentially real, but you need to cross-reference with your own experience and other comments.

Step 4: Cross-check patterns across evaluators and settings

One biased attending can write nonsense. A pattern across five rotations is data.

Pay attention to:

Does “not confident” only show up in surgical rotations with one particular attending?
Are you “quiet” on every eval, or just in one subspecialty where the culture is toxic?
Do some attendings explicitly praise your leadership while others call you “too direct”?

If conflicting narratives exist, you are not the problem. The environment is inconsistent. Your job is to choose whose feedback aligns with the physician you are trying to become.

5. Responding Strategically: Turning Biased Feedback Into Leverage

You are not going to fix institutional gender bias by yourself during intern year. But you can:

Protect your self-perception
Shape the narrative about you
Push your program to write better evaluations

5.1 Reframing in your own mind

First, you must stop internalizing biased adjectives as truth.

When you read:

“Too assertive” → Translate to: “She advocated strongly; this made me uncomfortable.”
“Not confident enough” (with no data) → “She did not perform confidence theater to my preference.”
“Quiet” → “She does not perform extroversion; I prefer verbal dominance.”

Then ask: Is the underlying behavior actually misaligned with my goals as a physician?
If no, drop the shame. If yes, adjust the behavior, not your worth.

5.2 Asking for clarification without sounding defensive

Residents worry: “If I push back, I will be labeled difficult.” That risk is real, but there are skillful ways to do this.

You can say to an attending or PD:

“I saw in my evaluation that I ‘can come across as too direct.’ I want to understand what specific interactions or phrases stood out so I can calibrate my communication while still advocating for patients. Can you give me 1–2 concrete examples?”

You are:

Signaling openness to growth
Demanding specificity
Making it harder for them to hide behind vague bias

If they cannot produce a single concrete example, that tells you everything.

Female resident in feedback meeting with program director - for Performance Evaluations: Decoding Gendered Language in Writt

5.3 Equipping your allies and mentors

Your mentors cannot advocate for you effectively if they only see the official narrative.

Show them:

The word distribution you counted
Representative biased phrases
The discrepancy between your actual performance and the written comments

Then ask directly:

“When you write letters for me, can you emphasize my clinical decision-making, independence, and leadership, not just that I am ‘great to work with’?”
“I want to be seen as a leader. What language do you deliberately use for strong male residents that we should also use for me?”

Good mentors will get it. Many will tell you: “You are right, I have been socialized to describe women differently. Let’s fix that.”

5.4 Pre-empting bias in future evaluations

Yes, you can influence how people write about you. Subtly.

Name your own strengths in agentic terms when you meet a new attending.
“One of my strengths is organizing the team and making clear decisions in acute situations.”
Ask for competency-based feedback before the written eval.
“On this rotation I am targeting improvement in leading rounds and independent decision-making. Could you comment on those specifically in my evaluation?”
After a strong performance moment (great code leadership, high-stakes family meeting), you can say:
“I would appreciate if you could comment on my leadership and communication in your evaluation—that was a big growth area for me.”

You are not being manipulative. You are counterbalancing a biased system that otherwise defaults to “nice, hardworking, team player” for women.

6. The Ethics: Why Gendered Feedback Is Not Just “Unfair” but Unethical

Let me be blunt. Gendered, biased evaluations are not merely “unfortunate.” They are a professional ethics problem.

Why?

They distort assessment of competence
- Underestimating women’s abilities leads to fewer leadership roles and academic positions. That is a patient care issue downstream. Diversity in leadership improves outcomes; bias undermines that.
They violate principles of justice and fairness in evaluation
- Your institution claims to evaluate on ACGME milestones and competencies. If language diverges by gender for the same behaviors, the system is not just.
They inflict psychological harm
- Repeated exposure to tone-policing and “not confident” narratives fuels imposter syndrome, burnout, and attrition, especially for women and women of color. That crosses into the territory of a hostile learning environment.
They perpetuate stereotypes for the next generation
- Medical students read the MSPE, hear how faculty talk about “strong” vs “nice” residents, and absorb the same patterns.

Ethically aware programs should be doing three things:

Training faculty on gender and racial bias in written evaluations, with real examples from their own institution
Standardizing narrative prompts tied to competencies, not personality
Auditing narrative language by gender, race, and specialty and feeding that back to faculty

If your program is doing none of this, it is behind. Full stop.

hbar chart: No training, One-time workshop, Annual training, Active audit & feedback

7. Concrete Institutional Fixes (And How You Can Push For Them)

You are one resident, not the DIO. But you can still influence.

Here is what actually helps at the system level.

7.1 Better evaluation forms

The worst forms: a big empty box that says “comments.”

Better:

Prompts aligned with competencies: “Describe this resident’s clinical reasoning with a specific example.”
Required space for strengths and specific areas for growth.
Separate sections for “interpersonal style” so it does not swallow everything.

Programs can also block certain words or flag them:

Auto-flag “bossy,” “emotional,” “too sensitive,” “abrasive,” “too direct,” etc. for review by the CCC chair before they go into the MSPE or official record.

You can:

Bring published data and anonymized examples to your program’s Clinical Competency Committee or Education Committee and ask, “What are we doing to address this?”

7.2 Faculty development that is not fluffy

Bad training: 1-hour PowerPoint, everyone signs attendance sheet, nothing changes.

Better:

Show de-identified real comments from your program where men and women were described differently for the same behavior.
Have small-group revision exercises: attendings practice rewriting biased comments into behavior-based, gender-neutral language.
Provide word banks of performance descriptors tied to competencies.

You can:

Volunteer to help the GME office or your women in medicine group design such a session.
Offer to present your own analysis (those word distributions you did) as a case study.

7.3 CCC and MSPE oversight

The Clinical Competency Committee and the Dean’s office have power. They should:

Scrub egregiously biased language before high-stakes documents are finalized.
Look for patterns: if one attending consistently uses “abrasive” only for women, that is a problem.

You can:

Ask your PD or APD: “How do you review for biased language in narratives before they go into promotion decisions and MSPEs?”
If the answer is “we don’t,” you have identified a concrete advocacy target.

Graduate medical education committee reviewing evaluation data - for Performance Evaluations: Decoding Gendered Language in

8. Building Your Counter-Narrative

You cannot fully control what others write. You can control the story you tell about yourself.

I push residents to keep a “counter-narrative” file:

Specific examples where you led codes, handled difficult conversations, made key diagnoses, ran efficient rounds
Emails or messages of praise from staff, patients, peers
Selected excerpts from evaluations that highlight agentic language (“excellent leadership,” “independent decision-making”)

Why?

When an evaluation knocks you with “not confident,” you can open your file and see ten examples that contradict that.
When you ask for letters, you can provide bullet points from this file that prompt your letter-writers to use strong, concrete language.
When you go up for chief or leadership roles, you have evidence ready.

You are building a portfolio of who you actually are as a physician, not just who biased language says you are.

Female physician maintaining a professional accomplishments journal - for Performance Evaluations: Decoding Gendered Languag

FAQ (4 Questions)

1. How can I tell if a negative comment is legitimate feedback or just gender bias?
Look for specificity and impact. Legitimate feedback names concrete behaviors (“frequently arrives 10–15 minutes late to sign-out”) and ties them to outcomes (delays, safety concerns). Biased comments are vague (“too direct,” “a bit intense”) and focus on how your presence makes someone feel rather than what you actually did. When in doubt, ask the evaluator for 1–2 specific examples. If they cannot produce any, lean toward bias.

2. Should I ever ask for an evaluation to be changed or removed?
Yes, in targeted situations. If an evaluation contains clearly inappropriate, gendered, or personal attacks (“emotional,” “too hormonal,” “not likeable”), you can and should raise this with your program director or clerkship director. Frame it as an issue of professionalism in assessment, not bruised feelings: “This language does not describe observable behavior and is potentially discriminatory. I am concerned about it being part of my permanent record.” Some programs will amend or annotate such evaluations, especially for high-stakes documents.

3. How do I talk about biased feedback in my own residency or fellowship interviews without sounding like I am making excuses?
You do not need to recite every biased comment. If asked about a “weakness” or “area of growth,” you can say: “One thing I have worked on is calibrating assertiveness in different team cultures. Early on, some feedback framed my advocacy as ‘too direct.’ I sought specific examples, worked with mentors on phrasing and timing, and now I receive consistent comments about effective, clear leadership in crises.” You acknowledge feedback, show growth, and quietly signal that you recognize bias without ranting about it.

4. I am an attending and realize I might be part of this problem. What should I change today?
Start by banning personality labels from your first drafts. Describe behaviors and their impact, tied to competencies: “She led the resuscitation, delegated tasks clearly, and adjusted the plan in real time.” Use leadership and competence language for women as freely as you do for men when it is warranted. Avoid “too” modifiers unless they are backed by concrete examples and safety implications. Finally, periodically review your own past evaluations by gender—if women are “nice and hardworking” and men are “brilliant leaders,” you have work to do. Better to confront that now than keep reproducing inequity.

With this framework, you are no longer at the mercy of whatever random adjectives show up in your file. You are starting to read, question, and reshape the story. The next step is using that clarity to choose your mentors, your environments, and eventually your own leadership style. And when you are the one writing evaluations, you will know exactly how not to repeat the same lazy patterns.

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

See Your Residency Matches

* 100% free to try. No credit card or account creation required.

Empowering Women Leaders in Medicine: Shaping Healthcare's Future

Discover how women in medicine are transforming healthcare leadership and driving gender equity for better patient-centered care and outcomes.

If Your Co-Resident Takes Credit for Your Work in Front of Faculty

Learn concise, professional scripts to reclaim credit when a co-resident presents your work. Protect your reputation, CV, and career in residency.

What Male PDs Quietly Admit About Evaluating Women Residents

Insight into how male program directors assess women residents—common biases in evaluations, leadership perception, and career-impacting double standards.

What Happens When a Woman Resident Pushes Back on Attendings

What happens when a woman resident pushes back on attendings: recognize gender bias, manage power dynamics, and protect patient safety.

Negotiated vs Posted Salaries: How Women Doctors Fare in Each Model

Compare negotiated vs posted salary models and their effect on the gender pay gap for women physicians - evidence-based strategies to pursue fair pay.

When a Patient Asks to See a ‘Real Doctor’ Instead of You

Practical scripts and strategies for physicians when patients ask for a real doctor. Learn to address bias, protect safety, and assert your role.

Can I Still Match Competitively If I Reported Harassment in School?

Worried you reported harassment? Learn how reporting affects your residency match, protect your application, and strategies to still match competitively.

Empowering Women in Medicine: Resilience and Real Stories of Triumph

Explore inspiring stories of women in medicine overcoming healthcare challenges through resilience, mentorship, and advocacy. Join their transformative journey!

Why Some Female Residents Get Labeled ‘Difficult’ Behind the Scenes

Explore why female residents are labeled 'difficult' in residency, how gender bias skews evaluations, and practical ways to respond and protect your career.

How to Manage Safety Concerns Leaving Late Shifts as a Woman Physician

Actionable safety strategies for women physicians leaving late shifts: step-by-step exit plan, parking checks, key-in-hand tips, and when to call security.

Myth: Speaking Up About Sexism Always Destroys Your Medical Career

Discover evidence that reporting sexism in medicine rarely ruins careers. Advice for trainees on speaking up, documenting incidents, and handling backlash.

What’s the Best Way for Women to Ask for Letters of Recommendation?

Learn how women in medicine can confidently request strong letters of recommendation—exact scripts, who to ask, and how to avoid biased or vague endorsements.

Burnout Statistics for Women Physicians by Career Stage and Setting

Burnout statistics for women physicians by career stage and setting — residency to late career. Data-driven rates, causes, and actionable solutions.

The Real Politics of Pregnancy Announcements in Residency

Learn the politics of pregnancy announcements in residency—best timing, program reactions, and how to protect your training and professional reputation.

Empowering Women in Medicine: A New Era of Healthcare Leadership

Explore the pivotal role of women in medicine, their impact on healthcare leadership, and the path towards gender equality and innovation in the medical field.

Five-Year Career Planning Grid for Women in Academic Medicine

Five-year career planning grid for women in academic medicine: practical month-by-month and year-by-year steps to advance promotion, scholarship, leadership.

How Female Attendings Are Actually Judged on ‘Likeability’

How female attendings are judged on 'likeability' - learn how gender bias skews evaluations and promotions, and actions to address it.

The ‘Mommy Track’ Story in Medicine: Inevitable or Outdated Myth?

Debunking the 'mommy track' in medicine: institutional bias, not motherhood, often stalls physicians' careers — evidence-based fixes for equity.

How Direct Should Women Be in Negotiating Their First Physician Contract?

Learn how women physicians can negotiate their first physician contract confidently—use MGMA data, protect schedule and salary, and secure written terms.

How Do I Know If a Program Is Truly Supportive of Women Residents?

Learn how to evaluate if a residency truly supports women residents-ask for data, review parental leave and lactation policies, and hear women's experiences.

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

See Your Residency Matches

* 100% free to try. No credit card or account creation required.

Category	Value
Assertive	60
Aggressive	15
Nice	70
Leader	55
Bossy	20
Team player	80
Brilliant	40
Hardworking	75

Category	Value
Agentic positive	10
Communal positive	32
Neutral	14
Negative/coded	8

Category	Value
No training	40
One-time workshop	30
Annual training	20
Active audit & feedback	10

Performance Evaluations: Decoding Gendered Language in Written Feedback

1. How Gendered Language Sneaks Into Evaluations

The structural problem

2. The Gendered Vocabulary: What The Words Really Signal

2.1 Agentic vs communal traits

2.2 The “too” problem

2.3 Personality labels vs performance descriptors

2.4 Code words that should make you pause

3. What High-Stakes Evaluations Actually Do With That Language

4. Decoding Your Own Evaluations: A Stepwise Approach

Step 1: Separate signal from noise

Step 2: Sort the adjectives into buckets

Step 3: Identify which “negatives” are actually growth points vs bias flags

Step 4: Cross-check patterns across evaluators and settings

5. Responding Strategically: Turning Biased Feedback Into Leverage

5.1 Reframing in your own mind

5.2 Asking for clarification without sounding defensive

5.3 Equipping your allies and mentors

5.4 Pre-empting bias in future evaluations

6. The Ethics: Why Gendered Feedback Is Not Just “Unfair” but Unethical

7. Concrete Institutional Fixes (And How You Can Push For Them)

7.1 Better evaluation forms

7.2 Faculty development that is not fluffy

7.3 CCC and MSPE oversight

8. Building Your Counter-Narrative

FAQ (4 Questions)

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Related Articles

Empowering Women Leaders in Medicine: Shaping Healthcare's Future

If Your Co-Resident Takes Credit for Your Work in Front of Faculty

What Male PDs Quietly Admit About Evaluating Women Residents

What Happens When a Woman Resident Pushes Back on Attendings

Negotiated vs Posted Salaries: How Women Doctors Fare in Each Model

When a Patient Asks to See a ‘Real Doctor’ Instead of You

Can I Still Match Competitively If I Reported Harassment in School?

Empowering Women in Medicine: Resilience and Real Stories of Triumph

Why Some Female Residents Get Labeled ‘Difficult’ Behind the Scenes

How to Manage Safety Concerns Leaving Late Shifts as a Woman Physician

Myth: Speaking Up About Sexism Always Destroys Your Medical Career

What’s the Best Way for Women to Ask for Letters of Recommendation?

Burnout Statistics for Women Physicians by Career Stage and Setting

The Real Politics of Pregnancy Announcements in Residency

Empowering Women in Medicine: A New Era of Healthcare Leadership

Five-Year Career Planning Grid for Women in Academic Medicine

How Female Attendings Are Actually Judged on ‘Likeability’

The ‘Mommy Track’ Story in Medicine: Inevitable or Outdated Myth?

How Direct Should Women Be in Negotiating Their First Physician Contract?

How Do I Know If a Program Is Truly Supportive of Women Residents?

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.