
Most “constructive” written feedback for women in medicine is quietly sabotaging them.
Not because your attendings are villains. Because the language they were trained to use is biased, coded, and normalized. And unless you learn to read that code, you will underestimate yourself, internalize garbage, and let biased narratives shape your career.
Let me break this down specifically.
1. How Gendered Language Sneaks Into Evaluations
Every resident has heard some version of this:
- The male resident is “decisive, assertive, a natural leader.”
- The female resident, doing the same thing, is “abrasive, opinionated, can come across as harsh.”
Same behavior. Different adjectives. Different consequences.
This is not abstract theory. It shows up in:
- Clerkship evaluations
- MSPE (Dean’s Letter) narratives
- Residency milestone assessments
- Fellowship and job letters of recommendation
Repeatedly. Across institutions. Across specialties.
The structural problem
Most evaluation systems rely on “free text” narrative comments:
- “Strengths”
- “Areas for improvement”
- “Comments”
The prompts are vague. The writers have variable training. The narrative gets filtered through:
- Gender stereotypes (communal vs agentic traits)
- Racial stereotypes (even more damaging for women of color)
- Role expectations (“team player” vs “leader”)
- Personality “fit” with the attending
So you think you are getting objective data about your performance. You are not. You are getting performance + stereotype + personal preference + mood of the evaluator on post-call day 3.
When you know what to look for, this becomes painfully obvious.
2. The Gendered Vocabulary: What The Words Really Signal
You will not fix bias if you treat all adjectives as neutral. Some words are loaded. Some are red flags. Some are disguised compliments that cap your trajectory.
Let’s dissect the big buckets.
| Category | Value |
|---|---|
| Assertive | 60 |
| Aggressive | 15 |
| Nice | 70 |
| Leader | 55 |
| Bossy | 20 |
| Team player | 80 |
| Brilliant | 40 |
| Hardworking | 75 |
(Think of the first four as more commonly applied to men, the next four disproportionately to women.)
2.1 Agentic vs communal traits
Agentic (power, autonomy, decision-making) terms:
- Assertive
- Confident
- Decisive
- Independent
- Leader / leadership potential
- Takes charge
Communal (support, warmth, cooperation) terms:
- Caring
- Helpful
- Supportive
- Kind
- Nurturing
- Team player
Pattern I see over and over:
Men are framed as agentic plus competent.
Women are framed as communal plus “nice.”
Neither set of words is intrinsically bad. The problem is what they do downstream:
- Agentic + competent → leadership, fellowships, academic tracks
- Communal + nice → “great to work with,” stays in the trenches, overlooked for advancement
Now layer on the negative forms.
2.2 The “too” problem
Men get credit for high-intensity traits; women get punished for the same things.
For women, these show up constantly:
- “Can be too assertive”
- “Comes across as too direct”
- “At times too confident for her level”
- “Can be too outspoken in team discussions”
The “too” is almost always about violating a gender norm, not an actual safety or professionalism issue. If there were true professionalism concerns, the language would (and should) look different:
- “Raised her voice at nursing staff when redirected multiple times.”
- “Ignored direct instruction regarding patient safety protocol.”
You see the difference. One is about tone policing. The other is about specific behavior.
2.3 Personality labels vs performance descriptors
Watch for this split:
Men:
“Excellent clinical reasoning.”
“Impressive fund of knowledge.”
“Handles complex cases independently.”
Women:
“Pleasure to work with.”
“Very nice with staff.”
“Always smiling.”
“Reliable and dependable.”
Ask a simple question: Does this sentence describe what I can do, or what I am like?
If your evaluation skews heavily toward “what you are like,” it is less about your competence and more about your role in keeping the team emotionally comfortable.
2.4 Code words that should make you pause
Here are the words and phrases I pay attention to when I review evaluations for women residents and students:
- “Emotional,” “overly emotional,” “too sensitive” → often used when a woman sets boundaries or shows normal human reaction to stress or unfairness.
- “Not confident enough,” “needs to be more confident” → sometimes true, often a signal of bias when the same attending praises a male colleague as “appropriately cautious.”
- “Can be perceived as harsh” or “might come off as abrasive” → vague, usually tone policing, rarely backed with concrete behavior examples.
- “Quiet,” “reserved,” “soft-spoken” as negatives → often penalizing women for not performing extroverted leadership.
- “Strong team player, always willing to help” as the only compliment → classic trap; you are the support structure, not the star.
And the big one:
- “Not ready for independent practice” with no specific competencies listed. That phrase can kill fellowship chances if it shows up in your MSPE or letters.
3. What High-Stakes Evaluations Actually Do With That Language
The problem is not just hurt feelings. It is structural.
Narrative language gets fed into:
- Rank lists for honors in clerkships
- AΩA membership decisions
- “Top tier” vs “middle tier” ranking at CCC (Clinical Competency Committees)
- Dean’s Letters (MSPE) language bands (“outstanding,” “excellent,” “very good”)
- Fellowship and job selection decisions
| Narrative Pattern | Typical Interpretation | Downstream Effect |
|---|---|---|
| Agentic + specific competence | High performer, leader | Honors, leadership roles |
| Communal + vague praise | Nice, reliable worker | Solid but not “star” |
| Tone-policing with no specific examples | Questionable “fit” | Hesitation for high-visibility |
| “Not confident” with no objective deficiencies | Risk-averse, needs supervision | Less autonomy, weaker letters |
| Lacks leadership language entirely | Not leadership material | Overlooked for chief, committees |
In MSPE review committees, I have literally heard:
- “She sounds very nice, but I am not seeing words that signal leadership.”
- “He is described as assertive and decisive; that reads as a future chief.”
- “There is a comment about her being ‘too direct’—is that a professionalism issue?”
Same text. Wildly different readings depending on who is in the room and how much they understand gendered language.
4. Decoding Your Own Evaluations: A Stepwise Approach
Let’s get practical. You have a pile of evaluations. You sense something is off, but you cannot quite articulate it.
Here is how I would walk through it with you.
| Step | Description |
|---|---|
| Step 1 | Collect evaluations |
| Step 2 | Strip names and dates |
| Step 3 | Highlight competence words |
| Step 4 | Highlight personality words |
| Step 5 | Check balance agentic vs communal |
| Step 6 | Identify vague or coded negatives |
| Step 7 | Extract specific behavior-based feedback |
| Step 8 | Plan response and development steps |
Step 1: Separate signal from noise
Print or digitally copy your evaluations. Then:
- Highlight all comments that describe specific behaviors or skills.
Example: “Presented a well-organized oral case with appropriate differential.” - In another color, highlight adjectives that describe personality or style.
Example: “Pleasant, easygoing, always smiling.”
You want to see the ratio. If behavioral / skill comments are sparse, that is a problem regardless of gender.
Step 2: Sort the adjectives into buckets
On a blank sheet (or spreadsheet), make four columns:
- Agentic positive (assertive, decisive, confident, leader)
- Communal positive (kind, supportive, team player, caring)
- Neutral descriptors (organized, punctual, thorough)
- Negative / coded (abrasive, emotional, too direct, not confident, quiet, etc.)
Count how many you see in each.
Most women I work with are drowning in communal positives, light on agentic positives, and have a few vague negatives that eat away at their confidence.
| Category | Value |
|---|---|
| Agentic positive | 10 |
| Communal positive | 32 |
| Neutral | 14 |
| Negative/coded | 8 |
You can show this to your mentor or PD with a straight face. It is data.
Step 3: Identify which “negatives” are actually growth points vs bias flags
Take each negative or critical comment and ask:
- Is there a specific, observable behavior described?
- Does it tie to a concrete impact on patient care, team function, or professionalism?
- Is it about what I did or how I made them feel?
Examples:
“Needs to improve time management; often started notes late, which delayed sign-out.”
→ Real. Actionable. You can work on this.
“Can come across as intense and may need to soften communication style with staff.”
→ Vague. Could be bias. Needs probing.
“Lacked confidence in procedures, often deferred to seniors even when capable.”
→ Potentially real, but you need to cross-reference with your own experience and other comments.
Step 4: Cross-check patterns across evaluators and settings
One biased attending can write nonsense. A pattern across five rotations is data.
Pay attention to:
- Does “not confident” only show up in surgical rotations with one particular attending?
- Are you “quiet” on every eval, or just in one subspecialty where the culture is toxic?
- Do some attendings explicitly praise your leadership while others call you “too direct”?
If conflicting narratives exist, you are not the problem. The environment is inconsistent. Your job is to choose whose feedback aligns with the physician you are trying to become.
5. Responding Strategically: Turning Biased Feedback Into Leverage
You are not going to fix institutional gender bias by yourself during intern year. But you can:
- Protect your self-perception
- Shape the narrative about you
- Push your program to write better evaluations
5.1 Reframing in your own mind
First, you must stop internalizing biased adjectives as truth.
When you read:
- “Too assertive” → Translate to: “She advocated strongly; this made me uncomfortable.”
- “Not confident enough” (with no data) → “She did not perform confidence theater to my preference.”
- “Quiet” → “She does not perform extroversion; I prefer verbal dominance.”
Then ask: Is the underlying behavior actually misaligned with my goals as a physician?
If no, drop the shame. If yes, adjust the behavior, not your worth.
5.2 Asking for clarification without sounding defensive
Residents worry: “If I push back, I will be labeled difficult.” That risk is real, but there are skillful ways to do this.
You can say to an attending or PD:
“I saw in my evaluation that I ‘can come across as too direct.’ I want to understand what specific interactions or phrases stood out so I can calibrate my communication while still advocating for patients. Can you give me 1–2 concrete examples?”
You are:
- Signaling openness to growth
- Demanding specificity
- Making it harder for them to hide behind vague bias
If they cannot produce a single concrete example, that tells you everything.

5.3 Equipping your allies and mentors
Your mentors cannot advocate for you effectively if they only see the official narrative.
Show them:
- The word distribution you counted
- Representative biased phrases
- The discrepancy between your actual performance and the written comments
Then ask directly:
- “When you write letters for me, can you emphasize my clinical decision-making, independence, and leadership, not just that I am ‘great to work with’?”
- “I want to be seen as a leader. What language do you deliberately use for strong male residents that we should also use for me?”
Good mentors will get it. Many will tell you: “You are right, I have been socialized to describe women differently. Let’s fix that.”
5.4 Pre-empting bias in future evaluations
Yes, you can influence how people write about you. Subtly.
Name your own strengths in agentic terms when you meet a new attending.
“One of my strengths is organizing the team and making clear decisions in acute situations.”Ask for competency-based feedback before the written eval.
“On this rotation I am targeting improvement in leading rounds and independent decision-making. Could you comment on those specifically in my evaluation?”After a strong performance moment (great code leadership, high-stakes family meeting), you can say:
“I would appreciate if you could comment on my leadership and communication in your evaluation—that was a big growth area for me.”
You are not being manipulative. You are counterbalancing a biased system that otherwise defaults to “nice, hardworking, team player” for women.
6. The Ethics: Why Gendered Feedback Is Not Just “Unfair” but Unethical
Let me be blunt. Gendered, biased evaluations are not merely “unfortunate.” They are a professional ethics problem.
Why?
They distort assessment of competence
- Underestimating women’s abilities leads to fewer leadership roles and academic positions. That is a patient care issue downstream. Diversity in leadership improves outcomes; bias undermines that.
They violate principles of justice and fairness in evaluation
- Your institution claims to evaluate on ACGME milestones and competencies. If language diverges by gender for the same behaviors, the system is not just.
They inflict psychological harm
- Repeated exposure to tone-policing and “not confident” narratives fuels imposter syndrome, burnout, and attrition, especially for women and women of color. That crosses into the territory of a hostile learning environment.
They perpetuate stereotypes for the next generation
- Medical students read the MSPE, hear how faculty talk about “strong” vs “nice” residents, and absorb the same patterns.
Ethically aware programs should be doing three things:
- Training faculty on gender and racial bias in written evaluations, with real examples from their own institution
- Standardizing narrative prompts tied to competencies, not personality
- Auditing narrative language by gender, race, and specialty and feeding that back to faculty
If your program is doing none of this, it is behind. Full stop.
| Category | Value |
|---|---|
| No training | 40 |
| One-time workshop | 30 |
| Annual training | 20 |
| Active audit & feedback | 10 |
7. Concrete Institutional Fixes (And How You Can Push For Them)
You are one resident, not the DIO. But you can still influence.
Here is what actually helps at the system level.
7.1 Better evaluation forms
The worst forms: a big empty box that says “comments.”
Better:
- Prompts aligned with competencies: “Describe this resident’s clinical reasoning with a specific example.”
- Required space for strengths and specific areas for growth.
- Separate sections for “interpersonal style” so it does not swallow everything.
Programs can also block certain words or flag them:
- Auto-flag “bossy,” “emotional,” “too sensitive,” “abrasive,” “too direct,” etc. for review by the CCC chair before they go into the MSPE or official record.
You can:
- Bring published data and anonymized examples to your program’s Clinical Competency Committee or Education Committee and ask, “What are we doing to address this?”
7.2 Faculty development that is not fluffy
Bad training: 1-hour PowerPoint, everyone signs attendance sheet, nothing changes.
Better:
- Show de-identified real comments from your program where men and women were described differently for the same behavior.
- Have small-group revision exercises: attendings practice rewriting biased comments into behavior-based, gender-neutral language.
- Provide word banks of performance descriptors tied to competencies.
You can:
- Volunteer to help the GME office or your women in medicine group design such a session.
- Offer to present your own analysis (those word distributions you did) as a case study.
7.3 CCC and MSPE oversight
The Clinical Competency Committee and the Dean’s office have power. They should:
- Scrub egregiously biased language before high-stakes documents are finalized.
- Look for patterns: if one attending consistently uses “abrasive” only for women, that is a problem.
You can:
- Ask your PD or APD: “How do you review for biased language in narratives before they go into promotion decisions and MSPEs?”
- If the answer is “we don’t,” you have identified a concrete advocacy target.

8. Building Your Counter-Narrative
You cannot fully control what others write. You can control the story you tell about yourself.
I push residents to keep a “counter-narrative” file:
- Specific examples where you led codes, handled difficult conversations, made key diagnoses, ran efficient rounds
- Emails or messages of praise from staff, patients, peers
- Selected excerpts from evaluations that highlight agentic language (“excellent leadership,” “independent decision-making”)
Why?
- When an evaluation knocks you with “not confident,” you can open your file and see ten examples that contradict that.
- When you ask for letters, you can provide bullet points from this file that prompt your letter-writers to use strong, concrete language.
- When you go up for chief or leadership roles, you have evidence ready.
You are building a portfolio of who you actually are as a physician, not just who biased language says you are.

FAQ (4 Questions)
1. How can I tell if a negative comment is legitimate feedback or just gender bias?
Look for specificity and impact. Legitimate feedback names concrete behaviors (“frequently arrives 10–15 minutes late to sign-out”) and ties them to outcomes (delays, safety concerns). Biased comments are vague (“too direct,” “a bit intense”) and focus on how your presence makes someone feel rather than what you actually did. When in doubt, ask the evaluator for 1–2 specific examples. If they cannot produce any, lean toward bias.
2. Should I ever ask for an evaluation to be changed or removed?
Yes, in targeted situations. If an evaluation contains clearly inappropriate, gendered, or personal attacks (“emotional,” “too hormonal,” “not likeable”), you can and should raise this with your program director or clerkship director. Frame it as an issue of professionalism in assessment, not bruised feelings: “This language does not describe observable behavior and is potentially discriminatory. I am concerned about it being part of my permanent record.” Some programs will amend or annotate such evaluations, especially for high-stakes documents.
3. How do I talk about biased feedback in my own residency or fellowship interviews without sounding like I am making excuses?
You do not need to recite every biased comment. If asked about a “weakness” or “area of growth,” you can say: “One thing I have worked on is calibrating assertiveness in different team cultures. Early on, some feedback framed my advocacy as ‘too direct.’ I sought specific examples, worked with mentors on phrasing and timing, and now I receive consistent comments about effective, clear leadership in crises.” You acknowledge feedback, show growth, and quietly signal that you recognize bias without ranting about it.
4. I am an attending and realize I might be part of this problem. What should I change today?
Start by banning personality labels from your first drafts. Describe behaviors and their impact, tied to competencies: “She led the resuscitation, delegated tasks clearly, and adjusted the plan in real time.” Use leadership and competence language for women as freely as you do for men when it is warranted. Avoid “too” modifiers unless they are backed by concrete examples and safety implications. Finally, periodically review your own past evaluations by gender—if women are “nice and hardworking” and men are “brilliant leaders,” you have work to do. Better to confront that now than keep reproducing inequity.
With this framework, you are no longer at the mercy of whatever random adjectives show up in your file. You are starting to read, question, and reshape the story. The next step is using that clarity to choose your mentors, your environments, and eventually your own leadership style. And when you are the one writing evaluations, you will know exactly how not to repeat the same lazy patterns.