Residency Advisor Logo Residency Advisor

Question Bank Triage: Exactly Which Items to Flag and Re-Review

January 5, 2026
19 minute read

Medical student reviewing question bank on laptop with notes and highlighters -  for Question Bank Triage: Exactly Which Item

The way most students “mark” questions in a bank is chaotic and nearly useless.

They flag everything that felt hard, confusing, or new. Two months later they are staring at 400+ marked questions with no idea what to do next. That is not a strategy. That is a digital junk drawer.

Let me be very concrete: you should only flag questions that change your future performance. Everything else is noise.

This is an article about that line. Exactly what deserves a flag, what does not, and how to structure re-review so it actually raises your exam score instead of just making you feel “busy.”


The Real Purpose of Flagging (Most People Get This Wrong)

Flagging is not about:

  • “Interesting” questions
  • Questions you got wrong
  • High-yield looking topics
  • Things that make you feel dumb

Flagging is about leverage: identifying items that will, if revisited, lead to disproportionate gains in points or pattern recognition.

If a flagged question does not do at least one of these, you should not have flagged it:

  1. Exposes a repeatable pattern that you tend to miss
  2. Reveals a misconception that will tank multiple questions in that domain
  3. Introduces a core fact or framework that is high-yield and easily re-tested
  4. Is a “prototype” question that many other stems are built around

You are not building a museum of your suffering. You are building a small, targeted library of items that predict future points.


The Six Categories of Questions You Should Flag

Let me break this down precisely. These are the buckets that actually matter.

doughnut chart: Conceptual Framework Gaps, Systematic Misreads, Core Must-Memorize Facts, High-Yield Prototype Stems, Missed but Knew (Careless), “I Guessed but Got Lucky”

Recommended Distribution of Flagged Questions by Category
CategoryValue
Conceptual Framework Gaps25
Systematic Misreads15
Core Must-Memorize Facts20
High-Yield Prototype Stems20
Missed but Knew (Careless)10
“I Guessed but Got Lucky”10

1. Conceptual Framework Gaps (Your #1 Priority)

These are questions you miss because your understanding is broken, not your memory.

Typical pattern:

  • You read the explanation and realize, “I never actually understood how this pathway/physiology/algorithm worked.”
  • The answer choice makes sense only after you see the whole concept tied together.
  • You could talk through the idea now, but you would not have derived it on your own beforehand.

Examples:

  • Cardiology: You mis-handle a question on aortic regurgitation and after reading the explanation, you realize you never truly internalized pressure-volume loops.
  • Renal: You miss an acid–base question and see you have no mental model for respiratory vs metabolic compensation.
  • Micro/ID: You keep mixing up which bugs cause what kind of diarrhea because you never built a schema, just random fact fragments.

Flag these because revisiting them (and their associated frameworks) will fix 5–20 future questions, not just this one.

What to annotate:

  • 1–2 sentence “mental model” summary in your own words, not copied from the explanation.
  • A tiny rule or decision algorithm you can later convert into an Anki card.
  • One “anchor” clinical detail that helps you recognize the pattern again.

Test for “framework gap”:
“If I changed the numbers and surface details of this question, could I still reason my way to the right answer?”
If the answer is no → flag.


2. Systematic Misreads and Cognitive Traps

These are the “I should have gotten that” questions where, if you are honest, you did not actually understand what was being asked.

You either:

  • Glanced past a key sentence (e.g., “He stopped taking his medication 2 days ago…”),
  • Misread the question stem type (“Which of the following is the most appropriate next step?” vs “Which finding is most likely?”), or
  • Got sucked into irrelevant buzzwords and ignored the decision point.

Typical indicators:

  • You chose an answer that would be correct in a slightly different version of the scenario.
  • On re-read, the correct answer feels obvious and not actually that hard.
  • Your explanation to yourself starts with, “I did not see that they said…” or “I misread the age/setting/vitals.”

Flag some of these—but not all. You are looking for patterns of error, not every individual mistake.

Flag when:

  • The misread aligns with your known tendencies (rushing, ignoring timelines, not reading labs carefully).
  • The question highlights a particular phrasing or cue you frequently misinterpret.
  • The explanation shows a clear pivot sentence or single phrase you should have keyed into.

Your annotation should be about process, not content:

  • “Always check: what are they asking—diagnosis, mechanism, complication, next step?”
  • “When they say ‘management,’ think: stabilize vs diagnose vs treat vs screen.”
  • “Look for time course: hours vs days vs months completely changes answer.”

If you are flagging for misreads and your notes do not change how you will read the next question, you are wasting the flag.


3. Core Must-Memorize Facts You Straight-Up Did Not Know

There are some questions where the honest reality is: you did not know the fact, and there is no way to “reason” to the answer.

These can be worth flagging if they meet all of these criteria:

  1. The fact is high-yield (seen often across practice tests, NBME, UWorld, AMBOSS, etc.).
  2. The fact is specific and non-obvious.
  3. The question stem is a good vehicle for remembering and applying that fact later.

Examples that are worth flagging:

  • “What nerve is injured in a surgical ligation of the inferior thyroid artery?” (Answer: recurrent laryngeal nerve.)
  • “What enzyme deficiency leads to increased orotic acid without hyperammonemia?” (Answer: UMP synthase deficiency.)
  • “What is the prophylactic treatment for variceal bleeding in cirrhosis?” (Nonselective beta blockers like propranolol or nadolol.)

Examples not worth flagging:

  • You did not remember the name of an obscure enzyme that will never show up again.
  • Super low-yield eponyms that appear once in a random bank and nowhere else.
  • Hyper-contrived questions that force memorization of pointless minutiae.

If you would not turn the key fact from that question into a flashcard, do not flag it. Same principle.


4. High-Yield Prototype Stems

These are questions that are too good not to revisit. They encapsulate so much:

  • Classic presentation
  • High-yield mechanism
  • Distinguishing features from common distractor diagnoses
  • Board-style phrasing

I treat these as “teaching cases” inside the question bank.

Prototype examples:

  • “Painful thyroid, elevated ESR, hyperthyroid symptoms after viral illness → subacute (de Quervain) thyroiditis” case that cleanly contrasts with Graves and silent thyroiditis.
  • “Elderly patient, hip fracture after minor trauma, lytic lesion on imaging → metastatic cancer to bone” that elegantly walks you through differential for pathologic fracture.
  • ST elevation in II, III, aVF with hypotension, clear lungs → right ventricular MI; avoid nitrates, give fluids” with perfect EKG and clinical correlation.

Flag these because:

  • Reviewing them is like reviewing an entire lecture in 2 minutes.
  • They often recur with only small variations across different exams.
  • They are excellent anchors for building mental schemas.

Annotation goal: turn stem into a compressed “case-definition” note plus 2–3 differentiating clues from look-alikes.


5. Missed but Knew It (Careless Errors Only If Recurrent)

You do not flag every careless mistake—that becomes noise.

You flag when:

  • The same type of careless error has shown up multiple times in the last few days or weeks.
  • The error is linked to your habits: not checking units, not scanning all answer choices, changing correct answers at the last second.
  • The question nicely exposes that bad habit in a way that will make you wince on review.

Example patterns:

  • Always missing “except,” “least likely,” or “most contraindicated” because you rush.
  • Mis-clicking the adjacent answer because you are doing questions half-awake at midnight.
  • Changing from your first correct instinct to an overthought wrong answer.

Flag 1–2 “representative” questions for each recurring careless pattern. Those will be your reminder cases when you re-review before the exam: “Do not do this again.”


6. The “I Guessed but Got Lucky” Category (High-Utility Only)

These are questions you got right, but for the wrong reason or pure guess. If you notice that and shrug it off, you are missing gold.

Flag when:

  • You cannot, on review, clearly justify why the right answer is right and the others are wrong.
  • The explanation reveals a mechanism / pathway / association you clearly did not previously own.
  • The topic is high-yield enough that it will not be the last time you see it.

This is one of the most profitable categories to flag. Why? Because your score report will not show you these weaknesses. The question bank is the only time you see the “underlying instability” in your knowledge structure.

Annotation here:

  • One-sentence “If they mention X, think Y because Z” style rule.
  • A brief clarification of why your original reasoning was flawed.

If you repeatedly get “lucky” on the same topic, that is not luck. That is a landmine waiting for you on test day.


What You Should Not Flag (Even If It Hurts Your Ego)

Now the part students ignore: there are questions you must let go. They can be educational without being re-review worthy.

Frustrated medical student deciding which question bank items to flag -  for Question Bank Triage: Exactly Which Items to Fla

1. Pure Curveballs and Edge Cases

Every exam has a “tax” of questions that essentially no one knows and no one should care about.

Clues you are dealing with one of these:

  • The explanation itself basically says, “This is rare but sometimes tested,” with a page-long digression.
  • You have never seen this topic in any other resource, not even tangentially.
  • The clinical scenario is absurdly contrived and does not feel like real medicine.

Learning from them once is fine. Flagging and re-reviewing them repeatedly is score-negative because it displaces attention from high-yield core material.

2. Low-Yield Trivia Wrapped in Fancy Stems

Some question banks try to flex with detail. You do not need to flex with them.

Example: A detailed question about the exact biochemical step of a hormone synthesis pathway that is not part of any commonly tested conceptual framework (just an arrow buried in a pathway diagram).

Ask yourself: “If I master this, how often is that realistically going to pay off?” If the answer is “almost never,” skip the flag.

3. Questions You Understand Completely After One Good Read

Sometimes you read the explanation and immediately feel, “I get it now. This will stick.” No cognitive dissonance, no doubt.

If revisiting that exact stem would add nothing beyond what your single read already did, do not flag. Instead, turn any key fact into a flashcard if needed.

Remember: flags are for things that need multiple touches to stick or represent multiple questions’ worth of value.


How Many Questions Should You Actually Flag?

If you are flagging more than about 15–20% of your total questions, you are probably flagging too much.

Different phases should look like this:

Recommended Flagging Rates by Phase
PhaseFlag Rate TargetMain Categories Emphasized
Early Preclinical10–15%Conceptual gaps, must-memorize facts
Dedicated Step/Level Prep15–20%Framework gaps, prototype stems, lucky guesses
Clerkship Shelf Prep10–15%Prototype stems, decision algorithms
Final Weeks Before Exam5–10%Only the most critical frameworks & patterns

If you find yourself marking 40%+:

  • You are using flags as a substitute for real study.
  • You are not differentiating between “hurt my feelings” and “matters for my score.”
  • You are going to build an unmanageable backlog that you will never fully review.

Be ruthless. Every time you hit “flag,” mentally ask: “Will future-me thank me for having to see this question again?”


Exactly How to Re-Review Flagged Questions

Flagging is only half the story. The other half is how you come back to them.

Here is a structure that works and is realistically doable.

Mermaid flowchart TD diagram
Question Flagging and Re-Review Workflow
StepDescription
Step 1Do Timed Block
Step 2Review Explanations
Step 3Flag + Concise Annotation
Step 4Move On
Step 5Add Key Facts to SRS (optional)
Step 6Schedule Re-Review Session
Step 7Re-attempt Flagged Questions (No Peeking)
Step 8Unflag or Downgrade Priority
Step 9Refine Notes & Plan Targeted Study
Step 10Does it meet flag criteria?
Step 11Now Correct with Solid Reasoning?

Step 1: Do Not Re-Review Immediately

You do not need to re-do flagged questions the next day. In fact, that often just tests short-term memory of the explanation.

Better spacing:

  • First re-review: 3–7 days after first encounter.
  • Second re-review (for the really stubborn ones): 2–3 weeks later or in the final 1–2 weeks before the exam.

That spacing makes sure you are testing retention, not just recency.

Step 2: Re-Attempt Before Looking at the Answer

When you come back to a flagged block:

  • Hide explanations and previous answers if your platform allows.
  • Work through the question as if it were new. No half-remembered answer-hunting.
  • Force yourself to narrate (even just in your head) your reasoning for each answer choice.

Three possible outcomes:

  1. You get it right, with clean reasoning → maybe time to unflag.
  2. You get it right, but you feel shaky → keep flagged, refine your notes.
  3. You get it wrong again → this is a major target; you need to go back to the parent concept, not just the question.

You should not have isolated islands of “random hard questions.” You want hubs.

For each flagged item, decide:

  • What topic node does this belong to? (e.g., “Hyponatremia algorithms,” “Nephrotic vs nephritic,” “Thyroid nodules workup.”)
  • Where do you store that node? Could be:
    • Anki deck tag,
    • Notion/Obsidian note,
    • Handwritten “one-pager” for a system.

The goal: when you review “hyponatremia,” you can quickly find 2–3 prototype flagged questions on that topic to test your schema.

If you are not tying questions back into topic structures, you are just hoarding.


Preventing Flag Overload: A Simple Triage Rule

Here is a practical triage rule I have used with students who flag everything that moves:

When you are tempted to flag, classify the reason on the spot:

  • 1 = Conceptual framework gap
  • 2 = Systematic misread / process error
  • 3 = High-yield must-memorize fact
  • 4 = High-yield prototype stem
  • 5 = Representative careless error
  • 6 = Lucky guess on high-yield topic

If you cannot clearly assign one of these numbers within 5 seconds, do not flag it.

And set hard caps per block:

  • 40-question block → maximum 6–7 flags
  • 60-question block → maximum 8–10 flags

It forces prioritization. If you hit your cap early and see an even better candidate later, you can always unflag something weaker.

bar chart: 20 Q Block, 40 Q Block, 60 Q Block

Suggested Flag Caps per Question Block Size
CategoryValue
20 Q Block3
40 Q Block7
60 Q Block10


How This Changes Near Exam Day

Last 1–2 weeks before a big exam (Step, Level, Shelf, finals), your strategy shifts.

You do not want:

  • Massive new flagging.
  • Deep exploration of fringe questions.
  • Panic-driven review of every marked item ever.

You want:

  1. Rapid passes through your top-tier flagged items only:

    • Concept frameworks you still feel shaky on.
    • Prototype stems for bread-and-butter conditions.
    • Recurrent misread / process traps you know you are prone to.
  2. Aggressive pruning:

    • If you now get it quickly and can teach it to someone else in 30 seconds, unflag.
    • Only retain what still feels fragile.
  3. Integration with practice tests:

    • After NBMEs or COMSAEs, identify new flags only when they match your existing categories.
    • Do not create a “new universe” of marked chaos in the final week.

By the final 72 hours, your flagged list should feel like a shortlist of vulnerabilities and high-yield anchors, not a graveyard of every mistake you have ever made.


Putting It All Together in Daily Practice

Let me walk you through what this actually looks like in a single question block.

You do a 40-question mixed block on UWorld.

During review:

  • Q3: Missed. You realize you have no real understanding of restrictive vs obstructive lung disease PFT patterns. You annotate a 3-line schema. Flag as “1 – framework gap.”
  • Q7: Missed because you misread “most appropriate next step in management” as “most likely diagnosis.” You flag as “2 – misread” with a note: “Pause at last line. What exactly do they want?”
  • Q12: Got right by chance; did not know the mechanism of thiazide-induced hypercalcemia. Explanation is tight, high-yield. Flag as “6 – lucky on high-yield fact” and add a quick Anki card.
  • Q18: Bizarre zebra question about a rare storage disease you have never seen. You read the explanation once, shrug, and do not flag.
  • Q22: Classic STEMI localization with clean EKG and management sequence. Perfect prototype. Flag as “4 – prototype.”
  • Q31: Forgot the specific side effect profile of a common chemo agent, but explanation shows this repeatedly shows up. You flag as “3 – must-memorize” and tag your oncology note.
  • Q36: You change your correct answer at the last second because you overthought. You have done that three times this week. Flag one representative as “5 – careless pattern.”

End result: maybe 6–7 flags out of 40. All justified. All tied to categories. All reviewable.

When you come back to flagged questions in 5 days, you are not facing a random pile. You are facing a curated set of leverage points.


FAQ (Exactly 6 Questions)

1. Should I ever do a full block only of flagged questions?
Occasionally, yes—but sparingly. A full “flagged-only” block can be useful:

  • 1–2 times during dedicated,
  • Once in the final week before the exam,

to pressure-test your weakest areas. But do not make it your main diet. You still need exposure to new questions to broaden your pattern recognition. Think of flagged-only blocks as stress tests, not daily training.


2. What if my question bank does not have a “flag” feature?
Then you mimic it externally. Use:

  • A spreadsheet with columns: question ID, topic, category (1–6 from above), date.
  • Or a simple note app with headings for each system and a running list of “must re-do” questions.

The key is not the icon inside UWorld or AMBOSS. The key is an intentional list of stems you will see again. Do not let “my platform does not have flags” become an excuse.


3. How do I combine question flags with Anki or other SRS?
Split functions cleanly:

  • Question flags = which scenarios to re-attempt.
  • Anki cards = which facts and frameworks to recall.

After reviewing a flagged item, ask:
“Is there a discrete fact or tiny decision rule here that should live as a flashcard?”
If yes, make 1–3 cards. The question remains flagged for applied practice; the cards handle pure recall.


4. I feel like I forget explanations unless I write long notes. Is that a problem?
Yes, it usually is. Long notes are rarely re-read and waste time. Your goal is not to capture the explanation. It is to extract:

  • The single key idea you were missing,
  • How to recognize when that idea is needed next time.

Force yourself to limit annotations to 1–3 short lines. If you need more writing to understand the concept, step away from the question bank and study that topic properly from a text or video, then come back.


5. What if I am scoring low—should I just flag almost everything early on?
No. When you are weak, your instinct will be to cling to every explanation. That is how you drown. Instead:

  • Do smaller blocks (e.g., 10–20 questions).
  • Spend more time on understanding between questions.
  • Still obey the flag criteria; just expect more framework gaps and must-memorize facts early.

Your percentage of flags might edge toward 20%, but if you are above that, you are not prioritizing.


6. How do I know when it is safe to unflag a question?
Two conditions:

  1. You can answer it correctly after a delay (several days) and
  2. You can explain why each incorrect option is wrong in 1–2 sentences each.

If both are true and the underlying topic feels solid across multiple questions (not just this one), you can unflag with a clear conscience. Keeping everything marked forever is just anxiety dressed up as “thoroughness.”


Key takeaways, stripped down:

  1. Only flag what will change future questions: frameworks, patterns, truly high-yield facts, and recurrent process errors.
  2. Keep flags scarce and intentional, tied to specific categories, with brutally short annotations.
  3. Re-review on a spaced schedule, re-attempting before peeking, and unflag as concepts become truly owned.
overview

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

* 100% free to try. No credit card or account creation required.

Related Articles