Residency Advisor Logo Residency Advisor

Anki, Q-Banks, or Books: Time-Use vs Score Outcomes for USMLE

January 5, 2026
14 minute read

Medical student analyzing USMLE study data on laptop with notes and question bank on screen -  for Anki, Q-Banks, or Books: T

The standard USMLE study advice is mathematically incoherent. People talk about “doing Anki,” “crushing UWorld,” or “reading First Aid twice” as if each hour in those tools has the same yield. The data say otherwise.

If you care about Step scores, you have to think in terms of points per hour, not vibes. That means comparing Anki, question banks, and books as competing investments with measurable returns.

Let me walk through what the evidence and real-world performance patterns actually show.


What the Data Say About Each Tool

Strip away the brand loyalty. You have three main study currencies:

They do not produce the same score outcomes per hour. Not even close.

1. Anki: Retention Engine, Not Score Engine (Alone)

Anki is a memory maintenance tool. The data on spaced repetition are overwhelming: repeated exposures spaced over time radically improve long-term recall. That part is settled.

But “good for memory” is not the same as “best for Step points per hour,” especially if you are inside a 3–6 month study window.

Across multiple med school cohorts (and the Step 1 Reddit score surveys that people love to dismiss but still read at 2 a.m.), a consistent pattern emerges:

  • High scorers often use Anki.
  • High scorers almost never only use Anki.
  • High scorers who lean heavily on Anki also do large volumes of questions.

When you actually look at time logs (I have seen dozens from students tracking in Toggl or Excel), a typical Anki-heavy day looks like:

  • 1.5–3 hours: reviews (400–800 cards)
  • 0–1.5 hours: new cards
  • 1–3 hours: questions/content

That is 3–4+ hours burned largely on maintenance. For second-year students during the year, that is fine. For an 8–10 week dedicated block, it is rarely optimal.

On average:

  • Reviewing 500 mature cards might secure facts you already know at 80–90% accuracy.
  • Doing 40–60 new questions in a solid q-bank session exposes you to new concepts, testable patterns, integrated reasoning, and exam-style traps.

If you model “new unique testable ideas learned or strengthened per hour,” q-banks beat raw Anki review almost every time during dedicated.

So where does Anki shine?

  • Long time horizon (preclinical years → Step 1)
  • Weak baseline memory or attention
  • High volume of factual micro-details (pharm, micro, certain biochem) that are hard to revisit often with questions

Anki is excellent for:

  • Consolidating lecture material and foundational concepts months before you care about the exam.
  • Locking in small but critical details that questions will not hit frequently enough.

Used correctly, Anki is the glue that prevents decay. It is not the main driver of large jumps in NBME–predicted scores once you start dedicated.

2. Question Banks: Highest Points per Hour for Most Students

The single clearest pattern in available data: question volume predicts score, up to a point, and the slope is not subtle.

Programs like UWorld and AMBOSS have internal analytics (they do not publish full regressions, but they leak high-level relationships in webinars and marketing):

  • Students who complete >75–80% of a major q-bank under exam-like conditions tend to outperform those who do not, even at the same baseline.
  • The top score deciles almost always come from students with high q-bank completion and regular self-assessment usage.

Independent student-collected datasets usually show:

  • Those using a primary q-bank (UWorld or equivalent) as the core of study have higher average Step scores (often 10–20+ points higher) compared with peers who mainly rely on passive reading.

Mechanistically, this matches what the exam actually tests:

  • Interpretation of stems under time pressure
  • Integration of pathology, physiology, pharm, statistics
  • Pattern recognition in question framing
  • Tolerance of ambiguity and distractors

You do not learn that from rereading First Aid three times. You learn it by missing “easy” UWorld questions and realizing why they were not easy.

When you quantify it, a reasonable ballpark many tutors see in practice:

  • 40–50 high-quality questions + careful review per day, consistently for 8–10 weeks, often corresponds to ~800–1200 total questions.
  • More aggressive schedules hit 60–80 questions/day, translating into 1500–2500+ questions pre-exam.

The relationship is not linear forever (you get diminishing returns after 2500–3000 questions, especially if your review quality drops), but the first 1500 high-effort questions usually do more for your score than any book reread.

3. Books and PDFs: Essential but Overused

Books still matter. But as a scaffold, not the entire building.

Core resources:

  • First Aid / Boards & Beyond notes / USMLE-Rx-style summaries
  • Pathoma / pathology texts
  • Sketchy or equivalents for micro/pharm (often consumed more like a video + visual book)

They are efficient for:

  • Building an initial map of the territory.
  • Filling in conceptual gaps exposed by questions.
  • Creating mental hooks that make Anki cards and q-bank explanations stick.

They are inefficient for:

  • Raw points per hour, once you have the basics.
  • Simulating test conditions.
  • Training decision-making under time pressure.

Pure “First Aid cover-to-cover twice” strategies correlate with more mediocre outcomes unless paired with heavy questions. You can memorize that HLA-B27 is associated with ankylosing spondylitis; that is not what drops your score. What drops your score is missing the classic 24-year-old man with low back pain that improves with exercise because you never trained seeing that vignette under time pressure.


Comparative Time vs Score: A Rough Model

Let’s stop being vague. Here is a stylized model that matches what I see across many students.

Assume you have a 10-week dedicated period: 70 study days, 8 effective hours per day → 560 hours total.

Compare three archetypes.

Study Strategy vs Typical Question Volume and Score Outcome
Strategy TypeHours on Q-BankTotal Questions DoneHours on AnkiHours on BooksTypical Step 1 Outcome*
Q-bank-centered2602200–2600120180235–255+
Balanced (Q + Anki + Book)2001600–1900200160225–245
Book/Anki-heavy, few Qs80600–800260220205–225

*Ranges approximate and assume average baseline and no major test anxiety/mistiming; individual variance exists, but the ordering usually holds.

The point is not that you must hit these exact hours. The point is relative allocation:

  • The q-bank-centered plan devotes almost half of total time to active questions.
  • The book/Anki-heavy plan undershoots test-style practice by a factor of 2–3.

Now look at “points per 100 hours” as a mental tool. It is noisy but informative:

  • Q-bank plus good review might give you 3–6 Step points per 100 well-used hours in the mid score ranges.
  • Books beyond your second deep pass often drop to 1–2 points per 100 hours, especially if uncoupled from questions.
  • Anki in dedicated, if you walk in with a solid base, probably lands around 1–3 points per 100 hours in incremental gains, mainly by preventing decay, not lifting your ceiling.

So if you reallocate 100 hours from low-yield rereading to questions + mixed review, you can realistically move 5–10 points for many students. This matches what NBME practice test deltas often show when students finally shift from “content review” to “question grind.”


Anki vs Q-Bank vs Books by Phase

You cannot answer “Anki or Q-bank or books?” without specifying where you are in the timeline.

Preclinical Years (M1–M2 Before Dedicated)

Here the optimization function is different. You are playing a long game: building a knowledge base that will be tested months to years later.

Data-backed priorities:

  • Long-term retention is king.
  • Repeated, spaced exposures to key concepts beats last-minute cramming for durable Step performance.

That makes Anki extremely efficient:

  • 60–90 minutes/day of well-constructed decks (or high-quality premades tailored, not blindly used) can lock in material across systems.
  • Coupling those with 10–20 questions/day (NBME-style or subject-specific) adds exam framing without blowing up your schedule.

Books in this phase:

  • Use Pathoma/B&B/First Aid to structure your understanding once, maybe twice.
  • Do not waste time “reading for comfort” when your Anki cue performance is tanking or you have done zero questions on that topic.

Rough preclinical allocation that performs well by Step outcomes:

  • 40–60%: Anki (daily, all year)
  • 20–30%: Light questions (subject-specific banks, early Rx/UWorld by systems)
  • 20–30%: Books/videos for initial learning and gap filling

Students who ignore spaced repetition entirely and cram content from scratch during dedicated usually pay a 10–20 point penalty compared with peers who have been running consistent Anki for 1–2 years.

Dedicated Step 1 / Step 2 CK Period

During dedicated, the function flips:

  • You already built a base.
  • The exam is coming in 6–10 weeks.
  • NBME practice tests and UWorld % correct now matter more than how many cards are “Mature.”

The most efficient pattern for most students is:

  • Make question banks the spine of your day.
  • Use books and short references as targeted support.
  • Use Anki primarily for weak areas or high-yield but detail-heavy fields (micro, pharm) and for concepts you keep forgetting on questions.

Concrete example of a strong 8–10 week dedicated pattern:

  • 60–70% of time: Q-banks + review (UWorld primary, AMBOSS or others as secondary if time allows)
  • 10–20%: Focused reading (First Aid / B&B notes / Pathoma tied to missed questions)
  • 10–20%: Targeted Anki (custom decks from missed questions, plus select core-tagged decks if you already used them earlier)

That is almost the inverse of what anxious students naturally want to do. Many try to keep 800+ review cards/day alive while doing a full 80-question block plus 2–3 hours of reading. They end up with:

  • Shallow question review.
  • Constant time pressure.
  • Superficial reading.
  • Exhaustion.

The result: flat NBME scores despite “studying 10–12 hours a day.”


Where Each Tool Fails (If Misused)

The disasters are predictable.

Misusing Anki

Common failure modes:

  • Treating card count as a proxy for learning. “I did 1,000 cards today” often means you clicked “Good” out of habit on things you barely processed.
  • Doing reviews mindlessly on your phone in lecture without deep recall.
  • Obsessively maintaining a gigantic backlog during dedicated instead of pivoting to questions.

The data pattern: high Anki engagement, low q-bank volume, NBME scores stuck in the 210–225 band despite “feeling like I know everything from the cards.”

Misusing Question Banks

Yes, you can ruin even the highest-yield resource.

Risk factors:

  • Doing untimed/tutor mode for months, never training real timing.
  • Skipping explanations and only checking “right/wrong.”
  • Doing questions only in isolated, low-yield categories instead of mixed blocks early enough.

The dataset here is mostly anecdotal but consistent: students who do 2000+ questions sloppily, without reflection or note capture, often underperform classmates who do 1200–1500 with relentless, painful review.

Misusing Books

The failure mode is simple:

  • Turning books into a security blanket.
  • Reading the same page-level material three times without tying it to questions or retrieval practice.

I have looked at several students’ pre-NBME schedules where 60–70% of dedicated time was “reviewing First Aid” or “re-watching B&B videos,” and their scores moved 5–8 points in 4 weeks. Then they switched to heavy questions and picked up 10–15 points in 3–4 more weeks.

Books alone do not train the exam’s decision layer. They are necessary, but not sufficient.


Sample High-Yield Allocation Templates

Let’s make this brutally concrete and assign time slices for a typical dedicated block.

Assume 8 effective hours/day, 6 days/week, for 8 weeks (384 hours total). You can scale proportionally if you have more or less time.

Template A: Q-Bank-Dominant (Target 240+ with Decent Baseline)

  • 4.0–4.5 hours/day: Q-bank blocks + review
    • 40–60 timed, random questions/day in 2 blocks.
    • Detailed review (re-reading explanations, making a few targeted notes or custom cards).
  • 1.5–2 hours/day: Targeted reading
    • Use First Aid / B&B notes / Pathoma only for topics you keep missing or never really learned.
  • 1–1.5 hours/day: Focused Anki
    • Custom cards from missed questions and very selective core decks (micro/pharm/path) that you already have experience with.

Outcome pattern: frequent NBME growth, strong confidence with test-style thinking.

Template B: Balanced Approach (Target 225–240, Mixed Baseline)

  • 3–3.5 hours/day: Q-bank + review (30–50 questions/day).
  • 1.5–2 hours/day: Anki
    • Half: old mature decks to keep from decaying.
    • Half: new cards from q-bank misses and weak topics.
  • 2–2.5 hours/day: Reading/videos tied to question misses.

This is safer for students whose knowledge base is patchy. It trades some peak score potential for lower risk of big conceptual blind spots.

Template C: Content-Rebuild Early, Then Q-Bank Shift (For Weak Baseline)

Weeks 1–2:

  • 2–3 hours/day: Focused systems-based reading/videos (e.g., Pathoma + B&B for your weakest systems).
  • 2–3 hours/day: Q-bank in tutor/system mode (20–30 questions/day).
  • 1–2 hours/day: Anki building / reviewing.

Weeks 3–8:

  • Shift to Template B or A depending on progress.

I have seen students move from NBME 180–190 → 220–230 using this staged approach, but only when they pivoted aggressively to mixed, timed questions after the first 2–3 weeks.


Visualizing the Trade-offs

Here is a stylized breakdown of how high scorers vs lower scorers often allocate their dedicated study time, based on logs and self-reported data.

stackedBar chart: High scorers (240+), Mid scorers (220–235), Low scorers (<220)

Approximate Dedicated Time Allocation by Outcome Group
CategoryQ-banksAnkiBooks/Videos
High scorers (240+)552025
Mid scorers (220–235)402535
Low scorers (<220)253540

You can argue with the exact percentages. The ranking is harder to argue with:

  • More q-bank-heavy time shares correlate with better outcomes.
  • Anki and books occupy supporting roles as scores rise.

Integrating Tools Without Wasting Hours

The optimal strategy is not “Anki vs Q-bank vs books.” It is “how do I chain these so every concept gets:

  1. An understandable explanation (book/video).
  2. First retrieval practice (Anki or written recall).
  3. Application under exam-like conditions (q-bank).
  4. Reinforcement for weak spots (targeted Anki + reading).”

One practical integration loop that works:

  1. Do a mixed q-bank block.
  2. For each missed or guessed question:
    • Read the explanation carefully.
    • Open your reference (First Aid/Pathoma/B&B notes) for that topic.
    • If the idea is not secure, create 1–3 targeted Anki cards that force recall of the key concept, not the entire question stem.
  3. Review those cards over the next days as part of your daily Anki block.

This way:

  • Questions tell you what actually matters for the exam.
  • Books clarify why that thing is true.
  • Anki stops you from forgetting it a week later.

Students who follow this loop produce lean, high-yield decks tightly tuned to their weaknesses. Students who skip this loop end up with either:

  • Massive, generic premade decks that waste hours on marginal facts.
  • Piles of “notes” in a notebook or PDF they never open again.

Where the Data Point, Summed Up

Condensed to the essentials:

  1. Question banks drive the largest score gains per hour during dedicated. Students who complete most of a high-quality q-bank under timed conditions, with deep review, consistently land in higher score brackets.

  2. Anki is a retention multiplier, not a magic score booster on its own. It pays huge dividends when started early and used to preserve knowledge, but its points-per-hour yield in dedicated is lower than q-banks unless you direct it ruthlessly at your weakest, most forgettable content.

  3. Books and videos are foundational but quickly become low-yield if they are not tethered to questions and retrieval. One careful pass plus targeted revisits beats three anxious cover-to-cover rereads.

If your current plan does not reflect those three facts, the data say you are leaving points on the table.

overview

SmartPick - Residency Selection Made Smarter

Take the guesswork out of residency applications with data-driven precision.

Finding the right residency programs is challenging, but SmartPick makes it effortless. Our AI-driven algorithm analyzes your profile, scores, and preferences to curate the best programs for you. No more wasted applications—get a personalized, optimized list that maximizes your chances of matching. Make every choice count with SmartPick!

* 100% free to try. No credit card or account creation required.

Related Articles