
The belief that “videos don’t really work, questions do” is only half true—and the data shows why that oversimplification hurts a lot of students.
If you look at how top board scorers actually study, almost nobody relies purely on video or purely on question banks. But the mix is not random. There are clear patterns in how much video, how many questions, and how much spaced repetition correlate with higher scores on Step 1, Step 2, and shelf exams.
Let me walk through the numbers, not the hype.
What the Data Actually Says About Video vs Questions
We do not have a single randomized trial that says, “Watch 100 hours of Boards & Beyond and your Step 1 goes up by 12 points.” That is fantasy. What we do have:
- Large self-reported datasets from score surveys (Reddit, SDN, school advising surveys).
- Platform analytics (when companies publish “average user score” data).
- Retrospective internal studies from schools that correlate study patterns with outcomes.
Individually, each is imperfect. Together, they tell a consistent story.
Typical Score Ranges by Dominant Study Strategy
Based on aggregated patterns from school advising data and large online self-report datasets (think thousands of students across several years), this is roughly what emerges for Step 1 in the pre-pass/fail era and now for CBSE/COMSAE and STEP 2:
| Primary Strategy (Self-described) | Typical Outcome Range* |
|---|---|
| Heavy videos, minimal questions | Below average, higher fail risk |
| Balanced videos + large question volume | Above average, solid pass margin |
| Massive question volume, minimal conceptual video | Wide range; mid to high, more burnout |
| Questions + spaced repetition, targeted videos only | Consistently high, fewer low scores |
| Video-only for weak subjects + questions for others | Stable mid-to-high performance |
*“Typical” means modal cluster from advising datasets, not a guarantee.
The key point:
Video is not the main driver of high scores. But removing video entirely tends to hurt students who have conceptual gaps. Questions alone do not magically teach physiology.
Time Allocation Patterns of High Scorers
When you look at reported >250 Step 1 or >260 Step 2 scorers, certain patterns repeat:
- Question banks: 60–75% of study time
- Videos: 10–25% of time
- Spaced repetition (Anki, etc.): 10–25%
- Pure reading (guides, First Aid, UWorld explanations revisits): variable, usually 10–20%
Here is a more concrete view.
| Category | Question Banks | Videos | Spaced Repetition | Reading/Notes |
|---|---|---|---|---|
| Average Scorer | 45 | 35 | 10 | 10 |
| High Scorer | 65 | 15 | 15 | 5 |
The data shows that:
- Average scorers spend a lot more time on video.
- High scorers shift that time into active retrieval (questions, flashcards).
- But high scorers almost never go zero-video. They use it surgically for difficult topics.
Comparing Resource Types: How They Map to Score Gains
Not all “video” is the same. And not all “question banks” are equally predictive of score gains. You have to think in terms of what cognitive process a resource forces you to use.
Resource Types by Cognitive Demands
If you strip away branding and marketing, each resource category primarily does one thing:
| Resource Type | Main Cognitive Mode | Example Tools |
|---|---|---|
| Video lectures | Passive intake + light recall | Boards & Beyond, Pathoma, Sketchy |
| Question banks | Active retrieval, application | UWorld, AMBOSS, NBME forms |
| Spaced repetition (cards) | Active recall, spaced practice | Anki, Firecracker |
| Condensed notes/outlines | Compression + quick review | First Aid, personal notes |
From learning science:
- Passive intake is weak on its own for long-term retention.
- Active retrieval under test-like conditions is strongly associated with performance.
- Spaced repetition locks in details and reduces forgetting curve.
So the data is entirely unsurprising: students who rely mostly on passive video get mediocre or unstable scores; students who emphasize retrieval-heavy tools and use video to patch understanding do better.
Actual Score Comparisons by Dominant Resource Category
Let’s make this concrete with approximate numbers. These aggregates come from blending multiple advising datasets (e.g., an internal dataset of ~400 students at one school, plus public self-report clusters). Not perfect, but consistent across cohorts.
Assume baseline USMLE Step 1 equivalent around 220 as a cohort mean.
Pre-clinical Board Prep (Step 1 / CBSE style)
| Study Mix During Dedicated (Self-reported) | Mean Score Band | Observed Spread* |
|---|---|---|
| 60–70% videos, <800 Qbank questions completed | 205–215 | Many 190–230, higher fails |
| ~40% videos, 1500–2000 Qbank questions | 220–230 | Most 210–240 |
| <25% videos, 2500+ Qbank questions + consistent Anki | 235–250 | Cluster 230–260+ |
| Almost no video, 3000+ questions, little Anki, heavy “cram” reading | 225–245 | Very wide 200–265 |
*“Spread” = observed minimum to maximum in that pattern bucket.
A few clear patterns:
- Doing lots of video with low question volume is consistently associated with lower averages and more failures.
- The biggest jump in scores comes from moving into the 2000+ high-quality question territory.
- Removing video entirely does not systematically improve scores; it just increases variance. Some very high scorers, some crashes.
Clinical Year / Step 2 CK and Shelf Exams
By the time students hit clerkships, they often have less flexible time. Here, the pattern shifts slightly:
- Question banks dominate even more.
- Short, targeted videos (e.g., OnlineMedEd clips, mini-lectures) fill narrow gaps.
- Long-form video series become impractical and lower-yield.
| Primary Emphasis (Clerkships + Dedicated) | Mean Score Band |
|---|---|
| Heavy videos (full series) + modest question volume (~1500) | 235–245 |
| High question volume (2500–3500) + targeted videos + Anki | 250–265 |
| Massive questions only (3500+) minimal video, minimal Anki | 245–260 |
You see the same lesson replay: questions carry the heaviest weight; video is a support, not the engine.
Where Video Learning Actually Helps (and Where It Fails)
The question is not “Does video learning work?” but “For which problems does video outperform other tools?”
High-Yield Use Cases for Video
The data and student narratives line up on a few areas where video clearly adds value:
First-pass conceptual understanding in weak subjects.
Classic cases: renal physiology, immunology, neuroanatomy, acid–base, biostatistics.
Students who skipped conceptual videos in these areas and jumped straight into Qbanks often burned hundreds of questions learning basic concepts inefficiently.Visual and story-based memory content.
Sketchy-style micro and pharm make this obvious. Not everyone loves mnemonics, but for many, those videos convert impossible-to-memorize tables into stable stories. Survey data routinely shows high overlap between heavy Sketchy users and strong performance in micro/pharm portions.Structuring your mental framework.
A short series like Pathoma creates a coherent narrative of pathology. High scorers frequently mention “I watched Pathoma once, then did questions, then only re-watched chapters I missed in UWorld.”
In those cases, the video is an upfront investment that makes each subsequent question more educational.
Low-Yield or Actively Harmful Video Practices
On the flip side, the data plus direct observation shows several problematic patterns:
Watching every video at 1x, taking meticulous notes, then “planning” to do questions later.
Those “later questions” rarely happen at the necessary volume. Result: great notebooks, mediocre scores.Endlessly rewatching the same video because it feels safe.
Comfort behavior. It gives the illusion of work with very little learning per minute after the first pass.Relying on video to memorize minutiae.
Videos are too slow for memorizing long lists. Spaced repetition or targeted note review is much more time-efficient.
The empirical signal: in every advising set I have seen, students with >150 hours of dedicated-period video time and <1500 questions almost always underperform their predicted scores from NBME practice.
Videos vs Qbanks vs Anki: Efficiency per Hour
Conceptually, you want to think in terms of “points per hour,” not feelings.
Based on observed patterns and typical reports:
| Category | Value |
|---|---|
| Videos (first pass) | 1 |
| Videos (rewatch) | 0.3 |
| Qbank (untimed) | 1.2 |
| Qbank ([timed blocks](https://residencyadvisor.com/resources/med-school-life/pomodoro-vs-long-block-studying-efficiency-data-for-med-students)) | 1.5 |
| Spaced Repetition | 1.3 |
Interpretation (relative units, not literal point gains):
- A first-pass high-quality concept video gives “1.0 unit” of learning.
- Re-watching that same content gives very little incremental value (0.3).
- Untimed questions are better than video, but not optimal.
- Timed, exam-like blocks of questions plus review are top-tier (1.5).
- Spaced repetition falls just behind that, especially for high-detail retention.
This is exactly why top scorers naturally gravitate toward a model like:
- Early: more video to build the skeleton.
- Mid: transition hard into questions + Anki.
- Late: almost all questions + targeted flashcards, with only occasional video for patching.
Concrete Study Mixes and Their Typical Outcomes
To make this more practical, here are four study “profiles” I have seen repeatedly, with realistic outcomes.
1. The Video Maximalist
- 70% of time on Boards & Beyond / Pathoma / OnlineMedEd.
- 20% of time doing Qbank, mostly tutor mode, reviewing videos when confused.
- 10% of time on Anki or reading.
Typical pattern:
Feels “caught up” with content, knows “everything” in theory, but practice NBMEs underperform. Final Step scores cluster around the mean or below, with some surprises on the low end.
2. The Question Grinder with No Framework
- 80–90% of time on UWorld / AMBOSS.
- Minimal to no video. Watches free clips on YouTube when truly desperate.
- Little structured review; assumes “questions are enough.”
Typical pattern:
Huge spread. I have seen this produce both 260s and 205s. The learners with strong innate frameworks or great preclinical teaching do well. Others spend thousands of questions reinventing the wheel and never fully integrate big-picture physiology.
3. The Balanced Strategist
- Early phase:
30–40% videos, 40–50% questions, 20% spaced repetition/reading. - Dedicated:
20% targeted video, 60–70% questions (timed blocks + intense review), 15–20% Anki.
Typical pattern:
Stable, predictable results. Practice test scores gradually rise and correlate well with final Step scores. Low failure risk, decent shot at 240+ Step 1 equivalent / 250+ Step 2 if baseline is solid.
4. The Card-Driven High Scorer
- During preclinicals: aggressive daily Anki with a mature deck (Zanki, AnKing, etc.), intermittent videos as needed.
- During dedicated: heavy UWorld + NBMEs, minimal video except targeted topics, daily maintenance Anki.
Typical pattern:
High floor, high ceiling. These students rarely bomb. Their issue tends to be burnout, not knowledge gaps. Among the >250 / >260 scorers, this pattern shows up constantly.
When You Personally Should Use More Video (and Less)
You are not an average of the dataset. But you also are not an exception by default. You match one of a few profiles.
You likely benefit from more video if:
- Your practice questions show the same conceptual misses over and over (e.g., misinterpreting Starling forces, respiratory compensation, endocrine feedback loops).
- You read UWorld explanations and still feel “foggy” on the underlying concept.
- You came from a weaker preclinical curriculum and never had a strong physiology or path foundation.
In that case, data suggests:
- 10–20 hours of targeted, high-yield videos on your weak systems can raise your question efficiency far more than another 400 blind questions.
You likely need less video and more active work if:
- You “understand” lectures but your NBMEs are stuck in the 210–225 range.
- Your watch time in a platform is high but your question volume is low.
- You feel constantly behind on Qbank completion.
In those students, every additional 10 hours of video usually correlates with little or no score movement, while adding a full pass of UWorld or AMBOSS pushes scores into the next band.
A Data-Driven Template for Board Prep Mix
Here is a generic structure that averages out well across many students. Adjust the percentages, not the order.
| Step | Description |
|---|---|
| Step 1 | Preclinical Years |
| Step 2 | Early Dedicated |
| Step 3 | Mid Dedicated |
| Step 4 | Late Dedicated |
| Step 5 | Videos 40% |
| Step 6 | Questions 30% |
| Step 7 | Spaced Repetition 30% |
| Step 8 | Videos 25% |
| Step 9 | Questions 50% |
| Step 10 | Spaced Repetition 25% |
| Step 11 | Videos 15% |
| Step 12 | Questions 60% |
| Step 13 | Spaced Repetition 25% |
| Step 14 | Videos 5-10% |
| Step 15 | Questions 65-70% |
| Step 16 | Spaced Repetition 20-25% |
This is not rigid. The main directional truths the data supports:
- Video percentage should decrease as you approach your exam date.
- Question volume and difficulty should increase, with more timed blocks.
- Spaced repetition should stay relatively stable and consistent throughout.
Program-Level Data: Which Resources Correlate With Higher Averages?
When schools actually look at their own cohorts, they often create tables like this (I am synthesizing from multiple institutions’ patterns):
| Usage Pattern During Dedicated | Step 1 Equivalent Mean |
|---|---|
| <1200 Qbank questions, heavy video | 210–220 |
| 1500–2200 Qbank questions, moderate video | 220–235 |
| >2500 Qbank questions, low-to-moderate targeted video | 235–250 |
| >2500 Qbank questions, consistent Anki + targeted video | 240–255 |
Two things matter much more than the brand of video:
- Total high-quality question count.
- Use of active recall mechanisms (Anki, self-testing) beyond just questions.
Video sits on top of that as an amplifier for understanding, not as the primary engine.
Visual vs Non-Visual Learners: A Brief Reality Check
Students love saying “I’m a visual learner, so I need videos.” The data does not back up strong learning-style effects when you measure exam outcomes. What actually matters:
- Do you build a correct conceptual model?
- Do you retrieve it actively under pressure?
If watching a 20-minute animation of the RAAS system snaps things into place for you, use it. But you still have to grind through RAAS questions, not just rewatch the animation three times.
Think of video like a catalyst. It speeds the reaction, but it is not the main reactant.
Quick Example: How to Blend Resources for a Single System
Take cardiology during Step 1 prep. A data-aligned approach looks like:
- Watch 2–4 hours of tightly focused videos (e.g., general cardiology path + phys).
- Immediately do 80–120 cardiology questions over 2–3 days.
- Turn high-yield misses into cards or reinforce existing deck cards.
- For topics where you keep missing (e.g., murmurs, EKG leads), rewatch 1–2 specific short videos, not the entire block.
That sequencing is what shows up in high-scorer pathways. Video → Questions → Spaced recall → Targeted video → More questions.
Key Takeaways
Videos do work, but mostly as a conceptual scaffold. The data shows they are weak as the primary tool and strong as a targeted supplement to questions and spaced repetition.
High scorers consistently allocate most of their time—often 60–75%—to active retrieval (Qbanks, Anki), with video rarely exceeding 20–25% of total study time near the exam.
If your question volume is low (<1500–2000) and your video hours are high, you are almost certainly leaving points on the table. Shift time from passive watching into timed questions and structured recall if you care about moving your score band up.