
It is 4:45 p.m. You just finished a faculty development workshop where the DME cheerfully said, “And of course, we will evaluate this program robustly.” Everyone nodded. No one asked how. Now you are back in your office, staring at a new curriculum proposal or faculty development series you are supposed to “evaluate,” and the only thing on your paper is: pre–post survey?
You are not alone. Most medical educators were never systematically taught program evaluation. You learned bits: satisfaction surveys, test scores, maybe some Kirkpatrick levels. But putting it all together into a coherent evaluation framework? That is where most people stall.
Logic models are the missing spine. They give structure to your evaluation. They force you to be explicit about what you are doing, why you are doing it, and how you will know if it worked. Used properly, they also stop you from collecting 500 useless survey items just because SurveyMonkey is free.
Let me walk you through how to actually use logic models in practice in medical education—not as pretty diagrams for accreditation, but as workhorse tools that drive real evaluation.
1. Why Logic Models Matter More Than Another Survey
Think about the last “evaluation” you saw for a teaching program:
“On a scale of 1–5, how satisfied were you with the session?”
“Would you recommend this workshop to a colleague?”
That is Level 1 reaction data. It has its place, but it does not tell you whether residents order fewer unnecessary CTs, whether attendings give better feedback, or whether your remediation program actually improves exam pass rates.
The central problem: most evaluations are disconnected from the underlying theory of how the program is supposed to work. Logic models fix that by forcing clarity on:
- What resources you are actually investing
- What you are really doing (not just the title on the flyer)
- What you expect to happen in sequence
- Where you will look for evidence that it is happening
You stop asking, “What should we measure?” and start with, “What are we trying to change, through which mechanisms?” Evaluation stops being generic and becomes surgical.
2. The Logic Model Skeleton (and How It Really Works)
A logic model is not mystical. It is a structured story of cause and effect.
Classic components:
- Inputs (or Resources)
- Activities
- Outputs
- Outcomes (short-, medium-, long-term)
- Impact (sometimes separated, sometimes folded into long-term outcomes)
For medical education, I usually tighten this to:
- Inputs
- Activities
- Participants
- Outputs
- Outcomes (short, intermediate, long)
Let me walk through each part in a context you actually live in: a new resident communication skills curriculum.
Inputs
Inputs are what you invest.
Examples for a communication curriculum:
- Faculty time: 4 core faculty at 0.05 FTE each
- Simulation center availability: 10 half-days per year
- Standardized patients budget: $15,000 annually
- IT support for recording/feedback tools
- Protected resident time: 4 half-days per cohort
Most people skip this step. That is a mistake. Inputs tell you what this program costs. They also help you later when someone asks, “What did we get for all this faculty time and money?”
Activities
Activities are what you actually do, not what you call it.
For the same curriculum:
- Develop 6 case-based modules (breaking bad news, informed consent, managing anger, etc.)
- Monthly 2-hour small-group workshops
- Quarterly SP encounters with structured feedback
- Direct observation in clinic with mini-CEX style tools
- Faculty calibration meetings for feedback standards
Here is the trap: “Did a grand rounds” is not an activity in evaluation terms. “Delivered a 60-minute interactive session using case discussions and role plays on X topic” is.
Participants
Who actually receives or is involved in the activities?
- PGY-1 internal medicine residents (all 24 per year)
- Core faculty preceptors (8)
- SPs trained in the scenarios
You care about this because your evaluation data need to map to actual exposure. If only 12 of 24 interns attend the workshops, your outcome analysis had better reflect that.
Outputs
Outputs are immediate, countable products—process data.
For the curriculum:
- Number of workshops delivered (12 / year)
- Attendance rates per session (e.g., 18–22 per workshop)
- Number of SP encounters completed (96 / year)
- Number of direct observations documented (average 3 per resident)
Outputs track implementation. You do not infer behavior change from outputs, but you do use them to answer, “Did we implement the curriculum as designed?”
Outcomes
Here is where it gets interesting. You break outcomes into:
- Short-term: Usually knowledge, attitudes, self-efficacy
- Intermediate: Behavior change in the real environment
- Long-term: Patient, system, or institutional impact
For the communication curriculum:
Short-term (0–6 months):
- Resident knowledge of communication frameworks (checklists, OSCE scores)
- Self-reported confidence in specific domains (e.g., disclosing errors)
- Faculty ratings of performance in simulated encounters
Intermediate (6–18 months):
- Observed use of communication skills on wards/clinic (mini-CEX, field notes)
- Frequency and quality of documented goals of care conversations
- Nursing or allied health staff ratings of resident communication
Long-term (18+ months):
- Patient satisfaction scores related to communication (HCAHPS-like domains)
- Reduced complaints related to “poor communication”
- Improved documentation of advance care planning
Now, notice something. You are already mapping to Kirkpatrick levels without saying the word “Kirkpatrick” once.
- Reaction → often embedded in outputs and a bit of short-term
- Learning → short-term outcomes
- Behavior → intermediate
- Results → long-term
Logic models and Kirkpatrick are not rivals. A good logic model operationalizes your Kirkpatrick mapping in a more concrete way.
3. Building a Logic Model for a Real Medical Education Program
Let us get practical. I will walk through a concrete example and then show how it shapes evaluation.
Scenario: You are launching a new faculty development program on giving feedback to learners across your department of surgery.
Step 1: Clarify the problem and desired change
Problem statement:
Residents consistently rate feedback from surgical faculty as “infrequent and nonspecific.” CCC data suggest that underperforming residents are not identified until late in PGY-2.
Desired changes:
- Faculty give more frequent, specific, and actionable feedback
- Underperforming residents are identified earlier
- Resident perception of feedback quality improves
- Ultimately, progression decisions are more accurate and timely
Step 2: Draft the logic model
Here is what a simple text-based version looks like:
Inputs:
- 0.1 FTE faculty development director
- Department Chair support (mandatory participation for new faculty)
- CME office support for credits
- Rooms, AV, evaluation software
- 3–4 experienced “feedback champion” faculty
Activities:
- Quarterly 2-hour interactive workshops on effective feedback
- Peer observation of feedback with structured tools
- One-on-one coaching sessions for selected faculty
- Distribution of quick-reference feedback tip cards
- Integration of feedback expectations in faculty onboarding
Participants:
- All new surgical faculty (first 3 years)
- Volunteer mid-career faculty
- Feedback champions (as mentors/coaches)
Outputs:
- Number of workshops delivered and attendance rates
- Number of faculty receiving 1:1 coaching
- Number of peer observations completed and debriefed
- Percentage of new hires completing the program within 12 months
Short-term outcomes:
- Increased faculty knowledge of feedback models (e.g., R2C2, SBI)
- Increased self-reported confidence in giving constructive feedback
- Improved performance in simulated feedback scenarios (video-assessed)
Intermediate outcomes:
- Higher frequency of documented feedback in evaluation forms
- Improved resident ratings of feedback specificity and usefulness
- More timely completion of end-of-rotation evaluations
Long-term outcomes:
- Earlier identification of struggling residents (by PGY-1 midyear)
- More accurate CCC decisions (reduced “late surprises”)
- Improved resident performance on milestones over time
Now you have a spine. Evaluation questions and measures plug into this.
4. Turning the Logic Model into a Concrete Evaluation Plan
A logic model without an evaluation plan is just a nice diagram for your LCME or ACGME site visit. The whole point is to drive measurement choices.
Step 1: Prioritize which outcomes you will actually measure
You cannot measure everything. Nor should you. The question is: what matters most to your stakeholders and is realistically measurable in your context?
Use a quick 2 × 2 mental grid: importance vs feasibility.
| Outcome Example | Importance | Feasibility |
|---|---|---|
| Faculty knowledge of feedback models | Medium | High |
| Resident perception of feedback quality | High | High |
| Earlier identification of struggling residents | High | Medium |
| CCC decision accuracy | High | Low |
I would prioritize:
- Resident perception of feedback quality
- Observable changes in feedback frequency/specificity
- Maybe one marker of earlier identification of problems
You can still collect simpler short-term data (knowledge, confidence) because those are cheap to measure, but do not pretend they are your main outcomes.
Step 2: Map evaluation questions to logic model components
Examples:
Inputs / Activities (process evaluation):
- Were planned workshops actually delivered as designed?
- Did we reach the intended faculty (new hires within 3 years)?
- What proportion of participants received coaching?
Outputs (implementation fidelity):
- How many peer observations were completed per faculty member?
- What was the attendance pattern over time?
Short-term outcomes:
- Did faculty demonstrate improved skills in simulated feedback encounters?
- Did self-efficacy scores increase meaningfully?
Intermediate outcomes:
- Do residents report more frequent and higher-quality feedback?
- Are there more narrative comments in evaluations that reflect specific, actionable feedback?
Long-term outcomes (if feasible):
- Are residents with performance issues being flagged earlier by faculty and CCC?
Each question then drives selection of specific measures.
Step 3: Select and align methods
Here is where people usually make mistakes. They pick one method (usually a survey) and try to reuse it everywhere.
For this faculty development program:
Faculty knowledge and confidence: Pre–post questionnaire with validated or at least thoughtfully constructed items.
Observed feedback skills: Rating of recorded/simulated feedback using a structured rubric (e.g., check if feedback includes behavior description, learner engagement, agreed plan).
Resident perception: Resident survey items added to existing rotation evaluations, focusing specifically on feedback frequency, specificity, and perceived helpfulness.
Behavior change in real settings:
- Audit of written narrative comments in evaluations for specificity and actionability.
- Peer observation forms coded for presence of key behaviors.
Early identification:
- Track month of first formal concern raised about underperforming residents pre- and post-implementation.
- CCC minutes coded for timing and nature of identified problems.
You do not need a randomized trial. You do need alignment: each outcome in the logic model has one or more realistic, valid-enough indicators.
5. Common Mistakes Medical Educators Make with Logic Models
I have seen logic models used brilliantly. I have also seen versions that are basically decorative infographics with arrows.
Let me be blunt about the usual errors.
Mistake 1: Treating them as static accreditation artifacts
Someone on the curriculum committee creates a logic model “for LCME” or “for the self-study.” It is a one-time exercise. No one ever looks at it again.
Real use: you keep revisiting the model as you refine the program. You adjust activities when outcomes are not moving. You use it to explain to leadership what you can and cannot attribute to the program.
Mistake 2: Outcomes that are vague or unmeasurable
“Improve professionalism.”
“Enhance interprofessional collaboration.”
These phrases mean nothing if not defined. For each outcome in your logic model, you should be able to answer: “What concrete evidence would convince a skeptical colleague that this is happening?”
If you cannot answer that, you do not have an outcome. You have a wish.
Mistake 3: Confusing outputs with outcomes
“Number of workshops delivered increased from 4 to 8.”
That is nice. That is not an outcome.
If your logic model lists “more workshops” as an outcome, you have missed the point. Outputs are process markers. Outcomes are changes in learners, patients, or systems.
Mistake 4: No causal story
A logic model is not just a linear list. The implicit question is: “Why should these activities lead to these outcomes in this specific context?”
For example, your resident wellness program includes yoga sessions, financial planning, and mentoring. If you cannot articulate why those particular activities would reduce burnout in your environment, you cannot design a focused evaluation.
Mistake 5: Ignoring context and assumptions
Every logic model rests on assumptions and preconditions. You should be explicit:
- Assumes protected time is actually honored.
- Assumes faculty buy-in is ≥ 70%.
- Assumes SP program can supply enough trained actors.
When evaluation results are disappointing, you need to know whether the theory was wrong or the implementation failed because an assumption was violated.
6. Integrating Logic Models with Other Evaluation Frameworks
Logic models do not live alone. Smart medical educators stack frameworks.
Logic Model + Kirkpatrick
Use the logic model to define concrete outcomes and how they unfold over time. Then map those outcomes to Kirkpatrick levels to communicate with stakeholders who are used to that language.
Example for a simulation-based central line training program:
- Level 1 (Reaction): Trainee satisfaction with the sim session → short-term outcome in the logic model.
- Level 2 (Learning): Performance on simulation checklist → short-term outcome.
- Level 3 (Behavior): Adherence to central line bundle in ICU → intermediate outcome.
- Level 4 (Results): CLABSI rates → long-term outcome.
| Category | Short-term outcomes measured | Intermediate outcomes measured | Long-term outcomes measured |
|---|---|---|---|
| Pre-Implementation | 0 | 0 | 0 |
| Year 1 | 3 | 1 | 0 |
| Year 2 | 3 | 2 | 1 |
| Year 3 | 3 | 2 | 1 |
The logic model anchors the specifics. Kirkpatrick gives you a shorthand to summarize maturity of your evaluation.
Logic Model + CIPP (Context, Input, Process, Product)
CIPP is more management oriented. You can layer it on top:
- Context → problem statement, assumptions in your logic model
- Input → your inputs section
- Process → your activities and outputs (fidelity, reach)
- Product → outcomes and impact
When your dean wants to know if you “evaluated the whole program,” not just learner outcomes, CIPP language helps.
7. Practical Workflow: From Idea to Logic Model to Data
Let me outline a workflow you can actually use when someone dumps “evaluate this” on your desk.
| Step | Description |
|---|---|
| Step 1 | Clarify problem and goals |
| Step 2 | Draft simple logic model |
| Step 3 | Identify priority outcomes |
| Step 4 | Select indicators and methods |
| Step 5 | Plan data collection schedule |
| Step 6 | Collect and analyze data |
| Step 7 | Interpret results with stakeholders |
| Step 8 | Refine program and logic model |
Step-by-step, stripped of fluff:
- Clarify the problem and high-level goals in one paragraph. If you cannot, stop.
- Draft a rough logic model with stakeholders in 30–45 minutes on a whiteboard. Do not over-polish.
- Bold or circle 3–5 outcomes you care most about. Those are your priority evaluation targets.
- For each, define 1–2 indicators and at least one method (survey, observation, audit, etc.).
- Build a simple time-based plan: who collects what, when, and how often.
- Collect data and analyze just enough to answer your key evaluation questions.
- Sit with stakeholders, interpret together, and adjust the program or the model based on what you see.
If you are not updating the logic model based on real-world data and constraints, you are not using it—you are just decorating.
8. Example: Logic Model in Practice for a Residency QI Curriculum
To make this less abstract, let me run a full, concrete example typical for many programs.
Scenario: Internal medicine residency launches a longitudinal QI and patient safety curriculum.
Condensed logic model
Inputs:
- 0.15 FTE QI curriculum director
- Access to hospital QI data and analysts
- Protected half-day per month for PGY-2 residents
- Small project seed funds ($5,000/year)
Activities:
- Monthly didactics on QI methods (PDSA, run charts, root cause analysis)
- Resident teams complete real QI projects with faculty mentors
- Quarterly QI project work-in-progress conferences
- End-of-year poster session and institutional showcase
Participants:
- All PGY-2 residents (24)
- 8 QI mentors
- Hospital QI office representatives
Outputs:
- Number of QI sessions delivered and attendance
- Number of projects initiated and completed
- Number of scholarly products (posters, abstracts)
- Mentor-participant meeting logs
Short-term outcomes:
- Improved resident knowledge of QI concepts (quiz scores)
- Increased self-efficacy in leading QI projects
- Resident ability to correctly interpret run/control charts (assessment)
Intermediate outcomes:
- Number of QI projects that achieve predefined process targets
- Increased resident involvement in institutional QI committees
- Hospital leadership ratings of resident contributions to QI efforts
Long-term outcomes:
- Sustained improvements in selected system-level metrics (e.g., handoff errors, sepsis bundle adherence) that were targeted by resident projects
- Culture shift toward continuous improvement (staff survey subscales, maybe)
Now the evaluation plan falls out logically.
You might choose:
Short-term: Pre–post QI knowledge test and confidence scale.
Intermediate:
- Audit of QI project aims vs achieved outcomes (did they meet project targets?).
- Track resident participation in QI committees pre/post curriculum.
Long-term (selective): For 2–3 high-priority projects, track clinical process metrics over 12–24 months.
| Category | Value |
|---|---|
| Pre-Curriculum | 62 |
| Post-Curriculum | 84 |
That bar chart is the kind of concrete, logic model–driven result you can show your chair: “We invested 0.15 FTE and one half-day per month; here is the change in QI knowledge plus the number of projects that moved real hospital metrics.”
9. Using Logic Models to Communicate with Leadership and Accreditors
One underrated function of logic models: politics.
When the DIO, Dean, or Chair asks, “What are we getting out of this?” you need more than learner satisfaction quotes. A clear logic model lets you:
- Show that you have a coherent theory for how the program produces outcomes
- Demonstrate that you are not over-claiming attribution for long-term institutional changes
- Justify resource requests in relation to expected outcomes
For example, you can say:
“Our QI curriculum invests 0.15 FTE and monthly resident time. In year 1, we will be able to demonstrate changes in resident QI knowledge and confidence (short-term), plus process improvements on 4–6 resident-led projects (intermediate). We do not claim sole credit for hospital-wide sepsis mortality changes, but we will track our residents’ contributions to sepsis process metrics as one piece of a larger institutional effort.”
That is much more credible than “our residents will be better at QI.”
Accreditors also like seeing that your evaluation is not random. A logic model that links curriculum to competency outcomes, with planned measures at different time points, signals that you understand continuous quality improvement in education—not just box-checking.
10. Getting Started Fast: A Minimalist Logic Model Template
If you want a lean starting point you can fill in on one page, use this:
Problem / Need:
One short paragraph.Inputs:
3–7 bullets describing key resources.Activities:
5–10 bullets of what you will actually do.Participants:
Who is supposed to be touched by the activities.Short-term outcomes (0–6 months):
3–5 bullets: knowledge, skills, attitudes, immediate behaviors.Intermediate outcomes (6–18 months):
3–5 bullets: behavior in real settings, process changes.Long-term outcomes (18+ months):
2–4 bullets: patient, system, or institutional impacts.Assumptions / Risks:
3–5 bullets: What must be true; what could derail this.
You do not need fancy software. A whiteboard picture with arrows is fine as long as you can read it and operationalize it.

11. Where Logic Models Fit Across Your Career as a Medical Educator
If you stay in medical education long enough, you will touch this at multiple levels:
- As a junior faculty running a small workshop: You use a quick logic model to decide what to measure beyond smilesheets.
- As a clerkship or residency director: You use logic models to rationalize major curriculum revisions and to prioritize evaluation efforts when time is limited.
- As a vice chair for education or DME: You use logic models to compare programs, justify resource allocations, and respond to accreditation questions about effectiveness.
Learning this now pays off repeatedly. You will stop guessing which evaluation methods to use and start being able to say, “Here is our program’s logic. Here are the 3 outcomes we care about most. Here is how we will know, by when, and with what data.”

FAQ (Exactly 5 Questions)
1. Do I need a separate logic model for every single workshop or session?
No. Build logic models at the program level, not the event level. A longitudinal curriculum, a faculty development series, or a major clerkship revision deserves a logic model. An isolated grand rounds does not. For individual sessions, a simpler “learning objectives + evaluation plan” suffices.
2. How detailed should my logic model be?
If it takes more than one page to read, it is probably too detailed. The key is usable clarity, not exhaustive listing. Aim for 3–7 items in each main category (inputs, activities, outputs, outcomes). You can always maintain a more detailed internal version, but the working model that guides evaluation should stay lean.
3. How do logic models handle unintended or negative outcomes?
They do not prevent them, but they help you notice them. When your expected pathways are explicit, deviations become obvious. You can add a section for “possible unintended effects” (e.g., added faculty burden, learner burnout) and measure those intentionally. A serious program evaluation should include at least one indicator that can detect harm or tradeoffs.
4. What if long-term outcomes are influenced by many other factors? Can I still claim impact?
Be honest about attribution. Long-term outcomes in medical education are almost always multi-determined. You can say your program contributed to observed changes, supported by intermediate outcomes that are more directly linked (behavior, process metrics). Use language like “associated with” or “aligned with institutional improvements,” not “caused.”
5. How do I introduce logic models to colleagues who think they are just extra paperwork?
Do it in reverse. Start with their pain point: “We keep doing these workshops and have no idea if they matter.” Then sketch a quick, rough logic model on the board in 10 minutes and show how it clarifies which 2–3 outcome measures would actually answer their questions. When people see that logic models reduce random data collection and help defend their programs to leadership, resistance drops quickly.
Key takeaways:
- Logic models are not art projects; they are causal roadmaps that drive targeted, meaningful evaluation.
- Build them at the program level, keep them lean, and explicitly link short-, intermediate-, and long-term outcomes to concrete measures.
- Revisit and revise them as data come in. A living logic model is one of the sharpest tools you can have as a medical educator who takes program evaluation seriously.