November 2025

Hello COMSEP!

A variety of topics for this month’s Journal Club, from the qualities of excellent educators to a catalog of assessment tools for documentation to one of our favorite recent topics, a comparison of human and AI feedback.  Maybe it’s appropriate for the upcoming American holiday with its cornucopia of dishes.

Hope the shorter days allow you more time with family and friends, at home or wherever you find comfort.

Enjoy!

Karen, Amit and Jon

Part of the team, not just on the team

Haas AJ. Blackall,GF, Osei-Bonsu W; Costigan H,  Stuckey HL.   What Really Matters: A Qualitative Study of Student Perspectives on Exceptional Teaching.   Academic Medicine 100(11):1284-90. November 2025. |https://dx.doi.org/10.1097/ACM.0000000000006172

Reviewed by Drew Galligan

What were the study questions?

What makes someone an exceptional teacher in the eyes of medical students? How do these educators enact exceptional teaching behaviors in practice? 

How was the study done?

This qualitative study was conducted at Penn State College of Medicine through the Exceptional Teacher Initiative. Over a five-year period, students voluntarily submitted 3,409 narratives describing educators who had provided exceptional learning experiences. A single open-ended prompt was used: “What makes someone an exceptional teacher and how do they do it?”  From this dataset, a systematically selected subset of 872 narratives was analyzed using an inductive coding approach. Three independent coders developed a codebook and used qualitative software to identify recurring themes and subthemes. The coding process achieved substantial interrater reliability. Analysis focused on clustering similar codes into meaningful categories that reflected students’ perceptions of exceptional teaching.

What were the results?

Three major themes characterized exceptional teaching. First, students valued being challenged through progressively complex learning experiences, especially when those challenges were paired with strong support and frequent, actionable feedback. Second, students felt empowered when they were included as part of the clinical team, given autonomy in patient care responsibilities, and treated as respected colleagues. Third, students were deeply inspired by educators who demonstrated genuine enthusiasm for teaching and patient care, and who modeled humanistic and authentic professional behavior—including vulnerability and emotional connection. 

How can I apply this to my work in education?

The findings provide a clear, evidence-based framework for strengthening medical education. Drawing from the themes identified in the study, five actionable recommendations stand out: challenge students while maintaining psychological safety, provide frequent and specific feedback, offer autonomy to encourage growth, integrate students into the team, and cultivate meaningful trainee–teacher relationships. By adopting these practices, educators can create a more effective, inspiring, and supportive learning environment for medical trainees.

Editor’s Comments: The strength of this study is in its use of a large and robust data set – almost 900 voluntarily submitted narratives on exceptional teachers!  While the findings are not surprising, framing their results as specific recommendations was very helpful and I would encourage you all to reflect on those recommendations and your own educational practices. (KFo)


Bringing objective assessment to clinical documentation

Kelly, WF, Hawks MK,.Johnson WR; Maggio  LA, Pangaro L,  Durning, SJ.   Assessment Tools for Patient Notes in Medical Education: A Scoping Review. Academic Medicine 100(3):p 358-374, March 2025. htttps://dx.doi.org/10.1097/ACM.0000000000005886

Reviewed by Chas Hannum

What was the study question?

What are the characteristics of tools that exist for objective assessment of clinical documentation, and can they be used in competency-based education?

How was the study done?

This scoping review searched for assessment tools from articles published from database inception to November 2023. For each note-writing tool identified, the characteristics of the tool were recorded and collated including what was being assessed, patient setting, properties of the tool, how it was used in practice and learner and stakeholder perspectives. The study team identified 32 studies with unique tools.

What were the results?

Most articles (69%) outlined an original assessment tool with the remaining articles assessing a curriculum intervention using a tool. The tools were heterogeneous in their characteristics, including setting (inpatient vs outpatient), specialty, learner phase - undergraduate medical education (UME) or graduate medical education (GME), assessment goal (general note assessment vs a specific competency) and application (formative vs summative assessment). All tools used a combination of dichotomous questions, Likert scales, global ratings and/or open-ended comments. Only 25% of tools mapped to an assessment framework (such as RIME or EPAs). Many of the identified tools had identifiable validity arguments. Content validity was most common (81%), followed by internal structure (69%), relationships to other variables (41%) and response process (25%). Tool characteristics of the number of individual items and the number of domains/sections varied widely. No tools explicitly assessed patient readability, required social determinant of health documentation or addressed billing elements. No studies commented on equity or fairness in  the assessment to learners. 

 How can I apply this to my work in education?

Applying tools to the assessment of clinical documentation can reduce subjectivity and assessment bias, though  the tools identified in this study have limitations to their incorporation more broadly in medical education settings. Regardless, using a tool to assess clinical documentation may still be beneficial to learners and can help organize faculty assessing students around similar constructs. Interestingly, most tools were not used in summative assessment/grading. 

Editor’s Note: This was an interesting review of what assessments are used in clinical documentation and highlight that in medical education there remains wide variation in assessment of skills. (AP)


Humans finally win out…but barely

Ali M, Harbieh I, Haider KH. Bytes versus brains: A comparative study of AI-generated feedback and human tutor feedback in medical education. Med Teach. Published online June 18, 2025. https://dx.doi.org/10.1080/0142159X.2025.2519639

Reviewed by Nikita Chigullapally

What was the study question?

How does AI-generated feedback compare to human tutor feedback in terms of effectiveness and perceived value among medical students?

How was the study done?
Second-year medical students in a prescribing course received two sets of feedback on a written case assignment: one from a human tutor and one unedited from ChatGPT (GPT-4 Turbo). Students evaluated both using an online questionnaire with Likert-scale and open-ended questions. 

What were the results?
Eighty-five of 108 students (79%) completed the survey. Human tutor feedback was rated significantly higher across all domains, including clarity, actionability, comprehensiveness, accuracy, and overall usefulness (all p < 0.01). Students highlighted the value of tutors’ detailed, context-specific guidance, especially around medication choices and communication strategies. Many found ChatGPT feedback helpful, noting its clarity, structure, and accessibility. 62% reported the two feedback types complemented each other, with AI offering accessible phrasing and broader perspectives while tutors provided depth and curriculum alignment. 

How can I apply this to my work in education?
AI-generated feedback should be integrated as a supportive tool rather than a stand-alone replacement. Its main strengths—clarity, speed, and structure—can enhance formative feedback and provide students with timely input in resource-constrained settings. Human tutors remain indispensable for ensuring accuracy and contextual nuance, but a hybrid approach could increase the frequency and richness of feedback while easing faculty workload. Medical schools could also improve AI utility by training models with curriculum-specific content and examples of high-quality feedback, then layering human review for accuracy. Such hybrid systems could transform the feedback process, allowing educators to provide both scale and depth, ultimately strengthening student learning and professional development. COMSEP collaboratives could develop prompts members could employ locally throughout the curriculum.

Editor’s Note: It is nice to know that the perspective of faculty tutors is still valued by students.  In the real world, it is likely that many students will use AI feedback to improve their submissions before faculty see it, anyway.  And it is increasingly likely that many faculty will use AI to help them write their feedback to the students.  (JG)