October 2023


Attached and below please find another edition of the COMSEP Journal Club.  For the second month in a row, there is a robot reference.   It is only a matter of time now before the predictions of our April Fool’s edition come to pass…

On a lighter note, today is the 35th anniversary of the release of the Village People’s YMCA, a moment that changed dance parties and sporting events forever.


Amit, Jon and Karen

Do Unto Others…How Learners and Educators Make Sense of Maltreatment

Vanstone M, Cavanagh A, Molinaro M, Connelly CE, Bell A, Mountjoy M, Whyte R, Grierson L. How medical learners and educators decide what counts as mistreatment: A qualitative study. Med Educ. 2023 Oct;57(10):910-920. https://dx.doi.org/10.1111/medu.15065

Reviewed by Lauren K. Kahl

What was the study question?

How do medical learners and educators define and interpret maltreatment in the clinical learning environment?

How was the study done?           

Thirty-one learners (medical students, residents or fellows) who self-identified as having witnessed or experienced maltreatment and 18 educators were interviewed between July 2017 and July 2019 at a single institution in Canada.   Interviews were conducted by trained personnel. They first asked participants to define maltreatment and then asked participants to recount experiences in the following categories: those that most would agree is maltreatment, those that were suboptimal but most would not call  maltreatment, and those where people might disagree. Researchers examined the transcripts using  a staged coding process and a grounded theory social constructivist approach..

What were the results?

Sixty-five percent of participants were female. Other identifying demographic information was not collected to protect participant identity.. Participants noted that different recipients could interpret the same situation in different ways, and that maltreatment represents a spectrum of behaviors.  However, some important themes did arise. Organizational culture, recipient identity, and intent of the initiator were all identified as key factors in whether a particular interaction was categorized  as maltreatment.

Both learners and educators reported that the context of the event (including physical setting, organizational culture, and social dynamics) was the most influential factor in determining whether something would be called maltreatment.

How can I apply this to my work in education?

There was no consensus between participants as to what classifies as maltreatment. This makes establishing definitions for future research more challenging.   Changes to organizational culture and training for educators on culturally competent learning environments are high-level interventions that may be successful. On the individual level, initiators should state the intent of their interactions with learners (i.e. patient or learner safety, learning points, constructive feedback).

Editor’s Note:  One of the most useful features of this article is a table that includes common situations in which there was frequent disagreement among participants about whether they constituted maltreatment.  Faculty might consider those ‘red flags’ and be extra careful to avoid misconceptions around those behaviors. (JG)

 We regret to inform you….Sincerely, Robbie the Robot

Triola MM, Reinstein I, Marin M, Gillespie C, Abramson S, Grossman RI, Rivera R Jr. Artificial Intelligence Screening of Medical School Applications: Development and Validation of a Machine-Learning Algorithm. Acad Med. 2023 Sep 1;98(9):1036-1043. https://dx.doi.org/10.1097/ACM.0000000000005202

Reviewed by: Angela  Punnett

What was the study question?

The NYU Grossman School of Medicine studied the application of a machine learning decision support algorithm to perform initial screening of applications to invite for an interview.

How was the study done?

The virtual screener algorithm was developed and validated using structured data elements from 5 application cycles (2013-2017) of 14,555 students with accompanying faculty screening recommendations. Data elements included self-identified demographic factors, detailed academic performance and count variables (eg. amount of research experience). Unstructured data (letters of reference, essays, research descriptions) were not included. The algorithm was then validated prospectively in tandem with usual screening practices in the 2018 application cycle. Finally, a randomized trial of the algorithm compared with faculty screeners was conducted during the 2019 application cycle with about 1500 applicants. The distribution of screening recommendations (invite for interview, hold to review, reject) and applicant characteristics were compared.

What were the results?

Retrospective and prospective validation of the algorithm yielded AUROC (area under receiver operating curve) scores of 0.82-0.83 for invite to interview and reject recommendations (excellent performance) and 0.62-0.64 for hold for review recommendations (modest performance). In the randomized portion, there was no difference in overall recommendations between virtual and faculty screeners and no difference in rates of recommendations for applicants identifying as under-represented in medicine or female. There was a significant difference, however, for those applicants identified as economically disadvantaged with fewer invitations to interview from the virtual screener. Significant differences in GPA and MCAT scores were also noted though the absolute difference was small.

How can this be applied to my work in education?

The virtual screener demonstrated good accuracy and allowed for consistency in application reviews. As is true of any AI algorithm, the output reflects the collective performance and biases of the data training set. The difference in invitations to interview for those applicants identified as economically disadvantaged is concerning and warrants further study. The inclusion of unstructured data review may be particularly important for the applicants assigned as ‘hold for review’ where the performance of the algorithm was less robust. I worry about what is lost in the unreviewed unstructured data upfront and the potential for ‘gaming’ the count variables.

Editor’s Note: This is a good proof of concept study that shows that AI can assist us in the massive applications programs receive for limited spots. I do not think AI will supplant the process any time but will help cut down the time needed for each review. (AKP)

Residents Teaching Clinical Skills

Kusnoor AV, Balchandani R, Pillow MT et al. Near-peers Effectively Teach Clinical Documentation Skills to Early Medical Students.  BMC Medical Education 22, 712 (2022). https://doi.org/10.1186/s12909-022-03790-0

Reviewed by: Maya Neeley

What was the study question?

Can near-peer teachers suitably teach HPI documentation skills on par with faculty?

How was it done?

Second-year students participating in a required HPI (History of Present Illness) workshop as part of a longitudinal clinical skills course were randomly assigned to faculty- or resident- (near-peer) facilitated small groups. All facilitators were volunteers and received a facilitator guide; resident teachers also participated in an Teaching in Small Groups workshop prior. During the workshop, students reviewed, corrected, and critiqued sample HPIs under the guidance of their facilitator. Students and residents completed post-workshop evaluations. Graded assessments following the workshop included a written HPI shortly after, and a summative OSCE which included a written HPI done at the end of the longitudinal skills course. Ratings of resident and faculty facilitators were compared, as was student performance on assessments of their HPI documentation.

What were the results?

365 students, 29 residents and 16 faculty participated in this study. Ninety-five percent of students agreed that the workshop was beneficial, and these responses did not differ by facilitator type. In the post workshop evaluation, students ranked resident facilitators statistically significantly higher than faculty facilitators on the following: encouraging participation, making eye contact, addressing students by name, and achieving the goals of the session, although overall mean ratings between residents and faculty were very similar. All residents facilitators agreed that the Teaching in Small Groups workshop helped them to facilitate the session and that residents should facilitate the HPI workshop again. In terms of student assessments of performance, mean scores were similar on the written HPI for students in both groups (29.3/30) and on the OSCE HPI (9/10).

How can this be applied to my work in education?

This study demonstrates that near-peer (resident) teaching on HPI documentation skills is as effective as faculty teaching on the same topic. In addition, residents appreciated the workshop they attended beforehand, and wanted to be included again in future iterations of the HPI workshop. This study provides further evidence that there are particular skills that can be taught successfully by near-peers. As demands on faculty continue to grow, the near-peer model provides a method for teaching with benefits for both students and residents interested in education.

Editor’s Note:  This study demonstrates the benefits of near-peer teaching for both students and residents. Although the authors report statistical significance in facilitator evaluation scores, the scores themselves are actually very similar (4.8/5 for faculty, 4.9/5) for residents. It would be interesting to see if these differences remain if faculty also participate in a “Teaching in Small Groups workshop” prior to the session, reminding them of some of the microskills that were evaluated (e.g. eye contact, addressing students by name), as even the most experienced faculty educators benefit from reminders. (KFO)