June 2021

posted in: 2021 Journal Clubs | 0

COMSEP Journal Club
June 2021
Editors: Karen Forbes, Jon Gold and Randy Rockney


You can observe a lot just by watching


What happens under the flag of direct observation, and how that matters: A qualitative study in general practice residency.  Rietmeijer CBT, Blankenstein AH, Huisman D, van der Horst HE,  Kramer AWM, de Vries H, Scheele F Teunissen PW (2021) Medical Teacher, DOI: 10.1080/0142159X.2021.1898572


Reviewed by Michele Haight


 What was the study question?

How do residents perceive the role and value of direct observation of technical

skills in their training?


How was the study done?

Thirty-one residents from PGY-1 and PGY-3 participated in four focus groups. Interview questions focused on direct observation of “technical skills” in a Dutch General Practice Training Program.  Technical skills were defined as “physical examinations and invasive procedures, such as injections in joints or minor surgical interventions.” Focus group data were coded and analyzed using constant comparative, thematic analyses utilizing a constructivist, grounded theory approach.


What were the results?

Residents were generally ambivalent in their experiences with direct observation. Ad hoc, unplanned, unidirectional observations had an adverse effect on the training relationship between residents and supervisors due to their unpredictability and a perceived lack of agency by the residents. Planned, bi-directional observations were the most effective types of direct observations because they supported the psychological safety of the residents and enhanced the quality of the overall training. Residents found it difficult or not worth the effort to provide critical feedback to supervisors about the supervisor’s skills or behaviors during direct observations.  As a result, residents avoided direct observations or seeking help, which impacted their ability to achieve optimal learning.


What are the implications?

Direct observations provide great potential for enhanced learning if they are planned and provide a safe space for bidirectional feedback. Supervisors need to be aware of the context of the direct observation opportunity as well as the psycho-social dynamics of the observation in order to determine how the direct observation can best facilitate learning.


Editor’s Note:   The wisdom of the great philosopher Yogi Berra is as true in medical education as it is everywhere else.  It is hard to achieve much improvement without being observed, and yet our students (and residents) don’t get it nearly enough.   Multiple low-stakes, planned observations are challenging to implement but something to work toward.  (JG)


Can we trust our clinical assessments?


Evaluating the Reliability and Validity Evidence of the RIME (Reporter-Interpreter-Manager-Educator) Framework for Summative Assessments Across ClerkshipsRyan MS, Lee B, Richards A, et al. Academic Medicine 2021; Vol:96:256-262.  DOI: 10.1097/ACM.0000000000003811


Reviewed by: Caroline Roth


What was the study question?

Is the RIME (Reporter-Interpreter-Manager-Educator] framework for student summative assessments valid and generalizable across clerkships?


How was it done?

A modified RIME framework was implemented across all third-year clerkships for summative student assessments. Individual students were assigned the classification of observer, reporter, interpreter, or manager status based upon clinical skills demonstrated. Faculty development on the modified RIME framework was provided, although varied by clerkship. Clerkship-level assessment data from internal medicine, neurology, obstetrics-gynecology, pediatrics, and surgery faculty were analyzed for reliability and validity using generalizability theory. This refers to the ability to generalize from a single student rating to the average rating the student would achieve on repeated assessments. Analysis included examination of factors contributing to variability of the assessment scores through generalizability (G-) and decision (D-) studies.


What were the results?

A total of 6,915 ratings were completed on 231 students across 5 clerkships. In pediatrics, students were most frequently (45.2%) assessed as achieving interpreter status, with the mean RIME score of 2.94 (score of 1 = observer, score of 4 = manager). For other clerkships, means ranged from 2.89 in internal medicine and neurology to 3.48 in obstetrics-gynecology. Variance of score attributed to the learner ranged from 16.7% to 25.4% and was found to be 25.1% in pediatrics. A substantial proportion of variance was attributable to the rater. The minimum number of assessments per student on their pediatric clerkship required to achieve acceptable reliability (G=0.7) was 7 assessments with a range among other clerkships of 7 to 12.


What are the implications?

Clinical evaluation of students on their pediatric clerkship can be challenging especially given the inherent risks of subjectivity that comes with these evaluations.  The modified RIME framework offers the potential for a standardized evaluation tool which reaches desired reliability with sufficient faculty evaluators. However, some institutions may find it challenging to obtain the minimum quantity of assessments recommended for reliability.


Editor’s Comments: An important finding of this study is the number of assessments required to achieve 0.7 reliability: between 7 and 12 assessments depending on the clerkship. The authors found that students received between 3 and 8 assessments depending on the clerkship. This highlights a common gap in the validity in our assessment strategies and the need for more frequent assessments. (KFo)



Pass/fail frameworks foster mastery learning


Moving toward Mastery: Changes in Student Perceptions of Clerkship Assessment with Pass/Fail Grading and Enhanced Feedback.  Bullock JL, Seligman L, Lai CJ, O’Sullivan PS & Hauer KE.  Teaching and Learning in Medicine, (2021).  DOI: 10.1080/10401334.2021.1922285


Reviewed by Gary L. Beck Dallaghan


What was the study question?

Does a change in the clerkship assessment structure from a tiered framework (honors/pass/fail) to a pass/fail framework with enhanced formative feedback result in improved student perceptions of fairness and accuracy and an improved learning environment?


How was this study done?

This was a single-institution, before-after cross-sectional survey study.  Students’ perceptions of the assessment system and learning environment were collected after completing core clerkships for one cohort in the tiered assessment system and one cohort in the pass-fail system.  Analysis involved descriptive statistics and thematic analysis to analyze open-ended comments.


What were the results?

Student perception of fairness and accuracy of the clerkship assessment improved with an effect size of .80.  Students perceived grading to be fair and transparent in the new framework.  They also found that the learning environment was more mastery-oriented than in a tiered system.  Narrative comments corroborated findings, but also highlighted areas that could be improved with the workplace-based feedback process.


What are the implications?

In a tiered-grading structure, students often spend more time trying to figure out how to get top grades.  This goal can drive some to avoid trying new skills that they may not look good doing.  This study demonstrated that a change to pass/fail on the clerkships resulted in students engaging more in learning.  Knowing that if they put in effort and are motivated to learn, they can receive a passing grade and focus on mastering material.


Editor’s Note:  Interestingly, the students’ perceptions of the fairness, accuracy and bias of the individual clinical evaluations did not improve with the new framework.  (See the review by Caroline Roth above).   It’s just that they were lower stakes.  As you might expect, some students expressed concerns about how narrative comments would be incorporated into their MSPE (dean’s letter) since the pass-fail system made them even more important. (JG)