Council on Medical Student Education in Pediatrics

COMSEP Logo

Search This Site

Journal Club

Hasnain M, Connell KJ, Downing SM, Olthoff A and Yudiwsky R. Toward Meaningful Evaluation of Clinical Competence: The Role of Direct Observation in Clerkship Ratings. Acad Med 2004; 79(10):S21-4. Reviewed by Bob Swantz, University of Rochester


Hasnain M, Connell KJ, Downing SM, Olthoff A and Yudiwsky R. Toward Meaningful Evaluation of Clinical Competence: The Role of Direct Observation in Clerkship Ratings. Acad Med 2004; 79(10):S21-4.

Reviewed by Bob Swantz, University of Rochester

This paper addresses the question of whether direct observation by faculty of students' clinical skills improves the reliability and validity of their ratings of student performance. The study was a retrospective review from one academic year of a class of third year students on a six-week family medicine clerkship. Faculty ratings of clerkship performance using a 16 item behavior-based assessment tool were assigned weighted scores depending on the basis for the rating - note review, case discussion, and direct observation (low to high score data sources). These assessments were compared with NMBE subject exam scores and results of a 4th year 8 station OSCE (employing standardized patient).

Of 172 clerks, 73% had experiences at residency sites, and 33% had a single preceptor (who on average spent 11.5 half-days with student). Internal consistency on the assessment tool based on the primary evaluator was high (.93), however the overall coefficient for agreement between 3 raters was only .36. Subgroup analysis considering the data source showed that inter-rater reliability increased from .29 to .74 as more direct observation, instead of just note review, became the basis for judgment. While overall scores on the assessment tool also had a positive correlation with NBME subject exam scores and OSCE scores (r=.141, and r=.159, respectively), a similar sub-group analysis by data source, showed an even stronger and statistically significant correlation (NBME scores r=.311 and OSCE scores r=.423). The authors concluded that reliability and validity of student clinical competence ratings are enhanced by direct observation.

Comment: Although the study has some limitations (extrapolating data source ratings from the primary evaluator to all preceptors, and small number of subjects in sub-group analysis) it supports the argument for more direct observation to improve the reliability and validity of student evaluation. The poor inter-rater reliability when data source was not taken into account is not surprising. How many times have we seen discrepant evaluations of a student from faculty and residents, some who directly observe the student in their clinical skills and others who extrapolate inferences of performance from reading H&P's and progress notes? The article makes a case for employing multiple bases, including direct observation, to formulate judgments about trainees. The question remains - how can evaluators, within the time constraints of their practice, incorporate direct observation (even brief encounters) into their assessment of students?

(We probably all agree that increasing direct observation would improve our evaluations. (How many of you have seen Bill Razka's brilliant demonstration, where he role-plays a student performing a focused history and physical (badly) and then, competently present it to faculty?) What I liked about this article, though, has to do with the variety of evaluation and measurement techniques the authors use. Do you know your inter-rater reliability for various preceptors? Do you weight the evaluation scores based on the degree of observation? Do you triangulate data? These are things for all of us to think about (and get help with, as most of us are not evaluation experts). Having said that, Bob hits the other real question right on the head: Who has the time or resources to increase direct observations? Some of our COMSEP members have helped us with some strategies - brief structured observations and structured clinical observations (SCOs) are two examples. But the time/money question is still there. So, I have a few other questions: Do your residents or faculty receive training in how to observe? Do you have systems (e.g. checklists) in place to help them standardize observations? And what about an understanding of what you observe for in areas such as professionalism? Altruism? Teamwork? I think it's time for us to use the wisdom of crowds and brainstorm about ways to increase direct observation that won't overtax our systems. I know we can do it! Karen Marcdante)

Return to Journal Club