Section C
The Oral Examination
Paul B. Kaplowitz, M.D.
Renee R. Jenkins, M.D.
Prasanna Nair, M.D.
Overview
As George Miller pointed out in his elegant address to the 8th Annual Research in Medical Education conference, “it seems important to start with the forthright acknowledgment that no single assessment method can provide all the data required for judgment of anything so complex as the delivery of professional services by a successful physician.”1Proponents of oral examinations have claimed that it measures competence based on a fund of knowledge, problem-solving capability, and personal characteristics, so that it adds significantly to the evaluation of individuals training to be physicians.
In 1975 the American Board of Medical Subspecialties discontinued the use of oral examinations.Other medical subspecialty boards joined the National Board of Medical Examiners in dropping them. The cost of the examination with questionable reliability made it hard to justify the continued use.The American Board of Pediatrics phased out the oral examination in 1989.
There appears to be tremendous variation on what constitutes the “traditional” oral examination.Studies analyzing characteristics of oral examinations vary most often by format and number of examiners.2,3,4,5,6 Muzzin and Hart7 describe four basic formats for oral examinations:
- the interview style, in which the examinee is quizzed on general topics;
- the clinical style, in which questions are specifically regarding diagnosis and treatment plans for a particular patient;
- the cognitive style that requires problem solving around specific cases; and
- the role-playing style, with students assuming various “roles” with the examiner.
The lack of popularity of the oral examination in recent years has limited the appearance of studies on comparing it to other types of examinations.The most recent literature focuses on subspecialists taking board examinations as its study population rather than medical students.Two studies differ in the type of oral examination that was used for comparison.Robb and Rothman4 used a clinical examination style, while Anastaskis, et al.2 used a cognitive style with problem solving around four predetermined clinically oriented scenarios.Robb and Rothman4found higher inter-rater reliabilities for the Objective Structured Clinical Examination (OSCE) as compared to the oral examination and better correlations with the In-training Evaluation Report.In comparison, Anastakis, et al.2 found high inter-rater reliability for the Structured Oral Examination and moderately positive and statistically significant correlations with scores on the OSCE and a multiple- choice examination.
The benefits of the oral examination as a teaching tool when used with students is also a consideration. Although students expressed dissatisfaction with issues related to scoring, Vu et al.8 reported that students felt that the orals were a fairer evaluation of the student’s data base and provided an opportunity for immediate feedback in a way that supported further learning.
Programs which use the oral examination as one of their clinical assessment measures need to be aware of biases such as the “dove/hawk” effect, characterizing some examiners as more lenient or tough than others, the “halo effect”, scoring an overall high or low mark based on carryover from a score in one section of the exam, and others carefully described in Muzzin and Hart7.Using pairs of examiners rather than individuals or teams of examiners appears to increase inter-rater reliability.
Description and Rationale for Use
As discussed above, there are four basic formats for oral exams: the interview style (general topics); the clinical style (discussing diagnosis and treatment of a specific patient); the cognitive style (problem solving); and the role-playing style.From the survey of clerkship directors completed in September, 1992 (see Appendix 1), it appeared that the clinical and cognitive styles were most often used, frequently in combination. In 50% of programs which use oral exams, the student is asked to present and discuss a patient they have worked up during the rotation.This allows assessment of how well a student can organize a case with which he/she is familiar, discuss the differential diagnosis, and demonstrate in-depth knowledge concerning the patient’s problem and its management.
The cognitive format is used in some manner by 86% of clerkships which use oral exams.The student is given a brief clinical scenario and asked to collect information by history and physical exam, generate a problem list, and propose tests to make a specific diagnosis.About half the clerkships provide a list of cases made up by the clerkship committee, whereas in some programs, examiners are allowed to use whatever cases they wish. This format is intended to test a student’s reasoning skills and clinical judgment in a way that multiple-choice exams cannot.
Strengths and Weaknesses
Strengths :The main advantage of the oral exam is that the examiner is able to ask students a series of related questions which can test not just their knowledge base, but how well they can apply this knowledge to a clinical situation.If the student is presenting one of their cases, the examiner can assess their ability to organize
information and present it in a clear and logical fashion.The examiner can ask about the differential diagnosis of the patient’s complaint to see if the student has read about and understands the features which distinguish the patient’s diagnosis from others which needed to be considered.When students are asked to discuss an unknown case, their ability to gather relevant information by history and physical exam, their skill in developing a focused differential, and the appropriateness of the proposed diagnostic studies can all be evaluated.Factual knowledge about the case is important, but the examiner can still evaluate the student’s problem-solving ability even if there are gaps in knowledge. If the student cannot provide the desired information the first time, it is possible to provide a hint or rephrase the question.
From the students’ perspective, the main advantage of oral exams is that they get immediate feedback on their responses.They may find, for example, that the aggressive diagnostic work-up they proposed for a certain complaint, while appropriate for a sick inpatient, is excessive for a non-acutely ill outpatient.They often complete the exam knowing more pediatrics than when they started. With multiple-choice exams such as those provided by the NBME, the element of feedback is lacking.
Another advantage of using oral exams is that it provides the clerkship committee additional information upon which to base a grading decision for a student who may be marginal, and it helps identify students who have poor clinical reasoning abilities.
Weaknesses : The main difficulty with oral exams is that they are difficult to grade in a standardized manner when so many different faculty are involved.Some faculty are inherently more difficult or more demanding graders than others.When students are questioned on cases they have seen, different examiners may have different expectations as to the depth of knowledge a third-year student should have.Some may put undue emphasis on recall of certain facts, test results or less common items in the differential diagnosis, and less emphasis on the logic and thought processes behind the case presentation.It is therefore very difficult to standardize the grading of this type of question.One way to do a better job would be to develop a faculty consensus on what constitutes an adequate patient presentation and get examiners to apply these criteria in the oral exam setting.A second problem is variability in the time different faculty devote to the exam.Faculty who are very busy and less committed to teaching may try to complete an exam in 20 minutes, whereas other examiners may take an hour or more.This problem can be minimized by only utilizing faculty who are willing and able to devote adequate time to each exam and setting clear time limits for all examiners (e.g. 45-60 min.).A third problem is that when faculty ask the students to work through unknown cases, the difficulty of these cases and the extent to which the student has been exposed to the material varies greatly.It is therefore critical to get faculty who do oral exams together and develop a departmental consensus on which cases or types of cases are suitable and what level of knowledge is appropriate for a third-year medical student.
Even when oral exams are given in a more standardized manner, it must be recognized that the number of cases discussed is of necessity quite small (usually 2-4). Therefore, it is not possible to obtain a reliable sample of a student’s knowledge base, which could in some cases be much more deficient in the areas tested than in those not tested.The only way to remedy this problem is to give much longer exams (e.g. 3 hours vs. 45-60 minutes) which is impractical.It may therefore be argued that the oral exam at best provides supplemental information about student performance and should constitute only a minor part of the final evaluation.
Another potential weakness of oral exams is that some students, due to a high level of anxiety, do not perform up to their potential when they are tested orally.Though this problem affects only a small proportion of students, it is important for faculty to be aware that some students have this problem and make a special effort to put them at ease during the exam.
Implementation Strategies
Despite the clear strengths of oral examinations, it is surprising that only a minority of pediatric clerkships employ them.As detailed in the survey (Appendix 1), one reason is difficulties in standardizing the exams, which will be discussed below.The other main reason, given by 72% of clerkship directors, is that administering oral exams takes a lot of faculty time.If a program has a large full-time faculty, the exams can be spread out over many individuals, reducing the number which any one person needs to do.However, faculty who are extremely busy or have little commitment to student education do a poor job testing students and may give them little useful feedback.For a program to implement oral exams, there needs to be a strong commitment from the chairman that this represents an important educational activity which is deserving of scarce faculty time.
At most medical centers, oral exams are given by a large number of full-time and clinical faculty.Using members of the clerkship committee only, while workable at programs with a relatively small number of students, places a large time burden on a small number of individuals.If having a diverse group of faculty give examinations is to be fair, the faculty need very specific guidelines on what the students are to be questioned on and what level of knowledge is expected.Several programs have addressed this issue by preparing specific cases to be used by all examiners.In some programs, the examiners may use any case they wish from a list, whereas at others, the same one or two cases are used by all examiners for the same group of students completing the pediatrics rotation.The latter approach works well when all oral exams are given on the same morning or afternoon.If the exams are spread out over several days, there is clearly the risk that one student will tell others what they will be tested on.
Since many faculty lack current knowledge of some general pediatric problems, it is important that all faculty be given a written description of the knowledge expected of third year students completing pediatrics on a given topic.Because final grades in many programs are calculated from numerical formulae, it would be useful to provide the descriptions in the form of a checklist, so that the examiner could check off points correctly answered by the student as the case is discussed. The grading of the student could either be subjective (based on the examiner’s assessment of how many items the student knew compared to the “expected”) or objective (the examiner could be asked to total the points the student answered correctly and derive the grade directly from that total).One way to do this would be for each question to have a limited number (8-15) of key points of expected knowledge and have the score on that question be the sum of the points answered correctly.These points should cover both the critical items in the history, the 2-3 most important physical findings, the 2-3 most useful tests, and the key elements of the differential diagnosis (could list up to 6).Since not all items listed will be of equal importance, these forms should be designed to allow weighting of some items more heavily.For example, the 1 or 2 most critical items in each category could be weighted so as to count for 2 points, while the still important but less critical choices could receive a weight of 1 point.Since often a student will know an item only after being asked in more than one way or given a hint, the grader would be expected to give half-credit for such responses.The grader would then add up the total points, divide by the maximum possible, and express the grade for that question as a percent.
In developing cases for oral examinations, it must be recognized that even a diligent student cannot be expected to have seen or read about every major clinical problem in 6-8 weeks.The series of lectures given in most clerkships identifies a core of knowledge that all students should acquire.It is suggested that all questions involve topics covered at least briefly in the lecture series or in some clinical experience which all students are exposed to.A good exam case should have the following characteristics:
- the presenting problem is relatively common
- there are several plausible diagnoses
- key information in the history and physical exam can narrow the diagnostic possibilities
- the diagnosis can be confirmed with one or more relatively simple diagnostic studies or by a very characteristic clinical course
The cases should primarily test the student’s problem-solving skills, and questions about the treatment of the particular condition should at most be a minor part of the information the student is expected to know.
It is recommended that the oral exam consist entirely of 2 (or possibly 3) unknown cases, and be completed in 40-60 minutes (20-30 minutes per case).Having students present their own patients does provide useful information to the examiner, but a ward attending has had the opportunity to listen to each student present the case and question them, so their skill in discussing a case with which they are familiar is already part of their clerkship evaluation.
We have included in Appendix II two standardized cases for pediatric oral exams which meet the criteria discussed above.These cases were designed with the checklist feature to aid in objective scoring but have had only limited testing. It is recommended that programs which currently use standardized cases for their oral exams consider sending them to the COMSEP Resource Clearinghouse (Reference 9).Clerkship directors who do not currently use oral exams or who desire to improve the exams they administer will then have access to these cases and can develop additional cases to cover topics they would like their students tested on.
Cost/resources required:
The implementation of oral exams requires three components:
- The effort of the clerkship committee in working out the format of the exams and developing or selecting the cases to be used for the unknowns.
- Secretarial time to contact faculty to see if they will give exams at a particular time, to make up the final schedule for the students, and to collect and record the grades.
- Faculty time to administer the exam, record the grade, and write comments on student performance (approx. 50 – 70 min. per student).
It is difficult to assign a cost value to these efforts.However, it should be noted that compared to the OSCE and various standardized patient exams, the logistics of setting up oral exams are simple once the faculty is willing to commit the time.
REFERENCES
- Miller GE.The assessment of clinical skills/ competence/ performance. Acad. Med. 65: S63-67. 1990.
- Anastaskis D, Cohen R,Reznick RK. The structured oral examination as a method for assessing surgical residents.Am. J. Surg. 162:67-70. 1991.
- Colton T, Peterson O. An assay of medical students’ abilities by oral examination. J. Med. Educ. 42:1005-1014. 1967.
- Robb K, Rothman A. The assessment of clinical skills in general medical residents – comparison of the Objective Structure Clinical Examination to a conventional oral examination. Ann.Royal Coll. Phys. Surg. Canada. 18(3): 235-238. 1985.
- Saad AMA.An oral practical examination in emergency clinical surgery. Med. Educ.25:300-302. 1991.
- Solomon DJ,et al. An assessment of an oral examination format for evaluating clinical competence in Emergency Medicine. Acad. Med. 65: S43-44. 1990.
- Muzzin LJ,Hart L.Oral examinations. In: Neufeld, Victor R, Norman GR (eds), Assessing Clinical Competence. Springer, New York. 71-93. 1985.
- Vu, Nu V, et al.Oral examination: A model for its use within a clinical clerkship. J. Med. Educ. 56:665. 1981.
ADDITIONAL REFERENCES
- Abrahamson S. The oral examination: The case for and the case against. In: Lloyd J,Langsley DG, (ed).Evaluation of the Skills of Medical Specialists, American Board of Medical Subspecialists. Chicago. 1983.
- Lipscomb PR. Summary of conference on oral examinations. In: Lloyd J, Langsley DG, (ed). Evaluation of the Skills of Medical Specialists. American Board of Medical Subspecialists, Chicago. 1983.
- Rosinski EF. The oral examination as an education assessment procedure. In: Lloyd J, Langsley DG, (ed). Evaluation of the Skills of Medical Specialists. American Board of Medical Subspecialists, Chicago. 1983.
APPENDIX I
A survey on the current use of oral examinations in pediatrics clerkships
In September, 1992, a survey was mailed to directors of 142 clerkships to assess the current usage of oral examinations as a grading tool in pediatric clerkships.Responses were received from 97 programs, or 68.3%Of these, 28 programs (29%) were using oral exams and the remainder (71%) were not.
Reasons for using oral exams cited by the 28 programs | ||
Tests thinking skills better than multiple choice | 100% | |
Can assess if student has read on his/her patients | 18% | |
Additional data for evaluating marginal students | 50% | |
Opportunity for faculty-student interaction | 57% | |
People who give oral exams at different institutions: | ||
Full-time faculty | 100% | |
Department chairs | 36% | |
Clinical faculty | 36% | |
Subspecialty fellows | 11% | |
Chief resident | 36% | |
3rd year residents | 4% |
Number of oral exams requested of each faculty: | ||
1- 25%; | 2- 21%; | 3 or more- 36% |
Number of oral exams given per student: | ||
1-50%; | 2-36%; | 3-4% |
Are students tested on patients they saw during the clerkship? | ||
Yes: 50% | No: 50% | |
Are students tested on “unknown” patients? | ||
Yes: 86% | No: 14% | |
Is a list provided for “unknown” cases to be used? | ||
Yes: 54% | No: 25% | |
Are written instructions provided to examiners to help standardize the format? | ||
Yes-54% | No-43% | TD> |
For programs which do not use oral examinations, the reasons given for NOT using them were (asked to check up to 3): | ||
Would take too much faculty time | 72% | |
Too difficult to standardize | 75% | |
Other measures of student performance sufficient | 30% | |
Doesn’t do a good job testing what is important | 7% | |
The American Board of Pediatrics doesn’t use | 3% |
APPENDIX II: Sample oral exam questions
ANEMIA – (Developed by Dr. Paul Kaplowitz)
Case scenario: An 18 month old black male is seen for well child care.As part of routine screening, he has a hemoglobin checked which is 9 gm%There is no history of recent illness.Please describe the history which is most important, what you will be looking for on the PE, the differential diagnosis, and how you would use selected lab tests to arrive at the diagnosis.
History:
2 Asks about dietary intake of iron, whether child is drinking a large quantity of cow’s milk, when taken off formula1 Asks about family history of anemia (SS, thal, spherocytosis)
Physical exam:
1 Evidence of pallor (oral mucosa, conjunctivae, etc.)1 Looks for enlarged spleen (hemolysis), enlarged nodes (leukemia or other
malignancies)
Differential diagnosis:
2 Can categorize anemia according to microcytic, normocytic, and macrocytic2 Can name at least 2 causes of microcytic anemia (Fe deficiency + either _ thal or lead poisoning)
1 Can name at least 2 causes of normocytic anemia (hemolytic conditions, chronic disease, bone marrow failure due to malignancy or aplastic anemia)
Laboratory evaluation:
2 Knows significance of the MCV in classifying anemia and approximate normal value for age (70-84 for 6 mo – 2 yrs)2 Knows to examine the peripheral blood smear for RBC shape and to check WBC and platelet count to look for bone marrow involvement
1 Knows the significance of a normal-low vs. elevated reticulocyte count (hemolysis, blood loss)
Grading guidelines: The point value of each item is indicated to the left of the item. Check off next to each item whether the student knew that particular item. If the student gave the information but only after coaxing, give partial credit.The maximum number of points for this question is 15.
Express the score for this question as ___ pts of 15 = ___ %.
BACTERIAL MENINGITIS – (Developed by Dr. Prasanna Nair)
Case Scenario: A 10 month old male infant is brought to the Emergency Room because of fever and lethargy.Intake has been poor for 24 hours.Mother has noted increasing drowsiness.Temperature is 104_F, pulse 120/min, RR 44/min.The infant is lying quietly on the examination table, but with manipulation he becomes quite irritable.(Spinal tap: 10 RBC, 1200 WBC with 90% PMN’s and 10% lymphocytes, glucose of 22 mg/dl,and protein of 80 mg/dl, blood glucose 80 mg/dl)
History:
1 Asks about upper respiratory infection, fever (bacteremia), immunizations1 Asks aboutirritability, altered in mental status, seizures
.5 Asks about recent Head Trauma/Skull Fracture
.5 Asks about history suggestive of asplenia, e.g. hemoglobinopathy; immunodeficiency.
Physical:
1 Mentions difference in examination related to age – newborn (more like sepsis) infant, child; meningismus consistent only in older children1 Looks for evidence of increased intracranial pressure, meningeal irritation and cortical dysfunction
.5 Looks for petechiae, purpura, signs of shock
Differential diagnosis:
1 Names acute bacterial meningitis as most likely diagnosis:Streptococcus pneumoniae, H. influenzae type b, Neisseria meningitis as major pathogens.5 Can name other possibilities: aseptic meningitis (esp. enteroviral), tuberculosis, fungal infections
Pathogenesis and pathophysiology:
.5 Knows that bacteremia plays central role.5 Notes that young infants at higher risk because they may lack protective antibodies and also have an immature reticuloendothelial system
Laboratory evaluation:
1 Knows the expected difference in CSF findings in bacterial and viral meningitis.Must include cell count, gram stain, rapid test for bacteria-specific antigens, protein, glucose1 Should request cultures – bacterial and viral
.5 Should request CBC with diff. and platelet count, blood culture, electrolytes, glucose, BUN
Hospital management:
1 Knows that initial antibiotic will vary with age of patient1 Knows that supportive therapy is important: (notes at least 2of the following:) maintain BP, adequate oxygenation, management of increased intracranial pressure, 2/3 maintenance after intravascular volume has been replaced.
.5 Knows rationale for Dexamethasone therapy
Complications:
1 Knows at least 3 acute complication (includes cerebral edema, ventriculitis, increased intracranial pressure, seizures, cranial nerve palsies, stroke, subdural effusions).5 Knows that SIADH can occur in some, knows how to recognize
.5 Knows at least 3 long term complications (sensorineural deafness, developmental delay, blindness, paresis, seizures)
.5 Knows that prognosis worst in NB with gram negative bacilli
.5 Knows that chemoprophylaxis for contacts is recommended for specific organisms (esp. Neisseria)
Grading guidelines:Check off next to each item whether the student knew that particular item and score according to point values given. If student gave information but only after prompting or hints, you may give partial credit. Maximum number of points = 16.Score for this question: ____ out of 16 = ___ %.