Section I

Evaluation of Faculty Teaching

Mary Ellen Bozynski, M.D., M.S.


  1. To review and summarize the literature concerning faculty evaluation and its components.
  2. To describe several validated instruments designed to measure the quality of clinical teaching that may be useful in pediatric clerkships
  3. To explore how measures or summaries of teaching efforts and effectiveness such as teaching portfolios may be used for feedback and promotion.
  4. To define research questions of interest to COMSEP members.

Faculty Evaluation and its Components


Faculty evaluation achieves one or more goals including decisions regarding allocation of resources, improvement of teaching, or assistance in decision-making regarding appointment and promotion.1 If the data are used for decisions regarding tenure, promotion, or other rewards, the data must be reliable and truly represent an accurate assessment of the faculty member’s teaching.This is not an easy task to accomplish and requires thededication of significant institutional resources.

Evaluation of teaching should be comprehensive and cover the major general characteristics of good teaching including instructor knowledge, organization and clarity of presentation, enthusiasm, instruction skills, clinical competence, clinical supervision, and professional characteristics.2

The most commonly used methods of evaluation include self-evaluation, peer evaluation, student or house officer evaluations, and administrative evaluation.Administrative evaluation, usually used for promotion, may be very complex and based on an extensive review of materials by a committee or may be confined solely to a summary statement made by an individual department chair.The most comprehensive evaluation systemscombine data from each of these methods to create a complete assessment.

Description of Common Methods Used and Their Strengths and Weaknesses:

Although self-evaluation is a commonly used evaluation method, the results of self-evaluation rarely correlate with those of peer or student evaluations, they tend tobe more positive.Self-evaluations may be useful, however, to point out discrepancies between the faculty member’s self-assessment and the assessments of others.The data are also useful for establishing goals.For individual faculty, there are no data to suggest that such an approach can effect major changes in teaching effectiveness.3Moreover, data linking improved teaching performance to improved student performance are variable.

Peer evaluations are more popular but are difficult to conduct, especially in a clinical setting.The reliability of peer review in classroom settings is well established.While peer reviewers are better judges of teaching content, quality, appropriateness,and relevance, the time required for the direct observation of teaching is often prohibitive, especially in a non-classroom situation.Videotapes of lectures may be viewed more conveniently and their use may circumvent some of these difficulties.Moreover, Skeff4 and others5,6,7,8 have demonstrated improvement in faculty teaching effectiveness after faculty development programs that used individualized review of videotaped teaching sessions.

To achieve reliability and validity, several independent observations are necessary and at least one of the observers must be a content expert.Because some areas of expertise have few faculty members, arranging observations may be difficult. Furthermore, faculty may be reluctant to rate a close colleague.Peer evaluations of actual classroom teaching are less reliable than student evaluations.Peer evaluations of teaching materials, handouts, and examinations, however,are more easily arranged and more reliable.Evaluation of teaching materials is included as part of the teaching portfolio where portfolios are used.In any event, standardized systems for evaluation are mandatory and despite the difficulties involved, many educators9 believe that peer evaluation is a mandatory component of faculty evaluation.10

Evaluation of faculty teaching by students (both students and house officers) continues to be popular, second only to administrative evaluation.Studies have demonstrated that student evaluations are equivalent to those of peers, self- and administrative evaluations in the areas of delivery skills, empathy, enthusiasm, fairness, and preparedness.11,1Student ratings have demonstrated adequate reliability with numbers as few as ten raters (r= .71-.82).12

There are a number of barriers to appropriate evaluation, however.For example, students generally see faculty as transmitters of information.Thus students often prefer a lecture format and complete handouts.On the other hand, the faculty believe that their role is to teach the student how to think critically, problem solve, and access the literature inpreparation for life-long learning.In fact, these goals were emphasized in the Report of Physicians for the 21st Century.This difference in student and faculty goals may cause the student to rate the teacher who distributes the best handout higher than the teacher who expects a more adult approach to learning.This conflict is understandable, especially in pediatrics.During the often brief (six to eight weeks) pediatric clerkship, thestudent must learn a great deal of new material while adapting to a new clinical role involving the care of children of varying ages and communication with parents.

The qualities of a good pre-clinical and clinical teaching may differ.At least one tool to evaluate the effectiveness of faculty teaching in an evidence-based medicine rotation has taken these differences into account by modifying the categories to those originally suggested by Irby to reflect issues relevant to clinical teaching.3Teachers effective in transmitting knowledge may not be effective in the clinical setting.

Students may also learn more and rate faculty higher in certain settings.These factors, e.g., lack of space, lack of time, poor patient mix, may be beyond the control of the particular faculty member. For example, teaching students in an ambulatory setting may cause some decrease in productivity.In this era of managed care, increased competition, and emphasis on cost-effectiveness, the impact of education on productivity cannot be ignored.Irby and associates have studied the characteristics of effective clinical teachers of ambulatory medicine.13The most important characteristics identified included active involvement of the instructor, respect for learner autonomy, and clinical competence of the instructor.In this study, environmental factors appeared to have a minor impact on teaching effectiveness; however, the ambulatory setting was not well described.

The amount of time spent with the faculty member and the duration of the contact are also important variables, accounting for approximately 10% of the variance in faculty ratings.Faculty teaching may also decline over the course of a rotation or year due to burnout.4Gender and age appear to have little effect.12

Whether the rotation is required or elective also influences ratings; teaching associated with electives being rated higher.Several factors including student interest and greater familiarity with the subject influence ratings.Demonstration of personal interest in the student also has a positive influence on ratings.

In general both peer and house officer reviews of faculty teaching tend to be more favorable than evaluations by students.In fact, some items are best evaluated by peers or housestaff, e.g., knowledge.Both student and house officer evaluation ofteaching tend to site direct observation, clarity of expectations, and feedback as consistently weak areas of faculty performance.

Instruments to evaluate teaching/Implementation strategies:

A number of tools have been developed to rate teaching.Mosttools are based on the work of Irby and represent modifications of his work.As mentioned, Guyatt modified the categories defined by Irby to include more precise domain-specific descriptors, and defined other content areas including biophysiology, clinical skills, and teaching of evidence-based medicine.Mullan et al.14 also drew on Irby’s work but interviewed faculty to obtain input as to the teaching behaviors to be included, advice on the most credible source for evaluating each behavior, and the domain of teaching efforts to be evaluated.For example, faculty suggested that if only direct contact with learners was evaluated, teaching at conferences, teaching materials, etc., would be excluded.Another faculty concern centered on the validity of peer faculty ratings that were collected in addition to those of students and house officers.Faculty felt that it was important to assess the basis for peer judgments (firsthand observation versus impression).This evaluation has been used in the Department of Pediatrics at the University of Michigan for a number of years.The basis for peer evaluation is no longer recorded, but, increasinglybusy faculty members feel less able to comment on the teaching effectiveness of their colleagues since they rarely witness their teaching efforts.Recently a similar tool has been adopted for medical student evaluation of faculty teaching that can be used for all medical school disciplines.

The use of a “teaching portfolio” or “dossier”15 (Canadian Association of University Teachers) has also gained popularity.The contents and use of these portfolios is highly variable.One of the most highly organized, widely distributed,and complete portfolios is used at the Medical College of Wisconsin. Some other schools currently using portfolios include Northwestern University Medical School, the University of Washington, the University of Nebraska-Lincoln, Harvard, UCLA, and a number of Canadian medical schools.

In some institutions, the portfolio is examined by a committee that then makes recommendations regarding promotion.Portfolios generally include evaluationsof teaching effectiveness, samples of teaching documents such as syllabi, course outlines, etc., instructional materials (software, case studies, etc.), and may include samples of student work.In addition, academic products related to teaching such as monographs, research presentations and reports, and evidence of national recognition, e.g., lectureships are also included in the portfolio.Many institutions also consider evidence of continuing faculty development activities.Estimates of administrative efforts and the quantity of teaching: numbers of students, duration, and settings are also included.Faculty may also be requested to provide astatement of their educationally focused future goals and objectives.

The objectives of most portfolios include feedback and improved performance; however, these documents are used to assist in decisions regarding appointment and promotion at many institutions.Portfolios may be reviewed by an individual or by a committee; some systems are quite complex.Lack of time is an important barrier to scholarship in teaching.At least three half days per week must be provided; this amount of release time is rare for clinical track faculty.16Individual performance should be judged in the context of resources and time made available to the faculty member.Given that most medical school promotion committees are not knowledgeable about the multiple demands of patient care and teaching, the clinical faculty’s load may be grossly underestimated.

The use of a teaching portfolio in medical education lags behind its use in undergraduate education.In fact, most instruments were designed for use in standard classroom settings and are more easily adaptable to the pre-clinical setting.Nevertheless, a number of medical schools are using portfolios and there is little doubt that their use will become more widespread in the future.

If the teaching portfolio is to be an effective tool, it must serve not only those individuals spending the majority of their time in education, but also the faculty we all count on to do the majority of medical student and house officer teaching.Many key teaching faculty have minor administrative roles and lack dedicated time for scholarly activities.These faculty may not invest in teaching if their efforts are unrecognized and unlikely to be rewarded when tenure or promotion decisions are made.

Most clerkship directors are already overwhelmed by multiple demands.Teaching portfolios must be designed carefully to prevent the submission of an uninterpretable pile of paper to the clerkship director, departmental, or medical school promotion’s committee for evaluation.


There are no published data: however, the cost of even the simplest system is significant, and this factor cannot be ignored.

Future Research:

There are a number of areas where further research in evaluation is needed.For example:

  1. Since more teaching will be carried out in the ambulatory area, it is mandatory that faculty development efforts assist the faculty with methods to improve teaching effectiveness in this setting.Studies measuring the impact of various programs willassist us in planning effective development efforts.
  2. There are little data to support a change in student performance due to changes in teaching effectiveness, although the two are connected intuitively.Certainly student experience may influence career choice, etc.
  3. The impact of the teaching evaluations or portfolios on promotion and tenure decisions has not been documented.A standardized approach should also change faculty behavior and enthusiasm, but there are no data on this subject.
  4. .Although peer evaluation is felt to be important, it is difficult to carry out in the clinical setting.Research on methods to accomplish this and ways to use the review constructively needs to be explored.
  5. Student expectations must be considered in interpreting data.For example, many new curricular efforts stress independent learning.Many students, however, favor didactic presentations and are not comfortable with a more independent learning style.This is especially a problem for pediatrics because most of the material is new to the student, they are unfamiliar with examining children, they cannot learn one set of diseases or doses for all children (no one size fits all), and the duration of the clerkship is often short (6-8 weeks).These factors may result in difference in teacher ratings across disciplines.
  6. Understanding how students learn best in both the medical center and community ambulatory setting3will also help faculty in their teaching effectiveness.Studies must examine the effectiveness of specific teaching behaviors on learner outcomes in the clinical setting (Wilkerson, 1993).


  1. Rippey RM.The Evaluation of Teaching in Medical Schools. Springer, New York.1981.
  2. Irby DM. Evaluating teaching skills. The Diabetes Educator. II:37-46. 1986.
  3. Guyatt, GH, et al.A measurement process for evaluating clinical teachers in internal medicine. Canadi. Med. Assoc. J. 149 :1097-1102. 1993.
  4. Skeff KM. Evaluation of a method for improving the teaching performance of attending physicians.Am. J.Med. 75: 465-470. 1983.
  5. Menaham S.Interviewing and examination skills in paediatric medicine:videotape analysis of student and consultant performance.The Royal Society of Medicine. 1987.
  6. Burchard KW,Rowland-Morin PA.A new method of assessing the interpersonal skills of surgeons.Acad. Med.65(4): 274-276. 1990.
  7. Cox J, Mulholland H. An instrument for assessment of videotapes of general practitioners’ performance.Brit. Med. J. 306: 1043-1046. April, 1993.
  8. Cassie, J. M., Collins, G. F., Daggett, C. J.:The Use of Videotapes To Improve Clinical Teaching.Journal of Medical Education; 52: 353-354.April,1977.
  9. Irby DM. Evaluating instruction in medical education. J. Med. Educ. 58: 844-849. 1983.
  10. Irby DM.Peer review of teaching in medicine.J. Med. Educ. 58:457-461. 1983.
  11. McKeachie WJ.Student ratings of faculty: A reprise. Academe. 65: 384-397. 1979.
  12. Irby DM, Gillmore GM,Ramsey PG.Factors affecting ratings of clinical teachers by medical students and residents. J. Med. Educ. 62:1-7. 1987.
  13. Irby DM, Ramsey, PG, Gilmore GM, et al.Characteristics of effective clinical teachers of ambulatory care medicine. Acad. Med. 66: 54-55. 1991.
  14. Mullan PB.Teaching and rater characteristics predicting medical student, pediatric resident and faculty evaluation of clinical teachers. Teach.Learn. Med.V.1993.
  15. Canadian Association of University Teachers:The CAUT Guide to The Teaching Dossier. 1991.
  16. Jacobs MB.Faculty status for clinician-educators: guidelines for evaluation and promotion. Acad. Med. 68: 126-128.1993.
  17. Wilkerson L, Armstrong E. Lesky L.Faculty development for ambulatory teaching.J. of Gen. Int. Med. 5: 544-553.Supplement, 1990.

Additional References

Basook PG.Clinical assessment: A state-of-the-art review. Diabetes Educator. II: 30-36. 1986.

DeWitt TG, Goldberg RL, Roberts KB.Developing community faculty: Principles, practice, and evaluation. Am. J. Dis.Child. 147: 49-53.1993.

Ende J. Feedback in clinical medical education. JAMA.250: 777-781. 1983.

Evidence-Based Medicine Working Group (Guyatt, G. et al.) McMaster University.Evidence-based medicine: A new approach to teaching the practice of medicine. JAMA. 258:2420-2425. 1992.

Garg ML, Boero JF, Christiansen RG, Booher CG.Primary care teaching physicians’ losses of productivity and revenue at three ambulatory-care centers. Acad. Med. 66: 348-353. 1991.

Skeff KM, et al. Evaluation of the seminar method to improve clinical teaching. J. Gen.Int. Med. 1: 315-322. 1986.