Using Student Evaluations to Assess Teaching EffectivenessThe purpose of this guide is to provide both faculty and administrators with information concerning the utility and biases associated with Student Evaluations Teaching (SETs). Due to the quantity of resources concerning these items, we have grouped references into categories and published them in a companion handout: Resources on Student Evaluations of Teaching.
Use SET Ratings to:
Estimate overall teaching effectiveness . Provide general feedback that instructors can use to make positive changes in their teaching practice.
Demonstrate that an instructor uses teaching practices positively associated with student achievement. Pay special attention to items that correspond to the following ratings categories:
- Preparation and organization of the course
- Clarity and ability to be understood
- Pursuance and/or accomplishment of course objectives
- Motivation of students to do their best and requiring high standards of performance
- Stimulation of student interest in the course or subject matter
- Encouragement of questions/discussion and openness to the opinions of others
- Presentation skills
- Knowledge of subject matter
Study potential biases at the department and campus level, as well as validity and reliability of the instruments themselves, so that the instruments can be improved over time.
Avoid Using SET Ratings to:
Infer relationships between teaching effectiveness and student learning . There is little to no evidence that SET scores are aligned with how much students learn or how successful students will be in future courses.
Demonstrate incremental improvements in teaching. SET ratings are relatively stable from semester to semester, and only the faculty members who start out with comparatively low scores are likely to show significant improvement in their scores over time.
Compare faculty members and courses. A growing body of literature suggests that SET ratings are affected by a multitude of factors, including ones outside the control of faculty. These may or may not be salient factors within your department, but it is nonetheless worthwhile to be aware of these potential biases and trends when reviewing SET scores:
- Student perceptions of faculty personality (via verbal and non-verbal behaviors) confound students’ ratings of instruction, such that students may be rating likability over teaching quality.
- Upper-division students value different aspects of teaching than do their lower-division counterparts, even within the same discipline, so they may rate the same instructor differently.
- Because male students tend to give lower ratings than female students, courses that have an uneven sex ratio cannot be reasonably compared to ones that have an equal or opposite sex ratio.
- Science and engineering students tend to give lower ratings than social science, humanities, and fine arts students (regardless of gender) making comparisons between disciplines and between non-majors versus major courses unreasonable.
- Students who perceive faculty as grading leniently may give higher ratings and students who perceive faculty as grading stringently may give lower ratings.
- Students who expect to receive better grades may give higher ratings and those who expect to receive lower grades may give lower ratings.
- Even though there is a positive association between perceptions of rigor (based upon workload, difficulty of material, time, or effort) and student learning, students may give lower ratings for courses they perceive as rigorous.
- Students may give lower ratings to older faculty than younger faculty, even when instructor experience is held constant.
- Students may give lower ratings to minority faculty than non-minority faculty.
- Students may give lower ratings to female faculty than male faculty
- Students may give lower ratings to large classes than small classes (there is little to no correlation between size and ratings for mid-sized classes).
Remember, SETs Provide a Single Source of Evidence
Just as you would not want your students to construct arguments, draw conclusions, or evaluate scenarios based upon a single source of evidence, so too would you not want your colleagues to evaluate teaching based upon a single source of evidence. Therefore, multiple measures are needed to describe and evaluate someone’s teaching, especially for the purposes of tenure, promotion, and merit pay. These other measures can include:
- Review of course materials (formative and summative)
- Classroom observations performed by peers (formative) and senior faculty (formative and summative)
- Teaching portfolios
- Student comments collected in student focus groups or interviews
- Student comments on SETs and mid-semester evaluations
- Unsolicited comments made by students
- Samples of students’ work, preferably with instructor feedback, completed rubrics, etc.
- Samples of written communication with students
- Records of student achievement after leaving the course and/or institution
- Teaching philosophy statements
- Statements/records of the instructors’ activities and achievements regarding advising students, mentoring future/new faculty, and the scholarship of teaching and learning (SoTL)
- Statements of the instructors’ short and long-term teaching goals
- Self-evaluations and reflections, including changes instituted following evaluations, faculty development, and research on teaching.
How the Center for Teaching and Learning Can Assist
- Consult with faculty about SET ratings and teaching strategies. See Tips for Using Student Evaluations to Help Students Learn for more information.
- Perform formative classroom observations
- Administer student focus groups
- Assist in developing mid-semester evaluations
- Work with faculty, chairs, or deans to identify and develop methods for evaluating teaching, such as peer review programs, teaching portfolios
- Assist in reviewing student evaluation ratings and comments
References and Resources
Berk, R.A. (2005). Survey of 12 strategies to measure teaching effectiveness. International Journal of Teaching and Learning in Higher Education, 17(1): 48-62. Retrieved from: http://www.isetl.org/ijtlhe/pdf/IJTLHE8.pdf
Chism, N.V.N (2007). Peer Review of Teaching: A Sourcebook (2nd Ed.). Bolton, MA: Anker Publishing.
Feldman, K.A. (1998). Identifying exemplary teachers and teaching: evidence from student ratings. In K.A. Feldman & M.B. Paulsen (Eds.), Teaching and Learning in the College Classroom. Needham Heights, MA: Simon & Schuster.
Murray, H.G. (2007). Low inference teaching behaviors and college teaching effectiveness: recent developments and controversies. In R.P. Perry and J.C. Smart (Eds.), The Scholarship of Teaching and Learning in Higher Education: An Evidence-Based Perspective. Dordrecht, Netherlands: Springer. Retrieved from: http://www.springerlink.com/content/w4j072/#section=288317&page=1&locus=0
Onwuegbuzie, A.J., Daniel, L.G., & Collins, K.M. (2009). A meta-validation model for assessing the score-validity of student teaching evaluations. Quality and Quantity, 43 (2), 197-209. Retrieved from: http://www.springerlink.com/content/l031k77753182234/
Seldin, P. (1993). The use and abuse of student ratings of professors. Chronicle of Higher Education, 21, 40.
Authored by Sarah Lang (May, 2010)
Revised by Sarah Lang (October, 2011)