The Ethical Classroom, Part II: Using Student Evaluations


I promise not to pick on any moral dicta from religion, having already sufficiently debunked the so-called ten commandments and the tit-for-tat rule, often referred to as the golden rule.  This time, I want to take a close and somewhat ethical look at the our use of student-completed evaluation forms, which probably all departments use to come to conclusions about the soundness of a teacher’s pedagogy. 

Let me begin by re-iterating the directions which faculty are generally to follow:

  1. Write your name and the course number on the board.
  2. Designate a student to take the completed forms to the secretarial office.
  3. Distribute the forms.
  4. Leave.

 I suspect that most of us are also aware of the significance of the process to several involved agents: the students, the faculty, the form designer, the evaluator, the decision-maker.  Some of you may want to include the institution, but I would not.  The institution is not a moral agent; it is a collective term for a variety of individuals who are moral agents.  If the institution were to suffer harm, I generally mean to say with such an observation that one or several individuals encompassed by the institution are suffering harm of some sort.  If I were to speak of the institution as if it were an individual with the potential for suffering harm, I should be guilty of a category error[1] in my reasoning. 

List of Agents:


Students are involved at two levels.  The first level is the student as an evaluator; that is, the student as one filling out the evaluation form.  The second level is the student as benefit recipient.  Where soundness of instruction prevails, the reputation of the institution will benefit the student in the job search and in the prestige s/he projects on the job.  While this is clearly a many-hands issue,[2] a sound feedback loop to establish and to support instructional soundness is of service to the development of a sound reputation for graduates.


Also faculty are involved at similar two levels.  The faculty member is agentive when selecting time and modus operandi  for the administration of the questionnaire.  The faculty person may have an effect upon the ambience of the completing of the questionnaires by selecting or having an impact upon what precedes it, what accompanies it, and what follows it.  But the quality of the institution and its reputation also affects the stature and prestige of the faculty person.  So, s/he has an agentive role and a receptive role.

The Designer of the Questionnaire

The designer may have a dual role also.  The design of the questionnaire is clearly an agentive role.  But there may also be a benefit result.  Where the work of the designer is good and conducive to sound institutional improvements, the designer is likely to find repeat-customers.  So, a careful design would appear to be in the interest of any external designer.  The problem is compounded where someone inside the institution participates in the design and in one of the other roles.  Off hand, I would be inclined to believe that students and faculty are the essential stakeholders in the design of the questionnaire and thus either should themselves be involved in the design or should have an important voice in the design. 

The Evaluator

The evaluator’s role appears to be largely agentive also.  If the evaluator is part of the institutional system, s/he may also experience a self-interest feedback loop.  Thus, if the evaluator is a peer of the person being evaluated, the general improvement of the pedagogical process will also support the evaluator’s stature.  One of the pitfalls in the evaluator’s being part of the institution is the possible damage of the system by self-serving effects.  Another source of damage to the system is some forcing of results into quota or bell-shaped curves, of course.

The Decision-Maker

Again, this role is almost exclusively agentive.  Like the evaluator, the decision-maker has a connection to the process by virtue of being part of the institution’s fabric.

Power-Relationships Between Agents

Some philosophers have suggested that one should subject one’s self to the rules of logic so as to be powerful toward others.  Well, I’ve always thought some philosophers to be a wee bit naïve; we know better, don’t we?  What a vice-president says might draw many a chuckle because of its logically flawed structure; however, the power status of the speaker seems independently forceful nonetheless.  No matter how threadbare the emperor may be, his or her being emperor has sufficient impact to demand obedience or abject dismissal of one’s own better reasoning.  

We can see the power structure to some degree linearly.  The student has probably the least power; the faculty has power over the student; the evaluator has power over the faculty, with perhaps another tier of decision-makers arrayed beyond the evaluator.  The designer is relatively powerless, except as a member of one of the other groups. 

Complications of Power Relationships

Other relationships can affect the outlined power-relationships.  For example, last semester, I had one student unhappy about a process I required as part of my ethics class.  I also have office hours, a very open classroom, and a bulletin-board or forum on Blackboard that permits anonymous comments.  Presumably, the student was aware of power relationships between administration and faculty.  Thus, the complaint was aimed immediately at the level in our company-culture, where the student expected the most decisive and fastest action.  So, where that expectation is fulfilled, the student’s power is augmented by his or her wiring around the teacher to make the administrator his or her ally. 

Other relationships are possible.  Personal friendships between people at various stages will affect the results of the process also.  Suppose that an evaluator is a good personal friend of, say, the forms designer or that the evaluator is in a position of benefiting from kickbacks if the forms designer were to engage in this business commercially.  Let me stress that I am exploring possibilities here; I am not making any claims that any of this has indeed transpired.

Let me begin now to take a critical look at the form.  One of the features of the form that doubtlessly everyone has meanwhile noticed is that question 11 is hidden by virtue of the coloring of the form.  While questions 1 through 10 are highlighted in blue, question 11 is white so as to blend into the irrelevant background.  In my classes of last semester, I announced in one class that the questionnaire has 11 fields that require answers, that indeed here—pointing at the box at the bottom—is that eleventh answer, and that this eleventh answer is extremely important and so should be marked.  No student turned in a form with an unanswered eleventh question.  In another class, I told students that the questionnaire had eleven questions but did not call specific attention to the often ignored question.  3.85 percent—probably one student—did not answer the question.  In yet a third class, I said nothing about the nature of the form, thus following the directions properly.  26.32 percent or about a fourth of the class did not answer the eleventh question. 

We appear to have a form, thus, that does not on its own gather the relevant information.  While teachers should not guide students in the evaluation, guidance is necessary to supplement the format of the questionnaire.  The more information must come from the teacher, the more the teacher is likely to affect the outcome of the questionnaire.  And there we have clearly one ethical dilemma, don’t we?

Figure 2: New Form

Figure 1: Old Form

The form is contingent on a computer program that returns graphical representations of answers to question 11, which—as I have just established sufficiently—is an unreliable question by virtue of the design of the form.  Given that personnel decisions are being made on the basis of this kind of collection of data, I would assume that harms may come from the inadequate form and that, indeed, the continued use of the form and the computer program is indeed unethical.  With no-cost in-house development, the institution for many years used a program that I had written.  This program returned not only the result of answers to question 11; it also calculated an average of questions 1 through 10 and displayed the composite value graphically.  Such a form gave a view of two relevant values at a glance.  The contemporary form gives only the information about question 11; indeed, some departments may have experienced an emphasis on question 11 as a consequence of the program.  In isolation, these forms may not necessarily be unethical; it is conceivable that evaluators and decision-makers compensated for the unreliability of the forms and that, thus, the ultimate result of consultations with students about the instructional quality might have been reliable nonetheless.  I have no indication, however, that such a compensation has indeed taken place anywhere in the process.  I know I have received comments that, relying on these data, advised in all seriousness that fundamental changes of my teaching ought to be implemented.  

Administration of the Questionnaire

I think that I have established sufficiently that the questionnaire design and that the computer program tallying the results are and have certainly been tainted.  I believe that now some attempts are underway to re-examine the system and to re-design the system.  While the collection of data is seriously flawed, I do not think that we can do without student comment in the attempt to improve our teaching.  But I want to look at another feature now of the complexity of administering student evaluations.  I had begun these reflections with a review of what is required:  The teacher is to (1) distribute the forms and (2) depart from the classroom.  Is that what we do?

To get a feel for what we do, I sent out a simple question to all faculty: “What method of administering student evaluations do you use or recommend so as to optimize student responses?”  The term “optimize,” meaning “make best,” clearly seemed ambiguous to many of those who answered the question.  Most comments, however, seemed to interpret “optimize” as “having the greatest possible number of students present.”  A number of responses also interpreted “optimize” as “selecting the best time for administering the questionnaire.”  Finally, some interpreted “optimize” as “selecting circumstances that are conducive to getting the most positive answers from students.”

I need to caution again that I do not claim any statistical validity for these responses.  I was merely looking for some of the ideas that might be going through colleagues’ minds.

Timing and Participation:

To get optimal participation of students in the questionnaire, colleagues gave a variety of responses:  Some colleagues use a trick to force attendance; one, for example, will announce a quiz for that day to compel full attendance; another will announce a review session for the final examination, thus selecting the very last day of the semester.  Another colleague does the evaluations in the final week when the students are fairly clear on what their grades will be.  Some colleagues offer the questionnaire at the beginning of the class period so as to give students plenty of time to complete them; others will never give the evaluation on a day when anything else is due or on the last day; and yet another will always ask for the evaluations immediately following a test.   Some colleagues will always ask for the completion of the questionnaire on the last day of classes.  One colleague begins the questionnaire at the beginning of the class session sometimes during the last week and then leaves—presumably letting the remainder of that session lapse; another colleague sets aside ten minutes for the completion of the questionnaire; yet another reserves fifteen minutes; and yet another allows five minutes.   Almost all respondents offered argument in support of their choices, although the choices themselves varied from person to person.


In most cases, colleagues reported that they told students to answer honestly so that the instructor can improve his or her course.  No teacher reported admonishing students to answer nicely so as to improve the teacher’s professional prestige or his or her standing in the evaluation that these questionnaires feed into.  One colleague will tell students that the official form is pointless and that s/he will pay no attention to it; instead, students should concentrate their efforts on a form that the colleague distributes.  Another colleague will point out that s/he will probably not receive an increase or prestige or a cancellation of contract if the questionnaires are either good or bad, a strategy which might even make students feel at liberty to answer lackadaisically, whatever that might entail.    

Affecting the Results:

Interestingly enough, attempting to affect the results is what all the others do; no respondent really claimed to do any such conscious endeavor.  Some respondents, however, imply such an effect.  For example, when we admonish students that we are also interested in what worked in the course, we are prompting for positive responses.  When we say that we need criticism so as to improve our act, we are prompting for negative response.  As I have shown with the three different comments about the number of questions on the form, such comments can influence the respondents.  When we say that the form does not matter to us, is our body-language saying, “So, you might as well use good marks.”?  One colleague leaves the classroom only in a very narrow sense of the word, standing outside of the door so as to be visible to the students. The colleague reasoned that the at-the-door presence would stifle conversations in the room; in other words, a subtle influence is intended.  The presence that stifles talk might also stifle freedom of expression.  The colleague who allows the entire period for completion of the questionnaire is offering a day off.  Does that have a feel-good effect that skews the results? 

But the more egregious attempts to stack the deck are clearly the result of actions of all persons other than the ones responding.  One colleague is confident that “the outcomes from evaluations gathered at the end of the semester can normally be manipulated by the experienced professor.”  I tried to follow up by asking how the experienced professor managed to do that, but there was no further reply.   So the reports about others have tendencies to the negative:  “One of my students told me that her _____ professor said, ‘Write whatever you want.  It doesn't make any difference to me, because I am TENURED!’  She said that made her and other students quite angry.”   However, I have already pointed out that some colleagues attempt to de-emphasize the earnestness of the questionnaire by a variety of de-valuing comments.  So, this comment is not at all shocking in light of other such strategies that I have mentioned already.

But other teachers can be more shocking.  Reportedly, a teacher stands before the class dictating the values to be placed on the questionnaire.  Justifiable?  Perhaps.  How often have you wondered how your students could have missed your review of objectives when they mark you low in that category?  Are you not justified in reminding them when you had done what the questionnaire asks about?  If you mean to get the highest marks possible, you can—and others presumably do—remind them of the interesting things you've done in the class throughout the semester, and you can do so under the pretext of refreshing their memories for good critical feedback.  Several teachers administer the questionnaire in juxtaposition with a test.  But what has also been said about other colleagues is that the test is a particularly easy feel-good test that will manipulate the students’ sense of comfort and well-being.  Leaving plenty of time by giving the test early in the session and then leaving will also pump up well-being and the joy of living; thus, skewing results.    Of course, you can get the highest marks possible if you give out candy or food, which, of course, some people—though not any of us here—will do.  One colleague asked me in the hallway whether my riding a bike and wearing outlandish leather-gear would improve my student evaluations. 

I wonder how reasonable it is to conclude that what these reflections about what others do to skew the results show is that we do have a company culture marred by under­currents of distrust and competition, both perhaps undesirable in an enterprise where we should see ourselves in collaboration toward a common goal and mission.  The distrust does not only show in reflections about what others do but also in the reflections about student motivations.  Some colleagues also see the so-called evaluation as a way for students, particularly the incompetent students, to retaliate outside of the normal exchange of ideas and concepts within the classroom.  One comment expresses this dis­trust by expecting student comments to be of no or little use since students are simply passing a compliment or taking a parting slap at the teacher.  Or as this colleague put it, “I do know that the key is to make the students feel that they're very smart and that—miraculously, and with almost no effort on their part!—they have learned interesting and useful things in your class.”  Anyone who can instill that kind of feeling would also have manipulated the questionnaire—unless we have the view that such is the essence of pedagogy.

Student Perspectives

Many students believe that teachers are sincere about wanting to improve their courses, but there are other views also.   Several students expressed the view that teachers are interested in job security; in other words, students do know that job security and salary go hand in hand with the evaluations.  Students report that some teachers will call the students’ attention to what has transpired during the course; so the teacher who dictates part of the questionnaire is not as unusual, perhaps, as one might have assumed.  Another student reports that several of his or her teachers had reported that ERAU takes these questionnaires very seriously; so the student speculates that this was perhaps some current fad or passing campaign on campus.  Appeals to pity are not uncommon, apparently, since students report teachers to make comments such as “I am new to teaching this course.” Or “don’t forget that I am just a beginning teacher.”  Students also report the comment, “Don’t forget that your grades have not yet been turned in.”  How much of an impact these comments are likely to have or how much they are perceived humorously is probably largely a matter of conjecture.  

One student says, “. . . based on their [the teachers’] body language and responses, I would assume that these evaluations are a serious matter to their credibility.  If negative comments did not cause teachers some type of reprimanding by their superiors, (which they do mention how feedback is read by their superiors and discussed), then I believe they would not take these ratings so seriously.”  Here is another comment that points out that teachers do not always abide by the rules:  “. . . teachers are supposed to not be in the room during evaluations.  According to my knowledge, this is to help in providing honest feedback.  I have had teachers who not only stay in the room, but also wander behind students during evaluations to peer at their multiple-choice answers.  This does persuade students.  Also, many students will not fill out forms of teachers in high positions (such as department heads) for fear that they will get a hold of these documents and “know” their handwriting and chastise them.  This is a common myth/concern that does happen among students.”  In other words, some students are fairly clearly aware of our power structures.

That the wandering teacher is very much against the rules is apparently also a matter of student knowledge.  Some semesters ago, one of my students made the comment that I was standing behind him or her and that he or she was getting very nervous because I was looking at the form.  I have never in my entire teaching career done any such thing.  What might have been the student’s motivation other than a get-even trick here, I don’t know.

Here is another interesting insight.  A student says, “I think they [teachers] want only the highest marks and positive comments. Since GPA is written on the front, I do not believe the struggling student has as much bearing on faculty supervisors as a successful college student.”  I had never wondered about how the student would see the information about GPA.  Faculty perspectives desire such a correlation because, indeed, the student with the higher GPA should have relatively more credible comments than the student with the low GPA.  On the other hand, I have yet to see any evidence that anyone has included this correlation into the analysis of the data.  Some students are convinced that the GPA is distinctive enough to give away the identify of the respondent.

Some student comments appear to expect indifference on the part of the teacher.  One student said, “I believe most the teachers want good evaluations, some however don’t care. There are some teachers here that even given several bad evaluations would still never change their teaching style or the way they conduct their class rooms.”  One wonders, of course, whether the student is in any position to evaluate a style as being bad or good.  Perhaps a long-term product-analysis will tell more than the feelings of the moment.  We have yet to include our alumni in the process of trying to find the perfect teacher or the best pedagogy.

One comment insisted that some students are too scared to tell the truth on questionnaires.  In other words, even the teacher’s most remote presence is likely to have an effect if this observation is true.  Students are also aware of a get-even syndrome.  Colleagues had also addressed that problem.   One student writes: “Most of the time though they [teachers who ask for positive and negative feedback] only get negative feedback because the ones with the problems are the ones who like to gripe.”  Another student says, “I had one professor come in the room and tell us not to be too harsh and to write good things. I have had other professors pull the ‘be honest’ face card, when in all reality the back of that card was exclaiming ‘Say I am great so my superiors can see what an immensely superb job I am doing.’”  Another student rejects the entire concept of teachers being evaluated by the students.  S/he says, “Put another way, most students at ERAU are just looking for a certificate that says they have earned their degree and don't much care about expanding their minds in the Liberal Arts.  Because of this, all teacher evaluations should be burned on the spot.”

Alternate Systems:

I have also had comments from teachers who administer an evaluation about midway through a course. I have used such a system by way of a forum on Blackboard where students can post anonymous comments; other teachers administer a mid-semester questionnaire.  The advantage of such a system is that one can make changes in a course while it is still going on.   Other systems are under discussion.  My own bias is in favor of focus groups to dialog with the evaluator about their experience of a class.  What I would never challenge is the view that, being part of the classroom experience, the student is indeed the best source for evaluative data.  I am also quite sure that the system now in use is unclear, unfair, and harmful.   

Some “Should Have” Statements:

  • The institution should not have hesitated to dump the Scantron forms as soon as the erratic completion of items became obvious.  The aura of distrust and the attempts to dishonestly tweak the system are far more costly harms than the cost of the paper.

  • The computer program should have at least emulated the old design.  Embry-Riddle has a sufficient number of statisticians and programmers who should doubtlessly have constructed something better than did Scantron. 

  •   ERAU has layout artists and designers who doubtlessly should have been able to design a better form than now in use

  •   One colleague mentioned that a better method would be “for the faculty member to arrange with the administrative assistant to go into the classroom at the beginning of a period and to administer the review. Once they are taken up, the AA would keep them under lock and key until results are tabulated.”  Given the immense variations in time and context and accompanying text—including the explanation of the ambiguities of the form, I will continue to wonder about what self-interest might have been at stake since such a method has not been in use already.

  •   Given the students’ concern with who will read the evaluations, the administrator of the evaluations should also clarify in whose hands the forms will be delivered.  I had never even thought that students might have the concern that a department chair should be dealt with in a particularly nice manner because the chair is the reader of his or her own evaluations without any delay until after the assignment of grades, but they clearly have voiced that concern.

  •   If we do not correlate evaluations with GPA, we should probably not ask for the GPA on the evaluation; several students were worried about loss of anonymity.

  Most of these reflections may be outdated soon, if we do go to an on-line system of evaluations; however, I must confess my deep disappointment in our system that has for so long gone with so poor a method to establish something so nebulous as pedagogical excellence with nonetheless dogmatic pursuit of some absolutist notion unwilling to re-examine itself again and again.

[1] The best way to illustrate category errors is to tell the tale of person X who—somewhere near Clyde Morris and International Speedway—wanted to know where the University is.  If we were to make a sweeping pointing gesture toward the campus, he might answer,  “I know where the Student Village is, that’s not it.  I want to know where the University is.”  We might try one more time and do another pointer motion.  If then the person were to say, “I know where the ROTC building is, but I want to know where the University is,” we would probably give up trying to point, except to point person X to the nearest logic text to read about category errors.

[2] Many-hands issues are ethical problems that involve many actors.  Most ecological issues are typical many-hands issues.  For example, if I dump my oil after an oil-change into my lawn, probably little harm has been done.  However, if we all were to dispose of our oil in such a manner, we would probably have to face a major environmental catastrophe.  Many-hands problems are particularly difficult to solve since the individual hands are difficult to persuade to become responsible agents.