The behavioral economics of teacher incentives. (Maybe, maybe not.)

Good teachers matter indeed.

Some teachers are borne this way; they just have the natural ability that it takes. Others have to work, and possibly work hard, on becoming at least halfway decent ones. That often requires considerable effort that can, or cannot, be elicited through teacher incentives. Not surprisingly, incentivizing teachers – and measuring teaching outcomes – has been on the agenda for a while but remains a bone of contention.

A recent working paper by Italian researchers suggests strongly that teaching evaluations, for example, are not a promising avenue for the simple reasons that they can be, and apparently are, gamed. The authors show persuasively that teaching evaluations are not only a poor measure of effectiveness but, in fact, may measure the opposite. Not really a surprise there except maybe for administrators that are convinced that everything can be kpi-d. Well, the paper by Braga et al should make them think again.

While it is difficult enough to determine good teachers ex post, it is even more problematic to do so ex ante. Say the authors of another new working paper:
“Observable characteristics such as college-entrance test scores, grade-point averages, or major choice are not highly correlated with teacher value-added on standardized test scores … . And, programs that aim to make teachers more effective have shown little impact on teacher quality … . To increase teacher productivity, there is growing enthusiasm among policy makers for initiatives that tie teacher incentives to the achievement of their students.” (Fryer et al. 2012, p. 1) Apparently, in the USA at least ten states and many school districts have implemented various teacher incentive programs.

In “Enhancing the efficacy of teacher incentives through loss aversion: A field experiment,” Fryer et al. (2012) ride an old workhorse of behavioral economists – loss aversion – for some additional mileage. Loss aversion is the idea that people – somewhat irrationally — cling on to something that theirs when on average they should not. The idea of loss aversion is closely tied to various “endowment effects” figuring prominently in the behavioral economics literature and is also a key ingredient of prospect theory.

Arguing that there is “overwhelming laboratory evidence for loss aversion” (p. 18, see also p.2 for a similar statement) but little from the field, the authors report the results of a field experiment that they undertook during the 2010-2011 school year in nine schools in Chicago Heights, IL, USA. They randomly picked a set of teachers for participation in a pay-for-performance program – 150 or 160 eligible teachers chose to participate — and then randomly assigned these to one of two treatments. In the “Gain” treatment, participants were given at the end of the school year bonuses linked to student achievement. In the “Loss” treatment, participants were given at the beginning of the school year a lump sum payment (parts of) which they had to return if their students did not meet performance targets. Teachers with the same performance received the same final bonus independent of the frame.

The result: Those in the Loss treatment manage to increase student math test scores significantly indeed (“equivalent to increasing teacher quality by more than one standard deviation”) while those in the Gains treatment don’t. The authors attribute this strong showing of the “Loss” frame; essentially they argue that paying people upfront, and threatening them with repossession if they would fail to make the grade, were better incentivized because they were “loss” averse.

I am rather skeptical about these results and doubt that they will be confirmed in (large-scale) replications, or for that matter in field applications. What makes me skeptical is that, for starters, the alleged laboratory evidence in favor of loss aversion is much less overwhelming than the authors try to make us believe. For example, work by Plott & Zeiler in The American Economic Review 2005 (here), 2007 (here), and 2011 (see here and here for a user-friendly blog entry based on their 2005 article that led to the 2011 controversy) has seriously questioned the reality of the endowment effect as well as the related asymmetry between willingness to accept and willingness to pay and hence the underlying idea of loss aversion. Ironically, one of the co-authors (List) has also made his name with artefactual experiments that seem to demonstrate that only inexperienced consumers are likely to fall for endowment effects (e.g., this 2004 Econometrica piece .)

I am also skeptical about these results because – while other explanations are argued not to be likely (something which, especially regarding cheating, seems debatable) – what strikes me as the most obvious explanation is not being discussed: Hawthorne, Pygmalion, placebo and other expectancy effects (e.g., here; see also a recent piece by two of the present authors). Rather than being loss averse, those in the Loss treatment may simply be shame averse for not having made the grade, an effect that most likely was significantly enhanced by them knowing that they were closely watched by scientists.

Last but not least there is, of course, the question whether the one-off effort, if indeed it exists to some extent, could be extracted year after year after year. Maybe, maybe not.

2 thoughts on “The behavioral economics of teacher incentives. (Maybe, maybe not.)”

  1. I know that I am kind of late to start a debate on that topic but I think the paper is still not published so I want to give some thoughts about it.

    First of all Andreas Ortmann wrote a nice comment on the paper and pointed out some important weaknesses of the study. However I would like to say that effects like hawthorne, pygmalion and placebo can be neglegded here because those effects hold both for gain and loss treatment. Furthermore the shame should also be the same because the gain teachers could also not having “made the grade”. The importance is that even though these effects can be assumed to have the same magnitude for both treatments (why shouldnt they?), there are only siginificant results for the loss treatment (which is also not true if you make the decomposition into Grades you will also find highly significiant and huge effects for K-2 students in the gain treatment).

    For me, more severe weaknesses are the lack of any interpretation how the results are generated by the teachers and also that the subject pool being treated (both students and teachers) are so underaverage. Ironically the authors give a section in their paper how well incentives work for developement countries and in a later section they state that they chose a subject pool where 96% of the participants receive free meal because they are poor. The authors also don’t give any information about how the subject pool influences their results.

    Somebody still interested to discuss?


    1. I just see this now; didn’t get notified when you posted. I would have to go back to the original post and the articles. Might do in due course ….


Comments are closed.