What she sees at the revolution

Peggy Noonan is a writer and columnist for the WSJ.  Part of her reputation stems from her writing speeches for Reagan and the elder Bush, and for coming up with memorable phrases. Some of these phrases apparently did not work out well for whom she coined them. Read my lips.

In a recent WSJ opinion piece titled How Global Elites Forsake Their Countrymen – a piece much shared on social media — Noonan enlightens us about the failure of global elites to empathize:

“The larger point is that this is something we are seeing all over, the top detaching itself from the bottom, feeling little loyalty to it or affiliation with it. It is a theme I see working its way throughout the West’s power centers. At its heart it is not only a detachment from, but a lack of interest in, the lives of your countrymen, of those who are not at the table, and who understand that they’ve been abandoned by their leaders’ selfishness and mad virtue-signalling.”

Noonan, presumably to impress on us her status among the well-connected, opens her piece recounting a meeting with “an acquaintance of Angela Merkel, the German chancellor and the conversation quickly turned, as conversations about Ms. Merkel now always do, to her decisions on immigration.” Noonan then recounts Merkel’s announcement in late 2014 that refugees from Syria, Iraq and elsewhere were welcome in Germany, the following influx (net) of more than a million in 2015, the resultant public discussions in Germany about this fact, and the claim that, coming “from such a sturdy, grounded character as  Ms. Merkel the decision was puzzling – uncharacteristically romantic about people, how they live their lives, and history itself …”. We learn that the acquaintance of Merkel attributes her puzzling decision to her upbringing as the daughter of a Lutheran minister in East Germany, and as yet another attempt of providing “a kind of counter-statement, in the 21st century, to Germany’s great sin of the 20th.”

We learn that, while this was as good an explanation as Noonan heard, there was a fundamental problem with the decision:

“Ms. Merkel had put the entire burden of a huge cultural change not on herself and those like her but on regular people who live closer to the edge, who do not have the resources to meet the burden, who have no particular protection or money or connections. Ms. Merkel, her cabinet and government, the media and cultural apparatus that lauded her decision were not in the least affected by it and likely never would be.”

“Nothing in their lives will get worse. The challenge of integrating different cultures, negotiating daily tensions, dealing with crime and extremism and fearfulness on the street—that was put on those with comparatively little, whom I’ve called the unprotected. They were left to struggle, not gradually and over the years but suddenly and in an air of ongoing crisis that shows no signs of ending—because nobody cares about them enough to stop it.”

Noonan goes on to invoke the Cologne transgressions at last new year’s eve celebrations and Merkel’s adjustment to the considerable political backlash that her policies have brought about (the strong emergence of the AfD and the growing support of other populists such as Seehofer, the head of her own party’s Bavarian branch) and her pleading with her own populace to  deal with both the positive and the negative aspects of globalization. Quoting a fellow journalist, Noonan argues: “’This was the chancellor’s … way of acknowledging that various newcomers to the national household had begun to attack her voters at an alarming rate.’ Soon after her remarks, more horrific crimes followed, including in Munich (nine killed in a McDonald’s) Reutlingen (a knife attack) and Ansbach (a suicide bomber).”

Now, it is rather rich that as prominent a megaphone for the global elites as Noonan virtue-signals her compassion for the disenfranchised masses that allegedly have fallen victim to the NIMBY syndrome. For all we know, Noonan got paid royally for her piece and was writing it in a brownstone home in an affluent residential New York City neighborhood.

All that hypocrisy aside, while we have come to expect false and silly claims from presidential candidates in the USA, it is noteworthy that Noonan seems not to check the facts that she parades to make her case. Of the three horrific crimes that she mentions, at best two can be clearly linked to Merkel’s open-door immigration policy (the suicide bomber in Ansbach, and maybe the Reutlingen knife attack, which — while committed by an immigrant asylum seeker from Syria — seems to have been a crime of passion). Importantly, the McDonald’s killings were committed by some kid born in Germany that was as confused and unhinged as some of the school shooters in the USA from which he seems to have taken his cues. Apparently, fact-checking is not Noonan’s thing. Never let the facts get in the way of a story that sells. True journalism, that.

Yes, there is no doubt that the open-door policy was ill-advised, lacked appropriate consultation, and was poorly implemented in particular on the federal level, but the fact is that murders in Germany — currently about 250 annually — have been cut by 40 percent since 2000 and – at least for 2015 — this number has not increased, notwithstanding the influx of the various newcomers to the national household. For all I can tell, Germany is far from falling apart at the seams as some of the hysteric press and social-media responses have tried to suggest.

The Independent — a British newspaper, no less — has argued that Angela Merkel’s open-door immigration policy will protect Germany from terrorism in the long run. It seems that for now things have worked out remarkably well even in the short run, notwithstanding the fact that this policy has been implemented poorly.

While it is way too early to assess all the benefits and costs of the developments in Germany since late 2014, it seems self-evident that Noonan is mostly uninformed about the current state of affairs there. That, unfortunately, seems to be the modus operandi of post-truth journalists like her who are no better than the illiterati and inumerati populating social media.

What the FWC DECISION on Frijters v University of Queensland can teach us

In the wake of the recent academic-freedom cases involving Safe Schools co-founder and academic Roz Ward (here, here, and here) and the journalism academic Martin Hirst, comes the decision that the Fair Work Commission (FWC) posted yesterday in the case of Frijters v University of Queensland.

As you may recall I wrote about this deplorable situation more than a year ago here on CET  (for a refined version, see here) and suggested that UQ’s attempt to charge Professor Frijters — the 2009 winner of the bi-annual Young Economist Award of the Economic Society of Australia — with (serious) research misconduct and to suppress his, and his former student’s, research on racial discrimination on Brisbane public transportation, was ill-considered and that at that point already the University’s ongoing and drawn-out attempts to hang on Frijters those charges were at the minimum disproportionate to the facts then known and also apparently unduly influenced by Brisbane bus company Translink.

The University administration under VC Peter Hoj decided — for what now looks like poorly veiled punitive reasons –- to push on, finding in a March 25 2015 report by one Professor Wright that Frijters was guilty of misconduct and that disciplinary actions were warranted. This led Frijters on 13 April 2015 to make his application for the Fair Work Commission to deal with the dispute.

The DECISION that commissioner Bissett posted yesterday is a pointed slapdown for UQ and some of its top administrators. While couched in typical legalese, the commissioner does not mince many words given the restraint that her office and position require. On more than 50 dense pages, and no less than 379 detailled statements of fact, assessments, and decisions, she makes it very clear that the University systematically, and through-out the three-year saga, violated its own Enterprise Bargaining Agreement, and that these violations were substantial and in several cases prejudiced.

To wit,

“[239] For the University to suggest that these are not fatal errors is to not give proper weight to the words of the agreement it has entered into with staff or a set of procedures it developed.”

Says the Commissioner, concluding:

“[374] I am satisfied that there were substantial flaws and a lack of procedural fairness in the process applied to Professor Frijters with respect to dealing with a complaint about the research.

[378] I am satisfied that the failures in the process, and hence the failure to apply the provisions of the 2010 Agreement properly are such and extend so far back that the entire process, including outcomes, is not reliable. There is no point in the process where it is possible to say that everything before that point in time was reasonable. The process was infected by error from so early on that the fairest thing would be to commence the process from the beginning again.”

The whole document makes for depressing reading and shows key admin players at the University have lost whatever compass, moral or of proportionality, one could and should reasonably expect. The document presents administrative arrogance of the kind that many academics here in Australia unfortunately have come to expect too often.

Again Commissioner Bissett does not beat around the bush (here and in many other places):

“[200] The 2010 Agreement and the research misconduct procedures have set out clearly how an investigation is to come about and how it is to be conducted. That the process and requirement may appear inconvenient or even if they are not fully fit for purpose in the particular circumstances does not give the University the right to alter those procedures. The procedures provide staff members with understanding and confidence in what is to take place. To vary from them so markedly is to undermine the importance of the 2010 Agreement entered into by the University freely with its employees. This is not something to be lightly put aside.”

I urge my fellow academics to read this document; who-dunnits and morality plays do not get much better.

What can we learn from the Frijters v University of Queensland saga?

First, discovery procedures such as FWC hearings are a beautiful thing.

Second, universities – even in the G8 – have more than their fair share of unscrupulous people who believe that under the cover of hierarchy and bureaucratic procedures they can act out any way they see fit. (I know, for many a reader here that is hardly a surprise.)

Third, for a leadership team that has been so clearly unmasked as being in contempt of its own agreements, it seems impossible to regain rapport with its staff members under the best of circumstances (such as an honest apology). The honourable thing to do seems to accept the finding (and for key players to take their hat).

Fourth, in [379] Commissioner Bissett notes that “It is not for the Commission to indicate the fairness or otherwise of Professor Frijters being put through the process again. That was always a likely outcome of the instigation of these proceedings.” It is to be hoped that the Commissioner is incorrect in that assessment and that those responsible for the ordeal they inflicted on Professor Frijters are not allowed give it another shot.

Fifth, Professor Frijters – at considerable costs to his health and also straight out-of-pocket and opportunity costs to himself – has provided us with a public good of considerable value. We should appreciate it. And learn from it. It seems about time to start a FIRE in Australia.

Sixth, every academic should be glad, and grateful, that an entity such as the FWC exists.


The benefits and costs of Facebook (and how to maximize surplus for self)

Facey is in the news again. Apparently one of Zuckerberg’s former employees went rogue and told the world that the news that is being streamed and trended is not determined by some “objective” algorithm. Rather the news is curated by a bunch of left-wing Ivy-Leaguers. Shocking news indeed. Who would have thought that a paragon of virtue such as Facebook and its owners would feed us biased news and opinions?

Predictably, plenty of social-media activity ensued. Also, the Facebook overlords invited a bunch of conservative / right-wing “victims” of that bias to its headquarters for a – undoubtedly very sincere – session. It must have been quite something, as even Glenn Beck found some aspects of it unpalatable. More social-media activity ensued.

Facebook also recently played an enabling role in the Aussie cases of Safe Schools co-founder and academic Roz Ward [see here and here and here],  Senator Levonhjelm’s senior policy advisor Helen Dale [see here], and racist posts directed at outgoing Senator Nova Peris. All cases demonstrate that the boundary of what is private and public are not well defined, and that in fact for all practical purposes the boundary is evaporating. Always expect everything that you post on Facey to find its way into the public discourse even if your privacy settings are non-public.

Discourses on Facebook can, of course, quickly spin out of control.  I doubt there is anyone who has not been through, or at least watched, debates that quickly deteriorated into exchanges of accusations, imputations of beliefs, insults, and more. Lots of virtue-signalling, too. As the adage has it, it can be a good thing for everyone to have a voice but it is often not. Too many people do not take the opportunity not to say something when it is offered.

While for years Facebook has been considered a necessary evil by many, its scope and usage continue to grow, with recent numbers suggesting that more than half of the Australian population have an account and on average spend 1.7 hours a day on it, qualifying them as some of the heaviest users world-wide.

Time, in any case, to the weigh the benefits and costs of it, with all the self-serving biases that might entail.

Let’s get the costs out of the way first.

There are at least three costs and they are considerable, no doubt:

First, the Facebook business model monetizes the kind of private information that (all too) willing customers are providing by liking this and reacting to that in the various ways that it provides. Facebook uses these revealed preferences to customize the messages that it sends and the ads it presents. The more of these ads the Facebook user reacts to, the better for Facebook since that kind of activity translates straight into Facebook’s profits, click-through rates being an important success metric. For the Facebook user that does not only imply various annoyance costs but quite possibly subtle meddling with preferences and never-ending targeted marketing. Lots of attention costs on top of the opportunity cost that Facebooking brings about in any case.

Second, Facebook presents a considerable invasion in privacy and security all the way to outright scams (google “skype video scam”). Reasonable pre-caution – for me, for example, not accepting the many requests by scantily dressed pretty young things who want to friend me – can avoid some of it but it is essentially impossible to escape the milking of emails that Facebook seems somehow be able to do. It is surely no coincidence (although it seems, as often, poor inference on Facebook’s part) that ads for some airline show up when I just made a booking on that airline for a trip.

Third, and possibly the most important negative thing about it, is Facebook’s addictive features. Like e-mail, or texting,  it is now well understood to induce a form of neural addiction and, like e-mail, or texting that addiction gets triggered faster when its rewards are structured in an intermittent-variable way. This does not even take designers to do, it is in a sense a feature that is built-in. Your mind gets easily hi-jacked, the reason being the choice between the considerable effort a serious  task takes– writing that article, or report, or preparing that lecture – or reading up on a couple chatty news item.

Relatedly, there are also health costs that come with it. Because Facebook is often read on mobile devices –  in Australia of the 10 million that are on Facebook every day, 9 million are on a mobile device –, “text necks” have become a recent epidemic with serious consequences for those affected by it.

Now that we got the costs out of the way, why would anyone in their sane mind use it?  Well, for starters, other social media platforms such as twitter are not much better. (Not that I knew first-hand.)

I see at least three benefits:

First, it is an easy way to stay in touch with the large number of people I met over the years. Friends from kindergarden (quite literally), to peers at various educational institutions, and teachers and students as well. Then there is all kinds of social contacts, random and not-so-random encounters of various kinds, and others. Of course, this motley network poses interesting questions about what the meaning of the word “friends” is in this context. Some have argued that there is a natural limit to the number of meaningful friends one can have [Dunbar’s number]. To me these are very silly notions drawing on conceptions of friendship that are antiquated. Clear is, to me, that there is absolute no reason to lose friends as you age, as was claimed in one recent study (see for write-ups about this study here and here).

Of course that takes some serious curating of your set of friends but it is often worth it. Once you have more than a couple of hundreds of friends, Facebook’s default setting deals with the resulting information flow and overload on its own mysterious terms and its algorithms often lead to “friends” drifting out of the feed. So Facey’s birthday reminder – worth the price of admission alone if you are as forgetful about these things as I am — is a good trigger to catch up quickly with someone’s recent activities. Being pro-active in this respect, and using several of the customization functions and prompts, is a way to defeat the Facebook algorithms that run in the background.

Second, Facey can be a fabulous content aggregator. Many – literally hundreds – of my Facebook friends are academics. At any point in time, and on any topic (replication crisis in the social sciences anyone? academics cutting corners? politics in Straya?) –,  I can count on them posting comments, or links to articles, that I would most likely not come across otherwise. All moderated by the credibility that they have with me based on often long histories of postings. In addition, if I try to recall something or need some pointers, typically s brief post suffices to get me quickly all the information that I need. And then some.

Third, Facey it is an excellent platform to explore ideas, to test-drive ideas that might just be of interest (e.g., this post essentially aggregates ruminations of myself and others on and about Facebook), even to test-drive ideas that might lead to publishable products. But it goes beyond content aggregation. It is the nature of Facebook that you can throw out even outrageous ideas for commentary.

How to maximize the value added for self?

First, stay away every day for about half a day completely. This is a strategy to avoid the neural- addiction problem and a rather useful way to go about digital de-toxing. It is not the only strategy of course – others have chosen other ways such as writing cocoons (see here and here) but mine works quite well for me.

Second, when you are online – whether every day for half a day or between writing cocoons, try to use it as reward mechanism. I.e., don’t leave it one because you make yourself susceptible to the kind of neural addiction discussed above. Set yourself a goal – like finishing a draft, or reviewing a paper  – and then reward yourself with some Facebooking. Then repeat. It might take some serious convincing yourself that the latest Kanye West update can really wait but once you manage to do it, you should have tamed the neural-addiction beast. Note that this way you also deconstruct the intermittent-variable-reward mechanism problem. While the strategy is simple and straightforward, its implementation is less so: Facebook’s data scientists estimate people check the platform about 14 times a day.

Third, be proactive. That is the way to counteract Facebook’s deviousness; it is also a way of keeping your site useful for your friends. Don’t just post stuff; at the minimum — lest it is obvious — rationalize your posts by offering a summary or commentary, or at least some excerpts from the article that you post. Facebooking is an indefinitely repeated game with multiple players and gift exchange is a major driver of its usefulness. Do not hesitate to delete out-of-bound or irrelevant comments or their perpetrators. Too many will post and repost over and over again what they think is interesting, often even without bothering to explain why, and thus most likely will end up cluttering your site. Plus, there are many that specialize in proselytizing on sites that are better curated and where their posts might reach more readers. While digital assets of the Facebook kind are not exactly commons, they tempt many to treat them that way.

In sum then, Facey has important costs — from outright opportunity costs over various annoyance and attention costs to added privacy, security, and addiction risks as well as health costs — but it also can be customized to bring about considerable benefits. By revealed preference, I obviously believe that the benefits outweigh the costs. It takes some effort though to get there.


Why Boehmermann and Merkel have already won, and Erdogan is set to lose: Some backward induction

The players and their alleged actions

Lest you have lived under a huge rock for the last couple of weeks, you will have heard about that German comedian (Boehmermann) who in his tv show dared to insult Turkish president Erdogan with a rather (c)rude poem in which assertions were made (involving, for example,  goats,  shriveled balls, a tiny penis, paedophilia, SM, gang-rape, etc.) that we can only indicate in this fine family outlet.

Erdogan, already enraged by a short and rather brilliant song video that colleagues of Boehmermann at another tv program had produced earlier,

not only demanded the poem also to be taken down, but demanded that the German government – represented by Merkel –  allow an obscure paragraph in Germany’s Criminal Code (“Strafgesetzbuch”) be invoked that makes insulting a foreign leader punishable with up to 5 years of prison. That obscure paragraph, 103, goes back to 1871 and has been invoked only a few times previously. Equally obscure, and absurd, is paragraph 104a which stipulates that the government must decide whether it allows for the complaint to go forward under 103.

After a few days of consultation and reflection, Merkel allowed the complaint to go forward, copping plenty of criticism for her decision from the usual bunch of Libertarian simpletons (but the freedom of artistic expression ! and the  freedom of speech !), politicians that oppose Merkel on all issues as a matter of principle (Die Linke), or at least saw an opportunity to score cheap public-opinion points (the Social Democrats, her coalition partner, that continues to slide towards oblivion in the polls), and of course the usual slew of (social-)media dimwits and ignoramusses.

The New York Times editors, for example, chipped in, demonstrating for the most part their lack of knowledge about the situation and their lack of understanding of the context. Erdogan silencing all of satire in Germany? Really? Boris Johnson also took it upon himself to lecture the world about the unprincipled decision that Merkel had allegedly taken. Creditable and evidence-based opinion right there. Of course, on top of being mostly spectacularly ill-informed about important details, all these arm-chair commentators had their own motives which we can safely assume had to do with the advancement of their own profile.

The (legal) facts

One fact is of particular importance and it has all but gotten lost in the media storm that has ensued. In his tv show Boehmermann started with a comment on Erdogan’s failed attempt to have that earlier biting song video about him taken down. Pretending to lecture Erdogan directly, and heaping in passing plenty of subtle ridicule on him, Boehmermann first expounded why Erdogan failed in his earlier bid to have the song video taken down. He then explained why some such song video – and any fact-based song video of that make – would be covered by freedom of artistic expression and freedom of speech (“Kunst – und Meinungsfreiheit”) in every civilized country in Europe, or at least the European Union (of which Erdogan would like Turkey to be a member of).

Boehmermann then proceeded and explained where the limit of such satire was, seemingly giving up at some point his attempt to explain legal fine points and instead illustrating with his poem the limits of what can be said. Throughout he stressed – dialoguing with a sidekick — that this was an illustration of what one could not say under paragraph 103. It was all quite brilliant, as these things go. Very funny, too. And it rhymed.

Looking at the context in which Boehmemann recited his poem, it seems to me that it cannot possibly construed as violating paragraph 103. I predict, and I predict confidently, that the judges will agree, dismiss Erdogan’s complaint (or, at worst, slap Boehmermann with a small token fine), and in passing shower Erogan with more ridicule.

(The young Augstein, thinking of himself – once again falsely — as being the legitimate successor of his father, argued that a violation of that paragraph for illustrative purposes is still a violation, in the same way as the illustration of an assault on someone is an assault. This attempted analogy seems pretty obviously flawed but I let the legal experts sort that out.)

The players and their actions

To understand why Merkel’s move was a savvy, and ultimately the only rational one, let us do some backward induction. It is a reasoning procedure that assumes rational (and typically self-regarding) behavior and starts from the outer reaches, or terminal nodes, of the sequential game tree that one wants to analyze. For the sake of simplicity, let us assume it will be the media circus in which Boehmermann, and his lawyers, will get the chance to explain why the 103 does not apply.

(Reasonable people can disagree whether this spectacle is indeed the end of the game. One could, for example, argue that this spectacle is embedded in a larger sequential game whose terminal nodes involve the repeal of the paragraph, something that all parties at this point seem to have agreed on for 2018 already. But let us focus on the more narrowly defined game. The extension just complicates the analysis but does not undermine the key points that need understanding.)

Merkel surely made her decision not by herself but based on legal advice and plenty of consultation and reflection on the payoffs of the actions she had available. I have little doubt that Merkel has been advised that Boehmermann is not likely to face serious consequences under 103 although he may still face consequences when Erdogan fans – of which there are many dimwitted ones even in Germany – will try to go after him (but they will of course do that anyways). It is indicative that Boehmermann is currently under police protection and had to cancel the next instalment of his show. But it was him and the producers, and not the government, that decided on this precaution, contrary to what some uninformed sources have argued or intimated.

Had Merkel decided not to allow the complaint to go ahead, she would have unnecessarily  – especially given that she needs Erdogan to sort the refugees mess out – gone confrontational with the dude at predictably very high cost to her and the country. She also would have to continue to deal with the issue (rather than let the court and Boehmermann deal with it) and would have pre-empted what I anticipate to be a lesson for Erdogan about the freedom of artistic expression, the freedom of speech, and for that matter the separation of power – surely it will be spectacular lesson. Hand over the popcorn.

Allowing the complaint to go forward under that silly paragraph was, in game theory lingo, a dominant strategy and it was the clever thing to do. Merkel lobbed the whole affair out of her court, seemingly conceding to Erdogan that he might have a case but at the same time making sure that dude will get yet another fundamental lesson in what satire is allowed to do in Europe. As a matter of fact, her own framing of the situation mentioned – not co-incidentally – Rechts-staatlichkeit (roughly due process and separation of power) and the presumption of innocence as motivators for her decision.

In sum,

for all I can see Merkel did everything right in this situation. Boehmermann will get the glory for having triggered the repeal of the 103 and 104a paragraphs and has become a household name in Germany and beyond (I never heard about the guy before): It was a brilliant performance by any measure, as others have also observed. Merkel can lean back and enjoy the show – possibly tete-a-tete with Boehmermann — that is certain to follow and meanwhile deal with way more consequential issues, such as the threatened Brexit, the continued Greek crisis, the refugees crisis, and the rise of a very vocal right-wing movement in Germany.

Erdogan will soon notice that his ways of silencing critics – while it seems to work in Turkey for the time being —  does just the opposite in Germany and for that matter in most of Europe: While both the Erdogan song video  and the poem would have been heard / seen by maybe hundreds of thousands and would soon, without his interventions, have been forgotten like those Greek and Polish magazine covers showing Merkel as Hitler, or dominatrix, and what not, the song video has now amassed at one source alone more than 8 million hits.


You should make sure to watch it because satire does not get much better.

Erdogan’s curiously ill-advised actions have led to millions hearing and seeing the artifacts that he tried to incriminate and thus brought to the attention of those millions what a dim-witted and delusional wannabe dictator Erdogan is. And that’s before his complaint has been dealt with in court.


Update October 5, 2016. The German court has decided to dismiss the complaint, as predicted.


So, is there a crisis? Or is there a crisis of the crisis, or what? On replicability, reproducibility, and other current challenges in the social sciences

As the old adage has it, before it gets better it will get worse.

I have previously written about the deepening sense of crisis in economics and psychology (e.g., in The Conversation and in Core Economics Today – here and here and here)

Three interesting recent exhibits

The last couple of weeks we have seen three interesting additional exhibits:

First, a quartet from Harvard tried to straighten out the public record established by the Open Science Collaboration (OSC)  – a group of 270 researchers from psychology — a few months earlier: that results in psychology are mostly not replicable. Gilbert and his colleagues make the remarkable claim that there is “no evidence for a replicability crisis in psychological science.”

Second, on the same day an author made the astonishing claim in Slate magazine that 20 years of studies of ego depletion, an influential and seemingly robust set of findings, have recently dissolved into thin air.

Third, a group of 18 researchers in economics published the (long awaited) findings of another replication project and – after the dismal findings of another such attempt reported by two researchers from the St Louis Fed last year — , had mostly good news for econs. A colleague of mine from psychology sent me the write-up from The Economist with these words:

“So all is fine in the house of experimental economics then…



Is it?

Is there a crisis, if not in economics, then in psychology?  Or, in the social sciences as such?

In the following I will briefly comment on each of these three events from the last couple of weeks and the heat that they have produced. I will then try to cast a broader net before I come to an assessment of the current state of affairs in the social sciences.

So, is all fine in the house of experimental economics?  

As I pointed out to my colleague, a replication of 11 of 18 experiments published in a couple of top journals (which have an acceptance rate of well below 5 percent and huge selection biases) says little about the state of the art – replicability, reproducibility — in economics.  I like to believe that evidence production in economics is more stable than in psychology because economists’ experimentation practices are less laissez faire but I fear that we have also a lot of false positives. In work that I have done with Le Zhang (currently under second- round review), we have shown that dictator game experiments published in the top experimental economics journal were typically severely underpowered, inviting them pesky false positives. While Camerer et al. ran their replications under an exacting standard of a required power of 0.9, until recently most (well, at least most dictator game) experiments in economics were not properly powered up.

It is also worth recalling briefly that a few months earlier two Federal Reserve economists came to the alarming conclusion that economics research is usually not replicable. Their conclusion was based on an attempt to replicate 67 empirical papers in 13 reputable academic journals of which they could, without assistance by the original researchers, replicate only a third of the results. With the original researchers’ assistance that percentage increased to about half. A good summary of their study can be found here. This is an arguably even more troubling result, as you would think that the additional variance that experimenters’ choice of experimental design and implementation details entails, would reduce the irreplicability of empirical findings (where data sets are, after all, pre-existing). This replication attempt indicates the magnitude of the problems that economics might have.

Indeed, we have seen for a couple of decades now that effects – for example, many so-called cognitive illusions – have not survived serious attempts at replication, or maybe I should say at reproduction. Take as a recent – non-laboratory — example the controversy over reference dependence and the alleged propensity of taxi drivers to shoot for income targets, thereby violating the neo-classical optimizing model of labor supply theory. In a recent article, Hank Farber, using a much larger and complete data set for New York taxi drivers than Camerer et al. (QJE 1997) had, finds that “income reference dependence is not an important factor in the daily labor supply decisions of taxi drivers”. My colleague Tess Stafford had come to a similar conclusion earlier, demonstrating how the results in Camerer et al. (QJE 1997) can be made to appear and disappear. Hint: proper metrics is a key. Complete and large data sets also help.

I could parade many examples (endowment effects anyone? Loss aversion? Conjunction fallacy?) where serious questions have been raised about the replicability and reproducibility of effects claimed in the Biases and Heuristics literature.

On balance then there is reason to believe that economists have way to go and ought to continue to improve their data collection and sharing efforts and to reflect on the design and implementation of their experiments and, very importantly, the appropriate econometric assessment of the evidence produced. The house of experimental economics, I fear, is not yet in good order.

Is all fine in the house of experimental psychology?  

The Harvard quartet’s critique of the OSC initiative, and Ed Young in The Atlantichttp://www.theatlantic.com/science/archive/2016/03/psychologys-replication-crisis-cant-be-wished-away/472272/ its provocative conclusion that “the reproducibility of psychological science is quite high and, in fact, statistically indistinguishable from 100%“  has been widely dissected by OSC as well as individual OSC members (e.g,, Brian Nosek and Elizabeth Gilbert hereA number of commentators had their take on the situation published in popular media (e.g., Katie Palmer in WIRED and Ed Yong in The Atlantic , for others see the retraction watch list or Mayo’s summary of the recent developments on repligate), and some highly qualified (albeit not always completely uninterested) parties such as David FunderAndrew GelmanDaniel LakensUri Simonsohn. and Sanjay Srivastava have done so in more specialized outlets. (Follow the links attached to the names.)

The latter three have, at least to my mind, demolished pretty good the case that Gilbert and his colleagues presented. As did Funder and Gelman (and some of the commentators on Gelman’s piece).

Funder and Gelman also step back from the battle and look at the war that really is being waged here and by doing so provide some much needed light where there is currently way too much heat.

Funder, for example, points out that the OSC study “is not the only, and was far from the first, sign that we have a problem”; he is too modest to point out that he himself has provided more than a decade back a lengthy contribution and problem description.

Funder, seemingly unaware of the replication crisis that economists are dealing with, points also out that “other fields have replicability problems too”; he mentions specifically biochemistry, molecular biology, and medical research including cancer biology studies.

He then argues, “if Gilbert & Co. are right, are we to take it that the concerns in our sister sciences are also overblown?” It is a rhetorical question to which his answer is pretty clear. He concludes with a useful discussion of  ”the ultimate source of unreliable scientific research”, locating it in a tightening market for academic jobs and opportunities, the emerging “academic star” system, and other perverse incentives for academics.

The apparent debunking of 20 years of ego depletion findings mentioned at the beginning as one of three prominent developments during the last two weeks is just another illustration of the current unsatisfying state of affairs.

So, is there a crisis?

You have to live in an ivory tower to believe that there is not. It seems obvious to me that there is and that before it gets better, it will get worse. That’s because suddenly everyone is talking about it and got interested in it. And a general sentiment has developed, and even found its way in editorial practices, that flashy results that barely clear conventional hurdles ought to be not trusted.

Some observers have taken the current debate as a cue to rethink the way we do science. In psychology for example, in the wake of huge and somewhat nasty controversies over the reality of various priming effects, a replication recipe has been proposed and it would indeed be a good start if it were widely implemented. Likewise the various offerings of pre-registration, while not completely uncontroversial, are a welcome move in the right direction if the rather stunning results from NHLBI funded trials recently reported in PLOS One are any indication.

How deep the crisis is, is a question that is harder to answer. That’s because any such answer depends on what our measuring rod is, and ought to be. Are we looking for what some people call direct replication, or are we really interested in what some people have called conceptual replication and yet others have called reproduction? Ben Strickland makes an excellent case for conceptual replication here, arguing that what we really ought to be after is reproducibility of robust effects. Rolf Zwaan makes a related argument here.

Which of course ties into important question of the appropriate choice of design and implementation characteristics, as well as the question of the correct statistical evaluation which is another but related battlefield.

In sum, there can be little doubt that there is a crisis. There is no crisis of the crisis, for all I can see. And it seems fair to say that the sense that there is a crisis has both widened and deepened to judge by the evidence that has been forthcoming.

That there is a crisis and that it is widening and deepening, at least for now, is the bad news. The good news is that overdue discussions – about replicability and reproducibility and everything that is connected to them — do take place and do take place in a serious manner. Mostly.

The widening sense of a widening and deepening crisis is upping the level of the game; it is for example encouraging to see the increasing offerings of pre-registered studies, the increased opportunities to publish replications or reproductions, the fact that many journals now require submission of data files before publications, etc.  Similarly, it is encouraging to see platforms such as retraction watch emerge and clearly stay for good.


Ten commandments for the social-media demagogue

  1. Ignore facts. Facts are pesky and quaint. They constrain your narrative and might constrain, oh my, your priors. And they might even make it necessary that you provide links. Which might actually be checked. So, don’t go there.
  2. If someone brings them (pesky and quaint facts) up, either ignore them and their facts or, if that is not possible, question them and their facts – and do so with the right mix of indignation and annoyance — but do so preferably with opinions of others. For example, you could say your acquaintances and/or friends told you so. Remember: facts are pesky and quaint. They could be fact-checked. Opinions of acquaintances and/or friends, not so much.
  3. When you identify something, stay away from precise descriptors; it’s just too friggin labor-intensive. So rather than being specific (lots of work to figure out), say something like “the left” or even better say “the regressive left”. That will a) signal unmistakably where you stand and b) draw the admiration of your followers. Hail, you.
  4. Never worry about base rates. Way too complicated an argument. For example, if you were to look up the percentages of sexual assaults and rapes in Germany, you might learn that those of immigrants were about the same as those of “natives”. Which would make posting attention-grabbing individual cases look kinda silly. But that’s where you want to go. Post attention-grabbing individual cases. (Don’t forget to prettify your post with some pictures that tuck at heart-strings. Children that have drowned, or at least look sad, are always a winner.)
  5. Don’t worry about the fine difference between allegations of facts and established facts. Again way too subtle an argument. They are really the same to most people anyways. Because, you know, the subtle distinction between allegations of facts and established facts is just too cumbersome. Again, it constrains your narrative and might constrain your priors about how the world is, and in any case ought to be. According to you.
  6. Attribute motivations, or use conversational implicatures to have them imputed. For example, say that Merkel does not care about her female compatriots and the fact that they get raped by those Magrev types. (Note the clever use of commandments 4 through 6 here.)
  7. Relatedly, freely and generously use innuendo. When doing so, make sure that you maximize the reputational impact on your target.
  8. Make sure to be aggrieved when someone challenges you.  Accuse them of being ignorant and not knowledgeable.
  9. Post and post and post (preferably articles you have not read so that you can post more of them; the headline should be good enough to discern which fits your priors.) Don’t bother to summarize key arguments you find in an article, or at least to cut and paste the key paragraphs of the article. Way too labor-intensive.
  10. Freely and generously dispense your opinions. Never miss a chance to comment just because you had the opportunity not to say something,

Random observations on China

Warning: This is mostly a personal travelogue, with some generalizations and conjectures thrown in for good measure. A colleague of mine was so kind to comment on a draft and nonchalantly suggested that the title of the piece ought to be “Clueless Westerner hops off a plane and makes many random observations”. So there, I warned you.

This was my first visit to China and these are indeed very first impressions but …

it is a fascinating and intriguing country. Certainly, Shanghai and Beijing are. I will be back.

I was there mostly for professional purposes, gave a couple of talks and worked with a former student — now assistant professor at ShanghaiTech University — on a couple of manuscripts. It’s publish or perish there, too.

I stayed at the Hope Hotel in Shanghai – somewhere in the west of the city, near the downtown campus of ShanghaiTech U — where, it seems, I was for a while the only foreigner among all Chinese guests. But these Chinese were clearly middle class or maybe even higher up.

Food was plenty and often excellent. Curious and exotic, too.  Spicy chicken feet for breakfast? Well, maybe not. And frog at lunch in a fine eating establishment next door? Thank you but no, here, too.  Also, scorpions, spiders, and snakes are just not my thing. Even fried or grilled. I did try the salted eggs and the ginger threads and the lotus roots and the various mushroom delicatessen and other fascinating food stuff. The fish in chilli peppers was fabulous both times I had it. Also, it is amazing in how many ways you can prepare bean curd and I really liked the black rice cake. But I am digressing.

I made an excursion to the Hope Hotel neighborhood on the second day, just strolling through a few streets off the main traffic artery (about half a dozen lanes in each direction). While spending the weekend in Beijing  I likewise strolled through several neighborhoods there, mostly off the beaten tracks and with the help of a fabulous guide born in Beijing. A very different picture emerged in these excursions (a couple of them through hutong neighborhoods that the government has the good sense to preserve in their original building substance.) Hundreds of little shops, often literal holes in the wall (struggle town right there), offering an astonishing amount of riches at rather reasonable prices. Just for the record, the next time you are in Beijing, try Grandma Creative Kitchen, a little unconventional gem off the beaten track. You can thank me later.

Whatever legitimate gripes people might have about the system (and there are some such as pollution, restricted access to information, and corruption that seem very justified), it is remarkable that China, or at least Shanghai and Beijing, are the functioning metropoles that they are. Much of this remarkable development — remember, just a couple of decades back the country had trouble feeding its citizens — seems to be driven by entrepreneurial activities in the large and small. In particular the latter, as exemplified by the numerous eating places, many of them so small that it is hard to imagine how they can make ends meet. In any case, many people seem to do very well although it is also clear that the wealth and income distribution is rather unequal. This is communism? Or at least real existing socialism? I wonder what good ole Kalle would have to say about it. Or for that matter Chairman Mao.

For the most part, I did not dare to eat street-food although some of it looked rather yummy.

I did buy local beer (excellent) and wine (better than expected, in fact quite enjoyable). Also, the coffee is often much better than its reputation and I am not talking about coffee at a Starbucks which seems to have many outlets in particular in Shanghai. Outlets that are crowded indeed. Goldmines, surely. A regular skim latte with two extrashots? That’d be … 38 yuan (almost 10 Aussie dollars). Wow. (For comparison, four times that amount buys you a delightful lunch, for two, at the Grandma Creative Kitchen.)

I arrived on Sunday evening (December 6) in Shanghai and the next day read in the news sites that I could access (yahoo, Spiegel, SMH, no google or gmail or facebook there lest one goes VPN) that the authorities in Beijing had issued the highest environmental alert ever for Tuesday through Thursday. As it is, pollution in Shanghai was about ten times what it typically is in Sydney when I arrived but then, after some rain Wednesday and Thursday, reduced to three times that benchmark. None of it seemed to faze my local friends much (although allegedly pollution has gotten to the point where some expats are getting really concerned). My head, in any case, noticed the improvement. And on the day when I left, it also noticed – having been pampered for several days in Shanghai and Beijing by pollution levels classified as green or yellow (i.e., acceptable) – the sudden worsening in pollution from two / three times the Sydney standard to ten times, a worsening that happened within a couple of hours. It was remarkable: In the late morning good visibility and blue skies and even sun and then a couple of hours later clearly reduced visibility, grey skies, and if there was a sun it was well hidden behind a haze that got stronger with the minute.

The worsening environmental pollution seems the biggest threat to the welfare and productivity of the country and also to the pre-eminence of the party. (As in Vietnam, one keeps forgetting  that this is a “communist” country because on the individual level, entrepreneurialism is alive and well, and one sees little in the streets in terms of police, or other manifestations of state power, or what not.)

When I read about the Red Alert in Beijing, and in light of the threat that pollution poses for the government, I found it interesting that the authorities would allow some such alert: the worsening environmental pollution seems so obviously a systemic failure and hence to reflect on the party that runs the country.

Not really, my friends said. It is considered a failure of everyone and most Chinese seem to be able to relate to that because self-regarding preferences and a lack of concern for the commons, or for that matter, for externalities that one’s own behavior produce, are constituents of life in China. That’s similar to what I noticed in Moscow, or for that matter in Prague a decade or two back (and even now), or during numerous visits in what was then East Germany. Real existing socialism seems to bring out some pretty nasty sides in people. China, of course, is a very strange variant of real existing socialism: The planners in the state party that think they know best economically and socially and otherwise are at the top but at the bottom unrestrained Darwinian competition seems to carry the day. Even in its daily manifestations: Queueing anyone? Also, why worry about litter? Them pesky cigarette stubs – away with them on the side-walk.

I had been warned that in China I would face a hard test of my addiction to facebook (and google for that matter). Well, all true. But I survived it and it is easy to access many sites that you would normally google via bling or equivalent. Plus VPNs are on offer everywhere if your fb addiction were to get the better of you. I did just fine. Life without facebook is possible.

It seems that the Chinese authorities block some sites – such as facebook — religiously and you wonder why. Really. Everyone can access whatever they want if they really want. And what threat to the system does a 25-year old beauty queen really pose? Its’ puzzling. Likewise, allegedly the authorities seem to currently – again – be going after some young feminists. Someone somewhere high up in the state party hierarchy seems to have problems with their priorities. Did I mention air pollution? And ground and water pollution for that matter? Also, it is simply stupid from a public relations point of view. Anastasia Lin and Li Tingting clearly know how to work (social) media and the authorities seem woefully out of their depth in understanding the dynamics of it all.

As mentioned, one does not really notice much of state power in the streets; for someone visiting major cities it seems to show up mostly in odious reporting requirements (if one does not stay at a hotel), the restrictions on certain foreign sites, and access to free wifi in cafes that requires a working mobile phone in China. (So much for free wifi at the airport.)

I mentioned the occasional nastiness, or maybe I should better say unfettered self-regarding behavior, of some of the people I saw/encountered. There is a couple of notable exceptions.

First, everyone I met professionally was unfailingly polite and concerned to various degrees.

Second, children. The Chinese adore children in a way that I have yet to see in any other culture. They dote on them and I wonder what it does to these kiddos – overwhelmingly now sole children – down the road. (Yes, I am aware of the work of some my Australian colleagues, and also of Gigerenzer and his colleagues, about little emperors but also the dispute that it generated.) At the same time, everyone is worried about the future of their children and whether they can make the grade. The expectations are high and many kids are being pushed hard to perform.  I am not sure that is the way people get educated in ways that serves a society well. In fact, I am pretty sure it is not. A friend who I discussed this observation with explained it in terms of an intergenerational social contract: She thinks that the way how parents and grand-parents treat (their) children is related to the idea of parents bringing up children for the purpose of being looked  after well in their old age. When there is no well established financial / banking system and social security, people rely on children and may value the investment in children as insurance againist longevity. Chinese people thus  emphasize or value the filial responsibility as the most important value. Children have the obligation to pay the “loan” back to parents and grand=parents. That strikes me as a credible explanation.

Finally, corruption. When you visit a country for a few days it is hard to see many manifestations of it but corruption is clearly a problem for China; the authorities would otherwise not engage in high-visibility campaigns of various forms against it. I saw one striking example of likely corruption in the small: At the Shanghai hi-speed train station a mafia-like, well-organized mob was aggressively trying to persuade those arriving to take private and presumably unlicensed cabs  rather than to progress through an extraordinarily long queue that it took about half an hour to clear. The aggressiveness of the mobsters was disconcerting and it is astonishing that the city and train station administrations allow for it to happen. Likewise, taxi services in Shanghai seem to cost about twice as much. Apparently the mafia-like mob at the trainstation has considerable clout with these administrations. I’d be surprised if it did not come at a price.

Feel free to comment on my observations, generalizations, and conjectures above, especially if you know the country first-hand and better than I do. I am eager to learn more about it.

Many cooks spoil the broth. And that’s just for starters. On the WDR 2015.

World Development Report 2015: Overview: Mind, Society, and Behavior. A World Bank Group Flagship Report. Available at https://openknowledge.worldbank.org/ or doi: 10.1596/978-1-4648-0342-0

[What follows are excerpts from a review forthcoming in the Journal of Economic Psychology; full version there or write me … ]

The World Development Report 2015 (henceforth, the Report) was launched in December 2014. Apart from a foreword of the president of The World Bank Group in which we are told “that, when it comes to understanding and changing human behavior, we can do better” (p. v), there are two full pages of acknowledgments and twenty pages of overview that introduce the three parts of the Report.

Part 1 sets the stage with a sketch of a conceptual framework. It builds on the well-established fact (a “principle” as it is called here also) that our cognitive resources are limited and hence much of our thinking is automatic (chapter 1, “Thinking automatically”) and that, furthermore, we are social animals and hence our actions tend to be moderated and influenced by our environment (chapter 2, “Thinking socially”). The automaticity and sociality of our thinking induces mental models and chapter 3 reflects on this principle: “When people think, they generally do not draw on concepts that they have invented themselves. Instead, they use concepts, categories, identities, prototypes, stereotypes, causal narratives, and worldviews drawn from their communities. These are all examples of mental models. Mental models affect what individuals perceive and how they interpret what they perceive, … “ (p. 11)

In Part 2 six thematically oriented reviews of the literature (chapter 4: Poverty, 5: Early childhood development, 6: Household finance, 7: Productivity, 8: Health, 9: Climate change) are offered, each with many references.  Part 3 reflects on how the work of development professionals can be improved: In chapter 10 an attempt is made to assess the biases of development professionals and in chapter 11 adaptive design and adaptive interventions are discussed.

The Report is an enormous collaborative effort of more than a dozen direct contributors and many more indirect collaborators such as an advisory panel consisting of well-known academics. And, “[v]aluable inputs were received from all World Bank Group regions, the anchor networks, the research group, the global practices, the Independent Evaluation Group, and other units. The World Bank Chief Economist Council and the Chief Economist’s Council of Eminent Persons provided many helpful comments.” (vii) Two dozen organizations are also thanked for various forms of support and scores of individuals are thanked additionally for their feedback (see vii, viii). The Report draws on background papers and notes prepared by almost 30 people, and received expert advice from more than 40 people. The production and logistics team for the Report comprised a dozen people and there was both a principal editor and a principal designer of the Report.

The Report draws unapologetically on Behavioral Economics, whatever it is that this term denotes these days (Heukelom 2014; see Ortmann 2015).

An infographic usefully sketches out the conceptual framework: Economists, it is argued, typically assume people make rational choices, i.e., they carefully weigh choices, consider all readily available information, and make decisions individually. It is noted that that is not the way people actually make decisions. Behavioral economists – here understood to be economists enlightened by psychological insights – provide hence “a richer understanding of how people actually think and behave”: People think automatically (“We tend to think fast and rely on mental shortcuts”) and socially “We cooperate, as long as others do the same, and rely on social networks and norms”), and think with mental models that their automatic and social thinking made them pick up.  Which, in turn, motivates the plea for numerous policy interventions from simplified information presentation, to application of social pressure, and derailing of mental models.

The Report provides scores of brief summaries of literally hundred of studies that document interesting and apparently successful interventions. Herein lies the major value of this Report.

What the Report does not do, unfortunately, is the kind of red teaming that it advocates as “one way to overcome the natural limitations on judgement among development professionals … In red teaming, an outside group has the role of challenging the plans, procedures, capabilities, and assumptions of an operational design, with the goal of taking the perspective of potential partners or adversaries. Red teaming is based on the insight, from social psychology, that group settings motivate individuals to argue vigorously. Group deliberation among people who disagree … increase the odds that the best design will come to light and mitigate the effects of ‘groupthink’” (p. 19)

It is hardly contestable that we are social animals and that our mental models take into account what good old Adam Smith has identified as the blame- and praise-worthiness of our actions (Meardon & Ortmann 1996). The incidence of corruption, for example, is in many cases positively correlated with its social acceptance. And “changing a social norm about corruption constitutes a collective action problem rather than simply the repression of deviant behavior” (p. 61). The challenge then is how to solve this collective action problem. In chapter 2 of the Report, we learn about pro-social motivations and group identification and that most people behave as conditional cooperators. Such cooperation depends on one’s expectation about others’ cooperation. Studies are being reviewed that show how expectations can be manipulated into desirable directions, through the possibility of punishment and/or opportunities to observe others’ behavior for example. The problem is that punishment prompts desirable behavior only in very specific circumstances (see Guala 2012 and Balafoukis & Nikoforakis 2012 and see also the literature on asset legitimacy and social distance such as Cherry et al. 2002, Bekkers 2007, and Smith 2010, to name a few). Also, the effects of social monitoring are by no means uncontested, surely highly contextual, and I am not aware of any study that investigates their robustness over time. I.e., how long will “watching eyes” prevent people from exercising constraint?

Overall, and notwithstanding the occasional claim of systematic reviewing (p. 155 fn 6), the sampling of the evidence seems often haphazard and partisan. Take as another example, in chapter 7, the discussion of reference points and daily income targeting that was started by Camerer et al. (1997) and brought about studies such as Fehr & Goette (2007). These studies suggested that taxi drivers and bike messengers in high-income settings have target earnings or target hours and do not intertemporally maximize allocation of labor and leisure. The problem with the argument is that several follow-up studies (prominently, the studies by Farber 2005, 2008) questioned the earlier results. Here no mention is made of these critical studies. Instead the authors argue that the failure to maximize intertemporally can also be found in low-income settings. They cite an unpublished working paper investigating bicycle taxi drivers in Kenya and another unpublished working paper citing fishermen in India. Tellingly, the authors (and the scores of commentators they gave them feedback) did not come across a paper, now forthcoming in Journal of Labor Economics, that has been circulating for a couple of years (see Stafford 2013) and that shows, and shows with an unusually rich data set for Florida lobster fishermen, that both participation decisions and hours spent on sea are consistent with a neoclassical model of labor supply. Stafford also shows that estimation issues, and not workers’ behavior, may be responsible for earlier findings. Methods that do not control for measurement error and endogeneity of the wage not only produce downward biased estimates of labor supply elasticities, but generate a spurious negative and significant elasticity of daily hours.

There are dozens of other examples of review of the literature that I find troublingly deficient on the basis of articles I know. Troubling because “[t]he body of evidence on decision making in developing country contexts is coming into view, and many of the emerging policy implications require further study.” (p. 3) But, as mentioned and as I have illustrated with examples above, there is little red teaming on display here. Not that that is a particularly new development. Behavioural Economics has since the beginning been oversold and much of that over-selling was done by ignoring the considerable controversies that have swirled around it for decades (Gigerenzer 1996 and Kahneman & Tversky 1996 anyone?; see also Hertwig & Ortmann 2004 and Gigerenzer et al. 2008; and specifically pertaining to the the kind of work reviewed in the Report, Harrison 2010, 2011, 2013, Andersen et al 2014; see also the hilarious Welch 2015)).

The troubling omission of contrarian evidence and critical voices on display in the Report is deplorable because there are important insights that have come out of these debates and the emerging policy implications would be based on less shifty ground if these insights would be taken into account in systematic ways. If you make the case for costly and policy interventions that might affect literally billions of people, you ought to make sure that the evidence on which you base your policy implications is robust.

In sum, it seems to me that the resources that went into the Report would have been better spent had there been adversarial collaborations (Mellers et al. 2001) and/or had reviews gone through a standard review process which hopefully would have forced some clear-cut and documented review criteria. A long list of people that gave feedback is not a good substitute for institutional quality control.

We can indeed do better when it comes to understanding and changing human behavior but it ought to be done on a scientifically sound basis. The World Development Report 2015 seems wanting.


Andersen, S., Harrison, G.W., Lau, M.O., & E.E. Rutstroem (2014). Discounting behavior: A reconsideration. European Economic Review 71, 15 – 33.

Balafoukis, L. & Nikoforakis, N. (2012). Norm enforcement in the city: A natural field experiment, European Economic Review 56, 1773 – 1785.

Bekkers, R. (2007). Measuring Altruistic Behavior in Surveys: The All-or-Nothing Dictator Game. Survey Research Methods 1, 139-144.

Camerer, C., L. Babcock, G. Loewnstein, & R. Thaler (1997). Labor Supply of New York City Cab Drivers: One Day at a Time. Quarterly Journal of Economics 111, 407 – 441.

Cherry, T., Frykblom, P., & J. Shogren (2002). Hardnose the Dictator. American Economic Review 92, 1218-1221.

Farber, H.S. (2005). Is Tomorrow Another Day? The Labor Supply of New York City Cabdrivers. Journal of Political Economy 113, 46 – 82.

Farber, H.S. (2008). Reference-Dependent Preferences and Labor Supply: The Case of New York City Taxi Drivers. American Economic Review 98, 1069 – 82.

Fehr, E. & L. Goette (2007). Dow Workers Work More if Wages Are High? Evidence from a Randomized Field Experiment. American Economic Review 97, 298 – 317.

Gigerenzer, G. (1996). On narrow norms and vague heuristics. A reply to Kahneman and Tversky. Psychological Review 103, 592 – 596.

Gigerenzer, G., R. Hertwig, U. Hoffrage, & P. Sedlmeier (2008). Cognitive Illusions Reconsidered. Pp. 1018-1033 in Plott & Smith (eds.), Handbook of Experimental Economics Research. Vol. 1, Amsterdam: North-Holland.

Guala, F. (2012). Reciprocity: Weak or strong? What punishment experiments do (and do not) demonstrate. Behavioral and Brain Sciences 35, 1 – 59.

Harrison (2010). The behavioral counter-revolution. Journal of Economic Behavior & Organization 73, 49 – 57.

Harrison (2011), Randomisation and its Discontents. Journal of African Economics 20, 626 – 652.

Harrison (2013), Field Experiments and methodological intolerance. Journal of Economic Methodology 20, 103 – 117.

Hertwig, R. & Ortmann, A. (2004). The Cognitive Illusions Controversy: A Methodological Debate in Disguise That Matters To Economists. Pp. 113-130 in Zwick & Rapoport (eds.), Experimental Business Research III, Boston, MA: Kluwer.

Heukelom, F. (2014). Behavioral Economics. A History. New York: Cambridge University Press.

Kahneman, D. & A. Tversky (1996). On the reality of cognitive illusions: A reply to Gigerenzer’s critique. Psychological Review, 103, 582 – 591.

Meardon, S. & A. Ortmann (1996). Self-Command in Adam Smith’s Theory of Moral Sentiments. A Game-Theoretic Reinterpretation. Rationality and Society 8, 57 – 80.

Mellers, B., R. Hertwig, & D. Kahneman (2001). Do Frequency Representations Eliminate Conjunction Effects? An Exercise in Adversarial Collaboration. Psychological Science 12, 269-275

Ortmann, A. (2015), Review of Heukelom (2014). OEconomia, forthcoming.

Smith, V.L. (2010). What Are The Questions? Journal of Economic Behavior & Organization 73, 3–15.

Stafford, T. (2015). What Do Fishermen Tell Us That Taxi Drivers Don’t? An Empirical Investigation of Labor Supply. Journal of Labor Economics. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2262677

Welch, I. (2015). Plausibility. A Fair & Balanced View of 30 Years of Progress in Ecologics. Retrievable at: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2570577

It’s a snail but it seems to be moving in the right direction: On progress in the experimental social sciences …

I have previously written about the sorry state of the experimental (social) sciences (see here and here). Recent news about publishers having to withdraw (again) scores of papers in the wake of a rigged peer-review process (see here and here) and a flurry of retractions in psytown (e.g., Stapel, Sanna, SmeestersFoerster; see also the excellent discussions on Rolf Zwaan’s blog) have not been encouraging. But there is good news …

Exhibit 1: Dixon & Jones recently published a reanalysis of survey data that allegedly showed that conspiracist ideation predicts scepticism regarding the reality of anthropogenic climate change. For a more detailed version of the Dixon & Jones reanalysis, see here.

The reanalysis demonstrates that, rather than assuming a linear relationship between ideation and views on climate science, Lewandowsky and his colleagues should have tested that assumption. Turns out, linearity is not a good assumption to make.  Once non-linear regressions are used, the results – which are weak in effect size in the first place, as all parties now seem to agree – disappear, or so Dixon & Jones suggest: “The curvilinear relationship identified in (both survey data sets that Lewandowsky et al. used, AO) … suggests that both respondents convinced of anthropogenic climate change and respondents sceptical about such change were less likely to accept conspiracy theories than were those who were less decided about climate change.”  Which makes intuitive sense to me. Lewandowsky et al. beg to differ but it seems that their logic is indeed a bit tortured; see also Dixon’s response to Lewandowsky’s response.

Of course, it should not take two years and four rounds of revision for some such critique to be published but given the sorry state of the experimental social sciences, it is progress that this critique was published at all. It unfortunately has become a bad habit of many journals to publish sensationalist findings but to categorically refuse to even look at deconstructions of these findings. This is an irresponsible practice, be it for the basic reason that the original findings typically tend to be cited by an order of magnitude more frequently. Which all in itself represents a serious problem of evidence production and evaluation and journals that engage in this practice ought to be named and shamed.

Although they arguably should show up in journals, replications and critical questions about particularly egregious pieces end up frequently on blogs and other social media such as facebook. One of the two Lewandowsky et al pieces, for example,  had previously been savaged by a Ph.D. candidate at Arizona State University, Josef Duarte, who did not mince words — with which I do not agree — and called for one of the studies to be retracted. I guess the journal involved being Psychological Science, it ain’t going to happen. See also Ralf Zwaan’s pointed — and excellent — questions about another piece published in Psychological Science that somehow made it through the review process, when it clearly should not have.

Following a number of replication projects that psychologists have in the works already, economists have now gotten into the game.

Exhibit 2) The behavioural economics replication project, organized by 17 experimental and behavioral economists, some of them quite prominent. They intend to replicate 18 experimental studies published in the years 2011 to 2014 in the American Economic Review or the Quarterly Journal of Economics, two of the most prominent economics journals. The sample sizes were chosen so that the original result has a 90% chance of replication or higher, given a true effect size of the same magnitude as in the original study. A replication is here defined as a test statistic (using the same analysis in the original paper) with a hypothesis test p-value less than .05. In an interesting twist, the organizers of the replication project have also organized a prediction market, which is cool although by invitation only.  It will be interesting to see what comes of both the replication project and the prediction market project.

Exhibit 3) The Erev et al. prediction competition, already up and running. The good news is that it is open to all and that its registration deadline was just extended by two weeks (registration now closing April 20). The authors previously organized such choice prediction competitions (which yours truly and a collaborator have discussed here …). In their new and improved competition, Erev and his co-authors have identified 14 well-known decision problems that have been used in the past to question the predictive power of Expected Utility Theory and which include experimental work-horses such as the Allais questions, the Ellsberg paradox, the St Petersburg paradox, etc. The organizers of the competition have also riffed on one of their favorite themes, the difference in experimental results derived from Decisions by Description and Decisions by Experience.  They then replicated these choice “anomalies” under one “standard” setting: choice with real stakes in a space of experimental tasks wide enough to replicate all 14 “anomalies”.

The results of this replication (“Experiment 1”) suggested that all 14 phenomena emerge in their setting. “Yet, their magnitude tends to be smaller than their magnitude in the original demonstrations.” (This result is interesting since it speaks to the perennial issue of the effects of financial incentives for lab experiments. Had the stakes been higher, their magnitude might have been even less.)

In an “Experiment 2”, the organizers of the competition then randomly drew parameterizations for these 14 “anomalies” to address the critique that any one parameterization for one of the anomalies might lead to seriously biased results, in particular if selection of specific parameterizations was the result of careful pretesting meant to identify catchy results.  This so-called estimation set has been published in detail and participants in the prediction competition proper (“Experiment 3”) can use these results to figure winning strategies.

I am happy to see these developments (paying attention to appropriate sample size, replication projects, tournaments, and so on) because past practices in experimental and behavioural economics have been dismal at best, with at least one item in the Bullshit Bingo chart making an appearance in most studies. And that’s not even taking into account proper contextualization of studies in the literature or some reflections on external validity even when authors almost routinely try to rationalize their studies by appealing to real-world problems.

It’s a snail but it seems to be moving in the right direction …

Stay tuned.

Navigating the treacherous waters of ethics compliance for field experiments

[The following contribution is the unedited version of the comment that The Australian published today under a different title. The Australian is pay-walled, so I put the draft here, with a few links added for good measure. For the record, I prefer the headline that I proposed … ]

Two years ago, University of Queensland (UQ) researchers Paul Frijters and Redzo Mujcic published a couple of working papers that documented substantial, statistically significant discrimination on the part of Brisbane bus drivers, especially against people of colour. Interestingly, the authors also found that Asians were not discriminated again, a finding which I find hard to reconcile with the anecdotal evidence I have seen in Sydney, for example. There were many other important findings in the Mujcic-Frijters study which quantified various aspects of discrimination in an innovative manner that could be a template for further studies.

UQ initially issued a press release to announce these findings and outlets such as The Sydney Morning Herald picked up the study that curiously – given its important quantification of white privilege — was never published and did not get publicity afterwards.  Last week, in the wake of two contributions in  The New York Times and Forbes by Yale Law School professor Ian Ayres, we learned why.  Not a pretty story it is.

According to numerous contributions in Core Economics Today, The Australian, The Brisbane Times, The Washington Post, and many others, it seems that a day after the press release, UQ administrators told Frijters – the supervisor of then-Ph.D. candidate Mujcic – that the research had to be pulled out of the media and was not to be published because it had not received appropriate ethics clearance. It also has emerged that this discovery seems to have been prompted by the Brisbane bus company (Translink) media adviser asking how a study that seemed to involve fare evasion could receive ethics clearance.

Fare evasion?  No, not really for all I know. Frijters & Mujcic had 29 young adult testers – duly mixed by gender, ethnicity, and attire – board Translink buses and insert an insufficiently funded fare card into the scanner. The testers then told the bus-drivers, “I do not have any money, but need to go to [a station about 1.2 miles away]”. The bus-drivers were thus prompted to make a call whether to allow the testers to stay onboard.

In the flurry of contributions and discussions in various outlets over the last few days, facts and open questions have emerged. Was deception involved? (Having written extensively about the issue of deception in economics and other social sciences, I believe, no: The testers had insufficiently funded fare cards and as part of their research had to go to the destination that they gave the bus-drivers. Hence there were no deceptive statements. Of course, bets are off on this one if the testers had indeed money on them and / or valid fare cards.) Did Frijters, Australia’s Young Economist of the year in 2009 and by all measures one of Australia’s most productive economists, follow the ethics procedures then in place at UQ? (If indeed no deception was involved, apparently, yes.) Was it fare evasion? (No. The bus drivers could decline the request for a free ride.) Were bus drivers entitled to let people free ride that had insufficiently funded fare cards? (Apparently, as persuasively argued by Rabee Tourky in Core Economics Today, yes. According to him, in the wake of the Daniel Morcombe murder Brisbane bus drivers were instructed to use judgment when such requests were made.) Should the bus company have been informed about this research beforehand? Should bus-drivers have been debriefed that they just had participated in an experiment?  Was it appropriate for the researchers to publish location and carrier?

Some of these questions do not have easy answers, as is becoming now clear. With the benefit of hindsight, ethics clearance on a higher level might have been a good idea, be it only to protect the researchers from over-the-top administrative responses including at some point a demotion for Frijters that UQ ultimately had the good sense to reverse and Mujcic’s inability to get his dissertation research published. Hindsight, of course, is 20/20. As matter of fact, likely Nobel Prize laureate John A. List has defended field experiments that involve lack of ex-ante informed consent, or ex-post debriefing in the 11 July 2008 issue of Science, a leading and highly influential general-interest magazine. Arguing that notifying experimental subjects is often neither possible nor necessary because of the minimal risk that is involved, he puts it thus: “[I]n a natural field experiment, the analyst manipulates experimental conditions in a natural manner, whereby the experimental subjects are unaware that they are participating in an experiment. This approach combines the most attractive elements of the laboratory and of naturally occurring data: randomization and realism.” When challenged on this proposition, he responded in the 31 OCTOBER 2008 issue thus:“Ethical issues surrounding human experimentation are of utmost importance. Yet, the benefits and costs of informed consent should be carefully considered in each situation. Those cases in which there are minimal benefits of informed consent but large costs are prime candidates for relaxation of informed consent.” This seems to describe the Frijters & Mujcic study well and is in any case, for better or worse, an increasingly accepted point of view among behavioural and experimental economists.

Ethics clearance requirements – implemented  in response to scores of well-documented examples of Human Subject Research abuses (google them!) and universities’ concerns for liability and reputation — have become a considerable drain on researchers’ scarce resources in Australia, and – it has to be said — other places. I am sure every behavioral and experimental economist here in Australia will attest to that. Occasionally ethics clearance requirements are applied in questionable ways in that the gate-keepers feel entitled to make judgment calls on the quality of research.  I consider this an unacceptable infringement of academic freedom. It is in any case clear that – in light of List’s position, for example, as well as the Frijters & Mujcic situation — it is time to have a conversation about these issues, so as to lift the administrative and regulatory fog.

That ethics clearance requirements have escalated, and unreasonably so in many a case by List’s standard, seems widely acknowledged in that a number of universities in Australia have fast-track procedures for low-risk research.  It seems that UQ in 2012, when Frijters & Mujcic did their study, had such procedures in place although apparently there was some wiggle-room for interpretation. The Frijters-Mujcic  study hence seemed to warrant at best an updating of procedures that were found wanting. I simply do not see a case for suppressing this research given its unquestionable importance, and given that no deception seems to have been involved.

Academic freedom is an important good and should be defended vigorously. It is deplorable that the ongoing UQ saga has come at a high cost to Frijters and his former Ph.D. student.  Their time could have spent much more productively although I am confident that the saga has provided Frijters already with many examples for his ARC-funded research into socially undesirable behavior.

Given that UQ administrators continue to insist that the university has responded appropriately, and given the facts that have emerged over the last few days, it seems clear that UQ administrators at this point are an interested party and as such ought to excuse themselves from further attempts at assessments of what happened. The accusation of maladministration that is now in the open, is too important an accusation, and ought to be dealt with by truly independent investigators.  UQ, and Frijters, should commit to implement whatever recommendations such investigators come up with. It is to be hoped that Frijters soon can do again what he does best: Research that addresses important real-world problems in innovative ways and that gets noticed, and recognized world-wide, for exactly that reason.