Now you see it, now you don’t: On the deepening crisis in evidence production, and evaluation, in the social sciences (Part II: Some proposals to address it)

Yesterday I stated my understanding of the problem.

So, what to do in light of the deepening crisis?

First, in a recent open letter published in “The Guardian” more than 70 researchers have argued that scientific journals ought to allow pre-registered replications (and other studies). In fact, the journals “Attention, Perception & Psychophysics”, “Perspectives on Psychological Science”, and “Social Psychology” have already launched similar projects. The experiences so far seem promising.

Second, in the discussion of Ed Yong’s “Nature” news feature it was suggested (see the Lieberman and Hardwicke comments) that undergraduates ought to be enticed – maybe through a special journal for replication studies – to conduct replication studies. This seems an idea worth pursuing.

Third, all journals ought to insist that data for studies they publish ought to be posted. This is the conclusion that Simonsohn also has come to and it makes a lot of sense (“Just Post It: The Lesson from Two Cases of Fabricated Data Detected by Statistics Alone.”) Specifically, data sets ought to be posted with the journal in which the article is published. There is an interesting issue whether, and under what circumstances, the data ought to be made accessible to other researchers, especially if a study is on-going but that issue seems a minor and solvable one. Relying on the original authors to supply data, sometimes years after the fact, is bound to be a problem for a number of reasons (moves, deteriorating hard and software, crashes, theft) all of which lead to availability attrition of data when journals do not create depositories of data.

Fourth, and relatedly, in his discussion of the Smeesters and Geraerts affairs (here), Richard Gill provides on two slides “morals of the story”. He argues that data preparation and data analysis are integral part of the experiment and that “keeping proper log-books of all steps of data preparation, manipulation, selection/exclusion of cases, makes the experiment reproducible. “ Exploratory analyses and pilot studies ought to be fully reported, as should be the complete data collection design (which, of course, should be written down in advance in detail and followed carefully). He also argues against the wide-spread division of labor where younger co-authors do much of the data – analysis. (I doubt that this latter point is implementable at this point; it’s too entrenched a practice already. It seems better to identify who did what for a project.)

Fifth, replicability relies on detailed instructions and descriptions that allow everyone, everywhere to try to replicate. That in some cases (such as Dijksterhuis’s) sufficient protocols do not exist and have to be generated years after a study has been conducted seems highly problematic.

Sixth, Simmons, Nelson, and Simonsohn (following up on an earlier indictment of practices that facilitate false positives) have provided what they call a 21-word solution for the problem. Say they: “If you determined sample size in advance, say it. If you did not drop any variables, say it. If you did not drop any conditions, say it.“ It is an interesting question whether some such statement would indeed lead to full transparency but it seems a step in the right direction.

Seventh, meta-analyses ought to be conducted more often, in particular in economics where they are still relatively rare. As Ferguson & Heene make clear (here), meta-studies are no panacea but they force some discipline on evidence evaluation. To the extent that they would also be subjected to the “Just-Post-It” requirement that are likely to help stabilized the evidence base.

Eighth, adversarial collaborations (e.g.,; transcript provided under the video) is a way of getting away from the trench warfare that can be found in many areas of social sciences these days. Rather than lobbing at each other ever confirming evidence for one’s own position, the protagonists could agree on writing a joint article – possibly with a third, mutually agreed-on party moderating – that might help settle disputes. One of the nice aspects of some such way of collaborating is the much more likely balanced assessment of what previous literature had to say.

Ninth, tournaments are a recent, and increasingly used, tool, as Leonidas Spiliopoulos and I have demonstrated here . In our most recent version of the paper (conditionally accepted at “Psychological Methods”), we argue that tournaments are much wider applicable than we have so far seen.

Tenth, transparency indices. RetractionWatch has proposed some such index for journals (see here)

Eleventh, Deborah Mayo – intrigued by one of the final recommendations of the committee that investigated the Stapel affair — has argued on her blog that “the relevant basic principles of philosophy of science, methodology, ethics and statistics that enable the responsible practice of science” may well be taught by philosophy departments. Maybe so. It seems for sure desirable that a course addressing these issues be taught everywhere.

Any other ideas? Comments?

Class, discuss !


6 thoughts on “Now you see it, now you don’t: On the deepening crisis in evidence production, and evaluation, in the social sciences (Part II: Some proposals to address it)”

  1. It’s not always possible (legal) to make the data available if using data produced by a third party, but at a minimum all of the manipulations of the data should be disclosed so that anyone able to access the data can replicate the results.


  2. Interesting stuff. I have decided to be as open as I can about my own PhD research. I have a website that provides access to all raw data, interim/exploratory results, my experimental software and my musings on the process itself.

    It’s early days, but I have found that recording all this information a rewarding habit – somehow it takes away the ego/attachment and lets you get on with the process.

    There is possibly too much fear about null results, or otherwise ‘failed’ analysis – which is only further compounded by publication and career pressures.


  3. It is pretty outrageous that some journals do not publish replication studies or insist on data and source code. It is not so easy to fix this through the publication market. Self interest will still drive me to submit my best work to the highest rank journals, even if it is unethical in this respect.

    Government could have a role here since they provide so much money. They could come up with a journal ranking that relegated to list C any journal that did not sign up to, and implement, a short list of principles. Not suggesting that Australia could do it, but the US sure could. Imagine that all you publications in Econometrika* are suddenly not counted in applying for NSF grants! There would be an academic riot and the journal would change policy tomorrow.

    *Not sure what Econometrika’s policy is. It is just an example.


    1. “Econometrica has the policy that all empirical, experimental and simulation results must be replicable. Therefore, authors of accepted papers must submit data sets, programs, and information on empirical analysis, experiments and simulations that are needed for replication and some limited sensitivity analysis. (Authors of experimental papers can consult the more detailed posted information regarding submission of Experimental papers.)

      This material will be made available through the Econometrica supplementary material web-page. Submitting this material indicates that you license users to download, copy, and modify it; when doing so such users must acknowledge all authors as the original creators and Econometrica as the original publishers.”


Comments are closed.