The Real Source of the Replication Crisis

“Replication police.” “P-squashers.” “Hand-wringers.” “Hostile replicators.”  And of course, who can ever forget, “shameless little bullies.”  These are just some of the labels applied to what has become known as the replication movement, an attempt to improve science (psychological and otherwise) by assessing whether key findings can be reproduced in independent laboratories.

Replication researchers have sometimes targeted findings they found doubtful.  The grounds for finding them doubtful have included (a) the effect is “counter-intuitive” or in some way seems odd (1), (b) the original study had a small N and an implausibly large effect size, (c) anecdotes (typically heard at hotel bars during conferences) abound concerning naïve researchers who can’t reproduce the finding, (d) the researcher who found the effect refuses to make data public, has “lost” the data or refuses to answer procedural questions, or (e) sometimes, all of the above.

Fair enough. If a finding seems doubtful, and it’s important, then it behooves the science (if not any particular researcher) to get to the bottom of things.  And we’ve seen a lot of attempts to do that lately. Famous findings by prominent researchers have been put  through the replication wringer, sometimes with discouraging results.  But several of these findings also have been stoutly defended, and indeed the failure to replicate certain prominent effects seems to have stimulated much of the invective thrown at replicators more generally.

One target of the “replication police” has enjoyed few defenders. I speak, of course, of Daryl Bem (2). His findings on ESP stimulated an uncounted number of replications.  These were truly “hostile” – they set out to prove his findings wrong and what do you know, the replications uniformly failed. In response, by all accounts, Daryl has been the very model of civility. He provides his materials and data freely and without restriction, encourages the publication of all findings regardless of how they turn out, and has managed to refrain from telling his critics that they have “nothing in their heads” or indeed, saying anything negative about them at all that I’ve seen. Yet nobody comes to his defense (3) even though any complaint anybody might have had about the “replication police” applies, times 10, to the reception his findings have received. In particular, critiques of the replication movement never mention the ESP episode, even though Bem’s experience probably provides the best example of everything they are complaining about.

Because, and I say this with some reluctance, the backlash to the replication movement does have a point.  There IS something a bit disturbing about the sight of researchers singling out effects they don’t like (maybe for good reasons, maybe not), and putting them – and only them – under the replication microscope.  And, I also must admit, there is something less-than-1000% persuasive about the inability to find an effect by somebody who didn’t believe it existed in the first place. (4)

But amidst all the controversy, one key fact seems to be repeatedly overlooked by both sides (6).  The replication crisis did NOT arise because of studies intended to assess the reality of doubtful effects.  That only came later.  Long before were repeated and, indeed, ubiquitous stories of fans of research topics – often graduate students – who could not make the central effects work no matter how hard they tried.  These failed replications were anything but hostile.  I related a few examples in a previous post but there’s a good chance you know some of your own.  Careers have been seriously derailed as graduate students and junior researchers, naively expecting to be able to build on some of the most famous findings in their field, published in the top journals by distinguished researchers, simply couldn’t make them work.

What did they do? In maybe 99% of all cases (100% of the cases I personally know about) they kept quiet. They – almost certainly correctly – saw no value for their own careers in publicizing their inability to reproduce an effect enshrined in textbooks.  And that was before a Nobel laureate promulgated new rules for doing replications, before a Harvard professor argued that failed replications provide no information and, of course, long before people reporting replication studies started to be called “shameless little bullies.”

Most of the attention given these days to the replication movement — both pro and con — seems to center around studies specifically conducted to assess whether or not particular findings can replicated. I am one of those who believe that such studies are, by and large, useful and important.  But we should remember that the movement did not come about because of, or in order to promote such studies.  Instead, its original motivation was to make it a bit easier and safer to report data that go against the conventional wisdom, and thereby protect those who might otherwise waste years of their lives trying to follow-up on famous findings that have already been disconfirmed, everywhere except in public.  From what we’ve seen lately, this goal remains a long ways off.

  1. Naturally, counter-intuitiveness and oddity is in the eye of the beholder.
  2. Full disclosure: He was my graduate advisor. An always-supportive, kind, wise and inspirational graduate advisor.
  3. Unless this post counts.
  4. This is why Brian Nosek and the Center for Open Science made exactly the right move when they picked out a couple of issues of JPSP and two other prominent journals and began to recruit researchers to replicate ALL of the findings in them (5). Presumably nobody will have a vested interest or even a pre-existing bias as to whether or not the effects are real, making the eventual results of these replications, when they arrive, all the more persuasive.
  5. Full disclosure: A study of mine was published in one of those issues of JPSP. Gulp!
  6. Even here, which is nothing if not thorough.

Acknowledgements: Simine Vazire showed me that footnotes can be fun.  But I still use my caps key. Simine and Sanjay Srivastava gave me some advice, some of which I followed.  But in no way is this post their fault.

6 thoughts on “The Real Source of the Replication Crisis

  1. Nice post.

    The central point for me is clarifying the discussion instead of obfuscating it / making it polemic / creating tribal wars.

    Therefore, I am strongly in favor of Kahneman protocol. Even if one wants to be more extreme to either side (I will not repeat), these things get us backwards instead of forward.

    It is hard enough to glean the truth amid multiple replications. It *much* harder to progress when the very definition of what is a replication etc. is under this huge dispute.

    What scientific progress can we get when the global feeling is of an all out war and polemic / political campaign?

  2. A most interesting post. If it is hard to “glean the truth amid multiple replications” then there is perhaps a lot of noise? Perhaps even noise>>signal? One reason I like the ESP example. Darl Bem is stating here is a measurement, it doesn’t make sense, I have explanation (A), but maybe you have a better one? That is how science is meant to work. The real problem with replication is that there are far too many papers where noise is impossibly low.
    Maybe papers should have a separate section “identified sources of noise and the estimated magnitude of noise” to which readers can (moderated by an editor) contribute? When noise>>signal, the pages get stamped accordingly?

  3. Your first sentence could have included:
    “Replication bullies.”
    “Replication mafia.”
    “Second stringers”
    “god’s chosen soldiers in a great jihad”
    “Senator Joe McCarthy’s playbook”
    [Above from Matthew Lieberman’s May 23 post on Facebook and Dan Gilbert’s comments in reply, https://www.facebook.com/matthew.lieberman?fref=ts ]

    “Witch hunt”
    “A bunch of self-righteous, self-appointed sherrifs”
    “Rosa Parks” (for the person whose work was replicated).
    [Above from Gilbert’s comment on Schnall’s SPSP blog post]

  4. Pingback: I’ve Got Your Missing Links Right Here (20 July 2014) – Phenomena: Not Exactly Rocket Science

  5. Very interesting, and sad. My own interest in replications arose from Ioannis’ 2005 article Why Most Published Research Findings Are False. (PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124, available at http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124), and from my own experience reading scientific papers and seeing how common practices in the literature often conflicted with what I taught my students in statistics classes. So there are many reasons to promote replications.
    From my perspective, exact replications are not the best solution to the problem: practices need improve as well. If an original paper uses poor practices, then repeating the same poor practices in a replication does not advance science. (See http://www.ma.utexas.edu/blogs/mks/ for some examples).
    Good science does require criticism, since none of us is perfect — we need others’ eyes and thoughts to see the flaws in our reasoning and the gaps in our explanations. Not necessarily pleasant to be on the receiving end of, but we need to learn to accept the criticisms in a constructive manner, and focus on the goal of furthering science.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s