As I mentioned in my previous post, while I’m sympathetic to many of the ideas that have been suggested about how to improve the reliability of psychological knowledge and move towards “scientific utopia,” my own thoughts are less ambitious and keep returning to the basic issue of replication. A scientific culture that consistently produced direct replications of important results would be one that eventually purged itself of many of the problems people having been worrying about lately, including questionable research practices, p-hacking, and even data fraud.
But, as I also mentioned in my previous post, this is obviously not happening. Many observers have commented on the institutional factors that discourage the conduct and, even more, the publication of replication studies. These include journal policies, hiring committee practices, tenure standards, and even the natural attractiveness of fun, cute, and counter-intuitive findings. In this post, I want to focus on a factor that has received less attention: the perilous plight of the (non) replicator.
The situation of a researcher who has tried and failed to replicate a prominent research finding is an unenviable one. My sense is that the typical non-replicator started out as a true believer, not a skeptic. For example, a few years ago I spent sabbatical time at a large, well-staffed and well-equipped institute in which several researchers were interested in a very prominent finding in their field, and wished to test further hypotheses they had generated about its basis. As good scientists, they began by making sure that they could reproduce the basic effect. To their surprise and increasing frustration, they simply could not. They followed the published protocol, contacted the original investigator for more details, tweaked this, tweaked that. (As I said, they had lots of resources.) Nothing. Eventually they simply gave up.
Another anecdote. A graduate student of a colleague of mine was intrigued by a finding published in Science. You don’t see psychological research published in that ultimately prestigious journal very often, so it seemed like a safe bet that the effect was real and that further creative studies to develop its theoretical foundation would be a great project towards a dissertation and a research career. Wrong. After about three years of failing to replicate the original finding, the advisor finally had to insist that the student find another topic and start over. You can imagine the damage this experience did to the student’s career prospects.
Stories like these are legion, but you don’t see many of them in the published literature. Indeed, I suspect most failures to replicate are never written up, much less submitted for publication. There are probably many reasons, but consider just one: What happens when a researcher does decide to “go public” with a failure – or even repeated, robust failures – to replicate a prominent finding? If some recent, highly publicized cases are any guide, several unpleasant outcomes can be anticipated.
First, the finding will be vehemently defended, sometimes not just by its originator but also by the acolytes that a surprising number of prominent researchers seem to have attracted into loyal camps.[i] The defensive articles, written by prominent people with considerable skills, are likely to be strongly argued, eloquent, and long. The non-replicator has a good chance of being publicly labeled as incompetent if not deliberately deceptive, and may be compared to skeptics of global warming! Even a journalist who has the temerity to write about non-replication issues risks being dismissed as a hack. This situation can’t be pleasant. It takes a certain kind of person to be willing to be dragged into it – and not necessarily the same kind of person who was attracted to a scientific career in the first place.
It gets worse. The failed replicator also risks various kinds of subtle and not-so-subtle retaliation. I was at a conference a few weeks ago where I heard, first-hand, from a researcher who found that a promotion letter that subtly but powerfully derogated the researcher’s career was not only an outlier with respect to the other letters in the file, but was written by a practitioner in a field that the researcher’s work had dared to question. Another first-hand story concerned a researcher who, after publishing some reversals of findings that had been pushed for years by a powerful school of investigators, found that external reviews of submitted journal articles on other topics had suddenly turned harshly critical. And, in an episode I had the opportunity to observe directly, a professor and graduate student who had a paper questioning an established finding actually accepted for publication in a prominent journal found themselves subjected to threats! The person who “owned” the original effect said to them: you need to withdraw this paper. I’m the most prominent researcher in the field and the New York Times will surely call me for comment. I will be forced to publicly expose your incompetence. Your career will be damaged; your student’s career will be ruined. The threat concluded, darkly: I say this as a friend; I only have your best interests at heart.
Do you know other stories like this? There is a good chance you do. Publishing a failure to replicate a prominent finding, or even challenging the accepted state of the evidence in any way, is not for the squeamish. No wonder the typical response of a failed replicator is simply to drop the whole thing and walk away. The reaction makes sense, and from the point of view of individual self-interest – especially for a junior researcher — is probably the rational thing to do. But it’s disastrous for the accumulation of reliable scientific knowledge.
This is a cultural problem that needs to be solved. As individuals and as members of a research culture, we need to clarify two things. First, we have to make clear that denunciations of people with contrary findings as incompetent or deceptive, retaliation through journal reviews and promotion letters, and overt threats, are, in a phrase, SERIOUSLY NOT OK. This should go without saying, but – judging from what we’ve seen happen recently – apparently it doesn’t.
Second, and only slightly less obviously, we should try to recognize that a failure to confirm one of your findings does not have to be viewed as an attack. Indeed, a colleague attending this same meeting pointed out that a failure to replicate is a sort of compliment: it means your work was interesting and potentially important enough to merit further investigation! It’s much worse – and far more common – simply to be ignored. A failure to replicate should be seen, instead of an attack, as an invitation to clarify what’s going on. After all, if you couldn’t replicate one of your effects in your own lab what would you do? Attack yourself? No, you’d probably sit down and try to figure out what happened. So why is it so different if it happens in someone else’s lab? This could be the beginning of a joint effort to share methods, look at data together, and come to a collaborative understanding of an important scientific issue.
I know I’m dreaming here. Even a psychologist knows enough about human nature to understand that such an outcome goes against all of our natural defensive inclinations. But it’s a nice thought, and maybe if we hold it in mind even as an unattainable ideal it might help us to be not quite so vehement, a little less personal, and a bit more open minded in our responses to scientific challenge.
How can we enforce better responses to failures to replicate? Sociology teaches us that in small communities gossip is an effective mechanism to enforce social norms. Research psychology is effectively a small town, a few thousand people at the most spread out around the world but in regular contact nonetheless. So the late-night gossip about defensive reactions, retaliation, and threats is one way to ensure that such conduct carries a social price.
In the longer term, we need to change our overall social norm of what’s acceptable. We need to accept, practice, and, above all, teach constructive approaches to scientific controversy. This is a very long road. But, as the proverb tells us, it starts with one step.
Note: This post is based on a brief talk given at a conference on the “Decline Effect,” held at UC Santa Barbara in October, 2012. The conference was organized by Jonathan Schooler and sponsored by the Fetzer-Franklin Foundation. As always, this post expresses my personal opinion and not necessarily that of any other institution or individual.
[i] Typically, their defense will draw on the existence of “conceptual replications,” studies that found theoretically parallel effects using different methods. However, as Hal Pashler has noted, no matter how many conceptual replications are reported, there is no way to know how many failed efforts never saw the light of day. This is why it is essential to find out whether the original effect was reliable.