What if Gilbert is Right?

I. The Story Until Now (For late arrivals to the party)
Over the decades, since about 1970, social psychologists conducted lots of studies, some of which found cute, counter-intuitive effects that gained great attention. After years of private rumblings that many of these studies – especially some of the cutest ones – couldn’t be replicated, a crisis suddenly broke out into the open (1). Failures to replicate famous and even beloved findings began to publicly appear, become well known, and be thoroughly argued-over, not always in the most civil of terms. The “replicability crisis” became a thing.
But how bad was the crisis really? The accumulation of anecdotal stories and one-off failures to replicate was perhaps clarified to some extent by a major project organized by the Center for Open Science (COS), published last November, in which labs around the world tried to replicate 100 studies and, depending on your definition, “replicated” only 36% of them (2).
In the face of all this, some optimists argued that social psychology shouldn’t really feel so bad, because failed replicators might simply be incompetent, if not actually motivated to fail, and the typical cute, counter-intuitive effect is a delicate flower that can only bloom under the most ideal climate and careful cultivation. Optimists of a different variety (including myself) also pointed out that psychology shouldn’t feel so bad, but for a different reason: problems of replicability are far from unique to our field. Failures to reproduce key findings have become seen as serious problems within biology, biochemistry, cardiac medicine, and even – and disturbingly –cancer research. It was widely reported that the massive biotech company Amgen was unable to replicate 47 out of 53 of seemingly promising cancer biology studies. If we have a problem, we are far from alone.

II. And Then Came Last Friday’s News (3)
Prominent psychology professors Daniel Gilbert and Tim Wilson published an article that “overturned” (4) the epic COS study. Specifically, their reanalysis concluded that the study not only didn’t persuasively show that most of the studies it addressed couldn’t be replicated, its data were actually consistent with the possibility that all of the studies were replicable! The article was widely reported not just in press releases but in outlets including the Washington Post, Wired, the Atlantic on line, and the Christian Science Monitor, to name just a few.
Psychologists who had been skeptical of the “replication movement” all along – come one, we know who you are — quickly tweeted, Facebooked and otherwise cheered the happy news. Some even began to wonder out loud whether “draconian” journal reforms adopted to enhance replicability could now be repealed. At the same time, and almost as quickly, members of the aforesaid replication movement – come one, we know who you are too (5) – took close looks at the claims by Gilbert and Co., and within 48 hours a remarkable number of blogs and posts (6) began to refute their statistical approach and challenge the way they summarized some of the purported flaws of the replication studies. I confess I found most of these responses pretty persuasive, but that’s not my point for today. Instead my point is:

III. What if Gilbert is Right?
Let’s stipulate, for the moment, that Gilbert and Co. are correct that the COS project told us nothing worth knowing about the replicability of social psychological research. What then?

IV. The COS Study Is Not the Only, and Was Far From the First, Sign that We Have A Problem.
One point I have seen mentioned elsewhere – and I’ll repeat it here because it’s a good point – is that the COS project was far from being the only evidence that social psychology has a replicability problem. In fact, it came after, not before, widespread worry had been instigated by a series of serious and compelling failures to reproduce very prominent studies, and many personal reports of research careers delayed if not derailed by the attempt to follow-up on lines of research that only certain members of the in-crowd knew were dead ends. As this state of affairs became more public over the past couple of years, the stigma of failing to replicate some famous psychologist’s famous finding began (not entirely!!) to fall away, and a more balanced representation of what the data really show, on all sorts of topics, began to accumulate in public file drawers, data repositories, and outlets for replication studies. The COS study, whatever its merits, came on top, not as a foundation, of all that.

V. Other Fields Have Replicability Problems Too
A point I haven’t, in this context, seen mentioned yet – and my real motivation for writing this post – is that – remember! – the replication crisis was never exclusive to psychology  in the first place. It has affected many other fields of research as well. So, if Gilbert & Co. are right, are we to take it that the concerns in our sister sciences are also overblown? For example, was Amgen wrong? Were all those cancer biology studies perfectly replicable after all? Do biochemistry, molecular biology, and the medical research community share social psychology’s blight of of uncreative, incompetent, shameless little bullies aiming to pull down the best research in their respective fields?
Well, maybe so. But I doubt it. It seems extremely unlikely that the kinds of complaints issued against the studies that failed to replicate psychological findings apply in the same way in these other fields. It seems doubtful that problems in these other fields stem from geographical or temporal differences in social norms, unique aspects of student demographics, changes in wordings of scale items, exact demeanor of research assistants, or other factors of the sort pointed out by Gilbert & Co. as bedeviling attempts to replicate psychological findings. I also have no reason to think that molecular biology is full of shameless little bullies, but I stand ready to be corrected on that point.

VI: The Ultimate Source of Unreliable Scientific Research
So let’s go back to where some of us were before the COS study, when we pointed out that social psychology is not alone in having replication problems. What did this fact imply? Just this: The causes of a scientific literature full of studies that can’t be replicated are not specific to social psychology. The causes are both deeper and broader. They are deeper because they don’t concern concrete details of particular studies, or even properties of particular areas of research. They are broader because they affect all of science.
And the causes are not hard to see. Among them are:
1. An oversaturated talent market full of smart, motivated people anxious to get, or keep, an academic job.
2. A publication system in which the journals that can best get you a job, earn you tenure, or make you a star, are (or until recently have been) edited with standards such as the “JPSP threshold” (of novelty), and the explicit (former) disdain of Psychological Science for mere “bricks in the wall” that represent solid, incrementally useful, but insufficiently “groundbreaking” findings. I have been told that the same kinds of criteria have long prevailed in major journals in other fields of science as well. And of course we all know what kind of article is required to make it into Science.
3. And, even in so-called lesser journals, an insistence on significant findings as a criterion for publication, and a strong preference for reports of perfect, elegant series of studies without a single puzzling data point to be seen. “Messy” studies are left to work their way down the publication food chain, or to never appear at all.
4. An academic star system that radically, disproportionately rewards researchers whose flashy findings get widespread attention not just in our “best” journals but even in the popular media. The rewards can include jobs in the most prestigious universities, endowed chairs, distinguished scholar awards, Ted talks, and even (presumably lucrative) appearances in television commercials! (7)

It is these factors that are, in my opinion, both the ultimate sources of our problem and the best targets for reforming and improving not just psychology, but scientific research in all fields. And, to end on an optimistic note, I think I see signs that useful reforms are happening. People aren’t quite as enthusiastic about cute, counter-intuitive findings as they used to be. Hiring committees are starting to wonder what it really means when a vita shows 40 articles published in 5 years, all of which have perfect patterns of results. Researchers are occasionally openly responding – and getting publicly praised for openly responding — rather than defensively reacting, to questions about their work. (8)

VII. Summary and Moral
The replicability crisis is not just an issue for social psychology, and its causes aren’t unique to social psychology either. Claims that we don’t have a problem, because of various factors that are themselves unique to social psychology, fail to explain why so many other fields have similar concerns. The essential causes of the replicability crisis are cultural and institutional, and transcend specific fields of research. The remedies are too.

(1) The catalyst for this sudden attention appears to have been the nearly simultaneous appearance in JPSP of a study reporting evidence for precognition, and the exposure of massive data fraud by a prominent Dutch social psychologist. While these two cases were unrelated to each other and each exceptional by any standard, together they illuminated the fallibility of peer review and the self-correcting processes of science that were supposed to safeguard against accepting unreliable findings.
(2) Or 47%, or 39% or 68%, again, depending on your definition.
(3) Or a bit earlier, because Science magazine’s embargo was remarkably leaky, beginning with a Harvard press release issued several days before the article it promoted.
(4) To quote their press release; the word does not appear in their article.
(5) Full disclosure. This probably includes me, but I didn’t write a blog about it (until just now).
(6) A few: Dorothy Bishop, Andrew Gelman, Daniel Lakens, Uri Simonsohn, Sanjay Srivastava, Simine Vazire
(7) I strongly recommend reading Diederik Stapel’s vivid account (generously translated by Nick Brown) of how desperately he craved becoming one of these stars, and what this craving motivated him to do.
(8) Admittedly, defensive reactions, amplified in some cases by fan clubs, are still much more common. But I’m looking for positive signs here, and I think I see a few.

9 thoughts on “What if Gilbert is Right?

  1. Thanks for the kind words on my “generosity”. Diederik(*) Stapel’s book is indeed a goldmine of insights into how the system works. I don’t know if it’s a totally honest account of what really happened in his lab – although he and I collaborated on the translation, I never asked him to elaborate on specifics – but even if it isn’t a complete confession, it’s still jaw-dropping on occasion.

    (*) By the way, could you correct the spelling of his first name?

    • It’s a classic example of the “unreliable narrator,” but a fascinating read. It was generous of you to take the time to make it accessible to us non-Dutch readers.
      *Spelling corrected; thanks.

    • Yes, thank you for your work translating! I meant to just take a look but now it’s 60 pages deep, and that’s reading on a laptop screen. Just entering undergraduate sociology studies and these non-commercial depictions of the struggles of life within academia (and the tension between ideals and personal success) seem as rare as they are relevant for laying out hopes and plans within academia.

  2. What if, the SPSP taskforce on best practices had declared that researchers have to decide whether they are using a dataset for publication before they look at the result and that violating this rule violates rules of scientific integrity and ethical conduct?

  3. “Let’s stipulate, for the moment, that Gilbert and Co. are correct that the COS project told us nothing worth knowing about the replicability of social psychological research. What then?”

    — You don’t want to do that; exactly the wrong move. There’s absolutely no basis to accept the unfounded premise that Gilbert et al. might be correct, and lots of reasons to dismiss it — including multiple insurance TV commercials — so don’t give them a pass like D. Trump’s gotten. Only further confuses the whole matter…

    • Dave,

      I interpreted this post a little bit differently. It seems to me that the goal of the post was to suggest that if all of the other sciences possess replicability issues, than it is likely that social psychology does as well. It seems that David mentioned that we should “stipulate… that Gilbert and Co. are correct” only to build a narrative. The take away message I received was “Gilbert and co. may be right. But don’t you think it would be strange if all the other disciplines possess replicability issues, but social psychology doesn’t??”

      I agree that we should carefully consider and debate the validity of Gilbert and Co.’s claims. However, assuming that there are replicability issues in the rest of science (which is a broader issue that I am less familiar with), perhaps we should shift the burden of proof and assume that social psychology has a replicability issue (as demoralizing as that notion may be) until a more convincing (and likely less controvertible) amount of evidence suggests that we do not.

  4. Pingback: Core Economics | So, is there a crisis? Or is there a crisis of the crisis, or what? On replicability, reproducibility, and other current challenges in the social sciences

  5. Pingback: The Replication Crisis Is My Crisis – Undark Magazine

  6. Pingback: Understanding the replication crisis – Philipp K. Masur

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s