Bargain Basement Bayes

One of the more salutary consequences of the “replication crisis” has been a flurry of articles and blog posts re-examining basic statistical issues such as the relations between N and statistical power, the importance of effect size, the interpretation of confidence intervals, and the meaning of probability levels. A lot of the discussion of what is now often called the “new statistics” really amounts to a re-teaching (or first teaching?) of things anybody, certainly anybody with an advanced degree in psychology, should have learned in graduate school if not as an undergraduate. It should not be news, for example, that bigger N’s give you a bigger chance of getting reliable results, including being more likely to find effects that are real and not being fooled into thinking you have found effects when they aren’t real. Nor should anybody who had a decent undergrad stats teacher be surprised to learn that p-levels, effect sizes and N’s are functions of each other, such that if you know any two of them you can compute the third, and that therefore statements like “I don’t care about effect size” are absurd when said by anybody who uses p-levels and N’s.

But that’s not my topic for today. My topic today is Bayes’ theorem, which is an important alternative to the usual statistical methods, but which is rarely taught at the undergraduate or even graduate level. (1)  I am far from expert about Bayesian statistics. This fact gives me an important advantage: I won’t get bogged down in technical details; in fact that would be impossible, because I don’t really understand them. A problem with discussions of Bayes’ theorem that I often see in blogs and articles is that they have a way of being both technical and dogmatic. A lot of ink – virtual and real – has been spilled about the exact right way to compute Bayes Factors and advocating that all statistical analyses should be conducted within a Bayesian framework. I don’t think the technical and dogmatic aspects of these articles are helpful – in fact I think they are mostly harmful – for helping non-experts to appreciate what thinking in a semi-Bayesian way has to offer. So, herewith is my extremely non-technical and very possibly wrong (2) appreciation of what I call Bargain Basement Bayes.

Bayes Formula: Forget about Bayes Formula. I have found that even experts have to look it up every time they use it. For many purposes, it’s not needed at all. However, the principles behind the formula are important. The principles are these:

1. First, Bayes assumes that belief exists in degrees, and assigns numbers to those degrees of belief. If you are certain that something is false, it has a Bayes “probability” of 0. If you are certain it’s true, the probability is 1. If you have absolutely no idea, whatsoever, the probability is .5. Everything else is in between.
Traditional statisticians hate this. They don’t think a single fact, or event, can even have a probability. Instead, they want to compute probabilities that refer to frequencies within a class, such as the number of times out of hundred a result would be greater than a certain magnitude under pure chance given a certain N. But really, who cares? The only reason anybody cares about this traditional kind of probability is because after you compute that nice “frequentist” result, you will use the information to decide what you believe. And, inevitably, you will make that decision with a certain degree of subjective confidence. Traditional statistics ignores and even denies this last step, which is precisely where it goes very, very wrong. In the end, beliefs are held by and decisions based on those beliefs are made by people, not numbers. Sartre once said that even if there is a God, you would still have to decide whether to do what He says. Even if frequentist statistics are exactly correct (3) you still have to decide what to do with them.

2. Second, Bayes begins with what you believed to be true before you got your data. And then it asks, now that you have your data, how much should you change what you used to believe? (4)
Traditional statisticians hate this even more than they hate the idea of putting numbers on subjective beliefs. They go on about “prior probabilities” and worry about how they are determined, observe (correctly) that there is no truly objective way to estimate them, and suspect that the whole process is just a complicated form of inferential cheating. But the traditional model begins by assuming that researchers know and believe absolutely nothing about their research topic. So, as they then must, they will base everything they believe on the results of their single study. If those results show that people can react to stimuli presented in the future, or that you can get people to slow their walks to a crawl by having them unscramble the word “nldekirw” (5) then that is what we have to believe. In the words of a certain winner of the Nobel Prize, “we have no choice.”
Bayes says, oh come on. Your prior belief was that these things were impossible (in the case of ESP) or, once the possibility of elderly priming was explained, that it seemed pretty darned unlikely. That’s what made the findings “counter-intuitive,” after all. Conventional statistics ignores these facts. Bayes acknowledges that claims that are unlikely to be true, a priori, need extra-strong evidence to become believable. I am about the one millionth commentator to observe that social psychology, in particular, has been for too long in thrall to the lure of the “counter intuitive result.” Bayes explains exactly how that got us into so much trouble. Counter-intuitive, by definition, means that the finding had a low Bayesian prior. Therefore, we should have insisted on iron-clad evidence before we started believing all those cute surprising findings, and we didn’t. Maybe some of them are true, who knows at this point. But the clutter of small-N, underpowered single studies with now-you-see-it-now-you-don’t results are in a poor position to tell us which they are. Really, we almost need to start over.

3. Third, Bayes is in the end all about practical decisions. Specifically, it’s about decisions to believe something, and to do something or not, in the real world. It is no accident, I think, that so many Bayesians work in applied settings and focus on topics such as weather forecasting, financial planning, and medical decisions. In all of these domains, the lesson they teach tends to be – as Kahneman and Tversky pointed out long ago – we underuse baserates (6). In medicine, in particular, the implications are just starting to be understood in the case of screening for disease. When the baserate (aka the prior probability) is low, then even highly diagnostic tests are at a very high probability of yielding false positives, which entail significant physical, psychological, and financial costs. Traditional statistical thinking, which ignores baserates, leads one to think that a positive result of a test with 90% accuracy means that the patient has a 90% chance of having the disease. But if the prevalence in the population is 1%, the actual probability given a positive test is less than 10%. In subjective, Bayesian terms of course! Extrapolating this to the context of academic research, the principle implies that we overestimate the diagnosticity of single research studies, especially when the prior probability of the finding is low. I think this is why we were so willing to accept implausible, “counter-intuitive” results on the basis of inadequate evidence. To our current grief.

You don’t have to be able to remember Bayes’ formula to be a Bargain Basement Bayesian. But, as in all worthwhile bargain basements, you can get something valuable at a low cost.

Footnotes
1. In a recent graduate seminar that included students from several departments, I asked who had ever taken a course that taught anything about Bayes.  One person raised her hand.  Interestingly, she was a student in the business school.
2. Hi Simine.
3. They aren’t.
4. Bayes is sometimes called the “belief revision model,” which I think is pretty apt.
5. Wrinkled
6. Unless the data are presented in an accessible, naturalistic format such as seen in the work by Gerd Gigerenzer and his colleagues, which demonstrates how to present Bayesian considerations in terms other than the intimidating-looking formula.

Towards a De-biased Social Psychology: The effects of ideological perspective go beyond politics.

Behavioral and Brain Sciences, in press; subject to final editing before publication

This is a commentary on: Duarte, J. L., Crawford, J. T., Stern, C., Haidt, J., Jussim, L., & Tetlock, P. E.  (in press). Political diversity will improve social psychological science. Behavioral and Brain Sciences. To access the target article, click here.

“A liberal is a man too broadminded to take his own side in a quarrel.” — Robert Frost

Liberals may be too open-minded for their own (ideological) good; they keep finding fault with themselves and this article is a good example. Which is not to say it’s not largely correct. Social and personality psychology obviously lacks ideological diversity, and Duarte and colleagues provide strong circumstantial evidence that the causes include hostile climate, lack of role models, and subtle and not-so-subtle discrimination of the same sort that underlies other lacks of diversity elsewhere in society.

Duarte et al. argue that our science would be better if more “conservatives” were included in the ideological mix. But the point of view that carries this label has changed greatly in recent years. Not so long ago, no conservative would dream of shutting down the government over an ideological dispute, denying the validity of settled science, or passing laws to encourage open carry of weapons on college campuses. Conservatives were conservative. Such people indeed have a lot to contribute to any discussion, including scientific ones. But many modern-day “conservatives” — especially the loudest ones — would better be described as radical, and among their radical characteristics is a pride in anti-intellectualism and willful ignorance. In a call for more conservatives, who are we actually inviting and, I truly wonder, how many even exist? I am not optimistic about the feasibility of finding enough reasonable conservatives to join our field, even if we could overcome all of the barriers the target article so vividly describes. At best, such change is a long-term goal.

In any case, we shouldn’t wait for conservatives to arrive and save us. We need to save ourselves. The target article presents mixed messages about whether de-biasing is feasible. On the one hand, it cites evidence that de-biasing is difficult or impossible. On the other hand, the entire article is an effort at de-biasing. I choose to believe the more optimistic, implicit claim of Duarte et al., which is that we can become more intellectually honest with ourselves and thereby do better science. I find the “mirror-image” test particularly promising. For any finding, we should indeed get into the habit of asking, what if the very same evidence had led to the opposite conclusion?

Politics is the least of it. In focusing on research that seeks to describe how conservatives are cognitively flawed or emotionally inadequate, or on research that treats conservative beliefs as ipso facto irrational, Duarte et al. grasp only at the low-hanging fruit. More pernicious, I believe, are the way ideological predilections bias the conduct and evaluation of research that, on the surface, has nothing to do with politics. An awful lot of research and commentary seems to be driven by our value systems, what we wish were true. So we do studies to show that what we wish were true is true, and attack the research of others that leads to conclusions that do not fit our world view.

Examples are legion. Consider just a few:

Personality and abilities are heritable. This finding is at last taking hold in psychology, after a century’s dominance of belief in a “blank slate.” The data were just too overwhelming. But the idea that people are different at the starting line is heartbreaking to the liberal world-view and encounters resistance even now.

Human nature is a product of evolution. Social psychologists are the last people you would expect to deny that Darwin was right — except when it comes to human behavior, and especially if it has anything to do with sex differences (Winegard et al., 2014). The social psychological alternative to biological evolution is not intelligent design, it’s culture. And as to where culture came from, that’s a problem left for another day.

The Fundamental Attribution Error is, as we all know, the unfortunate human tendency to view behavior as stemming from the characteristics — the traits and beliefs — of the people who perform it. Really, it’s the situation that matters. So, change the situation and you can change the behavior; it’s as simple as that. This belief is very attractive to a liberal world-view, and one does not have to look very far to find examples of how it is used to support various liberal attitudes towards crime and punishment, economic equality, education, and so forth. But the ideological consequences of belief in the overwhelming power of the situation are not consistent. It implies that the judges at Nuremberg committed the Fundamental Attribution Error when they refused to accept the excuse of Nazi generals that they were “only following orders.”

The consistency controversy, which bedeviled the field of personality psychology for decades and which still lingers in various forms, stems from the conviction among many social psychologists that the Fundamental Attribution Error, just mentioned, affects an entire subfield of psychology. Personality psychology, it is sometimes still said, exaggerates the importance of individual differences. But to make a very long story very short, individual differences in behavior are consistent across situations (Kenrick & Funder, 1988) and stable over decades (e.g., Nave et al., 2010). Many important life outcomes including occupational success, marital stability and even longevity can be predicted from personality traits as well as or better than from any other variables (Roberts et al., 2007). And changing behavior is difficult, as any parent trying to get a child to make his bed can tell you; altering attitudes is just as hard, as anyone who has ever tried to change anyone else’s mind in an argument can tell you. Indeed, does anybody ever change their mind about anything? Maybe so, but generally less than the situation would seem to demand. I expect that responses to the article by Duarte et al. will add one more demonstration of how hard it is to change ingrained beliefs.

REFERENCES
Kenrick, D.T., & Funder, D.C. (1988). Profiting from controversy: Lessons from the person-situation debate. American Psychologist, 43, 23-34.
Nave, C.S., Sherman, R.A., Funder, D.C., Hampson, S.E., & Goldberg, L.R. (2010). On the contextual independence of personality: Teachers’ assessments predict directly observed behavior after four decades. Social Psychological and Personality Science, 1, 327-334.
Roberts, B.W., Kuncel, N.R., Shiner, R., Caspi, A., & Goldberg, L.R. (2007). The power of personality: The comparative validity of personality traits, socioeconomic status, and cognitive ability for predicting important life outcomes. Perspectives on Psychological Science, 2, 313-345.
Winegard BM, Winegard BM, & Deaner RO (2014). Misrepresentations of evolutionary psychology in sex and gender textbooks. Evolutionary Psychology, 12, 474-508.

How to Flunk Uber: A Guest Post by Bob Hogan

How to Flunk Uber

by Robert Hogan

Hogan Assessment Systems

Delia Ephron, a best-selling American author, screenwriter, and playwright, published an essay in the New York Times on August 31st, 2014 entitled “Ouch, My Personality, Reviewed”  that is a superb example of what Freud called “the psychopathology of everyday life.”  She starts the essay by noting that she recently used Uber, the car service for metrosexuals, and the driver told her that if she received one more bad review, “…no driver will pick you up.”  She reports that this feedback triggered some “obsessive” soul searching:  she wondered how she could have created such a bad score as an Uber passenger when she had only used the service 6 times.  She then reviewed her trips, noting that, although she had often behaved badly (“I do get short tempered when I am anxious”), in each case extenuating circumstances caused her behavior.  She even got a bad review after a trip during which she said very little:  “Perhaps I simply am not a nice person and an Uber driver sensed it.”

The essay is interesting because it is prototypical of people who can’t learn from experience.  For example, when Ms. Ephron reviewed the situations in which she mistreated Uber drivers, she spun each incident to show that her behavior should be understood in terms of the circumstances—the driver’s poor performance—and not in terms of her personality.  Perhaps situational explanations are the last refuge of both neurotics and social psychologists?

In addition, although the situations changed, she behaved the same way in each of them:  she complained, she nagged and micro-managed the drivers, she lost her temper, and she broadcast her unhappiness to the world.  Positive behavior may or may not be consistent across situations, but negative behavior certainly is.  And the types of negative behaviors she displayed fit the typology defined by the Hogan Development Survey (HDS), an inventory of the maladaptive behaviors that occur when people are dealing with others with less power and think no one important is watching them.

All her actions had a manipulative intent—Ms. Ephron wanted to compel a fractious driver to obey her.  Her behaviors were tactical in that they gave her short term, one off wins—she got her way; but the behaviors become counterproductive when she has to deal with the same people repeatedly—or when she is dealing with NYC Uber drivers.  Strategic players carefully control what Irving Goffman called “their leaky channels”, the behavioral displays that provide information regarding a player’s character or real self.  The tactical Ms. Ephron seems unable to control her leaky channels.

It was also interesting to learn that, although Ms. Ephron has been in psychotherapy for years, the way she mistreats “little people” seemingly never came up. This highlights the difference between intrapsychic and interpersonal theories of personality.   From an intrapsychic perspective, emotional distress creates problems in relationships; fix the emotional problems and the relationships will take care of themselves.  From an interpersonal perspective, problems in relationships create emotional distress—fix the relationships (behave better) and the emotional problems will take care of themselves.  In the first model, intrapsychic issues disrupt relationships; in the second model, disrupted relationships cause intrapsychic issues.

As further evidence that Ms. Ephron lacks a strategic understanding of social behavior, she is surprised to learn that other people keep score of her behavior.  This means that she pays no attention to her reputation.  But her reputation is the best source of data other people have concerning how to deal with her.  She might not care about her reputation, but those who deal with her do.  All the data suggest that she will have the same reputation with hair dressers, psychotherapists, and purse repair people as she does with the Uber drivers of New York.

Finally, people flunk Uber the same way as they become unemployable and then flunk life—they flunk one interaction at a time.  After every interaction there is an accounting process, after which something is added to or subtracted from peoples’ reputations.  The score accumulates over time and at some point, the Uber drivers refuse to pick them up.  Ms. Ephron is a successful artist, and her success buys her a degree of idiosyncratic credit—she is allowed to misbehave in the artistic community—but there are consequences when she misbehaves in the larger community of ordinary actors.

The Real Source of the Replication Crisis

“Replication police.” “P-squashers.” “Hand-wringers.” “Hostile replicators.”  And of course, who can ever forget, “shameless little bullies.”  These are just some of the labels applied to what has become known as the replication movement, an attempt to improve science (psychological and otherwise) by assessing whether key findings can be reproduced in independent laboratories.

Replication researchers have sometimes targeted findings they found doubtful.  The grounds for finding them doubtful have included (a) the effect is “counter-intuitive” or in some way seems odd (1), (b) the original study had a small N and an implausibly large effect size, (c) anecdotes (typically heard at hotel bars during conferences) abound concerning naïve researchers who can’t reproduce the finding, (d) the researcher who found the effect refuses to make data public, has “lost” the data or refuses to answer procedural questions, or (e) sometimes, all of the above.

Fair enough. If a finding seems doubtful, and it’s important, then it behooves the science (if not any particular researcher) to get to the bottom of things.  And we’ve seen a lot of attempts to do that lately. Famous findings by prominent researchers have been put  through the replication wringer, sometimes with discouraging results.  But several of these findings also have been stoutly defended, and indeed the failure to replicate certain prominent effects seems to have stimulated much of the invective thrown at replicators more generally.

One target of the “replication police” has enjoyed few defenders. I speak, of course, of Daryl Bem (2). His findings on ESP stimulated an uncounted number of replications.  These were truly “hostile” – they set out to prove his findings wrong and what do you know, the replications uniformly failed. In response, by all accounts, Daryl has been the very model of civility. He provides his materials and data freely and without restriction, encourages the publication of all findings regardless of how they turn out, and has managed to refrain from telling his critics that they have “nothing in their heads” or indeed, saying anything negative about them at all that I’ve seen. Yet nobody comes to his defense (3) even though any complaint anybody might have had about the “replication police” applies, times 10, to the reception his findings have received. In particular, critiques of the replication movement never mention the ESP episode, even though Bem’s experience probably provides the best example of everything they are complaining about.

Because, and I say this with some reluctance, the backlash to the replication movement does have a point.  There IS something a bit disturbing about the sight of researchers singling out effects they don’t like (maybe for good reasons, maybe not), and putting them – and only them – under the replication microscope.  And, I also must admit, there is something less-than-1000% persuasive about the inability to find an effect by somebody who didn’t believe it existed in the first place. (4)

But amidst all the controversy, one key fact seems to be repeatedly overlooked by both sides (6).  The replication crisis did NOT arise because of studies intended to assess the reality of doubtful effects.  That only came later.  Long before were repeated and, indeed, ubiquitous stories of fans of research topics – often graduate students – who could not make the central effects work no matter how hard they tried.  These failed replications were anything but hostile.  I related a few examples in a previous post but there’s a good chance you know some of your own.  Careers have been seriously derailed as graduate students and junior researchers, naively expecting to be able to build on some of the most famous findings in their field, published in the top journals by distinguished researchers, simply couldn’t make them work.

What did they do? In maybe 99% of all cases (100% of the cases I personally know about) they kept quiet. They – almost certainly correctly – saw no value for their own careers in publicizing their inability to reproduce an effect enshrined in textbooks.  And that was before a Nobel laureate promulgated new rules for doing replications, before a Harvard professor argued that failed replications provide no information and, of course, long before people reporting replication studies started to be called “shameless little bullies.”

Most of the attention given these days to the replication movement — both pro and con — seems to center around studies specifically conducted to assess whether or not particular findings can replicated. I am one of those who believe that such studies are, by and large, useful and important.  But we should remember that the movement did not come about because of, or in order to promote such studies.  Instead, its original motivation was to make it a bit easier and safer to report data that go against the conventional wisdom, and thereby protect those who might otherwise waste years of their lives trying to follow-up on famous findings that have already been disconfirmed, everywhere except in public.  From what we’ve seen lately, this goal remains a long ways off.

  1. Naturally, counter-intuitiveness and oddity is in the eye of the beholder.
  2. Full disclosure: He was my graduate advisor. An always-supportive, kind, wise and inspirational graduate advisor.
  3. Unless this post counts.
  4. This is why Brian Nosek and the Center for Open Science made exactly the right move when they picked out a couple of issues of JPSP and two other prominent journals and began to recruit researchers to replicate ALL of the findings in them (5). Presumably nobody will have a vested interest or even a pre-existing bias as to whether or not the effects are real, making the eventual results of these replications, when they arrive, all the more persuasive.
  5. Full disclosure: A study of mine was published in one of those issues of JPSP. Gulp!
  6. Even here, which is nothing if not thorough.

Acknowledgements: Simine Vazire showed me that footnotes can be fun.  But I still use my caps key. Simine and Sanjay Srivastava gave me some advice, some of which I followed.  But in no way is this post their fault.

When Did We Get so Delicate?

Replication issues are rampant these days. The recent round of widespread concern over whether supposedly established findings can be reproduced began in biology and the related life sciences, especially medicine. Psychologists entered the fray a bit later, largely in a constructive way. Individuals and professional societies published commentaries on methodology, journals acted to revise their policies to promote data transparency and encourage replication, and the Center for Open Science took concrete steps to make doing research “the right way” easier. As a result, psychology was viewed not as the poster child of replication problems, quite the opposite. It became viewed as the best place to look for solutions to these problems.

So what just happened? In the words of a headline in the Chronicle of Higher Education, the situation in psychology has suddenly turned “ugly and odd.”  Some psychologists whose findings were not replicated are complaining plaintively about feeling bullied. Others are chiming in about how terrible it is that people’s reputations are ruined when others can’t replicate their work. People doing replication studies have been labeled the “replication police,” “replication Nazis” and even, in one prominent psychologist’s already famous phrase, “shameless little bullies.” This last-mentioned writer also passed along an anonymous correspondent’s description of replication as a “McCarthyite nightmare.”  More sober commentators have expressed worries about “negative psychology” and “p-squashing.” Concern has shifted away from the difficulties faced by those who can’t make famous effects “work,” and the dilemma about whether they dare to go public when this happens. Instead, prestigious commentators are worrying about the possible damage to the reputations of the psychologists who discovered these famous effects, and promulgating new rules to follow before going public with disconfirmatory data.

First, a side comment: It’s my impression that reputations are not really damaged, much less ruined, by failures to replicate. Reputations are damaged, I fear, by defensive, outraged reactions to failures to replicate one’s work. And we’ve seen too many of those, and not enough reactions like this.

But now, the broader point: When did we get so delicate? Why are psychologists, who can and should lead the way in tackling this scientific issue head-on, and until recently were doing just that, instead becoming distracted by reputational issues and hurt feelings?

Is anybody in medicine complaining about being bullied by non-replicators, or is anyone writing blog posts about the perils of “negative biology”? Or is it just us? And if it’s just us, why is that? I would really like to know the answer to this question.

For now, if you happen to be a psychologist sitting on some data that might undermine somebody’s famous finding, the only advice I can give you is this:  Mum’s the word.  Don’t tell a soul.  Unless you are the kind of person who likes to poke sticks into hornets’ nests.

The “Fundamental Attribution Error” and Suicide Terrorism

Review of: Lankford, A. (2013) The myth of martyrdom: What really drives suicide bombers, rampage shooters, and other self-destructive killers. Palgrave Macmillan.
In Press, Behavioral and Brain Sciences (published version may differ slightly)

In 1977, the social psychologist Lee Ross coined the term “fundamental attribution error” to describe the putative tendency of people to overestimate the importance of dispositional causes of behavior, such as personality traits and political attitudes, and underestimate the importance of situational causes, such as social pressure or objective circumstances.  Over the decades since, the term has firmly rooted itself into the conventional wisdom of social psychology, to the point where it is sometimes identified as the field’s basic insight (Ross & Nisbett 2011). However, the actual research evidence purporting to demonstrate this error is surprisingly weak (see, e.g., Funder 1982; Funder & Fast 2010; Krueger & Funder 2004), and at least one well-documented error (the “false consensus bias” (Ross 1977a) implies that people overestimate the degree to which their behavior is determined by the situation.

Moreover, everyday counter-examples are not difficult to formulate. Consider the last time you tried, in an argument, to change someone’s attitude. Was it easier, or harder than you expected?  Therapeutic interventions and major social programs intended to correct dispositional problems, such as tendencies towards violence or alcoholism also are generally less successful than anticipated. Work supervisors and even parents, who have a great deal of control over the situations experienced by their employees or children, similarly find it surprisingly difficult to control behaviors as simple as showing up on time or making one’s bed. My point is not that people never change their minds, that interventions never work, or that employers and parents have no control over employees or children; it is simply that situational influences on behavior are often weaker than expected.

Even so, it would be going too far to claim that the actual “fundamental” error is the reverse, that people overestimate the importance of situational factors and underestimate the importance of dispositions.  A more judicious conclusion would be that sometimes people overestimate the importance of dispositional factors, and sometimes they overestimate the importance of situational factors, and the important thing, in a particular case, is to try to get it right. The book under review, The Myth of Martyrdom (Lankford 2013), aims to present an extended example of an important context in which many authoritative figures get it wrong, by making the reverse of the fundamental attribution error (though the book never uses this term): When trying to find the causes of suicide terrorism, too many experts ascribe causality to the political context in which terrorism occurs, or the practical aims that terrorists hope to achieve. Instead, the author argues, most, if not all, suicide terrorists are mentally disturbed, vulnerable, and angry individuals who are not so different from run-of-the-mill suicides, and who are in fact highly similar to “non-terrorist” suicidal killers such as the Columbine or Sandy Hook murderers. Personality and individual differences are important; suicide terrorists are not ordinary people driven by situational forces.

Lankford convincingly argues that misunderstanding suicide terrorists as individuals who are rationally responding to oppression or who are motivated by political or religious goals is dangerous, because it plays into the propaganda aims of terrorist organizations to portray such individuals as brave martyrs rather than weak, vulnerable and exploitable pawns. By spreading the word that suicide terrorists are mentally troubled individuals who wish to kill themselves as much or more than they desire to advance any particular cause, Lankford hopes to lessen the attractiveness of the martyr role to would-be recruits, and also remove any second-hand glory that might otherwise accrue to a terrorist group that manages to recruit suicide-prone operatives to its banner.

Lankford’s overall message is important.  However, the book is less than an ideal vehicle for it. The evidence cited consists mostly of a hodge-podge of case studies which show that some suicide terrorists, such as the lead 9/11 hijacker, had mental health issues and suicidal tendencies that long preceded their infamous acts. The book speaks repeatedly of the “unconscious” motives of such individuals, without developing a serious psychological analysis of what unconscious motivation really means or how it can be detected. It rests much of its argument on quotes from writers that Lankford happens to agree with, rather than independent analysis. It never mentions the “fundamental attribution error,” a prominent theme within social psychology that is the book’s major implicit counterpoint, whether Lankford knows this or not. The obvious parallels between suicide terrorists and genuine heroes who are willing to die for a cause is noted, but a whole chapter (Ch. 5) attempting to explain how they are different fails to make a distinction that was clear to this reader. In the end, the book is not a work of serious scholarship. It is written at the level of a popular, “trade” book, in prose that is sometimes distractingly overdramatic and even breathless. Speaking as someone who agrees with Lankford’s basic thesis, I wish it had received the serious analysis and documentation it deserves, as well as being tied to other highly relevant themes in social psychology.  Perhaps another book, more serious but less engaging to the general reader, lies in the future. I hope so.

For, the ideas in this book are important.  One attraction of the concept of the “fundamental attribution error,” and the emphasis on situational causation in general, is that it is seen by some as removing limits on human freedom, implying that anybody can accomplish anything regardless of one’s abilities or stable attributes. While these are indeed attractive ideas, they are values and not scientific principles. Moreover, an overemphasis on situational causation removes personal responsibility, one example being the perpetrators of the Nazi Holocaust who claimed they were “only following orders.” A renewed attention on the personal factors that affect behavior not only may help to identify people at risk of committing atrocities, but also restore the notion that, situational factors notwithstanding, a person is in the end responsible for what he or she does.

References
Funder, D. C. (1982) On the accuracy of dispositional vs. situational attributions. Social Cognition 1:205–22.
Funder, D. C. & Fast, L. A. (2010) Personality in social psychology. In: Handbook of social psychology, 5th edition, ed. D. Gilbert & S. Fiske, pp. 668–97. Wiley.
Krueger, J. I. & Funder, D. C. (2004) Towards a balanced social psychology: Causes, consequences and cures for the problem-seeking approach to social behavior and cognition. Behavioral and Brain Sciences 27:313–27.
Lankford, A. (2013) The myth of martyrdom: What really drives suicide bombers, rampage shooters, and other self-destructive killers. Palgrave Macmillan.
Ross, L (1977a) The false consensus effect: An egocentric bias in social perception and attribution processes Journal of Experimental Social Psychology 13(3):279–301.
Ross, L. (1977b) The intuitive psychologist and his shortcomings: Distortions in the attribution process. In: Advances in experimental social psychology, vol. 10, ed. L. Berkowitz, pp. 173–220. Academic Press.
Ross, L. & Nisbett, R. E. (2011) The person and the situation: Perspectives of social psychology, 2nd edition. Pinter and Martin.

Why I Decline to do Peer Reviews (part two): Eternally Masked Reviews

In addition to the situation described in a previous post, there is another situation where I decline to do a peer review. First, I need to define a couple of terms. “Blind review” refers to the practice of concealing the identity of reviewers from authors. The reason seems pretty obvious. Scientific academia is a small world, egos are easily bruised, and vehicles for subtle or not-so-subtle vengeance (e.g., journal reviews and tenure letters) are readily at hand. If an editor wants an unvarnished critique, the reviewer’s identity needs to be protected. That’s why every journal (I know of) follows the practice of blind review.

“Masked review” is different. In this practice, the identity of the author(s) is concealed from reviewers. The well-intentioned reason is to protect authors from bias, such as bias against women, junior researchers, or researchers from non-famous institutions. Some journals use masked review for all articles; some offer the option to authors; some do not use it at all.

A few years ago, I did a review of an article submitted to Psychological Bulletin. The journal had a policy of masked review posted on its masthead, noting that that the identity of the author(s) is concealed from the reviewers “during the review process.” I liked the article and wrote a positive review. The other two reviewers didn’t like it, and the article was rejected. I was surprised, when I received my copy of the rejection letter, that the authors’ identity was still redacted.

So I contacted the editor. I was sure there had been some (minor) mistake. But the editor refused to reveal who the authors were, saying that the review was masked. I pointed out the phrase in the statement of journal policy that authors’ identity would be concealed “during the review process.” I had assumed this meant, only during the review process. The editor replied that while he could see my point, he could only reveal the authors’ name(s) with the authors’ permission. This seemed odd but I said ok, go ahead, ask the authors if I can know who they are. The answer came back that I could, if I revealed my own identity!

Now, I should not have had any problem with this, right? My own review was positive, so this was probably a chance to make a new friend. I only wanted to know the authors’ identity so that I could follow their work in general, and the fate of this particular article in particular. Still, the implications disturbed me. If the rule is that author identity is unmasked after the review process only if the reviewer agrees to be identified to the author, then it seems that only writers of positive reviews would learn authors’ identity, because they are the only ones would agree. Authors of negative reviews would be highly unlikely to allow their identity to be revealed because of possible adverse consequences – recall this is the very reason for “blind” review in the first place. And, the whole situation makes no sense anyway. What’s the point of continuing to mask author identity after the review is over?

At this time, ironically, I was a member of the Publications and Communications Board of the American Psychological Association, which oversees all of its journals including Psychological Bulletin. And then, though the normal rotation, I became Chair of this august body! There was a sort-of joke around the P&C Board, that every Chair got one “gimme,” a policy change that everybody would go along with to allow the Chair to feel like he or she had made a mark. The gimme I wanted was to change APA’s policy on masked review to match what the statement at Psychological Bulletin implied was its policy already: Authors’ identities would be revealed to reviewers at the conclusion of the review process.

The common sense of this small change, if that’s what it even was, seemed so obvious that arguments in its favor seemed superfluous. But I came up with a few anyway:
1. The purpose of masked review, in the words of the APA Editor’s Handbook, is “to achieve unbiased review of manuscripts.” This purpose is no longer served once review is over.
2. Reviewers are unpaid volunteers. One of the few rewards of reviewing is early and first-hand contact with the research literature, which allows one to follow the development of research programs by researchers or teams of researchers over time. This reward is to some extent – to a large extent? – removed by concealing author identity even when the review is over. Moreover, the persistent concealment of author identity signals a distrust of reviewers who have given of their time.
3. Important facts can come to light when author identity is revealed. A submitted article may be a virtual repeat of a previous article by the same authors (self-plagiarism), it may contradict earlier work by the same authors without attempting to resolve the contradiction, or it may have been written by a student or advisor of a reviewer who may or may not have noticed and may or may not have notified the editor if he or she did notice. These possibilities are all bad enough during the review process; they can permanently evade detection unless author identity is unmasked at some point.
4. The APA handbook already acknowledges that masking is incomplete at best. The action editor knows author identity, and the mask often slips in uncontrolled ways (e.g., the reviewer guessing – correctly or not). So ending masking at the end of the review process is a way to equalize the status of all authors rather than have their identity guessed correctly in some cases and incorrectly guessed in others — which itself could have odd consequences for the person who was thought to be the author, but wasn’t.

Do these arguments make sense to you? Then you and I are both in the minority. The arguments failed. The P&C Board actually did vote to change APA policy, as a personal favor I think, but the change was made contingent on comments from the Board of Editors (which comprises the editors of all the APA journals). I was not included in the Board of Editors meeting, but word came back that they did not like my proposal. Among the reasons: an author’s feelings might get hurt! And, it might hurt an author’s reputation if it ever became known that he or she had an article rejected. Because, it seems, this never happens to good scientists.

Today, the policy at Psychological Bulletin reads as follows: “The identities of authors will be withheld from reviewers and will be revealed after determining the final disposition of the manuscript only upon request and with the permission of the authors.” This is pretty much where the editor of the Bulletin came down, years ago, when I tried to find out an author’s identity. I guess I did have an impact on how this policy is now worded, if not its substance.

So here is the second reason that I (sometimes) decline to do peer reviews. If the authors’ identity is masked, I ask the editor whether the masking will be removed when the review process is over. If the answer is no, then I decline. The answer is usually no, so I get to decline a fair number of reviews.

Postscript: After writing the first draft of this blog, I was invited to review a (masked) article submitted to the Bulletin. I asked my standard question about unmasking at the conclusion of the review process. Instead of an answer, I received the following email: “As it turns out, your review will not be needed for me to make a decision, so unless you have already started, please do not complete your review.” So, I didn’t.