Even when you’re skewering an entire field of science, the better part of valor might be to use terms such as “circular analysis” rather than, say, “voodoo.”
The latter is how a team of scientists characterized some findings from brain imaging, as I described in a print column and a previous post about an upcoming paper in Perspectives on Psychological Sciences (but available ahead of print here) by Ed Vul of MIT, Hal Pashler of UC San Diego and colleagues. Its original title of “Voodoo Correlations” in fMRI studies has been replaced by the much-politer “Puzzlingly High Correlations” in fMRI studies, but the message is the same: conclusions from brain-imaging studies of social and emotional aspects of human behavior (jealousy, altruism, social pain and the like) might be wrong and cannot be trusted unless they are re-done with greater statistical rigor.
Now a new study, published last night online for the May issue of Nature Neuroscience, offers an equally devastating critique. Nikolaus Kriegeskorte, Chris Baker and colleagues, of the National Institute of Mental Health, analyzed all the fMRI studies published last year in five top journals (Nature, Science, Nature Neuroscience, Neuron and Journal of Neuroscience). Of the 134 fMRI papers, 42 percent (57 papers) committed a statistical sin at least once: they were guilty of what the NIMH scientists call “double dipping.”
In double dipping, scientists start with a hypothesis that some region of the brain is involved in, say, feeling jealous, and therefore responds to (say) a photo of your romantic rival by becoming extremely active. It’s double dipping, and problematic from a statistical sense, if the scientists then look for those more-active brain regions and analyze only these areas to test the hypothesis. The problem is that brain regions may become more active when they see that photo (compared to seeing, say, a landscape) purely by chance, and analyzing only to these regions would give a misleading result. For the statisticians among you, it’s called non-independent selective analysis.
And it can turn dross into gold. When the NIMH team analyzed “noise”—that is, random data that was known not to show any effect—they still obtained results that seemed to connect a stimulus with a brain response. In other words, double dipping can do wonders for your study. As Kriegeskorte and his colleagues write, the practice “beautifies results, rendering them more attractive to authors, reviewers and editors, and thus more competitive for publication. These implicit incentives may create a preference for circular practices so long as the community condones them.”
As with the Vul et al. critique, this doesn’t mean that all the fMRI results are wrong. The point is, you can’t tell. “To decide which neuroscientific claims hold,” the NIMH scientists write, “the community needs to carefully consider each particular case, guided by both neuroscientific and statistical expertise. Reanalyses and replications may also be required.”
The British Psychological Society has a nice write-up of the new paper here. And Pashler, one of the “voodoo” authors, calls my attention to what he calls a “funny thing about the Kriegeskorte paper” that I didn't notice: unlike his team, the NIMH scientists “didn’t publish the list of which of the 2008 papers were, and were not, afflicted with problems. . . . I can well understand why, since he is a full time fMRI researcher and needs to avoid ticking those people off—something none of us were terribly worried about. And we do know how thin-skinned they are. But there is some tension between the idea of a secret list of bad studies, on the one hand, and the whole notion of science as a public self-correcting enterprise, with a Literature that can be relied upon.”
Science does tend to move at glacial speed, but isn’t it time the fMRI community came to grips with the growing criticism of its methods?