In light of our ongoing conversation on statistics and the fact that all of us in the lab continue to work on producing science, I thought I’d take this unseasonably snowy evening to pull together some interesting but somewhat related things I’ve been wanting to share for some time now.
To kick off the evening, I would like to bring up a piece that has been in the popular eye this past week: Reinhart & Rogoff’s “Growth in a Time of Debt.” This paper analyzes historic patterns of GDP growth and Debt:GDP ratios. The key conclusion of the paper is that there is a threshold when debt exceeds 90% of GDP which causes GDP growth to drop sharply. As is often the case, people who want this story to be true have taken it and run with it, giving the paper 450 citations in just under 2 years.
What makes this newsworthy is that a recent study was published attempting to replicate these results. As is detailed at great length here, replicating the published numbers took two methodological judgement calls and one miscoding of the excel data. The linked piece explains the exact nature of the methodology in substantial detail, but the bottom line is that the judgement calls that were made systematically strengthened the effect that was reported. What elevates this to a scientific pecadillo, if not quite a capital sin, is that these methodological decisions, and more importantly their impact on the findings, were not apparent from the original paper.
Of course, most of the attention has gone to the straight up coding error that simply excluded some observations. These absent observations happened to also reinforce the observed effect to the tune of 0.3 percentage points of growth. For context, the original study concluded that average growth for nations under high debt loads was actually negative (-0.1%). By correcting the excel coding and reversing the methodological judgement calls, the replicators estimated a growth rate of 2.1% under high debt, which was still lower than under lower debt (60% of GDP).
So taking a step back, there seem to be two strikes against this from a scientific perspective. First, and I would argue most fundamental, is that the conclusions were driven by subjective decisions and that the nature of these decisions was not communicated in the paper. Second, and unfortunately getting the most press coverage, is the data coding error. These authors actually got of lightly in that regard. I stumbled across this post, that captures an unequivocal retraction due to an Excel coding error (for bonus points, look carefully at the blog’s name).
What makes the Reinhart & Rogoff case so chilling is that it seems plausible that someone based decisions off of their results. While in the scientific community we routinely argue for a greater use of scientific evidence in decision-making, I feel many or most of us lack the appropriate level of terror that someone might actually listen to findings of ours that happen to be wrong. Over in the neuroscience field (them brain surgeons), there is some concern right now over the effects of a literature populated by studies with low levels of replication. This relates to an interesting phenomenon called the “Decline Effect” which observes a surprisingly consistent tendency for observed effects of phenomena to diminish over time (if the author’s name seems familiar, it may be because his career imploded when it came out that he had fabricated portions of a book he wrote). While underpowered studies can make it difficult to detect an effect, they also make it possible to observe very strong effects, the kind of multiple asterisk effects that have you rushing to a high-impact journal. When people listen to such a study, like in the “vaccines cause autism” study (n = 12), things gang aft agley.
So what’s my point here? Simply this: Most of us are dedicating, or have already dedicated, a great deal to the proposition that the world can be made a better place through a judicious application of science. We should really make sure that our personal work, and as much as we can our broader field, are up for that.
To end on a lighter note, here is an interesting paper which I think should be required reading for everyone embarking on a career in research. The first papers to reference this article came out a few months after the paper was published and can be found in the letters section of the journal Diabetes Care. While the blogosphere is largely in on the joke, the article has been cited 167 times in Web of Science, including 4 so far this year. In my limited sampling, the paper has generally been taken at face value by the people whose research may be incorporated into [insert name of diabetic loved one here]’s treatment.