scooped again

Two Spanish researchers have shown (original here) that two leading journals routinely publish statistical errors:

The analysis revealed that at least one error appeared in 38 per cent of the Nature papers and 25 per cent of the British Medical Journal papers looked at. Furthermore, the study estimates that four per cent of results reported to be statistically “significant” may not be significant after all.

Yet again, the Spanish study is an example of someone actually doing something I thought of some time ago. (Fortunately for me, I’m usually only pleased when this happens, because I know perfectly well that I’ll never do anything with the idea.) I am woefully ignorant of statistics, and probably have published overly simplistic analyses myself (though I am careful about claims of significance, and am confident that I’ve made no errors there). This sorry state is much more prevalent among biomed researchers than it ought to be, so I’m not suprised by the study’s findings. Garcia-Berthou and Alcaraz also make another point upon which I’ve been known to wax shrewish:

As well as warning researchers and editors to be more careful with data, they also urge the publication of raw data online. “If we had that, we could check the results,” Garc�a-Berthou says. “Some journals already publish supplements online, but it’s rare, and I think it should become commonplace.”

I think it should by now be viewed as low-rent not to make your raw data available online. There’s no reason not to do it, unless you’re hiding something; if the journal doesn’t provide the option, the server space and bandwidth costs are well within reach of any research institution. I’m convinced that it will become a standard part of scientific publishing. (Obdisclosure: I haven’t made any of my raw data available online, even though it was about the first thing I thought of when I came across the net, way back in 1993. I could never convince the higher-ups that it was a good idea. I’ll start doing it as soon as I’m high enough on the food chain to insist on it, which I hope will be from the next paper onwards, paying for the hosting myself if need be.)

3 thoughts on “scooped again

  1. You reminded me of a principle I learned in college statistics, many years ago. MINITAB, a statistics and number-crunching program, had just been made available, and we could now run all sorts of tests for significance. The professor warned us that if we had a data set of (say) 20 variables, and cross-tested each of them against each of the others, resulting in 190 cross-tests, using a 5% significance level, then we could expect 5% of 190, or 9 to 10, of the correlations to be significant, but that doesn’t necessarily mean that anything’s going on — it would be surprising if we found no statistically significant correlations. He made the broader point, too often ignored in the real world, that statistical correlation does not imply causation or actual relationship.

  2. Reminds me of this, one off your list of favs if I remember correctly. 🙂
    Magna Est Veritas
    Here, in this little Bay,
    Full of tumultuous life and great repose,
    Where, twice a day,
    The purposeless, gay ocean comes and goes,
    Under high cliffs, and far from the huge town,
    I sit me down.
    For want of me the world’s course will not fail:
    When all its work is done, the lie shall rot;
    The truth is great, and shall prevail,
    When none cares whether it prevail or not.
    – Coventry Patimore

  3. A new Handbook of Data Analysis (published by Sage — London) includes a very useful chapter on statistical significance. It includes cautions and examples of serious missteps.

Comments are closed.