What is the question?

See allHide authors and affiliations

Science  20 Mar 2015:
Vol. 347, Issue 6228, pp. 1314-1315
DOI: 10.1126/science.aaa6146

You are currently viewing the summary.

View Full Text

Log in to view the full text

Log in through your institution

Log in through your institution


Over the past 2 years, increased focus on statistical analysis brought on by the era of big data has pushed the issue of reproducibility out of the pages of academic journals and into the popular consciousness (1). Just weeks ago, a paper about the relationship between tissue-specific cancer incidence and stem cell divisions (2) was widely misreported because of misunderstandings about the primary statistical argument in the paper (3). Public pressure has contributed to the massive recent adoption of reproducible research tools, with corresponding improvements in reproducibility. But an analysis can be fully reproducible and still be wrong. Even the most spectacularly irreproducible analyses—like those underlying the ongoing lawsuits (4) over failed genomic signatures for chemotherapy assignment (5)—are ultimately reproducible (6). Once an analysis is reproducible, the key question we want to answer is, “Is this data analysis correct?” We have found that the most frequent failure in data analysis is mistaking the type of question being considered.