Data, eternal

Science  02 Jan 2015:
Vol. 347, Issue 6217, pp. 7
DOI: 10.1126/science.aaa5057

During 2014, Science worked with members of the research community, other publishers, and representatives of funding agencies on many initiatives to increase transparency and promote reproducibility in the published research literature. Those efforts will continue in 2015. Connected to that progress, and an essential element to its success, an additional focus will be on making data more open, easier to access, more discoverable, and more thoroughly documented. My own commitment to these goals is deeply held, for I learned early in my career that interpretations come and go, but data are forever.

“…interpretations come and go, but data are forever.”


During my qualifying exam to advance to Ph.D. candidacy, I drew a chalkboard cartoon of a then-new concept: that the weight of recently erupted oceanic volcanoes could elastically deform the surrounding seafloor, creating a deep depression and surrounding flexural arch. Afterward, H. W. Menard, the great marine geologist and a member of my exam committee, spread out a map of the Pacific. He pointed out places where there were older coral atolls (which marked former stands of sea level) that were either now uplifted or drowned in the vicinity of younger volcanoes. Using the distance from the young volcano to the atoll and the amount of uplift or depression, we were able to calibrate the long-term flexural strength of the Pacific seafloor under the weight of the volcanic loading. It was of no matter that Menard had published a paper years earlier using a subset of the uplifted atolls to argue for another hypothesis, which he now happily discarded in favor of the flexural warping one. It occurred to me that there was no database of “drowned and uplifted atolls” that one could access. Menard's prior publication provided only a biased sampling of all occurrences. Had he not been on my exam committee, that unique set of observations to constrain the flexural rigidity might never have presented itself.

Data, particularly those collected with public funding, should be used so that they do the most good. When the greatest number of creative and insightful minds can find, access, and understand the essential features that led to the collection of a data set, the data reach their highest potential. Although the situation has improved some four decades after my student days in terms of the number of public data repositories, requirements for making data available, and metadata standards, there is still a long way to go. So what can Science do to help in this regard, given that it covers many disciplines but is not deeply embedded in any one field?

There are many publicly and privately funded data repositories worldwide, not all of which are being used to their full potential. In 2015, we want to work with authors and readers to identify which of those repositories Science should promote because they are well managed, have long-term support, and are responsive to community needs. For data that do not neatly fit into large-scale repositories, we will explore other available options. We also will evaluate different ways to tag data sets and integrate such tagging into our peer-review process. For example, one might associate a digital identifier for a data set with a figure in a paper. A reviewer could use such an identifier to find the particular data that are related to the figure. The hope is to work with repositories that allow bidirectional tagging so that it is easy for someone—a reviewer or reader—to identify the data used in a Science paper.

Along with improving the “discoverability” of data sets, Science hopes to inspire creative ways to visualize data sets to improve the communication of information and concepts and even facilitate the discoverability of new phenomena. What happens when you bring together those who collect large data sets with those who develop the tools to analyze and view them? Stay tuned!


Navigate This Article