User:OpenScientist/Panton Papers/Data sharing requirements

Contributions to this page are licensed CC-BY

About
This section is auxiliary to the drafting and will be deleted before submission.

This document serves to draft an article on       Data sharing requirements by major research funders (and possibly journals) as part of the Panton Papers series to be published in BMC research notes as commentary. See http://science.okfnpad.org/PantonPapers for overview on the initiative, and http://okfnpad.org/sciencewg-PPcasestudies for case studies related to the Panton Principles. For news on the matter, see http://friendfeed.com/data-publishing and http://friendfeed.com/opendata.

Author guidelines
This section is auxiliary to the drafting and will be deleted before submission.


 * Generic for BMC Research Notes: http://www.biomedcentral.com/bmcresnotes/ifora/

"article type 'commentary' [..]. It's fairly flexible other than needing an unstructured abstract (about 100 words) and a Discussion - as the main text - broken into subheadings if desired. See Vince Smith's piece on data publication for an example, http://www.biomedcentral.com/1756-0500/2/113."
 * Specific for the Panton Papers series (from email):

Authors
Daniel Mietchen and Herbert Grüttemeier and Claudia Koltzenburg
 * further contributors welcome

For background, see Herbert's presentation at http://www.ape2011.eu/html/full_programm_2.html which covers funders: http://river-valley.tv/helping-to-ride-a-look-at-data-sharing-and-access-policies/. Journal policies are possibly to be covered in a separate article. See also Heather Piwowar: "A few inventories list the data-sharing policies of funders [25], [26] and journals [1], [27], and some work has been done to correlate policy strength with outcome [2], [28]."

Abstract
Data sharing is central to the scientific endeavour, and in the electronic age, it can be achieved in multiple ways. Statements about data sharing thus increasingly form an intrinsic part of grant applications, funding reports and manuscript submissions. In this article, a snapshot of applicable policies of major research funders as well as a number of journals shall be discussed, both at an individual level and from the perspectives of available tools or infrastructures, and of science as a whole.

Background

 * Kinds of data ==> refer to separate article
 * Kinds of data sharing mechanisms ==> refer to separate article
 * Historic outline of data sharing ==> refer to separate article?
 * Data sharing requirements by journals ==> refer to separate article?

Kinds of research funders

 * Discipline-specific vs. cross-disciplinary
 * Institutional vs. societal
 * Local/regional vs. national vs. international
 * Small vs. large
 * Public vs. private
 * Grants vs. fellowships/ awards
 * Public peer review vs. non-public peer review
 * Nothing to be reported on the former.

Kinds of research journals

 * Discipline-specific vs. cross-disciplinary
 * Non-data-focussed fields vs. data-intensive fields
 * Institutional vs. societal
 * For-profit journals vs. not-for-profit (non-profit) journals
 * Local/regional vs. national vs. international
 * Small vs. large
 * OA vs. non-OA
 * Data journals vs. traditional journals
 * Public peer review vs. non-public peer review
 * Impact guesstimation peer review vs. technical soundness peer review

Non-journal "outlets" for scientific data

 * CKAN (?)
 * Github (?)

Historic outline of data sharing requirements
Some historical fix points:
 * Brahe/ Kepler
 * Anagrams used by da Vinci, Galileo, Newton and others
 * Soviet Russian scientific culture (?)

Kinds of current data sharing recommendations/ requirements
Here, it would be good to always have an example where the practice under consideration is recommended and one where it is mandatory. Won't always be available, but we can try.

Open issues

 * outlined in http://dataconservancy.org/sites/default/files/Data%20Issues%20in%20the%20Life%20Sciences%20White%20Paper.pdf ; data sharing policies in Table 2 (p. 21)

Discussion
Many of the recent data sharing policies established by funders refer to, and take over the ideas of, the OECD Principles and Guidelines for Access to Research Data from Public Funding, released in 2007. This is not surprising as “giving guidance to institutions in need of policies” was one of OECD’s objectives in this initiative. The Principles stress the return on public investments and the enhancement of value, but also recall that “The public science systems of OECD member countries are based on the principle of openness and the free exchange of ideas, information and knowledge”. They also say that “More specifically, improved access to, and sharing of, data: - Reinforces open scientific inquiry; - Encourages diversity of analysis and opinion; - Promotes new research; - Makes possible the testing of new or alternative hypotheses and methods of analysis; - Supports studies on data collection methods and measurement; - Facilitates the education of new researchers; - Enables the exploration of topics not envisioned by the initial investigators; - Permits the creation of new data sets when data from multiple sources are combined.” Funders seem particularly sensible to the last aspects. In its Principles for the Handling of Research Data, the Alliance of German Science Organisations introduces its Principles for the Handling of Research Data by stating that “Quality-assured research data are a cornerstone of scientific knowledge and, independent of the purpose for which they were originally obtained, can often serve as the basis for further research. This applies especially to the aggregation of data from various sources for combined utilization.” Such aggregation takes advantage of the increasing opportunities offered by the new ICTs, but the degree of aggregation depends heavily on the degree of openness, which is one of the reasons, obviously recognized by funders, to consider technology as a driver for broader sharing and wider access. However, as far as such further utilization, beyond the initial purposes, is concerned, data sharing policies meet the reluctance of researchers whose views may be extremely opposite, for various reasons: potential misuse of data, legal uncertainty, loss of credit and intellectual capital, burdensome data management tasks.
 * Kinds of current data sharing recommendations/ requirements
 * The OECD reference

Conclusions
The focus in discussions of data sharing has moved from whether to how. Funders, institutions and journals have a special responsibility in guiding this process, and some of them are indeed taking good steps in this direction, although the reusability of the shared data needs more attention.

International

 * EU
 * ESF
 * HFSP
 * Gates Foundation

National

 * NIH
 * Wellcome
 * CNRS
 * INRIA
 * DFG

Regional

 * Catalonia?

Institutional

 * Howard Hugues
 * Harvard
 * Max Planck
 * Helmholtz
 * Wikimedia Foundation

Resources

 * BioSharing list of data sharing policies
 * For some relevant references, see http://oa.helmholtz.de/index.php?id=300#c1772