Data management plan
Proposals for projects that would result in the production or collection of scientific data should include a data management plan as an attachment to the JeS proforma. The data management plan attachment is mandatory against most STFC schemes (not Public Engagement) and should be no longer than two pages of A4. If it is felt that a DMP is not relevant to a proposal then an attachment explaining this should be uploaded to pass validation. This, together with any costs associated with it, will be considered and assessed by the normal peer review process. The data management plan should explain how the data will be managed over the lifetime of the project and, where appropriate, preserved for future re-use.
STFC expects that data resulting from the research it funds should be made openly available after any proprietary period. Data management plans should include an outline of how this will be achieved. Any reasons for not eventually sharing data should be explained (e.g. legislative, ethical, commercial, privacy and security issues).
In the case of proposals that are part of a larger collaborative project, STFC recognises that data management may not be within the sole control of the applicant or the applicant’s Institute. In such cases, where a suitable project wide data management plan exists, this may be referred to.
STFC does not stipulate a specific format for data management plans and applicants may structure them in the way most appropriate to the project. However, they should be clear, concise and proportionate – both to the scale and type of the datasets generated – and take into account the value of the data, the likely level of future re-use, the complexity and levels of data, and the cost effectiveness of preservation compared to re-creation. It is recommended that plans follow published recommendations for best practice, and applicants are encouraged to consult the guidance provided by the Digital Curation Centre (link opens in a new window).
As a general guide, the plan should:
- Specify the types of data the research will generate. Data management plans should describe the types of data that are expected to be produced from the project, including the raw data arising directly from the research, the reduced data derived from it, and published data.
- Specify which data will be preserved and how. Unless there are compelling reasons not to do so, STFC expects data to be managed through an established repository, chosen to maximise the scientific value from aggregation of related data. This may be at the grant holder's institution or elsewhere. Data management plans may refer to the general policies of the chosen repository and only include further details if necessary to the specific project. (If it is proposed not to use an established repository, the data management plan will need to demonstrate that resources and systems will be in place to enable the data to be curated effectively beyond the lifetime of the grant, although STFC recognises that applicants may not have the expertise to describe in detail how data will be curated).
- Specify the software and metadata implications. The data management plan should specify the software and metadata that will be retained to enable the data to be read and interpreted.
- Specify for how long the data will be preserved. This may depend on the type of data. Where possible, STFC expects the original data, from which other related data can in principle be derived, to be retained for a minimum of 10 years from the end of the project. For data that by their nature cannot be re-measured, efforts should be made to retain them indefinitely.
- Specify and justify which data will have value to others and should be shared. Any data that are shared should be of a sufficiently high quality to be of value to other researchers. In general, published data – data that are displayed or otherwise referred to in a publication – should be made publicly available, but it is for applicants to consider and justify which types of data will, in the context of their project, meaningfully and practically constitute published data. Publicly available means available to anyone, but there may be a requirement for registration to enable tracking of data use and to provide notification of terms and conditions of use where they apply. Other data should be made available wherever it is appropriate and cost-effective to do so, taking into account the cost of curation compared with the cost or feasibility of re-creation, the potential long-term demand for the data and the feasibility of their reuse by others.
- Specify and justify the length of any proprietary period. This might for example refer to the reasonable needs of the research team to have a first opportunity to exploit the results of their research, including any intellectual property arising. Where there are accepted norms within a scientific field or specific archive they should normally be followed. In general, STFC expects that published data should be made publicly available within six months of publication unless justified otherwise.
- Specify how data will be shared: The minimum level of data sharing expected would be that of making the data available in the natural format in which they were created, along with documentation and metadata, according to the standard accepted procedures within the scientific field. Where the data are likely to be in great demand by others it may be appropriate to request resources for a more proactive approach to data sharing, which maximises opportunities for cross linkage with other sectors.
- Specify and justify any resources required to preserve and share the data. Wherever possible, data management should make use of existing skills and capabilities. However, justification should be made for any additional specialist staff (or training for existing staff) needed within the grant to enable the research team to manage, preserve and share data effectively; and for any computational facilities needed to manage, store and share the data generated by the research.