Skip to main content

Data Management Plan for the Physical Sciences

This is a Data Management Plan (DMP) tailored for the Physical Sciences. In each section you can find questions to help you think about what data you will produce as part of your research and the steps to take to ensure that you keep your data safe and to make it usable for yourself and others in the future. Guidance information is provided to help. The DMP should be a living document that you update as your research evolves; you may not know the answers to all the questions at the beginning, and it is expected that your plans may change over time. The DMP gives a record of how you capture, use and protect your research data throughout the life cycle of your project.

info

This Data Management Plan is a work in progress, we invite collaboration from the community on designing a DMP template for the physical sciences that meets the needs of the community. This DMP is based on the chemistry-specific template for chemistry produced by NFDI4Chem available on Zenodo. We have attempted to ensure that the template covers the main questions and themes that should be captured in a DMP according to the Digital Curation Centre(DCC) which can be found in their Checklist for a Data Management Plan.

See Data Management Plan Example for Physical Sciences, for an example of a completed version of this DMP.

About this Data Management Plan

What is the ID of this data management plan?

Guidance: Enter a meaningful identifier for this data management plan. Your funder or institution may have guidance you should follow for this identifier.


Who owns this data management plan?

Guidance: Enter the name of the owner(s) of this data management plan. For a department or research group, the owner may be the Principal Investigator (PI). For a small or individual research project, the owner may be a doctoral or post-doctoral researcher. Include ORCID IDs.

What updates have been made to this data management plan?

Guidance: When the plan is first completed, add the name of the person who created the plan. When updates are made to this data management plan, add a new row to the table, and add the names of the contributor(s), the date the update was made, and a short description to describe the update.

Contributor NameDateShort description of update

Who has reviewed this data management plan?

Guidance: When this data management plan has been reviewed, add the reviewer’s name and the date of review to the table.

Reviewer NameReview Date

Who can this data management plan be shared with, and can they reuse the content?

Guidance:

It can be helpful for others to view the content of a data management plan, either to help them understand the research project and its associated data, or to help them complete their own data management plan. Consider whether there is any confidential information in the plan and whether you want others to be able to use the content for their own research.

  • Supervisors / PI can view this DMP
  • Supervisors / PI can reuse the content in this DMP
  • Collaborators can view this DMP
  • Collaborators can reuse this DMP
  • My research group can view this DMP
  • My research group can reuse the content in this DMP
  • Other researchers in same institution can view this DMP
  • Other researchers in same institution can reuse the content in this DMP
  • This DMP can be viewed by anyone
  • The DMP can be reused by anyone
  • Others (Please specify):

About the Research

What is your research project and what data will be collected?

Guidance: At a high level, describe your research project and the kinds of data you will be collecting. For new data, explain the disciplinary gap that you have identified and how the data you collect will be useful to the discipline. If you are reusing existing data, state that here. For new or existing data specify how the data are relevant to your project, and how you intend to use them. Enter information about funding, for example a grant reference number if applicable.

Who has contributed to this research project?

Guidance:
It is important to keep a record of contributors to ensure they get proper recognition and to help know who to contact with any questions about the data. Information recorded should include contributor names and their affiliations at the time of contribution, their responsibilities, and when they were active in the project. Adding ORCID identifiers ensures that correct attributions can be given to each of the contributors. It can be useful to keep a Contributor Record updated as new contributors join the project.

Complete a Contributor Record for Research in the Physical Sciences for your research.

Which regulations, policies, and guidelines are relevant to your research?

Guidance: There are likely to be many different regulations, guidelines and policies that you must adhere to during your research, including those related to research data such as data protection, ethics, and data retention. Select those that are relevant to your research data from the list below and add any additional options as required. Add any others, including formal standards that you will adopt.

  • Data Protection Policy/Data Management policy of your institution
  • IP/Copyright policy of your institution
  • Open Access Policy at your institution
  • Open Access policy of your funding body
  • Data sharing policy of your funding body
  • Research data management policy of your funding body
  • Scholarly society (e.g., RSC) recommendations
  • Departmental data management guidelines
  • Ethical Approval for collecting or re-using data (see Do I need ethical approval? for examples of situations that require ethical approval).
  • Acquisition of project and personal license for animal work
  • Research data management policy of your funding body
  • Alignment with HTA for primary cell work
  • Health and Safety requirements for sample storage and work being done
  • Others (Please specify):

About the Data

Which of the following kinds of research data are you creating or collecting as part of your research?

Guidance: In your research project you will be generating new data, for example, observation data, data captured automatically by an instrument, data generated through analysis of existing data, or data generated through simulations and other computational techniques. It is also important to capture any documentation of your research. Select the kinds of research data that are likely to be produced and need to be managed as part of your research.

  • Observational/experimental (raw) data
  • Data package collected from facility equipment
  • Calibration/instrumental data/sensors
  • Processed data
  • Analysed data
  • Scripts/code/software environments
  • Electronic or physical notebooks
  • Reported data e.g., Radioactive materials, animal use, restricted materials such as medications
  • Methods or Protocols
  • Simulation or analysis using secondary data
  • Large data sets (above 100Gb)
  • Photographs/videos (e.g. recordings of protocols or results / Micrography)
  • Image based materials e.g., diagrams, graphs, maps, decision trees
  • Qualitative data (e.g., surveys and interviews)
  • Reports, presentations, whitepapers and other outputs
  • Others (Please specify):

How are you reusing existing data in your research?

Guidance: There are many situations where data can be reused, for example to validate code, for machine learning purposes, or to reuse previously published methods. It is important to ensure that you have the legal right and ethical approval to reuse the data within your project, and that appropriate attributions are given when you publish your own research. Select which of the following data types that you will use as part of your research and complete the table with information about the specific sources as you use them.

Where did the data come from? (DOI/Citation)Are there any restrictions on the use of the data? (License if available)
☐ Raw data from a public source (e.g. diffraction data or spectroscopic data hosted in a repository)
☐ Data generated by a collaborator/colleague from a large-scale data collection from a facility
☐ Conditions and parameters for simulations
☐ Reference data
☐ Molecular models (e.g., Coordinate files, xyz)
☐ Calibration data
☐ Code/scripts made by others and openly available
☐ Electronic notebooks and other records of your experiments
☐ Image data
☐ Methods, protocols, and workflows from a previously published source
☐ Qualitative data
☐ Other data (Please specify)

How are you organising and naming your research data files?

Guidance: It is helpful to define how you will organise and name your research data files at the beginning of your project so that they are easier to work with, find and understand in the long term. Describe your research data file organisation and naming strategy in the box below.

What raw data formats will be produced or used during your research project?

Guidance: Processing of raw data to convert data format can sometimes result in loss of information from the original. Keeping the raw data ensures that there is no data loss and new techniques can be applied to the data later. For this reason, raw data should be retained in the original file format if the file size allows. Select which raw data formats will be produced or used in your research.

  • File types associated with diffraction data (eg. .dif, .raw, .dat, .rfl):
  • File types associated with NMR spectroscopic data (eg. .fid, .procpar):
  • File types associated with Mass spectrometry data (eg..ms, .raw, .baf, .wiff):
  • File types associated with Spectroscopic data (eg. .fid. .raw, .ms):
  • File types associated with Chromatography data (eg. .ch, .ms, .uv):
  • File type associated with modelling/simulation (eg. .xyz, .mol2, .cif):
  • Other formats for images, photos, and videos:
  • Self-developed formats:
  • Other data formats (Please specify):

How will your research data be processed?

Guidance: Raw experimental data often needs to be processed or transformed to make it more meaningful or easier to analyse. Select the data processing steps that are applicable to your research from the list.

  • Data extraction
  • Integral transformation (e.g., Converting time-domain data to frequency-domain data or analysing vibrational frequencies in spectroscopy)
  • Transformation for simplification
  • Data coding
  • Data cleaning (e.g., normalisation, baseline correction, smoothing, filtering, calibration, scaling)
  • Data integration (e.g., combining data from multiple sources or experiments, aggregation of data sets)
  • Data enrichment
  • Data annotation or mark-up
  • Validation and data review
  • Transformation into standard file formats
  • Visualisation
  • Others (Please specify):

Data Documentation

How are you documenting your research data to make them easy to understand?

Guidance: The notebooks that you create to record your research provide some of the information needed to make your research data easy to understand, but you can also make use of the automatic capture of metadata to provide context to your research. For example, electronic laboratory notebooks can often automatically capture metadata about your experiments and data files.

These are some examples of metadata that you should be capturing to document your research:

  • Technical: Calibration data, experimental conditions, Instruments and other hardware used, materials (e.g. batch number/ clone/ chemical compounds)
  • Structural: Links to data/filenames
  • Administrative metadata: Experiment and sample IDs, software and scripts
  • Descriptive: Personnel involved (experimenter, technician etc), links to background information and literature DOIs, experiment dates and times
  • Preservation metadata: Chosen vocabulary or standards, file formats
  • The following named electronic laboratory notebook will be used. Specify which metadata is automatically captured and which metadata will be recorded manually:


  • An electronic or paper-based laboratory notebook is used, in which the following metadata are documented:


  • No laboratory notebook is used, but the experiment record and the following metadata are captured in (specify where):

Guidance: It is important to ensure that the conditions and context under which research data were created can be retrieved. The details of the why and how of the experiment are usually documented in a notebook, but the data is often stored in a different location. It is therefore important that metadata is used to define the links between the experiment record and the research data.

  • I will upload my research data directly to the following electronic laboratory notebook:


  • An electronic laboratory notebook book will be used, and I will link it to the data inventory as described:


  • A paper lab book will be used, and I will link it to the data inventory as described:


  • No laboratory notebook is used, but I will link the experiment record to the data inventory as described:

Guidance: It is important to ensure that the conditions and context under which physical samples were created can be retrieved. Physical samples cannot be stored in a notebook, so it is important that appropriate metadata is recorded to define the links between the experiment record and the physical samples, through use of persistent locator IDs for samples.

  • I will use the following naming convention for my samples that will be referenced in my laboratory notebooks:


  • I will use the following Laboratory Information Management System (LIMS) to link between my notebook and physical samples:


  • I will use the following cheminformatics tools:


  • I will use Barcodes and the following identification methods:


  • I will use the following sample databases:


  • Other (please specify):

Which digital methods and software tools do I require to process and analyse my research data? Guidance: The use of digital methods and tools enables physical scientists to collect, analyse, interpret, and derive new scientific insights from experimental data. To make these insights usable by others, it is important to document the software that has been used to generate both the data and the findings from the research. Select which items have been used in your research, or add any missing items, for example workflow tools or code that you have created for the project.

  • Code written in a programming language (eg. Scripts written in Python, R etc):


  • Software used for research dissemination (eg. LibreOffice, LaTeX):


  • Software used for data analysis (eg. NMRPipe, TOPAS, etc):


  • Software used for image generation (eg ChemDraw, PyMol)


  • Self-developed software:


  • Other software:


  • Other digital methods:


  • Other tools:


Data Quality

Are quality controls in place for your project and if so, how do they operate?

Guidance: Quality controls vary by sub-discipline norms and by institutional requirements. It is important when you publish the research data or findings based upon it that others can trust that the data are reliable and that researchers have adhered to good practices. Select the quality controls in place for your project from the following list.

Select from the following options:

Experiment planning

  • Standardisation of procedures and protocols through careful experiment planning and documentation of all steps
  • Health and Safety consultation and documentation
  • Use and documentation of standardised naming for compounds/techniques/equipment
  • Capture of planned data collection in a specific location (data inventory/lab notebook)

Data collection

  • Automated data collection (standardised, verified, consistent)
  • Calibration of instruments (accuracy/scale)
  • Standardised methods and protocols (with clear instructions)
  • Multiple measurements, observations or samples
  • Expert review

Data verification

  • Double checking of data and outlier values
  • Correction of errors made during data entry
  • Peer review
  • Statistical analyses such as frequencies, mean values, dispersion or clustering to detect errors and anomalous values
  • Comparison of random data with original data
  • Checking for duplicate entries
  • Checking for completeness

Digitisation and data entry

  • Accompanying notes and documentation on the data
  • Detailed labelling of variable and data set names
  • Controlled vocabularies, code lists and selection lists to minimise manual data entry
  • Validation rules or input masks
  • Use of a specially developed database structure to organise data/files

Other (please specify)

Data Storage and Security

What do you estimate the storage requirements for your research data to be by the end of your research project?

  • Less than 1 GB
  • Between 1 GB and 1 TB
  • Between 1 TB and 100 TB
  • More than 100 TB
  • Not yet determined

How will you store and share your research data during the project?

Guidance: This question relates to storage of data that is either frequently accessed (“hot data”) or accessed periodically (“warm data”). Examples of the former are raw data from recent experiments, live sensor data, and of the latter are latest data from simulations or datasets required for trend analysis over months. Select from the following options for how you will store your research data during the project and please state how that data will be shared with any collaborators, with restrictions on access as appropriate:

  • EEA compliant cloud solution (e.g., GoogleDrive, Dropbox, OneDrive):


  • GitHub/GitLab:


  • Central solution (e.g., SharePoint, network drives and automatic data mirroring:


  • Own local server infrastructure:


  • Decentralized solution (e.g., servers in the data centre of your university or research institution):


  • Own hardware (e.g., laptop, external hard drive):


  • Other (Please specify):

Data back-up plan

Selecting suitable data storage and backup solutions is crucial for managing your digital research outputs both during and after a research project. Frequent backups provide protection against unintentional or deliberate data loss resulting from hardware or software malfunctions, human mistakes, virus infections, or malicious cyberattacks. A good practice strategy for backing up your research data is the 3-2-1 rule, where you keep at least three copies of your data in two different storage formats (e.g., physical hard drive and cloud storage, or USB and DVDs) with at least one of those copies in a geographically different location. It is also important to consider how often you will back-up your data and to ensure that the data is actually retrievable from your back-ups.

How will you secure your research data during your project?

Guidance In addition to ensuring that data is protected against loss due to storage media failure or accidental overwrites, it is important to ensure that data that may be classed as ‘sensitive’ is stored in the correct location and can only be viewed with appropriate authentication. In the table, describe any restrictions imposed on your research data that may require you to regulate access and describe the tools or processes you use to control that access. What are the risks to data security and how will these be managed?

Sensitive dataRestrictions requiredHow is access managed?
☐ Data collected as part of an externally funded project
Personal data collected, therefore subject to GDPR
☐ Other (please specify)

Data Reuse, Sharing, and Publication

How will you manage ethical issues?

Guidance: Ethical issues affect how you store data, who can see/use it and how long it is kept. You should show that you are aware of any issues and have planned accordingly. If you are carrying out research involving human participants, you must also ensure that consent is requested to allow data to be shared and reused.

  • There are ethical considerations for storing and sharing my data, and I have gained consent for data preservation and sharing through an ethics review board. Describe how you plan to accommodate any changes to allow sharing of sensitive data.


  • There are no ethical concerns for storing and sharing my data

Guidance: There are many different legal and ethical aspects to consider before the sharing and publication of your research data, especially where your data contains personal information, is subject to a non-disclosure agreement, or you have used existing data sets. You can find guidance about the legal aspects of scientific research data and copyright at the following links:

  • I have consulted with my institution on how to manage any legal or copyright issues related to my data
  • I plan to consult with my institution on how to manage any legal or copyright issues related to my data
  • I am currently unaware of any legal or copyright considerations for this project
  • I have consulted with my institution and identified no copyright considerations for this project

Which of your research data may be of most use to other researchers?

Guidance: Primary research data can often be reused by other researchers for a variety of reasons:

  • Validation and reproducibility: to verify and reproduce the findings of the original study.
  • Environmental and ethical reasons: to minimize environmental impact and reduce the need to use scarce or hazardous materials.
  • Use by other disciplines: physical sciences data is very useful for other domains including biomedical research, environmental science, energy sector, nanotechnology, physics, agriculture, manufacturing and industrial design, computational and data science, engineering, and forensics.

Describe which of your data might be reused for the selected purposes and who (roles and disciplines) might be able to reuse it:

SelectionWhich data might be reused and who might use it?
☐ My research data could be reused by researchers in the following disciplines:
☐ My research data could be reused for educational purposes:
☐ My research data could be reused for machine learning purposes:
☐ My research data could be reused by supplementing others’ datasets:
☐ My research data could be reused by others implementing my methodology:
☐ My research data could be reused by others using my code base:
☐ My research is socially relevant, and the data can be reused by the public:
☐ My research data could be reused for another purpose:
☐ I cannot envision a reuse scenario for my research data:

In what formats will you publish your analysed data (if applicable)?

Guidance: Will you publish or share analysed data in the form of tabular information or images, for example graphs or spectra? The formats that you choose may depend on the characterisation and analysis methods you use and may vary by experiment or sample. If it shared as a composite file, then please state the file type. Leave blank if you cannot share your data.

Are there any restrictions on publishing your data?

Guidance: Many funders and publishers expect research data to be published in a data repository at least at the point you publish your research in a text publication (e.g., a journal article). However, publishing data may not always be possible if it is sensitive. In some cases, an embargo or retention period may be enforced to protect priority of a discovery. Select which of the following best describes how you will publish your data, or any restrictions your data might be subject to and why:

  • The project is open, and data will be published in a named repository and updated throughout the project.
  • The data will be published in a named repository linked to a text publication.
  • The data will be published in a named repository independently then linked to a text publication if appropriate.
  • The data will be embargoed for public use and can only be published after x years (Please specify) ______.
  • The data cannot be published because it contains personal information and cannot be anonymised.
  • The data cannot be published due to security concerns.
  • The data cannot be published for ethical reasons.
  • The data cannot be published for sensitivity reasons.
  • The data cannot be published due to the following legal restrictions:


  • Other (Please specify):


Preservation and long-term data accessibility

Which criteria will you use to decide whether to archive your data for the long-term or to dispose of it?

Guidance: After the end of the research project, decisions need to be made about which research data to keep and how to archive them, and which data can be disposed of. There may be legal requirements and funder or institutional policies to consider. Keeping raw data is likely to be more important than keeping processed data or intermediate forms. Additional guidance on selecting which data to keep is provided by the Digital Curation Centre.

Select from the options below to indicate which criteria you will consider when deciding whether to archive or dispose of your research data.

  • Data relevance
  • Legal and ethical requirements
  • Data retention policies
  • Data sensitivity
  • Novelty of the data/research
  • Size of the data and storage costs
  • Restricted access
  • Data integrity and completeness
  • Reuse potential
  • Format of the legacy / proprietary formats / software availability
  • Organisational / personnel change
  • Community requirement for the data
  • Data removed by data cleaning
  • Intermediate data
  • Environmental impact
  • Data security

How will you archive data that has not already been published?

Guidance: In the UK, the recommended duration for storing research data depends on several factors, including the type of data, legal requirements, and funding agency policies. Where appropriate, data should remain available and retrievable, and the where the data is archived can determine who has responsibility for curating and maintaining the research data. Where data cannot be published, it is appropriate to choose an archive that requires authorisation to access the data.

Provide information about where your chosen research data will be archived (for example institutional repositories, national and international databases (e.g. NIST Materials Data Repository, Materials Project, and NIMS Materials Database, UK Data Service), Data repositories for specific fields (e.g. Materials Data Facility (MDF), Joint Automated Repository for Various Integrated Simulations (JARVIS), and NMR Online Management And Datastore (NOMAD), online code repositories (e.g. GitHub and GitLab), Cloud storage solutions (e.g. Google Drive, Dropbox, and Amazon S3), Generic repositories (e.g. Zenodo, Figshare) and publisher repositories.

Also describe what resources are required to ensure that the data is prepared for archive and describe what actions will be required to maintain the data, so they are still accessible and usable once archived.

Who is responsible for the long-term curation of the research data?

Guidance:

Who are responsible for ensuring that the data is available and retrievable in the long-term? Title; First Name, Last Name; ORCID (if available); Professional Position; Institute; Contact Information; Role in the Project.

Name(s)ORCIDInstitute, contact details, and professional positionRole in the Project

How will you ensure your physical samples are accessible for the long-term and that the linked records reflect that storage?

Guidance: Linking data and physical samples is important for reproducibility and data quality, both during the project and after its end. Sample database or inventory systems are helpful in managing and tracking samples. Describe both how you will store the physical samples you want to keep, and how you will maintain the link between the physical sample and the research data that describes how it was collected or created. What are the costs and risks associated with storing the samples?

In chemistry, long-term storage of physical samples is often done within the research laboratory using refrigerators and freezers. Alternatively, there are several facilities in the UK that handle the long-term storage of samples, these are largely for biological samples, but some also include facilities for storing chemical samples.

Resources and costs

What resources and costs are needed to meet the requirements of this data management plan?

Guidance: What additional resources are required to support this data management plan? This can include technical or IT resources as well as expertise, which may be provided by experts such as data stewards, data librarians, IT experts, or other research support staff, and long-term access to data in ELN S. Provide information on how the areas below will be managed and any costs associated with this.

  • Local infrastructure (e.g., Space on an institutional server, institutional repository):


  • Online infrastructure (e.g., Space in cloud storage, access to an online repository):


  • Additional infrastructure resources required:


  • Hardware (e.g., Laptops, hard drives, tapes):


  • Licensing support (e.g., Support with embargoed data, payment for proprietary software):


  • Long-term support for curation (e.g., Ensuring long-term accessibility of archived data, long-term access to data associated with ELNs)


  • Training of research staff associated with the project:


  • IT Support services:


  • Support from other research support staff:


  • Other (Please specify):

What to do next:

Related links:


About this page

If you would like to contribute content to the PSDI Knowledge Base or have feedback you would like to give on this guidance, please contact us.