Data Management Plan for the Physical Sciences
This is a Data Management Plan (DMP) tailored for the Physical Sciences. In each section you can find questions to help you think about what data you will produce as part of your research and the steps to take to ensure that you keep your data safe and to make it usable for yourself and others in the future. Guidance information is provided to help. The DMP should be a living document that you update as your research evolves; you may not know the answers to all the questions at the beginning, and it is expected that your plans may change over time. The DMP gives a record of how you capture, use and protect your research data throughout the life cycle of your project.
This Data Management Plan is a work in progress, we invite collaboration from the community on designing a DMP template for the physical sciences that meets the needs of the community. This DMP is based on the chemistry-specific template for chemistry produced by NFDI4Chem available on Zenodo. We have attempted to ensure that the template covers the main questions and themes that should be captured in a DMP according to the Digital Curation Centre(DCC) which can be found in their Checklist for a Data Management Plan.
See Data Management Plan Example for Physical Sciences, for an example of a completed version of this DMP.
About this Data Management Plan
What is the ID of this data management plan?
Guidance: Enter a meaningful identifier for this data management plan. Your funder or institution may have guidance you should follow for this identifier.
Who owns this data management plan?
Guidance: Enter the name of the owner(s) of this data management plan. For a department or research group, the owner may be the Principal Investigator (PI). For a small or individual research project, the owner may be a doctoral or post-doctoral researcher. Include ORCID IDs.
What updates have been made to this data management plan?
Guidance: When the plan is first completed, add the name of the person who created the plan. When updates are made to this data management plan, add a new row to the table, and add the names of the contributor(s), the date the update was made, and a short description to describe the update.
Contributor Name | Date | Short description of update |
---|---|---|
Who has reviewed this data management plan?
Guidance: When this data management plan has been reviewed, add the reviewer’s name and the date of review to the table.
Reviewer Name | Review Date |
---|---|
Who can this data management plan be shared with, and can they reuse the content?
Guidance:
It can be helpful for others to view the content of a data management plan, either to help them understand the research project and its associated data, or to help them complete their own data management plan. Consider whether there is any confidential information in the plan and whether you want others to be able to use the content for their own research.
- Supervisors / PI can view this DMP
- Supervisors / PI can reuse the content in this DMP
- Collaborators can view this DMP
- Collaborators can reuse this DMP
- My research group can view this DMP
- My research group can reuse the content in this DMP
- Other researchers in same institution can view this DMP
- Other researchers in same institution can reuse the content in this DMP
- This DMP can be viewed by anyone
- The DMP can be reused by anyone
- Others (Please specify):
About the Research
What is your research project and what data will be collected?
Guidance: At a high level, describe your research project and the kinds of data you will be collecting. For new data, explain the disciplinary gap that you have identified and how the data you collect will be useful to the discipline. If you are reusing existing data, state that here. For new or existing data specify how the data are relevant to your project, and how you intend to use them. Enter information about funding, for example a grant reference number if applicable.
Who has contributed to this research project?
Guidance:
It is important to keep a record of contributors to ensure they get proper recognition and to help know who to contact with any questions about the data. Information recorded should include contributor names and their affiliations at the time of contribution, their responsibilities, and when they were active in the project. Adding ORCID identifiers ensures that correct attributions can be given to each of the contributors. It can be useful to keep a Contributor Record updated as new contributors join the project.
Complete a Contributor Record for Research in the Physical Sciences for your research.
Which regulations, policies, and guidelines are relevant to your research?
Guidance: There are likely to be many different regulations, guidelines and policies that you must adhere to during your research, including those related to research data such as data protection, ethics, and data retention. Select those that are relevant to your research data from the list below and add any additional options as required. Add any others, including formal standards that you will adopt.
- Data Protection Policy/Data Management policy of your institution
- IP/Copyright policy of your institution
- Open Access Policy at your institution
- Open Access policy of your funding body
- Data sharing policy of your funding body
- Research data management policy of your funding body
- Scholarly society (e.g., RSC) recommendations
- Departmental data management guidelines
- Ethical Approval for collecting or re-using data (see Do I need ethical approval? for examples of situations that require ethical approval).
- Acquisition of project and personal license for animal work
- Research data management policy of your funding body
- Alignment with HTA for primary cell work
- Health and Safety requirements for sample storage and work being done
- Others (Please specify):
About the Data
Which of the following kinds of research data are you creating or collecting as part of your research?
Guidance: In your research project you will be generating new data, for example, observation data, data captured automatically by an instrument, data generated through analysis of existing data, or data generated through simulations and other computational techniques. It is also important to capture any documentation of your research. Select the kinds of research data that are likely to be produced and need to be managed as part of your research.
- Observational/experimental (raw) data
- Data package collected from facility equipment
- Calibration/instrumental data/sensors
- Processed data
- Analysed data
- Scripts/code/software environments
- Electronic or physical notebooks
- Reported data e.g., Radioactive materials, animal use, restricted materials such as medications
- Methods or Protocols
- Simulation or analysis using secondary data
- Large data sets (above 100Gb)
- Photographs/videos (e.g. recordings of protocols or results / Micrography)
- Image based materials e.g., diagrams, graphs, maps, decision trees
- Qualitative data (e.g., surveys and interviews)
- Reports, presentations, whitepapers and other outputs
- Others (Please specify):
How are you reusing existing data in your research?
Guidance: There are many situations where data can be reused, for example to validate code, for machine learning purposes, or to reuse previously published methods. It is important to ensure that you have the legal right and ethical approval to reuse the data within your project, and that appropriate attributions are given when you publish your own research. Select which of the following data types that you will use as part of your research and complete the table with information about the specific sources as you use them.
Where did the data come from? (DOI/Citation) | Are there any restrictions on the use of the data? (License if available) | |
---|---|---|
☐ Raw data from a public source (e.g. diffraction data or spectroscopic data hosted in a repository) | ||
☐ Data generated by a collaborator/colleague from a large-scale data collection from a facility | ||
☐ Conditions and parameters for simulations | ||
☐ Reference data | ||
☐ Molecular models (e.g., Coordinate files, xyz) | ||
☐ Calibration data | ||
☐ Code/scripts made by others and openly available | ||
☐ Electronic notebooks and other records of your experiments | ||
☐ Image data | ||
☐ Methods, protocols, and workflows from a previously published source | ||
☐ Qualitative data | ||
☐ Other data (Please specify) |
How are you organising and naming your research data files?
Guidance: It is helpful to define how you will organise and name your research data files at the beginning of your project so that they are easier to work with, find and understand in the long term. Describe your research data file organisation and naming strategy in the box below.
What raw data formats will be produced or used during your research project?
Guidance: Processing of raw data to convert data format can sometimes result in loss of information from the original. Keeping the raw data ensures that there is no data loss and new techniques can be applied to the data later. For this reason, raw data should be retained in the original file format if the file size allows. Select which raw data formats will be produced or used in your research.
- File types associated with diffraction data (eg. .dif, .raw, .dat, .rfl):
- File types associated with NMR spectroscopic data (eg. .fid, .procpar):
- File types associated with Mass spectrometry data (eg..ms, .raw, .baf, .wiff):
- File types associated with Spectroscopic data (eg. .fid. .raw, .ms):
- File types associated with Chromatography data (eg. .ch, .ms, .uv):
- File type associated with modelling/simulation (eg. .xyz, .mol2, .cif):
- Other formats for images, photos, and videos:
- Self-developed formats:
- Other data formats (Please specify):
How will your research data be processed?
Guidance: Raw experimental data often needs to be processed or transformed to make it more meaningful or easier to analyse. Select the data processing steps that are applicable to your research from the list.
- Data extraction
- Integral transformation (e.g., Converting time-domain data to frequency-domain data or analysing vibrational frequencies in spectroscopy)
- Transformation for simplification
- Data coding
- Data cleaning (e.g., normalisation, baseline correction, smoothing, filtering, calibration, scaling)
- Data integration (e.g., combining data from multiple sources or experiments, aggregation of data sets)
- Data enrichment
- Data annotation or mark-up
- Validation and data review
- Transformation into standard file formats
- Visualisation
- Others (Please specify):
Data Documentation
How are you documenting your research data to make them easy to understand?
Guidance: The notebooks that you create to record your research provide some of the information needed to make your research data easy to understand, but you can also make use of the automatic capture of metadata to provide context to your research. For example, electronic laboratory notebooks can often automatically capture metadata about your experiments and data files.
These are some examples of metadata that you should be capturing to document your research:
- Technical: Calibration data, experimental conditions, Instruments and other hardware used, materials (e.g. batch number/ clone/ chemical compounds)
- Structural: Links to data/filenames
- Administrative metadata: Experiment and sample IDs, software and scripts
- Descriptive: Personnel involved (experimenter, technician etc), links to background information and literature DOIs, experiment dates and times
- Preservation metadata: Chosen vocabulary or standards, file formats
- The following named electronic laboratory notebook will be used. Specify which metadata is automatically captured and which metadata will be recorded manually:
- An electronic or paper-based laboratory notebook is used, in which the following metadata are documented:
- No laboratory notebook is used, but the experiment record and the following metadata are captured in (specify where):
Question: How will you maintain a link between the data and the notebooks that document how they were created?
Guidance: It is important to ensure that the conditions and context under which research data were created can be retrieved. The details of the why and how of the experiment are usually documented in a notebook, but the data is often stored in a different location. It is therefore important that metadata is used to define the links between the experiment record and the research data.
- I will upload my research data directly to the following electronic laboratory notebook:
- An electronic laboratory notebook book will be used, and I will link it to the data inventory as described:
- A paper lab book will be used, and I will link it to the data inventory as described:
- No laboratory notebook is used, but I will link the experiment record to the data inventory as described:
How will you maintain a link between physical samples and the notebooks that document how they were created?
Guidance: It is important to ensure that the conditions and context under which physical samples were created can be retrieved. Physical samples cannot be stored in a notebook, so it is important that appropriate metadata is recorded to define the links between the experiment record and the physical samples, through use of persistent locator IDs for samples.
- I will use the following naming convention for my samples that will be referenced in my laboratory notebooks:
- I will use the following Laboratory Information Management System (LIMS) to link between my notebook and physical samples:
- I will use the following cheminformatics tools:
- I will use Barcodes and the following identification methods:
- I will use the following sample databases:
- Other (please specify):
Which digital methods and software tools do I require to process and analyse my research data? Guidance: The use of digital methods and tools enables physical scientists to collect, analyse, interpret, and derive new scientific insights from experimental data. To make these insights usable by others, it is important to document the software that has been used to generate both the data and the findings from the research. Select which items have been used in your research, or add any missing items, for example workflow tools or code that you have created for the project.
- Code written in a programming language (eg. Scripts written in Python, R etc):
- Software used for research dissemination (eg. LibreOffice, LaTeX):
- Software used for data analysis (eg. NMRPipe, TOPAS, etc):
- Software used for image generation (eg ChemDraw, PyMol)
- Self-developed software:
- Other software:
- Other digital methods:
- Other tools:
Data Quality
Are quality controls in place for your project and if so, how do they operate?
Guidance: Quality controls vary by sub-discipline norms and by institutional requirements. It is important when you publish the research data or findings based upon it that others can trust that the data are reliable and that researchers have adhered to good practices. Select the quality controls in place for your project from the following list.
Select from the following options:
Experiment planning
- Standardisation of procedures and protocols through careful experiment planning and documentation of all steps
- Health and Safety consultation and documentation
- Use and documentation of standardised naming for compounds/techniques/equipment
- Capture of planned data collection in a specific location (data inventory/lab notebook)
Data collection
- Automated data collection (standardised, verified, consistent)
- Calibration of instruments (accuracy/scale)
- Standardised methods and protocols (with clear instructions)
- Multiple measurements, observations or samples
- Expert review
Data verification
- Double checking of data and outlier values
- Correction of errors made during data entry
- Peer review
- Statistical analyses such as frequencies, mean values, dispersion or clustering to detect errors and anomalous values
- Comparison of random data with original data
- Checking for duplicate entries
- Checking for completeness
Digitisation and data entry
- Accompanying notes and documentation on the data
- Detailed labelling of variable and data set names
- Controlled vocabularies, code lists and selection lists to minimise manual data entry
- Validation rules or input masks
- Use of a specially developed database structure to organise data/files
Other (please specify)
Data Storage and Security
What do you estimate the storage requirements for your research data to be by the end of your research project?
- Less than 1 GB
- Between 1 GB and 1 TB
- Between 1 TB and 100 TB
- More than 100 TB
- Not yet determined
How will you store and share your research data during the project?
Guidance: This question relates to storage of data that is either frequently accessed (“hot data”) or accessed periodically (“warm data”). Examples of the former are raw data from recent experiments, live sensor data, and of the latter are latest data from simulations or datasets required for trend analysis over months. Select from the following options for how you will store your research data during the project and please state how that data will be shared with any collaborators, with restrictions on access as appropriate:
- EEA compliant cloud solution (e.g., GoogleDrive, Dropbox, OneDrive):
- GitHub/GitLab:
- Central solution (e.g., SharePoint, network drives and automatic data mirroring:
- Own local server infrastructure:
- Decentralized solution (e.g., servers in the data centre of your university or research institution):
- Own hardware (e.g., laptop, external hard drive):
- Other (Please specify):
Data back-up plan
Selecting suitable data storage and backup solutions is crucial for managing your digital research outputs both during and after a research project. Frequent backups provide protection against unintentional or deliberate data loss resulting from hardware or software malfunctions, human mistakes, virus infections, or malicious cyberattacks. A good practice strategy for backing up your research data is the 3-2-1 rule, where you keep at least three copies of your data in two different storage formats (e.g., physical hard drive and cloud storage, or USB and DVDs) with at least one of those copies in a geographically different location. It is also important to consider how often you will back-up your data and to ensure that the data is actually retrievable from your back-ups.
How will you secure your research data during your project?
Guidance In addition to ensuring that data is protected against loss due to storage media failure or accidental overwrites, it is important to ensure that data that may be classed as ‘sensitive’ is stored in the correct location and can only be viewed with appropriate authentication. In the table, describe any restrictions imposed on your research data that may require you to regulate access and describe the tools or processes you use to control that access. What are the risks to data security and how will these be managed?
Sensitive data | Restrictions required | How is access managed? |
---|---|---|
☐ Data collected as part of an externally funded project | ||
☐ Personal data collected, therefore subject to GDPR | ||
☐ Other (please specify) |
Data Reuse, Sharing, and Publication
How will you manage ethical issues?
Guidance: Ethical issues affect how you store data, who can see/use it and how long it is kept. You should show that you are aware of any issues and have planned accordingly. If you are carrying out research involving human participants, you must also ensure that consent is requested to allow data to be shared and reused.
- There are ethical considerations for storing and sharing my data, and I have gained consent for data preservation and sharing through an ethics review board. Describe how you plan to accommodate any changes to allow sharing of sensitive data.
- There are no ethical concerns for storing and sharing my data
What advice has been sought on legal aspects of your research data, including ownership, use, and copyright law?
Guidance: There are many different legal and ethical aspects to consider before the sharing and publication of your research data, especially where your data contains personal information, is subject to a non-disclosure agreement, or you have used existing data sets. You can find guidance about the legal aspects of scientific research data and copyright at the following links:
- [Copyright for researchers](Copyright for researchers)
- FAIRmat Guide to legal aspects in Research Data Management
- Introduction to Intellectual Property Rights in Data Management
- I have consulted with my institution on how to manage any legal or copyright issues related to my data
- I plan to consult with my institution on how to manage any legal or copyright issues related to my data
- I am currently unaware of any legal or copyright considerations for this project
- I have consulted with my institution and identified no copyright considerations for this project
Which of your research data may be of most use to other researchers?
Guidance: Primary research data can often be reused by other researchers for a variety of reasons:
- Validation and reproducibility: to verify and reproduce the findings of the original study.
- Environmental and ethical reasons: to minimize environmental impact and reduce the need to use scarce or hazardous materials.
- Use by other disciplines: physical sciences data is very useful for other domains including biomedical research, environmental science, energy sector, nanotechnology, physics, agriculture, manufacturing and industrial design, computational and data science, engineering, and forensics.
Describe which of your data might be reused for the selected purposes and who (roles and disciplines) might be able to reuse it:
Selection | Which data might be reused and who might use it? |
---|---|
☐ My research data could be reused by researchers in the following disciplines: | |