Skip to main content

Research Data Management

When you are doing research, good data management practices and transparency are essential. This toolbox provides practical information and guidelines for both PhD students and researchers when working with research data.

Data definition

Data can be many things and in some cases metadata (data about data) can also be data to somebody else. For practical purposes data can be defined as follows:

“The data, records, files or other evidence, irrespective of their content or form (e.g. in print, digital, physical or other forms), that comprise research observations, findings or outcomes, including primary materials and analysed data.” (Monash University Research Data Management Procedures: HDR candidates).

Research data come in different forms: measurement data, pictures, geographical data, models, chromatograms, surveys. Sometimes even metadata (data about data) can be another researcher's data. It is also common practice to make the distinction between Qualitative and Quantitative research in connection with this. Quantitative data are data that can be quantified and verified, and are amenable to statistical manipulation. Quantitative data define whereas qualitative data describe. Qualitative data approximate or characterize but do not measure the attributes, characteristics, properties, etc., of a thing or phenomenon.

For practical purposes, when we are talking about (research) data we are specifically talking about digital data. However, in your Data Management Plan, you should also consider how you will store and archive physical data, such as paper questionnaires, samples and models, if any.

ata that can be quantified and verified, and is amenable to statistical manipulation. Quantitative data defines whereas qualitative data describes.

Read more: http://www.businessdictionary.com/definition/quantitative-data.html

Data Assets

There are many types of digital data that may be used at different phases during research. Depending on the type data it may be saved or used for different purposes. Examples are:

 

Data Stage Dataset description Type of data Format
Raw data Interviews Audio files MP3
  Spectographic analysis Text files CSV
Processed data Transcription of interviews Word files Docx
  Data spreadsheet SPSS files SAV
Analysed data Regression graphic Photoshop files PSD
  Data table Word file Docx
Other Poster presentation Powerpoint PPS
  Project Website HTML HTM

What is a Data Management Plan?

A Data Management Plan (DMP) serves as a guide for all types of research and data collection. It assists the investigator in planning the data collection and organisation, and also helps in paying attention to certain regulations, law, licenses, ethical guidelines, etc.

Most data management plans have several sections like: Project description, Planning, Costs, Method of data collection and/or re-use, Data assets (description and documentation), Storage, Access (terms and conditions for sharing during and after research, which influences e.g. respondent consent forms, usage licenses, consortium agreements), Archiving, Ethical & Legal framework, Support.

[ read more about national criteria for DMPs (in Dutch) ]

If a research project involves multiple partners from different organisations it may be necessary to draw up a specific separate Consortium Agreement. In a DMP all partners that are involved in the collection, handling etc. of the data should be mentioned.

The  DMP is a working document and needs to be updated over the course of the project whenever significant changes arise, such as (but not limited to):

  • new data
  • changes in consortium policies (e.g. new innovation potential, decision to file for a patent)
  • changes in consortium composition and external factors (e.g. new consortium members joining or old members leaving).

The DMP should be updated in time with the periodic evaluation/assessment of the project, preferably also in between evaluations. If there are no other periodic reviews foreseen within the grant agreement, then such an update needs to be made in time for the funder's final review at  the  latest.  Furthermore,  the  consortium  can  define  a  timetable  for review in the DMP itself. Basically, the DMP should be updated whenever something meaningful changes or a significant milestone is reached.

DMP Elements

Project description:

A description of the project and the data to be collected, focussing on the methods of data collection, the instruments used in data collection, the period of and location in which data collection takes place, and the names of the data collectors, and other stakeholders involved;

Planning:

Describe the stages of your research project. This helps in staging your data journey in different transformations, for example from data collection and raw data to processed data to analysed data.

Data assets description:

For each stage you will probably have different forms of data, in different formats, for different purposes, with different input and stakeholders involved (eg. respondents, patients, instruments, group members, etc). Describing your data assets (see above for examples) will help you to frame the other aspects of the data management plan. Deciding on the format and estimating the size of the data is important for deciding what storage you need.

Ethical & Legal framework

For each data asset describe the data classification; does it involve data concerning individual natural persons, corporate or governmental secrets, animal testing, sensitive data, etc. Describe what laws and regulations apply to that data asset. This will help you to select the right storage, archiving and data transfer solutions later on.

Method of data collection and/or re-use;

For each data asset describe the methods, standards and/or protocols used to collect and analyse the data. By describing the data provenance it will increase data trust, in cases when your data need to be inspected (eg. reviewers of your paper) or someone else needs to take over the project.

Storage (during research) and Access:

For each data asset include a description of data storage & transfer solution during the project (e.g. local/cloud; closed/open environment), and data backup. Also who can access what data asset during your research, who can grant access, how will access be granted, and under what conditions (public domain, attribution, secrecy, citation, co-authorship, etc). Both the storage solution and the access conditions depend on the data classification.

Archiving (after research) and Access:

For each data asset include a description of the data archive solution (trusted digital repository) that is/are used. Also who can access what archived/published data asset after your research, who can grant access, how will access be granted, and under what conditions (public domain, attribution, secrecy, citation, co-authorship, etc). Both the archive solution and the access conditions depend on the data classification.

Costs:

For each data asset describe the material costs and the labour costs involved for storage during and archiving after research.

Templates for Data Management Plans

There are many forms and formats for Data Management Plans and many funders have their own template. You can find information and support on creating your Data Management Plan here:

  • On VUnet we offer :
    •    Guidance information and examples and references to specific VU support & services
    •    Templates from funders, including but not limited to NWO, ZonMW, or EU-H2020
  • A matrix of funder requirements is available here
  • More templates can be found in the DMP online tool. this is an advanced tool that allows you and others to collaborate when creating and editing a plan. The tool also allows you to invite experts or data stewards to have a look at different sections and create comments. You can register using your VU email address.