Basics of research data management

Research data management refers to the process of handling and organising research data throughout the lifecycle of a research project.

What is research data?

Research data includes any data collected for research. This can take many forms, from raw data collected in the field to processed data used for analysis.

Responses to a questionnaire completed as part of a survey
Transcripts or recordings of interviews
Biological samples
Measurements or tests carried out by research or medical professionals
Cleaned or derived datasets from an archive

Longitudinal population studies collect large amounts of research data of many different types for each wave (or sweep) of a survey. This creates very large datasets across multiple data files at each timepoint.

Why is data management important?

Effective research data management has many benefits to researchers and to the wider research community, funding bodies, and those taking part in research. These include:

Data management for longitudinal population studies and secondary data

Good data management is crucial for all types of data, but particularly when dealing with complex and large datasets, such as longitudinal population studies (LPS).

When using data from these types of studies for your research, some aspects of research data management are particularly important:

As a researcher using the data from LPS, you are not typically involved in the data collection, preservation, and sharing stage. However, research data management principles are still important.

Good research data management helps when processing the data and preparing it for analysis, when carrying out the analysis, and when sharing the outputs of the research e.g., the analysis syntax and results.

More information on data management in practice available in the Training Hub:

Analysing your data

Preserving & sharing your data

Activities throughout the Research Data Lifecycle

Research data management involves many different tasks and takes place throughout all of the stages of a research project, from planning to analysing data.

The Research Data Lifecycle framework describes the stages that research data goes through during a research project. Data management is a key part of all these stages.

Even if you did not collect or produce the data that you are using for your research project, data management is still an important part of your project. The ‘Data management in practice section provides further guidance on each step below:

Understanding metadata

Metadata is a fundamental part of documenting, sharing, and using survey data. Check out the CLOSER Learning Hub module Understanding Metadata module for a full introduction to metadata, to understand why it’s important and how to use metadata to identify relevant variables.

Why good data management, open science, and working reproducibly is beneficial:

Article

Five selfish reasons to work reproducibly

Five reasons why working reproducibly pays off in the long run and is in the self-interest of every ambitious, career-oriented scientist, by Florian Markowetz (2015).

Article

PLOS BIOLOGY: Open science challenges, benefits and tips in early career and beyond

There are great benefits but also significant challenges in the movement towards open science. By Christopher Allen and David M. A. Mehler (2019).

Article

eLife: Point of View: How open science helps researchers succeed

Literature review demonstrating benefits of open science, by Erin McKiernan et al. (2016).

Video

UK Data Service: Research data management

Well organised, well documented, preserved and shared data are invaluable to advance scientific inquiry and to increase opportunities for learning and innovation.

Basics of research data management

What is research data?

Why is data management important?

Data management for longitudinal population studies and secondary data

More information on data management in practice available in the Training Hub:

Analysing your data

Preserving & sharing your data

Activities throughout the Research Data Lifecycle

Understanding metadata

Why good data management, open science, and working reproducibly is beneficial:

Five selfish reasons to work reproducibly

PLOS BIOLOGY: Open science challenges, benefits and tips in early career and beyond

eLife: Point of View: How open science helps researchers succeed

UK Data Service: Research data management

More training resources on research data management

PRUK UKDS: Introductory Research Data Management Course

PRUK UKDS: Introductory Research Data Management Course

PRUK UKDS: Introduction to Synthetic Data for Longitudinal Data Managers Workshop

How big of a problem is analytic error in secondary analyses of survey data?

Explore the Training Hub

Cross-study research

Data management

Dissemination and impact

Training opportunities

Basics of research data management

What is research data?

Why is data management important?

Increased efficiency

Increased usability

Increased accuracy

Improved data sharing and collaboration

Increased reproducibility

Reduced costs for other researchers

Legal and ethical compliance

Compliance with funder requirements and data principles

Compliance with licensing and copyright

Data management for longitudinal population studies and secondary data

Consistent data formats

Sensible file structures

Appropriate security measures

More information on data management in practice available in the Training Hub:

Analysing your data

Preserving & sharing your data

Activities throughout the Research Data Lifecycle

Planning

Organising and documenting

Storing

Preserving

Sharing

Analysis

Understanding metadata

Why good data management, open science, and working reproducibly is beneficial:

Five selfish reasons to work reproducibly

PLOS BIOLOGY: Open science challenges, benefits and tips in early career and beyond

eLife: Point of View: How open science helps researchers succeed

UK Data Service: Research data management

More training resources on research data management

PRUK UKDS: Introductory Research Data Management Course

PRUK UKDS: Introductory Research Data Management Course

PRUK UKDS: Introduction to Synthetic Data for Longitudinal Data Managers Workshop

How big of a problem is analytic error in secondary analyses of survey data?

Explore the Training Hub

Cross-study research

Data management

Dissemination and impact

Training opportunities