When deciding where and how to store research data, there are a number of important things to consider. These include:
- How secure do the data need to be?
- Can the data be deposited without access restrictions, or will they need to only be accessed securely?
Data backup strategies
- Are there automatic backup strategies to protect against data loss?
- Where could you store a backup of the raw data files to protect the files from being overwritten?
- Encryption, access controls, and protection for sensitive data
Can sensitive data be accessed by other users?
- What type of encryption and access controls will be applied to more secure data?
- How will other users gain access e.g., following an application, only through a trusted research environment?
Organisation and accessibility
- How accessible are the data to other users?
- Is the repository easily found and accessed?
- Are the folder structures easy to navigate and conducive to programming during data processing and analysis?
- Can the storage method cope with a larger amount of data if more data are collected?
Compliance with ethical and legal requirements
- Does the storage method comply with data protection regulations and other regulatory requirements?
Types of storage
There are various types of data storage that can be used for research data, each with their own strengths and limitations.
Specific legal requirements may apply for the storage of data, depending on where you and the data are based, especially where personal data are concerned—be sure to check the data protection laws that apply to your area. For example, see the Data Protection Act in the UK and the General Data Protection Regulation (GDPR) in the European Union.
Local storage includes for example, personal computers and external hard drives.
This provides direct control and quick access to the data, but has limitations in terms of capacity, vulnerability to data loss and data sharing.
Networked storage includes, for example, shared network drives or servers and network-attached storage devices within an organisation.
This provides centralised access for the organisation and can be scalable. However, it is limited for data sharing because each individual requires access to that specific network.
Cloud storage includes remote servers managed by third party-providers such as Dropbox, Google Drive, Microsoft OneDrive and Amazon Web Services. This provides scalability, accessibility and backup capabilities.
However it has limitations with regards to data privacy, data security and terms of service from the provider.
It is important that you check the location of the cloud storage server, because cloud storage servers located outside the European Union (EU) do not have to comply with the EU’s General Data Protection Regulations (GDPR). Many cloud servers outside of the EU have created data storage protocols that align with GDPR and are safe places to store your data, but it is important to check these with the service provider.
If you plan to store data in a cloud server and your project requires you to protect data in line with GDPR regulations, you need to ensure the cloud server complies. This applies to personal data collected in any Horizon projects and any other personal data shared with EU colleagues.
Data archives and repositories include dedicated platforms for storing and sharing research data such as the UK Data Service.
These often adhere to specific standards and guidelines to ensure data discoverability, accessibility, security and long-term preservation.
The term research data centre is mainly used in North America.
These centres — or other specialised facilities or institutions that offer secure and long-term storage solutions for research data — can provide scalability, security, and regulation compliance. These often manage access controls and data sharing e.g., providing secure access to administrative and/or confidential data.
Alternatively, researchers may choose to use a combination of storage solutions for different types of data or for different purposes.