CLOSER’s most recent workshop (September 9th 2015) focused on the insights to be gained by comparing findings from different cohort studies, with a particular emphasis on data harmonisation.
These comparisons allow the findings from one study to be tested and replicated, and more robust conclusions to be reached – but they can be challenging to accomplish because of differences between studies. To take just a few examples:
- Height and weight may be measured in some studies but self-reported in others.
- Studies vary in how they collect information about income – weekly, monthly or annual; gross or net; individual or household; precise amount versus an income band.
- Tests for measuring physical or cognitive capability need to be adjusted as study participants get older.
During the day we heard from a range of speakers about their own experiences of harmonising data across studies in order to allow cross-cohort comparisons. Examples included CLOSER work harmonising BMI across five cohort studies, which has shown that children born since 1990 are up to three times more likely than older generations to be overweight or obese by the age of 10.
Common themes throughout the day included:
- The level to which data need to be harmonised depends on the scientific question under investigation and the studies included. Different levels of harmonisation will be appropriate depending on whether the aim is to compare means or prevalence estimates, or whether it is to investigate associations between a risk factor and an outcome.
- The documentation and meta-data of harmonised datasets are vital. Even where harmonised variables or datasets exist, researchers need to be able to consider whether the data are acceptable for their specific scientific question.
- Harmonisation is an iterative process in practice. The initial conceptual model of harmonisation often has to be modified in practice once data are obtained. The input of someone with expertise in the specific variables being harmonised is very useful.
- External data sources can be helpful. Existing data from surveys can be used to check harmonised variables, and results from calibration studies comparing measurements using different machines or tests can be very useful.
- Checking the robustness of scientific findings is important. Sensitivity analyses should be used where possible to check the sensitivity of the conclusions to decisions made during the harmonisation process.
These issues don’t only apply when comparing different longitudinal studies. Comparing different waves within the same longitudinal study can involve data harmonisation, as can comparing different cross-sectional studies.