Linkage to routine health and social records

A person with red hair sat at a desk facing two computer screens. Once screen has a graph on it and the other has a spreadsheet

The Project to Enhance ALSPAC through Record Linkage (PEARL)

The ALSPAC dataset is extensive and broad. However, to further strengthen the platform ALSPAC is undertaking to link our self-reported data with external data sources. Some of these linkages are in place and the data are readily available for use; others are in development or the data are available but require bespoke access arrangements.

Linkage activities are led by Professor John Macleod and the 'Project to Enhance Aetiological research through Record Linkage' (PEARL) team. PEARLs current activities, relating to ALSPAC, include:

1. The Project to Enhance ALSPAC through Record Linkage (PEARL).

Funded by the Wellcome Trust (£1.8m 2009-2018, led by John Macleod).

This (original) PEARL project has secured the legal basis for, and has subsequently extracted ALSPAC index participants health and social administrative records. Further more, the award supported the development of ALSPACs linkage governance structure (including: developing the ISO27001 accredited PEARL Data Safe Haven, which includes ALSPACs UK Secure eResearch Platform; a secure research envrionment developed by the MRC Wales Farr Institute).

2. Enhancing Environmental data Resources in Cohort Studies: ALSPAC exemplar (ERICA).

Funded by the Natural Environment Research Council (NERC) and Medical Research Council (MRC), (~£100k, 2017-2018, led by Andy Boyd).

ERICA aims to establish generalisable mechanisms for linking natural environment records into longitudinal population databanks established by cohort studies. With our partners at the Small Area Health Statistics Unit (SAHSU, Imperial University) and the Earth Observation Science Group (Leicester University) we will scope data science and governance issues relating to linking spatial data. We will also conduct an exemplar study evaluating the association of in utero and early life NO2 exposure and later health outcomes.

3. Cohort & Longitudinal Studies Enhancement Resources (CLOSER).

Funded by the Economic and Social Research Council (ESRC) and Medical Research Council (MRC), ( 2012-2017, linkage work packages led by Andy Boyd).

CLOSER aims to maximise the use, value and impact of cohort and longitudinal studies ( Andy Boyd is a member of the CLOSER leadership group and leads workpackages which: 1) are developing novel mechanisms for disclosure control in cohort and longitudinal data sets; 2) faciliatating new linkage mechanisms through coordinated communications with data owners; and, 3) developing harmonised standards for processing NHS Hospital Episode Statistics records.

Please contact the linkage team ( if you are interested in the existing data, developing new linkages, linkage methodologies or collaborative projects.

Links undertaken

  • NHS Primary (GP) and Secondary Care (Hospital) records have been extracted for a sub-set of the ALSPAC index children. These data are currently being processed but enquiries are welcome
  • ALSPAC have linked the index children to the Clinical Practice Research Datalink (CPRD) database. As CPRD only contains information on a sub-sample of the English population only an equivalent proportion of ALSPAC participants data is available (currently about 5%). Please see the paper by Rosie Cornish for more information
  • Geo-spatial linkages. ALSPAC are able to link participants to residential address across the lifecourse. Linkages can be established to co-ordinates or health, political and administrative goegraphies. In turn we can link participants to neighbourhood data including Indices of Multiple Deprivation and environmental measuring data
  • National Pupil Database (NPD) - including the ‘key stage’ (attainment) results for each ALSPAC study child (ages 7,11,14 and 16)
  • Pupil Level Annual School Census (PLASC) and Annual School Census (ASC)
  • We have previously been provided with data relating to Cancer Registry entries and death notification and cause of death from The Office of National Statistics. We now receive death notifications and cause of death from NHS Digital and are looking to re-establish Cancer Registry entries via NHS Digital once available. There are restrictions on how these data can be accessed and used.
  • NHS STORK data (midwifery database).

Bespoke linkages

The ALSPAC data linkage team have extensive experience of negotiating access and undertaking linkage to data sets that are either non-routine or are not routinely centralised. To discuss these possibilities please get in touch with the linkage team:

Links under development


We are working to expand our coverage of the index children's NHS primary care (GP), secondary care (HES – hospital records), and community care (mental health and learning disability) records.


Funding from the Department for Business, Innovations and Skills has enabled ALSPAC to link to the index children's Higher Education records (from the HESA database). Currently we are still in negotiations with the relevant data owners as to how these can be made available to the wider research community. We welcome enquiries about this dataset

Criminal convictions and cautions

Links to the young person’s criminal conviction and caution data are being developed.

Financial benefits, earnings and employment data

The potential to link to the young person’s benefit, earnings and employment data is being explored.

Linkage methodologies

The ALSPAC data linkage team are actively involved in furthering the development of data linkage methodologies - with a particular focus on how these interact with, and can benefit, cohort studies. Current projects include: developing and assessing statistical approaches to de-identify and anonymise data; and developing anonymised protocols for data linkage.

We welcome any enquiries about this work or to discuss the development of record linkage in observational studies.

Edit this page