Good TREs work

University College London (ucl) projects

Investigating the association between prenatal exposure to air pollution and maternal and child health outcomes
OPTICAL: Optimising Paediatric Transition to Intensive Care for AduLts
Project 3 (partially via "system access")
Database of UK recipients of pituitary-derived human growth hormone
MRC National Survey of Health and Development (NSHD) - tracing
Vivaldi Social Care
The impact of reimbursement schemes on healthcare providers' operational performance (partially via "system access")
A phase III, double blind, placebo controlled, randomised trial assessing the effects of aspirin on disease recurrence and survival after primary therapy in common non metastatic solid tumours ( ODR1718_261 )
Understanding and improving the use of investigations in primary care in patients subsequently diagnosed with cancer (ODR1920_196)
Loneliness among people with 'Complex Emotional Needs' (CEN): A cross-sectional UK study
Modelling impact of interruptions to cancer screening with COVID ( ODR2021_016 )
Stratifying Genomic Causes of Intellectual Disability by Mental Health Outcomes in Childhood and Adolescence (IMAGINE-2) (partially via "system access")
Prevalence, clinical characteristics and impact of body dysmorphic disorder in young people
Educational outcomes in children born after assisted reproductive technology; a population-based linkage study
1970 British Cohort Study - Tracing
Creating synthetic data for health research
Childhood outcomes after perinatal brain injury (Data flowing to ONS)
Research on Health and Ageing using English Longitudinal Study of Ageing (ELSA)
Dose-response relationship between alcohol and suicide
Examining loneliness in people with borderline intellectual functioning compared to the general population and its relationship to mental and physical health outcomes
MR104C - British Women's Heart and Health Study (s251 cohort)
A study exploring the relationships between cognitive and sensory impairments and experiences of abuse and discrimination
Mental disorders and help seeking for mental or physical health conditions in sexual minorities
MR1129 - SCORAD Feasibility Study
MR688 - UK Ductal Carcinoma in Situ (DCIS) Trial
Policy Research Unit for Children, Young People and Families
Investigating the utility of machine learning methods to predict prognosis and guide treatment decisions for people with lung cancer (Lung-ORACLE)
SUMMIT Study: Cancer screening study with or without low-dose lung CT to validate a multi-cancer early detection test (Previously ODR1718_316)
The role of IAPT in the prevention of dementia and the amelioration of its impact on service use and co-morbidities (the MODIFY project)
British Regional Heart Study (BRHS)- data linkage of established cohort to NHS Digital datsets (HES, MHMDS, DIDS)
MR472B - SABRE: Southall and Brent Revisited - Consented participants
MR472A - SABRE: Southall and Brent Revisited - S251 participants not cancer notifiable
MR472 - SABRE: Southall and Brent Revisited - S251 participants
Whitehall II (MR262)
Recording Antimicrobial Resistance during Death Certification in England
MR358 - NATIONAL CHILD DEVELOPMENT STUDY 1958 COHORT
MR104 - Regional Heart Study
Inequalities in cancer care pathways
General Health Outcomes in Subfertile Men: a UK register-based cohort study
Virus Watch: Understanding community incidence, symptom profiles, and transmission of COVID-19 in relation to population movement and behaviour
MR1393 - Join Dementia Research
Evaluation of aid to diagnosis for congenital dysplasia of the hip in general practice: controlled randomised trial
PreHOspital Triage for potential stroke patients: lessONs from systems Implemented in response to COVID19 (PHOTONIC)
MR737 - ESRC MILLENNIUM COHORT STUDY (MCS) child of the new century
AspECT EXceL- Aspirin Esomeprazole Chemoprevention Trial- Section 251 Subset of the Cohort
AspECT EXceL- Aspirin Esomeprazole Chemoprevention Trial- Consenting Subset of the Cohort
Centre for Longitudinal Studies -Next Steps Study - Mortality
1970 British Cohort Study - MR21
Understanding excess child and adolescent mortality in the UK
EVenti: The Prognostic Performance of the Enhanced Liver Fibrosis Test in UK Patients with Chronic Liver Disease Assessed 20 Years After Recruitment to the EUROGOLF study (EVenti).
OLIVE: Improving the early detection of lung cancer in never-smokers
MR1362: Extension of NIC-349413-F1J1N - Next Steps Cohort Study
Centre for Longitudinal Studies - Millennium Cohort Study (MCS)- (Age 17 consent)
Extended follow-up of the TARGIT A Trial
Linkage of NHS Digital data to young people with perinatal HIV, to monitor cancers and deaths.
MR1450 - National Child Development Study (NCDS)
MR1470 - Using routine data to identify and assess clinical outcomes for the STAMPEDE trial: Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy.
Assessing the impact of the COVID-19 pandemic on vulnerable children: the DHSC-ECHILD-COVID study
MR1a - Health and Development Study - Consented Cohort Members
Cancer Registry-wide study in infants with neuroblastoma; Task 11.4 of the ENNCCA Network of Excellence (ODR1516_119)
Millennium Cohort Study (also known as Child of the New Century) - Tracing
Centre for Longitudinal Studies Birth Cohort Studies Data Linkage: National Child Development Study
Centre for Longitudinal Studies Next Steps Data Linkage: Next Steps Age 25 Study
Centre for Longitudinal Studies Birth Cohort Studies Data Linkage: 1970 British Cohort Study
Childhood Outcomes after Perinatal Brain Injury (Data flowing to DfE)
Using Large-scale Routine Data to Monitor and Improve Ethnic Inequalities in Cancer and Cardiovascular Disease ( ODR1920_301 )
UK Early Life Cohort Feasibility Study (ELC-FS)
Advancing Survivorship after Cancer: Outcomes Trial (ODR1819_039)
LAUNCHES QI: Linking AUdit and National datasets in Congenital HEart Services for Quality Improvement.
Camden & Islington Clinical Record Interactive Search (CRIS) Linkage with HES/Mortality Data
Understanding the health needs of mothers involved in family court cases
Assessing the utility of healthcare systems data for trials: data utility comparisons in the STAMPEDE trial (DUCkS)(previously: ODR1718_094)
MR623 - NATIONAL MOTHER AND CHILD COHORT
Family, household and environmental risk factors for hospital admissions in childhood
MR1415 - Application for ONS mortality data to be used for flagging and analysis of the RADICALS trial, which is a large phase III randomised controlled trial for people with prostate cancer (ISRCTN 40814031).
Evaluating the Family Nurse Partnership in England
MR1b - Health and Development Study - S251 Cohort members
Evaluating protocols for identifying and managing patients with FH
MR740 - United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS)
Evaluating variation in special educational needs provision for children with Down syndrome and associations with emergency use of hospital care.
MR1318 - General Health & Hospital Admissions in Children Born after ART; A Population Based Linkage Study
Using national electronic databases to validate cardiovascular outcomes in PATCH a pilot study to assess the use of electronic databases for clinical trial follow up.
The relationship between education and health outcomes for children and young people across England: the value of using linked administrative data.
Mixed methods evaluation of the Getting it Right First Time programme - improvements to NHS orthopaedic care in England
Project 85
Precision in Provision: Predicting Treatment Outcome and Resource Use in Child Mental Health
MR1179 - INFANT study
Variation in Healthy Life Expectancy Throughout Childhood and Adulthood in England
Project 89
Variation in avoidable hospital admissions by mental health status
Usual Care versus Specialist Integrated Care: A Study of Hospital Discharge Arrangements for Homeless People in England
Critical Care Health Informatics Collaborative
MR1213 - Lung-SEARCH: A RANDOMISED CONTROLLED TRIAL OF SURVEILLANCE FOR THE EARLY DETECTION OF LUNG CANCER...
Acute Day Units as Crisis Alternatives to Residential Care
Small area geodemographic profiling of health needs
Project 96
MR104a - Regional Heart Study (Female Cohort)
Project 98
MR1291: Clinical Cohorts in Coronary Disease Collaboration (4C)
Project 100
MR1090 - HiLo: Multicentre randomised trial of high dose versus low dose radioiodine
MR1396 - GALA-5: An Evaluation of the Tolerability and Feasibility of combining 5-Amino-Levulinic Acid (5-ALA) with Carmustine Wafers (Gliadel) in the Surgical Management of Primary Glioblastoma.
Project 103
Project 104
Project 105

8527 data files in total were disseminated unsafely (information about files used safely is missing for TRE/"system access" projects).

🚩 University College London (ucl) was sent multiple files from the same dataset, in the same month, both with optouts respected and with optouts ignored. University College London (ucl) may not have compared the two files, but the identifiers are consistent between datasets, and outside of a good TRE NHS Digital can not know what recipients actually do.

Investigating the association between prenatal exposure to air pollution and maternal and child health outcomes — DARS-NIC-746266-S8M6T

Opt outs honoured: (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2025-10 – 2028-10 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 10 July 2025 final.pdf

Datasets:

Emergency Care Data Set (ECDS)
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Identifiable

Objectives:

University College London requires access to NHS England data for the purpose of the following research project:
The Baby Biome Study

The following is a summary of the aims of the research project provided by the controller:

Air pollution remains one of the leading public health issues facing the UK today. The mortality burden of air pollution in England is estimated to be between 26,000-38,000 a year,1 and, in London, 3,600-4,100 deaths were estimated to be attributable to anthropogenic particulate matter (PM)2.5 and NO2 in 2019.

The importance of air pollution has been highlighted by the Chief Medical Officer in England, who urged the government to take strong action to reduce air pollution and its impact in their 2022 annual report.

University College London's engagement with members of the public, including representatives of mothers and pregnant women, demonstrated that the impact of air pollution on health is a growing concern but awareness and understanding about potential long-term effects of prenatal exposure is lacking.

University College London propose to apply state of the art air pollution assessment to a UK mother-child cohort, the Baby Biome Study, to assess the effects of exposure on multiple pregnancy, birth, and child health outcomes, and further investigate the role of microbial and molecular mediators in this relationship.

The work will provide up-to-date high-quality information to meet the needs of public health policy and address widespread public concern.

The following NHS England Data will be accessed:
Hospital Episode Statistics
Admitted Patient Care Necessary because one of the primary outcomes of this study is respiratory infections, many of which will require inpatient care (e.g., bronchiolitis caused by RSV).
ECDS Necessary because atopic conditions and mild respiratory infections may result in A&E attendances but not hospital admission.

The level of the Data will be:

· Identifiable-necessary to enable linkage of the data with data collected from other sources, including the participants themselves and the testing laboratories contracted to process swab blood and stool data.

The Data will be minimised as follows:
· Limited to a study cohort of 3500 patients who consented to participate.
· Limited to data between 2016 to 2025. For each individual patient, data will only be provided from the date they joined the trial.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

The funding is provided by The Academy of Medical Sciences. The funding is specifically for the study described. Funding is in place until August 2025.

The funder will have no ability to suppress or otherwise limit the publication of findings.

A Public and Patient Involvement and Engagement group helped refine the purpose of the research. The group supported the collection of the data for the purposes described above. PPIE groups representing women from disadvantaged backgrounds were engaged, who are exposed to high levels of air pollution and will help disseminating findings (the Bridge, Global Black Maternal Health, Mothers and Others for Clean Air).

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers around the regulation of air pollution. The benefits this would produce would apply to patients as the findings are expected to inform best practice to improve the care, treatment and experience of health care users, in this case pregnant women, relevant to the influence of pollution on mother and child.

The use of the data could:
help the system to better understand the health and care needs of populations.
lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
advance understanding of the need for, or effectiveness of, preventative health and care measures for particular populations or conditions such as obesity and diabetes.
inform planning health services and programmes, for example to improve equity of access, experience and outcomes.
inform decisions on how to effectively allocate and evaluate funding according to health needs.
support knowledge creation or exploratory research (and the innovations and developments that might result from that exploratory work).

This research has the potential to influence public policy and regulation to the benefit of patients in regards to care around air pollution by central and local governments. For instance, it may help identifying thresholds of air pollution levels that are likely to be harmful to fetus, which may be different from those to children and adults and could change the care offered as a result.

The research may also help identifying strategies to reduce exposure to air pollution that will directly benefit patients, such as offering information and guidance to pregnant women, free travel passes to pregnant women off peak, offering support to the implementation of low/zero emission zones in residential areas or other policies related to decreasing emissions.

The findings are expected to be beneficial for giving advice to pregnant women on how to minimise their exposure to air pollution and adopt strategies to mitigate against its harmful impacts on their health and the health of their children (e.g., wearing masks when outdoors in polluted areas).

More indirectly this research will benefit other researchers who will be able to access the data to explore other hypotheses considering the in-depth characterisation of this birth cohort. For instance, mechanistic studies exploring the role of epigenetics and metabolomics will be possible. These studies can ultimately produce benefits for mothers and children.

It is hoped that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

University College London organised focus groups with PPIE reps from maternal health charities to help design the research, especially with women who experience the worst maternal health outcomes. The group strongly supported the use of maternal and childs health data for research as findings can be very important to support policies that will protect their health and the health of their children.

Outputs:

The expected outputs of the processing will be:
Submissions to peer reviewed journals at the end of the study.
Presentations at appropriate conferences
A database to be utilised as a resource for health research

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Webinars open to policy makers in local councils and Public Health organisations (e.g., UKHSA)
Webinars open to members of the public
Social media
Briefing documents provided to local councils
Press/media engagement.

The Data Controller expect to produce and disseminate outputs at the end of the study approximately 3 years from when the data controller get access to the NHS England Data.

Processing:

University College London will transfer data to NHS England.

The data will consist of identifying details (specifically NHS Number and a unique person ID) for the cohort to be linked with NHS England data.

NHS England will provide the relevant records from the HES datasets to University College London The Data will
contain directly identifying data items including NHS Number, unique person ID & Postcode, which are required to link the Data at record level with data already held by the recipient.

The Data will not be transferred to any other location other than as described below.

The data will be stored on servers at University College London.

University College London uses offsite back-up services provided by VIRTUS Data Centres.

University College London. stores data on the Cloud provided by Amazon Web Services.

The Data will be accessed by authorised personnel via remote access.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations England/Wales. The data will not leave England/Wales at any time.

Access is restricted to employees or agents of University College London who have authorisation from Chief Investigator.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with maternity and demographic data already available for study participants. These data linkages will be completed by NHS number.

Air pollution data will be linked by postcode for the registered address at the time of registration with the study.

Identifiable information will only be accessed by the researcher carrying out the data linkages.

Data will then be pseudonymised prior to analysis.

All subsequent users will conduct analyses using a pseudonymised dataset. Identifiable information will be kept in a separate dataset that will not be accessible to all researchers and will have restricted access with additional password protection.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Researchers from University College London will process the Data for the purposes described above.

OPTICAL: Optimising Paediatric Transition to Intensive Care for AduLts — DARS-NIC-772633-Y3K3T

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2025-10 – 2028-10 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 11th September 2025 final.pdf

Datasets:

Civil Registrations of Death
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
PersonID Bridge File

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) require access to NHS England data for the purpose of the following research project:
Optimising Paediatric Transition to Intensive Care for AduLts (OPTICAL)

The following is a summary of the aims of the research project provided by University College London (UCL):
OPTICAL aims to indirectly improve the quality of intensive care for young people with complex health needs by providing the first detailed understanding of the number and characteristics of patients who transition from paediatric to adult intensive care.
The primary objective of the OPTICAL study is to generate the evidence base for improving the care received and patient/family experience for teenagers transitioning from paediatric to adult ICU services.

The secondary objectives are-
1. To determine the clinical characteristics and healthcare resource utilisation from teenage years to early adulthood of people who used intensive care as a young person, and how these relate to intensive care use after the age of 16.
2. To understand the experience of patients, carers and health professionals of ICU transition, including barriers they face, examples of good practice and suggestions for improving support provided.
3. To establish evidence-based potential improvements in the processes and support for transitioning to adult ICU services, targeted among patient groups as appropriate.
4. Explore the outcomes of Teenage and Young Adult (TYA) admitted to PICUs for mental health (MH) crises in terms of their future need for adult critical care admission.
5. Understand the mental health implications of critical care transition from childhood to adulthood on the individual, their families, and healthcare professionals involved in their management.

To support these objectives, the study makes extensive use of national audit datasets from the Paediatric Intensive Care Audit Network (PICANet) and the Intensive Care National Audit & Research Centre Case Mix Programme (ICNARC CMP). These datasets provide comprehensive, longitudinal data on paediatric and adult ICU admissions, respectively.

The data from NHS England will contribute to objective 1 for data analysis and the findings from the analysis will contribute to objective 3.
Objective 1 and objective 3 do not involve recruiting participants. The participants for objective 1 are identified by
1) The Hospital Episode Statistics Critical Care Dataset which defines the critical care population,
2) The Paediatric Intensive Care Network (PICANet), and
3) The Intensive Care National Audit & Research Centre Case Mix Programme (ICNARC CMP).

The integration of PICANET and ICNARC records with Hospital Episode Statistics Critical Care (HES CC) data enables the identification and characterisation of young people who transition between ICU services, supports the analysis of patterns and risk factors associated with ICU use after age 16, and informs the development of targeted improvements in transition pathways.

Objective 3 does not involve a study population or recruitment. It involves establishing an expert group to review evidence relating to the management of ICU transition. Objective 4 involves sub group analyses of the datasets from objective 1, and additional participant recruitment for work package 2 (which does not involve the NHS England data).

This study aims to produce the evidence base for the development of future national guidelines to improve the care and experience of teenagers transitioning from paediatric to adult ICU services. This involves generating evidence on:
The size and characteristics of the population that require transition from paediatric to adult intensive care post 16
Whether certain conditions make AICU admission more likely
Whether this population can be identified during childhood and the methods to do this
The types and number of hospital interactions these teenagers and young adults (TYA) have including current outcomes
The lived experiences of Teenage and Young Adult (TYA) and their families, and health professionals involved in paediatric to adult ICU transition
The ways in which clinical care and patient experience can be improved

Measuring, reporting and learning from outcomes should drive quality improvement of care but this is particularly challenging for conditions or health problems such as learning disabilities, mental health issues, genetic syndromes, and congenital anomalies. Given the complex care trajectories and experiences of patients with chronic complex conditions, rich datasets and careful multi-disciplinary analysis is required.

The following NHS England data will be accessed:
Hospital Episode Statistics (HES) - necessary to complete the patient trajectories of healthcare use, diagnoses, and treatment
Critical Care (CC) necessary to establish a cohort of patients admitted to intensive care.
Outpatients (OP)
Accident and Emergency (A and E)
Admitted Patient Care (APC)
Emergency Care Dataset (ECDS)
Civil Registrations of deaths necessary to determine loss to follow up and also to include in healthcare analyses

The level of the data will be Pseudonymised.

The Data will be minimised as follows:
- Limited to a study cohort identified by NHS England as meeting the following criteria:
All patients with at least one critical care admission in England aged 14- or 15 years between January 2017 and March 2024 (allowing at least one year of follow up), captured in the Hospital Episode Statistics Critical Care dataset.
- For individuals in the HES Critical Care cohort identified by NHS England, data is limited to 2003/04 up to latest available for HES APC, HES OP, HES A&E and ECDS. For HES CC, data is limited from January 2017 up until latest available.
- Civil registrations mortality will contain latest available data.

UCL is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e): processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j): processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The processing is in the public interest because the research aims to deliver significant benefits to young people transitioning from child to adult ICU, and their families, by generating the crucial evidence that will enable improvement of clinical care and experience at a time of heightened vulnerability. The new evidence will also deliver benefits to the NHS by empowering staff to identify and better support this relatively small but rapidly increasing population, who use scarce and high-cost intensive care resources.

Funding is provided by the National Institute for Health Research (NIHR). The funding is specifically for the project described

The funder(s) will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure back-up of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

Data will be assessed by:
- Substantive employees of UCL; and
- Individuals holding an honorary contract under the supervision of a substantive employee of University College London for the purposes described in this Data Sharing Agreement (DSA) only. University College London must maintain records in a single location that cover the following details of each individual given access under an honorary contract:
Their substantive employer;
Their role in respect of the purpose for the processing specified in the DSA;
The start date and end date of the duration in which the Data will be accessed by the individual under an honorary contract;
The necessity for the Data to be accessed by the person(s) holding an honorary contract, instead of a substantive employee of an organisation named as controller or a processor in this DSA;
Confirmation that an appropriate contract is in place which follows the relevant guidance and is countersigned by the substantive employer of the honorary contract holder.

One parent/carer with first-hand experience of transition to AICU (PPI co-applicant) provided feedback on the proposal and will chair our PPI group. Additionally, UCL's partner paediatric patient charities ICUSteps, WellChild and Mencap will help with objective 2 of the study (e.g. designing online forums). These charities develop and moderate the forums, and representatives from each advise across the study via the PPI group.
For objective 2, the study team and charities will engage with potential participants for the 1:1 interviews, surveys, and online forums for i) Teenage and Young Adult (TYA) who have had a PICU admission ii) family members of Teenage and Young Adult (TYA) who have had a PICU admission. Objective 2 does not involve the NHSE data.

Expected Benefits:

OPTICAL will aim to deliver significant benefits to young people transitioning from child to adult ICU, and their families, by generating the crucial evidence that will enable improvement of clinical care and experience at a time of heightened vulnerability. It is hoped that the new evidence will also deliver benefits to the NHS by empowering staff to identify and better support this relatively small but rapidly increasing population, who use scarce and high-cost intensive care resources.

OPTICAL is focused on the evidence needed to inform practice, with objective 3 designed to ensure that the quantitative and qualitative data collected in objectives 1 and 2 is fit for that purpose, and that the new evidence will be synthesised and used by stakeholders to develop practical evidence-based suggestions for improving transition services.
Importantly, objective 3 will ascertain whether certain improvements can be customised or targeted for particular patient groups in order to ensure that limited NHS resources are focused where they are needed most, maximising patient benefit and value for money.
The researchers expect to enable clear pathways to impact that will maximise the potential benefits of the work by:
* Directly informing updates to the current Intensive Care Society (ICS) / Paediatric Critical Care Society (PCCCS) joint transition guidelines through a co-investigator who is lead author of the guidelines and ICS Council Member and another who is Director of Research at ICS.
* Directly informing quality standards for the care of critically ill children through our co-principal Investigator who is PCCS research group chair and Council Member.
* Incorporating learning from the implementation of transition guidelines in testbed regions through a co-investigator who is paediatric critical care network lead in Yorkshire, where a scheme is being updated based on feedback from clinicians and patients.

Outputs:

The expected outputs of the processing will be:
Academic articles, including the study protocol (published in open-access journals to improve transparency of reporting)
Approximately ten publications in peer-reviewed Medical and Scientific Journals, oral and written presentations at national and international conferences. The final outputs will only contain aggregate results with small number suppression, in line with the HES Analysis Guidelines.
Conference presentations (e.g., UK Paediatric Critical Care Society (PCCS) Annual Conference, Intensive Care Society (ICS) State of the Art meeting)
Webinars to present study findings (e.g., through the professional societies and Royal Colleges).
One-page executive summaries.
Social media posts (including LinkedIn), accessible lay summaries, blogs, vlogs, infographics, slide decks, newsletters and podcasts (disseminated through our study website and partner organisations).
Code mapping tables from objective 1 and interview topic guides and surveys from objective 2 will be additional study outputs.
Informing and engaging patients, NHS and wider population about the work undertaken.

To transfer the new measures into practice the study team will use the networks of their team and experts from their committees.
The research team has strong links with the UK Paediatric Critical Care Society (PCCS) and the Intensive Care Society. UCL also have strong links with paediatric care charities to capture their perspectives through online forums as patient and family views are crucial. Charities include the Royal Mencap Society, ICUSteps, and WellChild.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
· Webinars - UCL will submit annual reports (most recently May 2024) to the NIHR (the project funders) and partner with them to draw on their networks and skills in dissemination and spread to make an impact more widely, which may include generating accessible resources such as downloadable leaflets and case studies, research highlights, blogs and webinars

· Social media - A project website https://www.ucl.ac.uk/clinical-operational-research-unit/optical-study, and via social media and blogs. Updates to the website will be ongoing throughout the study and as and when publications and communications are available. Updates will be less common following the final publication, however the timing of the final communication is not possible to determine as updates on benefits or reviews may be made after the study end and final publication. UCL will also publish details of OPTICAL on the PICANet and ICNARC websites as well as on the UCL CORU website.

· Participant newsletters - The team will also arrange the dissemination of findings through the charities the Royal Mencap Society, ICUSteps, and WellChild who will send newsletters to members and news items on its website highlighting the work of the study to its members. . Where appropriate, results will be promoted as press releases (2026-2030)

· Reports aimed at [participants/patients] - UCL will ensure that lay summaries are provided (reviewed in collaboration with patients and parents on their PPI Committee). The patients and parents on the advisory committee attend annual advisory group meetings to receive updates and to provide feedback on any aspect of the study

The target dates for production and dissemination of the outputs are over the course of the DSA 2025-2028

Processing:

No data will flow to NHS England

NHS England will identify a cohort (as detailed in section 5a of this DSA) using HES CC.

NHS England will provide the relevant records from the HES CC, APC, HES OP, HES A&E, ECDS and Civil Registration Death and datasets to UCL. The data will contain no direct identifying data items but will contain a unique person ID which can be used to link the Data with other record level data already held by the recipient

Under a separate DSA (DARS-NIC-791298-B8B2B) ICNARC will transfer data to NHS England. The data will consist of identifying details (specifically NHS Number, Name, Date of Birth, Postcode, Gender and a unique person ID).

Under a separate DSA (DARS-NIC-791292-Z2X6Z) Healthcare Quality Improvement Partnership (HQIP) will transfer a PICANet cohort to NHS England. The data will consist of identifying details (specifically NHS Number, Name, Date of Birth, Postcode, Gender and a unique person ID)

For both the PICANet and ICNARC cohorts NHS England receive, NHS England will supply two files to UCL containing pseudonymised study IDs, these files will allow UCL to identify (at a pseudonymised study ID level) whether an individual identified in the HES CC cohort identified by NHS England is also present in the ICNARC or PICANet cohorts and If present UCL will also have access to the HES, ECDS and Civil Registration of deaths data.

The Data will not be transferred to any other location.

Data is stored on servers at UCL.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The Data will be accessed by authorised personnel via remote access. The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.
The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England/Wales. The data will not leave England/Wales at any time.

Data will be accessed by individuals substantively employed by UCL and individuals with an honorary contract with UCL. The individual(s) will act as an agent of UCL at all times under supervision from employees of UCL. Aside from this/these individuals, access is restricted to employees or agents of UCL who have authorisation from the Principal Investigator.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with pseudonymised data obtained from the ICNARC and PICANET databases respectively, which is stored within the UCL Data Safe Haven. The linkage will exclusively be for individuals within the HES Critical Care cohort who also have records within the ICNARC and PICANET data. These linkages will be carried out using pseudonymised Study IDs.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Analysts from UCL will process/analyse the Data for the purposes described above.

Project 3 — DARS-NIC-756900-Q2J3K

Opt outs honoured: unknown (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2025-08 – 2026-08 2025.10 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: System Access
(System access exclusively means data was not disseminated, but was accessed under supervision on NHS Digital's systems)

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD Minutes - 31 July 2025 Final.pdf

Datasets:

Civil Registrations of Death
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)

Type of data: Anonymised - ICO Code Compliant (note: this information not disclosed for TRE projects )

Expected Benefits:

This is a national programme, with implementation taking place in relation to multiple healthcare conditions, across different types of hospital organisations based in diverse geographic and socioeconomic contexts. It is likely that implementation will have progressed variably across different clinical and organisational settings. In addition, some settings or specialties (e.g. bariatric surgery) have more well-established multidisciplinary preoperative pathways which aim to optimise comorbidities and reduce the risk of patients undergoing surgery without critical modifiable health issues having been addressed.

Therefore, analysis of this programme offers several opportunities to identify and share learning about implementation approaches and factors influencing uptake and experience of these services.

The use of the data could:
> help the system to better understand the health and care needs of populations.
> lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
> advance understanding of regional and national trends in health and social care needs.
> advance understanding of the need for, or effectiveness of, preventative health and care measures for particular populations or conditions such as obesity and diabetes.
> inform planning health services and programmes, for example to improve equity of access, experience and outcomes.
> inform decisions on how to effectively allocate and evaluate funding according to health needs.
> provide a mechanism for checking the quality of care. This could include identifying areas of good practice to learn from, or areas of poorer practice which need to be addressed.
> support knowledge creation or exploratory research (and the innovations and developments that might result from that exploratory work).

Subject to the findings of this research, patients are expected to benefit directly from the early screening and health optimisation programme through improved surgical safety, leading to fewer complications and a better recovery. This includes being more physically and psychologically prepared for their operation, experiencing fewer distressing last-minute cancellations, and feeling more informed and supported in their care. Ultimately, successful implementation and positive findings could mean patients spend more days alive and out of the hospital post-surgery, returning home sooner where appropriate, and potentially enjoying longer-term health improvements as a result of pre-surgical optimisation.

It is hoped that this evaluation will provide important evidence of the national programme on patient centred health outcomes and consider whether the related costs are good value for the money what might be useful for the future policy consideration.

To optimise public benefits from this research, the study team plans several actions focused on timely and accessible dissemination of findings. Formative learning and interim findings will be shared on an ongoing basis with participating national and local stakeholders, including NHS bodies, allowing for real-time insights to inform programme improvements. Crucially, Patient and Public Involvement and Engagement (PPIE) members are embedded in the dissemination strategy; they will contribute to creating outputs in clear, accessible language, co-authoring articles and summaries, and potentially speaking at presentations to ensure the lessons learned are relevant and understandable to patients and the wider public.

Outputs:

The expected outputs of the processing will be:
> Manuscripts submissions to peer reviewed journals such as the British Journal of Anaesthesia
> Presentations at national conferences hosted by the Association of Anaesthetists of Great Britain and Ireland, the Royal College of Anaesthetists, and the Intensive Care Society

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
> Journals
> Conferences

Outputs are expected to be achieved within a year of the second data drop, which will be requested in a future iteration of this DSA, expected in 2026.

Processing:

No data will flow to NHS England for the purposes of this Data Sharing Agreement (DSA).

NHS England will grant access to the Data via the Secure Data Environment (SDE). The SDE is a secure data and research analysis platform. It allows approved researchers with approved projects access to pseudonymised data and industry-leading analytics tools.

NHS England will provide access to the relevant records from the HES and Deaths datasets to UCL via NHS England Secure Data Environment (SDE). The Data will contain no direct identifying data items. The Data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

SDE users can request exportation of aggregated analysis results (suppressed and summarised according to the NHSE SDE Disclosure Control rules) subject to review and approval by the NHS England SDE Output Checking team. The SDE Output Checking team will ensure that no output contains information which could be used either on its own or in conjunction with other data to breach an individual's privacy.

Users must identify themselves via a multi-factor authentication mechanism and are only able to access the datasets detailed within this DSA. The access and use of the system is fully auditable, and all users must comply with the use of the Data as specified in this DSA.

Users are only authorised to access the Data specified in this DSA and can utilise a variety of analytical tools available within the SDE platform. Users are not permitted to export record-level data from the SDE.

The Data will be stored on servers at NHS England.

The Data will be accessed by authorised personnel via remote access.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:

- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England. The data will not leave England at any time.

Access is restricted to employees or honorary contractors of UCL who have authorisation from the lead and senior author of the mixed methods evaluation.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will not be linked with any other data.

Researchers from the NIHR Central London Patient Safety Research Collaboration (CL PSRC) at UCL will analyse the Data for the purposes described above.

Database of UK recipients of pituitary-derived human growth hormone — DARS-NIC-691697-K9L9B

Opt outs honoured: (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2025-08 – 2028-07 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 16th January 2025 final.pdf

Datasets:

Civil Registrations of Death
Demographics
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Identifiable

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of establishing two databases consisting of UK recipients of pituitary-derived human growth hormone.

The following is a summary of the aims provided by UCL:

Between 1959 and 1985, nearly 2000 individuals in the UK were treated with human growth hormone extracted from a gland in the brain (the pituitary gland) of people who had died. This treatment was called pituitary-derived growth hormone (also known as cadaveric growth hormone). The treatment was given for severe short stature, particularly if caused by growth hormone deficiency. It was given by several injections per week over months or years.

Some people who received this treatment went on develop a disease called iatrogenic Creutzfeldt-Jakob Disease (CJD). This occurred because some batches of pituitary-derived human growth hormone were contaminated with an abnormal form of one particular protein, called the prion protein, which went on to cause their disease. Recent research suggests that people who received pituitary-derived human growth hormone might also be at risk of a newly described disease, iatrogenic cerebral amyloid angiopathy. This is a disease associated with strokes caused by bleeding in the brain, as well as seizures (or fits) and cognitive changes. UCL believe this disease is caused by transmission of another abnormal protein, called amyloid-beta. We now know that pituitary-derived human growth hormone can also cause iatrogenic Alzheimers disease, which is also caused by transmission of amyloid-beta. It is possible that proteins other than amyloid-beta might also have been transmitted via pituitary derived human growth hormone, although there is no evidence that this has occurred in people yet. For these reasons, UCL are interested in monitoring the long-term health of people who were treated with pituitary-derived growth hormone, particularly with regard to neurological conditions caused by abnormal protein aggregates.

In order to better understand whether people who received pituitary-derived human growth hormone are at risk of neurological conditions other than iatrogenic CJD, the study team will create two research databases. Both of these use data from a pre-existing historical database currently held by UKHSA (the UK Health Security Agency) on behalf of DHSC (the Department of Health and Social Care).

1. Surveillance Snapshot Research Database
This database will be used for the purpose of investigating whether people who received pituitary-derived human growth hormone have been affected by diseases caused by iatrogenic protein transmission, other than those caused by the prion protein, by:
i. Quantifying mortality due to neurological conditions (particularly stroke and dementia); comparison of mortality rates in those receiving high-risk preparations (i.e. those in which proteins associated with neurodegeneration have been found, such as Hartree-modified Wilhelmi preparations) vs others; review for dose effect
ii. Review of NHS hospital activity, including admissions, emergency attendances and outpatient appointments, for neurological symptoms and diagnoses; comparison of activity in those receiving high-risk preparations (e.g. Hartree-modified Wilhelmi) vs others; review for dose effect
- Only individuals based at the Medical Research Council Prion Unit at University College London (UCL) will be permitted to access to the Surveillance Snapshot database.
- The Surveillance Snapshot database will not be used for any additional purposes, other than the purpose described above.

2. Permission to contact Research Database
Demographics data will be used to create a new database of patients who consent to be contacted for future research studies.
- When the database is established and explicit consent obtained from individuals, external organisations may request access to the database.
- Subject to ethical approval and a Data Sharing Agreement (DSA) being in place between the requesting organisation and UCL, the requestor will be provided contact details (email address, phone number etc) of those individuals who provided explicit consent to be included in the database...
- The contact details shared, will have been provided through explicit consent by the individual who gave permission to be added to the Permission to Contact Database. No data sourced from NHS England can be requested or accessed by applying organisations.
- Once the Demographics has been used for the stated purpose, the Permission to contact database will no longer contain NHS England data. Non-respondents data will have been deleted and only details provided directly by consented participants will be held in this database.

The following NHS England Data will be accessed:
Hospital Episode Statistics (HES) - Admitted Patient Care, Accident & Emergency and Outpatients necessary for comparison between people who received high-risk growth hormone preparations (i.e. those in which proteins associated with neurodegeneration have been found, such as Hartree-modified Wilhelmi preparations). Also, necessary to review for dose effect (i.e. whether increasing number of treatments with high-risk preparations shows an association with neurological hospital episodes), if the number of events allows for this.
Emergency Care Data Set (ECDS) necessary for comparison between people who received high-risk growth hormone preparations (i.e. those in which proteins associated with neurodegeneration have been found, such as Hartree-modified Wilhelmi preparations). Also, necessary to review for dose effect (i.e. whether increasing number of treatments with high-risk preparations shows an association with neurological hospital episodes), if the number of events allows for this.
Civil Registration Deaths necessary for comparisons between people who received high-risk growth hormone preparations. Review for dose effect (i.e. whether increasing number of treatments with high-risk preparations shows an association with neurological mortality), if the number of events allows for this.

The HES/ECDS and Civil Registration Deaths data will be exclusively used in establishing the Surveillance Snapshot database and subsequently investigating the research question described above. The use of HES/ECDS and Civil Registration Deaths data is not permitted as part of establishing the Permission to Contact database.

Demographics - necessary for contact of data subjects via GPs for informed consent to be a part of the "permission to contact" database. This dataset will be used exclusively in establishing the Permission to Contact database, these Data must not be used in establishing the Surveillance Snapshot database.

The level of the Data will be:
Identifiable necessary to confirm linkage accuracy in order to establish the Permission to contact research database.

The Data will be minimised as follows:

Surveillance Snapshot and Permission to contact database
Limited to a study cohort consisting of all recipients of pituitary-derived human growth hormone between 1959 and 1985.

University College London (UCL) is the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The funding comes from multiple sources. Current funders include:
Alzheimers Research UK Funding is in place until January 2026 .
Stroke Association Funding is in place until August 2026.
Funding to continue the work described will be sought on an ongoing basis.

The funders will have no ability to suppress or otherwise limit the publication of findings.

The UK Health Security Agency (UKHSA) is a processor acting under the instructions of UCL. UKHSA is Data Controller for the historical database containing details of recipients of pituitary-derived human growth hormone. UKHSA will extract identifiers for data linkage, and receive linked data from NHS England; they will then add variables of interest (relating to growth hormone administration, match certainty, study ID) and pseudonymise the dataset prior to transfer to UCL.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL solely to host encrypted backup data from the Data Safe Haven.

Data will be accessed by:
PHD students enrolled with UCL. Any student working with the Data held under this Data Sharing Agreement (DSA) must have completed relevant data protection and confidentiality training and are subject to UCL's policies on data protection and confidentiality. Any students accessing the Data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by students. These students would only work on the Data for the purposes described in this DSA.

An individual holding an honorary contract under the supervision of a substantive employee of UCL for the purposes described in this DSA only. UCL must maintain records in a single location that cover the following details of each individual given access under an honorary contract:
o Their substantive employer;
o Their role in respect of the purpose for the processing specified in the DSA;
o The start date and end date of the duration in which the Data will be accessed by the individual under an honorary contract;
o The necessity for the Data to be accessed by the person(s) holding an honorary contract, instead of a substantive employee of an organisation named as controller or a processor in this DSA;
o Confirmation that an appropriate contract is in place which follows the relevant guidance and is countersigned by the substantive employer of the honorary contract holder.

A Public and Patient Involvement and Engagement group helped refine the purpose of the research. The group supported the collection of the data for the purposes described above.

UCL have spoken to different groups of people who might be familiar with certain aspects of this lived experience; UCL are grateful for their help in designing this proposal.

UCL discussed ethical considerations with a focus group, with participants from the CJD Support Network; attendees included a relative of a recipient of pituitary-derived human growth hormone who died of iatrogenic CJD; people at risk of developing CJD (genetic risk); people who had lost family members to CJD.
The Stroke Voices in Research patient and carer group, coordinated by the Stroke Association, reviewed UCL's study materials for accessibility
UCL received input from a recipients of pituitary-derived human growth hormone on this proposal and how it might be improved

Expected Benefits:

The potential benefits of this database and subsequent research are:
To confirm whether recipients of pituitary-derived human growth hormone are at risk of iatrogenic CAA, and if they are, to ensure they can be monitored and receive appropriate clinical care, including interventions that aim to reduce their future risk of stroke
If recipients of pituitary-derived human growth hormone are at risk of developing iatrogenic CAA, the team intend to educate and update other clinical providers on this risk, so that at-risk individuals can receive relevant information (if they so wish) and care
To review whether recipients of pituitary-derived human growth hormone are at risk of other neurological diseases caused by iatrogenic protein transmission
To update public health bodies about these potential risks; it might be necessary to institute new public health measures (for example, relating to instrument sterilisation) in order to prevent future cases of disease

A great deal is unknown about how proteins other than the prion protein result in human disease via iatrogenic transmission; this includes knowledge of which proteins are able to do this in people and what the possible routes of iatrogenic exposure are. Our projects will provide knowledge for the risks associated with pituitary-derived growth hormone, a treatment which is known to have resulted in iatrogenic CJD previously. The Surveillance Snapshot research database will establish whether people who received this treatment are at increased risk of neurological disorders compared to the general population, and this information will be used to communicate risk to recipients. In some cases, there might be an opportunity to intervene; for example, if there is evidence of a higher incidence of early onset stroke due to CAA in recipients, the team might approach as yet asymptomatic individuals with advice on blood pressure management and avoidance of medications that increase bleeding risk, in order to reduce their future stroke risk.

Future research involving people in this cohort will be possible because of the Permission to contact research database, should people in this database chose to consent for this later work. Our planned studies include biomarkers studies to look for protein transmission, which might be subclinical; this will provide information on which proteins might have been transmitted in people via pituitary-derived growth hormone. For those that are symptomatic with neurological conditions, the team would offer detailed evaluations via our linked NHS services, as well as the option to participate in research studies to better characterise these newly described conditions and their natural history.

Cases of iatrogenic cerebral amyloid angiopathy and iatrogenic Alzheimers disease, caused by transmission of amyloid-beta protein, have now been reported. The team do not know the impact that this will have in the UK, in terms of how many people are likely to be exposed to amyloid-beta in the course of historical treatment, or what these causative treatments are. This work will help in specifically quantifying the risk associated with pituitary-derived growth hormone, a treatment which is known to have resulted in iatrogenic CJD previously. People who have received pituitary-derived growth hormone might be asked to take certain measures in the future to prevent further onward transmission. These projects will also contribute to a broader programme of work with additional public health impacts, including those relating to instrument sterilisation and whether current strategies are sufficient to prevent onward amyloid-beta transmission.

The team hope that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

Our results are likely to be of interest to the public, and the team will continue to work closely with UCL Media Relations to encourage appropriate reporting of our results. The team will use focused workshops to communicate with policy-makers, governmental advisory bodies and public health officials, and engage with these stakeholders through our annual open day, which is also attended by patients and carers, charities and MPs. The team also plan to share results with patients and the public on social media, using short videos, illustrations, animations and other communication strategies to maximise accessibility and reach. This work is funded by charities (Alzheimers Research UK and the Stroke Association) who will also support dissemination of this work to a wide audience.

Outputs:

The expected outputs of the processing will be:
Submissions to peer reviewed journals; anticipated 1 or 2 articles, submitted 12 to 18 months following data receipt by UCL
Presentations at patient and public engagement events, attended by a side range of stakeholders which might include: recipients of pituitary-derived human growth hormone, their families and carers; general public; representatives from government departments or bodies; relevant charities and funders
Presentations at appropriate academic conferences.
A database to be utilised as a resource for health research

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Workshops involving relevant stakeholders, including governmental officials and representatives (e.g. DHSC, UKHSA), advisory committees, charities and other research funders
Social media
Public events hosted by the MRC Prion Unit at a UCL open day, which will be attended by a wide range of stakeholders, including patients and carers, charities, researchers, MPs and public health specialists.
Posters displayed at university research events and conferences
Press/media engagement
Reports aimed at recipients of pituitary-derived human growth hormone, their families and carers

UCL estimate that outputs will be produced 12 to 18 months after data receipt by UCL, with dissemination immediately thereafter (likely over a further 12 months).

Processing:

UKHSA will transfer data to NHS England. The data will consist of identifying details (specifically NHS Number, Date of Birth, and name) for the cohort to be linked with NHS England data.

NHS England will provide the relevant records from the HES, ECDS, Civil Registration Deaths and Demographics datasets to UKHSA. The Data will
contain directly identifying data items including Names, NHS Number, Date of Birth, Postcode, Gender which are required to link to growth hormone data held by UKHSA, in order to establish the Permission to contact database.

The Surveillance Snapshot database will:
contain no direct identifying data items but will contain a unique person ID which can be used to link the Data with other record level data already held by the recipient

UKHSA will extract a further subset of the Data (details of growth hormone administration including indication for administration, dates of treatment, batches administered; match flag) to generate a pseudonymised dataset, and will securely transfer this to UCL to perform analysis within the Surveillance Snapshot database.

For the surveillance snapshot database, all analyses will use the pseudonymised dataset. There will be no requirement and no attempt to reidentify individuals when using the pseudonymised dataset.

The permission to contact database, which includes identifying details, will be stored in a separate database to the surveillance snapshot dataset. This will be used for the purpose of contact only. The identifiable data will be used to initiate contact with data subjects via their GP, to request their consent to be in the database. Data subjects will be removed from the Permission to Contact database if a) They opt-out of being in the database, or b) Contact is attempted 3-times, and the data subject does not respond.

The Data will be stored on servers at University College London (UCL).

UCL stores Data on the Cloud provided by Amazon Web Services.

Data will be accessed onsite at UCL premises, and by authorised personnel via remote access.

UCL will confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
Access controls granting users the minimum level of access required are in place;
Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
Multifactor authentication (MFA) is required for remote access;
Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The Data will not leave England at any time.

Data accessed by individuals with an honorary contract with UCL will act as an agent of UCL at all times under supervision from employees of UCL. Aside from these individuals, access is restricted to employees or agents of UCL.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with growth hormone data held by UKHSA on recipients of pituitary-derived human growth hormone with NHS England data. This linkage will be performed by UKHSA.

The Permission to Contact database will be linked to growth hormone data, to allocate priority for which data subjects are contacted first (via their GP).

Apart from this linkage, the data will not be linked with any other data.

Researchers from the MRC Prion Unit at UCL will analyse the pseudonymised data within the Surveillance Snapshot database for the purposes described above.

Researchers from the MRC Prion Unit at UCL will process the identifiable Demographics data in establishing the Permission to Contact database for the purposes described above.

MRC National Survey of Health and Development (NSHD) - tracing — DARS-NIC-774097-J9J0C

Opt outs honoured: (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(2)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2025-03 – 2028-03 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 6th March 2025 final.pdf

Datasets:

Demographics

Type of data: Identifiable

Objectives:

University College London (UCL) requires access to NHS England Data for the purpose of the following research project:
Medical Research Council (MRC) National Survey of Health and Development (NSHD) - also known as the 1946 British Birth Cohort

The following is a summary of the aims of the research project provided UCL:
The MRC NSHD is the oldest and longest running of the British birth cohort studies. From an initial maternity survey of 13,687 (82%) of all births recorded in England, Scotland and Wales during one week of March, 1946, a socially stratified sample of 5,362 singleton babies born to married parents was selected for follow-up.

Over the years, the NSHDs findings have made an important contribution to society by influencing government policy. Some highlights are given below The studys first direct policy impact was a private members bill (the Analgesia in Childbirth bill) which was introduced in the House of Commons in 1949. This increased training for midwives to give gas and air analgesia to all mothers during childbirth. It was in response to the maternity surveys finding that, as only one in five midwives was qualified to administer gas and air, and just 20 per cent of mothers in the survey had received any kind of pain relief during labour. Other policy investigations that directly used study findings included: The Platt Committee (The welfare of children in hospital, 1959) The Plowden Committee (Children and their primary schools, 1967) The Finer Committee (Report of the committee on one parent families, 1974) The Acheson Committee (Independent inquiry into inequalities in health, 1998) The Marmot Review (Fair society, healthy lives, 2010) The studys findings have also had an indirect impact on policy by influencing popular thinking. Evidence for this is described in Expected Measurable Benefits to Health and/or Social Care.

The NSHD study team has collected unique lifetime data on body size and maturation, cognitive and physical function, socioeconomic status and diet; and has repeat adult data on diet, smoking, physical activity, blood pressure and lung function. The most intensive data collection to the whole cohort in 2006-2010, when study members were aged 60-64 years, included measurement of cardiac structure and function, body composition and bone density.

The most recent data collection (2022-24) to the whole sample included postal questionnaires, a home visit by a trained research nurse for interview and assessment in 2022 and/or a London clinic visit in 2022/23. At this follow-up, all eligible UK study members were contacted: 2,162 study members gave information, of which 1,918 study members completed a postal questionnaire, 1,812 study members completed a cost of living questionnaire and 1,095 study members completed a home visit.

Each year the Lifelong Health and Ageing (LHA) at University College London (UCL) sends an annual postal mailing to all NSHD participants. NSHD asks that participants complete a reply slip which is returned to LHA which allows participants to provide LHA with any change in their details e.g., a new email address, phone number. LHA also ask them to return the reply slip even if none of their details have changed, and in doing so, seek a positive confirmation that that is the address LHA hold for them. As a result LHA, can maintain the cohorts' latest details on the NSHD database. In the event of the annual mailing not reaching the participant it is returned to LHA as a 'return to sender'. LHA will attempt to trace all these returns but if LHA cannot locate the participants then they are flagged on the database as a 'lost'.

NHS England may potentially hold a more recent address and provide LHA with an opportunity to invite cohort members who have been lost to follow-up to re-join the study. The LHA at UCL would therefore like to access to NHS England Data to provide address details in order to re-contact study members who have been lost to follow up. This will avoid potential bias to the results, due to loss of participants to follow-up. Obtaining address details will enable UCL, where possible, to re-contact the study members who had not previously refused and who had been lost to follow-up, to invite them to continue participating in the study, as well as inviting them to consent to future participation.

The following NHS England Data will be accessed:
Demographics necessary because the study would like to maintain contact with as large a number of study participants as possible.

The level of the Data will be:
Identifiable necessary so UCL can maintain contact with as large a number of study members as possible. LHA at UCL require updated addresses for study members.

The Data will be minimised as follows:
- Limited to the NSHD study cohort identified by UCL of those who have been lost to follow-up (approx. 200-350 of the study cohort). UCL will exclude individuals whose deaths have been reported or who have withdrawn from the study.

University College London (UCL) is the research sponsor and controller as the organisation ensuring the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under UK GDPR is Article 9(2)(j) processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

The funding is provided by the MRC. The funding is specifically for the NSHD, 1946 Birth Cohort. Funding is in place until 31st March 2029. The funder will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services provides cloud hosting services to UCL and will store the Data as contracted by UCL.

VIRTUS Data Centre provides offsite back-up services but does not process the data.

Formara Limited provides a mail out service and will process the Data as contracted by UCL.

All those accessing the data supplied by NHS England are substantive employees of University College London or authorised employees of Formara Limited carrying out work on behalf of UCL.

A Public involvement, engagement and participation (PPIE) group was formed in 2021, where they have provided feedback and shaped documentation on how we explain how data are used. Further work is on-going with NSHD participants to enable co-creation of questionnaires and other areas of interest.

Expected Benefits:

The NSHD has informed UK health care, education and social policy for 70 years and is the oldest and longest running of the British birth cohort studies. Today, with study members in their early seventies, the NSHD offers a unique opportunity to explore the long-term biological and social processes of ageing and how ageing is affected by factors acting across the whole of life.

The information provided by study members provides valuable evidence for the research and policy community about the cohort's transitions from birth to older age.

A specific benefit of the data dissemination under this agreement is being able to trace study members, using this data ensures the study sample is maintained and the study remains representative of the studied population.

Below are some examples of existing publications using the NSHD data benefiting public health:
The NSHD finding that more rapid rises in systolic blood pressure during midlife (even if not crossing into hypertension) were related to poorer cardiac structure (published in the European Heart Journal in 2014) has implications for treatment guidelines as it suggests that identification and treatment of people with rapidly increasing SBP, even if they are not reaching the criteria for hypertension, may be beneficial in preventing subsequent cardiovascular disease.
The NSHD findings suggesting that those who lost weight at any age during adulthood, even if weight was regained later, had better cardiovascular risk profiles than those who remained overweight or obese supports public health strategies that help individuals to lose weight at all ages.
The NSHD finding that better performance in tests of physical capability (i.e. grip strength, chair rising and standing balance) in midlife was linked to higher survival rates over 13 years of follow up was published in the British Medical Journal. This highlighted the value of these simple objective physical tests in helping to identify those people who from at least as early as midlife onwards may require more support than others to achieve a long and healthy life.

Due to their age and pace of implementing intervention, the NSHD study participants are unlikely to directly benefit from this research, but through UCL's dissemination strategy, it is hoped that the research will be used in policy and used to benefit successive generations. Through UCL's annual newsletter, study participants receive information on how their data is used, which can encourage continued participation in the study.

Outputs:

The addresses supplied by NHS England under this agreement will help ensure the NSHD have contact details for the cohort which is lost to follow-up, and can re-contact them to determine whether they would like to continue to take part in the study which would not otherwise have been possible. This will reduce the potential impact of bias attributed to loss to follow-up and will ensure the study sample is maintained and the study remains representative of the studied population.

The output will be continued participation in the NSHD, for those participants who would not otherwise have participated due to loss to follow-up.

NSHD will continue to produce a range of research outputs. This research has already generated multiple publications in peer review journals and findings are further disseminated via conference presentations.

A full list of publications produced to date are published on the MRC LHA website at: http://www.nshd.mrc.ac.uk/.

Publications and conference attendances target an audience of researchers and scientists. Typical conferences attended annually by UCL researchers include Alzheimers Association International Conference (AAIC), CLOSER (various dates throughout the year) Society for Social Medicine & Population Health (SSM).

UCL also participate in the annual MRC Festival of Medical Research to disseminate at a lay/population level. Members of the team regularly engage with policy makers and health professionals to influence policy- a recent example is of senior members of the team acting as advisors to the House of Lords Science and Technology Select Committee inquiry into Ageing, Science, Technology and Healthy Living.

The NSHD study website is currently being updated to incorporate a section summarising research findings and their implications for participants and researchers and will also incorporate a current study news section for both participants and researchers (with ongoing updates). High quality research outputs are always needed to inform health and social care policy.

A full list of publications can be found at http://www.nshd.mrc.ac.uk/findings/

Results of the NSHD will also be disseminated to the study members through their annual birthday/newsletter and the NSHD website (www.nshd.mrc.ac.uk).

Outputs are planned on a rolling basis.

The outputs will not contain NHS England Data provided under this Data Sharing Agreement.

Processing:

UCL is contracting Formara Limited to send mailing to NSHD participants.

UCL will share the cohort members' personal information with Formara Limited (name, address) these may contain some of the addresses provided by NHS England to UCL. UCL will use the Data Safe Have Secure mail function to provide Formara Limited with an encrypted file. Only Data Safe Haven account holders with Outbound rights are able to use the Secure Mail feature. The Secure Mail feature allows UCL to send messages and files as secure "packages". Packages are secured using a system-generated password which UCL should communicate to the recipient in a method other than email. Recipients will get an email with a unique link to each package, allowing them to download the message and files through a secure connection. Formara Limited will delete all information after the activity has been completed.

UCL will transfer data to NHS England. The data will consist of identifying details (specifically NHS Number, Date of Birth, Full Name and Study ID) for the cohort to be linked with NHS England Data.

The file supplied will only contain study members who have consented to be part of the NSHD and have been lost to follow-up. It will not include study members known to have died or to have withdrawn from the study.

NHS England will provide the relevant records from the Demographics dataset to UCL. The Data will contain directly identifying data items including NHS number, name, address, postcode and study ID which are required to update the NSHD database with the correct address and facilitate contact with participants.

The 'Reason for removal date' is used to compare the address confirmation data UCL holds on their database. If UCL hold an address which has been confirmed more recently than the reason for removal date they receive from NHS England, they will retain the address in their database but if the reason for removal date from NHS England is later than the address confirmation date they hold then they will update their records. On receipt of Data from NHS England, UCL will remove details of deceased cohort members before providing information to the Formara Limited.

UCL require identifiers in the circumstance that any discrepancies arise. This is because there are instances where UCL receive identifying details that do not accurately match the cohort member. Using the identifying details UCL will be able to assess whether the details they hold correctly associate with the participant.

The data will be stored on servers at UCL and Formara Limited.

UCL uses offsite back-up services provided by VIRTUS Data Centres.

UCL stores data on the Cloud provided by Amazon Web Services.

The data file supplied by NHS England will be processed within LHA at UCL and entered into LHAs secure confidential address database.

Access is restricted to employees of UCL and Formara Limited.

Formara Limited will use the Data to send mailing to participants on behalf of UCL.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

Data will be accessed by authorised personnel via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within the UK. The data will not leave the UK at any time.

Personnel are prohibited from downloading or copying data to local devices.

UCL staff using the DSH complete annual training and regularly review data access arrangements ensuring data are only limited to those authorised to access it.

The data will not be linked with any other data.

Vivaldi Social Care — DARS-NIC-769062-G5F1K

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2025-01 – 2026-01 2025.04 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: CARE ENGLAND, THE OUTSTANDING SOCIETY COMMUNITY INTEREST COMPANY, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 23rd January 2025 final.pdf, AGD minutes - 21st November 2024 final.pdf

Datasets:

Civil Registrations of Death
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Medicines dispensed in Primary Care (NHSBSA data)
Vivaldi Care Home Dataset

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL), Care England and The Outstanding Society Community Interest Company (TOSCIC) requires access to NHS England data for the purpose of the following research project:
Vivaldi Social Care

The following is a summary of the aims of the research project provided by UCL, Care England and TOSCIC:

Before COVID-19 there was very little information on disease burden, health care utilisation and clinical outcomes in care home residents because there was no reliable method to identify residents in routine data, and no systems to collect data specifically for this population. During the pandemic, regular testing for COVID-19 in care homes staff and residents created a registry of the care home population (because test results shared with NHSE were labelled with individuals NHS numbers and their care home identifier). This information was linked to routine datasets within NHS Foundry in the VIVALDI study providing accurate and timely estimates of COVID-19 infections, related outcomes, immunity and vaccine effectiveness in care home residents and staff. This evidence was critical to the public health response to COVID-19. It also showed that it is feasible to rapidly generate research / surveillance in care homes to inform policy, by working in partnership with providers.

There is enormous scope to re-purpose this model to generate evidence on how to improve outcomes for care home residents and streamline interactions between the NHS and social care. A natural next step would be to address other leading causes of infection and outbreaks in care homes such as influenza or norovirus, which cause substantial morbidity and mortality, care home closures and drive NHS winter pressures every year. Now regular testing for COVID-19 has stopped this requires a new approach to generate a care home registry.

The aim of the study is to pilot a system of surveillance for infection and antimicrobial resistance in care homes for older adults, and to demonstrate its capacity to deliver as a trial infrastructure for public health research. UCL, Care England and TOSCIC will also demonstrate the steps that would be required to embed a long-term continuous study, with the goal to inform a permanent programme of care home research, surveillance and quality improvement.

Objectives:

1. To regularly ingest data (NHS numbers, care home identifier, calendar date) from residents in 500-1500 care homes into NHSE, in order to link this information to routine datasets (hospital admissions, vaccinations, deaths, laboratory results, prescriptions). These datasets are already held by NHSE, with the exception of laboratory test results (Second Generation Surveillance System, SGSS) which are held by the UKHSA. We are exploring whether it is possible for an extract of SGSS to be shared with NHSE for use in this project but this is not critical to project delivery.

2. To develop and produce outputs (reports, dashboards) for care providers, policymakers and the public which summarise the impact of priority infections in care homes to inform quality improvement and public health activities, and research prioritisation

3. To establish a research database that researchers can use to deliver observational studies on infection and AMR, and to explore use of the platform to enable interventional research studies e.g. cluster randomised controlled trials (subject to additional approvals, not included as part of this DSA)

4. To explore use of the platform to deliver near real-time surveillance for priority infections e.g. influenza, norovirus, COVID-19

5. To build capacity in public health surveillance, QI and research in care homes

The following NHS England Data will be accessed:
> Hospital Episode Statistics Admitted Patient Care (HES APC) necessary:
a. To measure rates of hospital admission for specific infections and infection syndromes (e.g. urinary tract infections, blood stream infections, respiratory infections). This information is not currently available for care home residents. It will support quality improvement and policy to reduce the burden and impact of infection in care home residents.
b. To measure overall rates of hospital admission in an accurate, well defined cohort of care home residents. This information will improve our understanding of the care home population and their interactions with the NHS relevant to policymakers, care home residents and their families.
c. To use data on prior hospital admissions to infer levels of comorbidity in the care home population. This is essential when trying to make comparisons between care homes.
> Emergency Care Data Set (ECDS) necessary:
a. To measure rates of A&E attendances for specific infections and infection syndromes (e.g. urinary tract infections, blood stream infections, respiratory infections). This information is not currently available for care home residents. It will support quality improvement and policy to reduce the burden and impact of infection in care home residents, particularly in relation to avoidable attendances at A&E and subsequent hospital admissions
b. To measure overall rates of A&E attendances in an accurate, well defined cohort of care home residents. This information will improve our understanding of the care home population and their interactions with the NHS relevant to policymakers, care home residents and their families.
> Civil Registration Mortality necessary:
a. To measure overall rates of death and causes of death in a well-defined cohort of care home residents. We currently lack accurate information on these outcomes because there is no national registry of who lives in a care home. This information is likely to be of relevance to policymakers, care home residents and their families.
b. To estimate the burden of infection-related death in care home residents relevant to policymakers, care home residents and their families.
> Medicines dispensed in Primary Care (NHSBSA data) - necessary:
To estimate rates of antibiotic usage in care home residents, and how this varies between care homes. We currently lack reliable estimate of antibiotic usage in this population which undermines efforts to tackle the problem of antibiotic resistance in care home residents (residents have higher rates of AMR compared to the general population of comparable age). This information will help policymakers and providers understand which types of interventions are likely to have the biggest impact on antibiotic prescribing and AMR in specific care homes.

Antibiotic overuse drives antibiotic resistance. NHSBSA data is requested to help address one of the key aims of the Governments National AMR action plan which is to safely reduce and optimise antimicrobial use to reduce the risk of AMR, consequently reducing inappropriate and unnecessary exposure to antibiotics is a patient safety issue. The dataset established in this project is also to be used to investigate the effectiveness of different types, doses and durations and antibiotic therapy in care home residents.

The level of the Data will be:
> Pseudonymised

The Data will be minimised as follows:
> Limited to a study cohort identified by Person Centred Software, Nourish, Camascope* approximately 15,000-45,000 residents of 500-1500 care homes for older adults in England. Care homes in all regions are eligible to participate, provided they are using Digital Care records provided by one of the software suppliers that are partnering on the project (Person Centred Software, Nourish, Camascope).
> Limited to data between 2021 latest available; 3-years of historic data is required as the average length of stay in a care home is around 2.5 years.
> A derivation of date of death will be supplied in Month/Year format

*These software vendors supply digital care records to participating care homes.

The lawful basis for processing personal data under the UK GDPR is:
For UCL, Care England and TOSCIC:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
For the dashboard summarising the care home population and the burden of infection in care home residents purpose:
Article 9(2)(i) - processing is necessary for reasons of public interest in the area of public health, such as protecting against serious cross-border threats to health or ensuring high standards of quality and safety of health care and of medicinal products or medical devices, on the basis of Union or Member State law which provides for suitable and specific measures to safeguard the rights and freedoms of the data subject, in particular professional secrecy.

For the research studies undertaken using the Vivaldi Social Care database purpose:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The funding comes from multiple sources. Current funders include:
> The UK Health Security Agency (UKHSA)
> National Institute for Health and Care Research (NIHR)
Funding to continue the work described will be sought on an ongoing basis.

The funders will have no ability to suppress or otherwise limit the publication of findings.

Quantaim Limited is a processor acting under the instructions of UCL. Quantaim Limiteds role is limited to curating and managing the Vivaldi Social care database in the UCL DSH.

Quantaim Limited are listed as a party in a data processor agreement between UCL (acting on behalf of the joint data controllers), NHSE and Quantaim Limited.

Amazon Web Services (AWS) provides IT hosting services to UCL and will store the Data as contracted by UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

Data will be accessed by:
Substantive employees of UCL and Quantaim Limited
Individuals holding an honorary contract under the supervision of a substantive employee of UCL for the purposes described in this DSA only. UCL must maintain records in a single location that cover the following details of each individual given access under an honorary contract:
o Their substantive employer;
o Their role in respect of the purpose for the processing specified in the DSA;
o The start date and end date of the duration in which the Data will be accessed by the individual under an honorary contract;
o The necessity for the Data to be accessed by the person(s) holding an honorary contract, instead of a substantive employee of an organisation named as controller or a processor in this DSA;
o Confirmation that an appropriate contract is in place which follows the relevant guidance and is countersigned by the substantive employer of the honorary contract holder.

Since its inception in May 2020, the Vivaldi Social Care study has evolved through collaboration with groups like TOSCIC and Rights for Residents to better understand the care sector and the complexities of using routinely collected data. The study aims to ensure residents, including those lacking capacity, can participate while providing clear opportunities to opt out. This opt-out model, developed in consultation with care home stakeholders, balances inclusivity with ethical data usage. Engagement efforts have included task groups, working groups, and dissemination events, culminating in a co-produced governance model involving UCL, Care England, and TOSCIC as joint data controllers.

The study has actively sought resident and relative input, holding targeted events and care home visits to refine materials and address concerns. Feedback has been overwhelmingly supportive, with residents recognising the study's importance in improving care quality, particularly in light of challenges faced during the COVID-19 pandemic. Rights for Residents plays a vital role, advocating for transparency, data protection, and resident representation. The project is overseen by the Adult Social Care Engagement Collective and remains committed to collaboration, aiming to share findings widely across the care sector. ASCEC, comprises 30-40 members of the public, care home relatives, care providers, care home staff and charities. UCL, Care England and TOSCIC will work with their ASCEC to find effective ways to share the research findings with the care sector. This builds on Vivaldis existing experience of working collaboratively with care home residents, relatives, staff and providers, which is explored on their website.

Expected Benefits:

The findings of this research study are expected to yield benefits for a wide range of groups, particularly:

> Care home residents: improved quality of life due to fewer infections and fewer infection-related hospital admissions
> Families / relatives: Improved quality of life as fewer care home closures due to outbreaks, meaning relatives can consistently visit their loved ones.
> Care home staff: reduced infection-related sickness absence, improving wellbeing and income
> Care providers: Financial benefits. Outbreaks close care homes to new admissions which means loss of income. Outbreaks / sickness also mean providers have to employ agency (temporary) staff who are expensive. Better evidence on how to prevent infections will also improve care quality for residents.
> Commissioners and local authorities: Dashboards will help to identify care homes with poorer outcomes, supporting targeted quality improvement and better commissioning decisions
> Integrated Care Boards: Better data on infections will help ICBs target their infection prevention and control activities
> Public: Better data helps people make informed decisions when selecting a care home
> NHS: Better evidence and new ways to reduce infections and outbreaks will benefit the NHS by reducing winter pressures due to care home outbreaks e.g. flu, norovirus
> Regulator (CQC): longer term, better data could be used as part of CQC inspections

Overall, the outputs of this programme are expected to benefit residents (improved physical and mental health and wellbeing), policymakers (e.g. evidence to reduce NHS winter pressures and inform the AMR national action plan), patients and the public (improved delivery of health and social care services).

This will be achieved by:
1) generating reliable estimates of the burden of infection and AMR in residents to inform prioritisation of activities by NHSE and UKHSA,
2) identifying variation in practice to inform local interventions by providers, UKHSA and NHSE, and
3) enabling public health research to generate new strategies / policies to prevent/reduce infection in care homes.

Outputs:

The expected outputs of the processing will be:
> Dashboard(s) for policymakers, care home providers and the public. These will summarise key features of the care home population and will include care home level data on the burden of infection and AMR in residents. This information help providers and policymakers at local, regional and national level understand variation in care outcomes (e.g. rates of antibiotic prescribing, rates of hospital admissions for urinary tract infections) to better target their quality improvement activities and evidence-based policymaking.
> Submissions to peer reviewed journals
> Presentations at appropriate Care Sector conferences

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will not result in the re-identification of individual care homes.

The outputs will be communicated to relevant recipients through the following dissemination channels:
> Journals
> Conferences
> Podcasts
> Short videos; aimed at care home staff, care providers, relatives/ family members of residents and residents (recognising that many residents have cognitive impairment so we will also need simpler outputs to convey findings to this group). These videos will be shared via social media (e.g. LinkedIn, Facebook, twitter), they study website, at conferences, trade shows, care sector meetings, and meetings with policymakers.

The aim is to begin setting up the study dashboards in mid-2025. Research outputs (publications, policy briefings, conference presentations) are then expected to be made available from January 2026 onwards.

Multi-media outputs will be produced throughout this period (newsletters, podcasts, press releases, conference presentations and updates on the project, care home visits, video(s)). Engagement activities and dissemination plans are overseen by the Adult Social Care Engagement Collective a group of relatives, care home staff, providers, members of the public and people representing charities.

Processing:

Nourish, Person Centred Software - PCS and Camascope (three software vendors) will transfer data to NHS England.

The data will contain the following details
- NHS number
- System_ID -This is a supplier ID from the care home
- CQC_ID CQC ID for the care home.
- Resident date Date stamp for when resident is present within the care home

A project specific opt out and the national data opt-out will be applied before data is transferred from the software vendors to NHS England.

The identifiable data will be passed through an automated data pipeline which will;

1) validate whether the NHS number is in PDS
2) store the NHS Numbers as a cohort for the study
3) link the data via NHS number to other record-level healthcare datasets listed in this Data Sharing Agreement

NHS England will supply the relevant records from the HES, ECDS, Civil Registration of Deaths, and NHSBSA datasets to UCL and a file containing System_ID, CQC_ID, Resident date and STUDY_ID.

The data will be stored on servers at the UCL Data Safe Haven (DSH).

UCL uses offsite data centre services provided by VIRTUS data centre.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The Data will be accessed by authorised personnel via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:

- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England. The data will not leave England at any time.

Data will be accessed by an individual with an honorary contract with UCL. The individual will act as an agent of UCL at all times under supervision from employees of UCL. Aside from this individual, access is restricted to employees or agents of UCL and Quantaim Limited who have authorisation from Principal Investigator.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will not be linked with any other person-level data.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Analysts from UCL and Quantaim and individuals holding an honorary contract with UCL will process the Data for the purposes described above.

An aggregate (care home level) copy of the dataset, with small numbers suppressed, will be shared with the UKHSA and used to create care home level dashboards that will be shared with care providers and policymakers. As this is a pilot the content of the dashboard and the mechanism for sharing data with providers and policymakers is still being developed. This dataset will be stored on UKHSA servers.

The impact of reimbursement schemes on healthcare providers' operational performance — DARS-NIC-727610-S2V3N

Opt outs honoured: unknown (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2024-10 – 2027-10 2024.11 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: System Access
(System access exclusively means data was not disseminated, but was accessed under supervision on NHS Digital's systems)

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 27th March 2025 final.pdf, AGD minutes - 12th September 2024 finalv2.pdf, AGD minutes - 20 June 2024 - Final.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Anonymised - ICO Code Compliant (note: this information not disclosed for TRE projects )

Outputs:

The expected outputs of the processing will be:
At least 1 submission to a peer reviewed journals (such as Management Science, Manufacturing & Service Operations Management, Production and Operations Management, Journal of Operations Management, British Medical Journal).
At least 3 presentations at appropriate national and international conferences (such as INFORMS Healthcare Conference, INFORMS Annual Meeting, INFORMS MSOM Conference, POMS Conference, Health Economists Study Group Meetings, ISPOR Europe Conference, European Health Economics Association Conference, European Public Health Conference, European Operations Management Association Conference).

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Conferences
Workshops and seminars open to UCL staff, academics, researchers outside of UCL, and policymakers

The target date for production and dissemination of the outputs is 2 years following receipt of the Data and is expected to be ongoing until the end of the project.

Processing:

No data will flow to NHS England for the purposes of this Data Sharing Agreement (DSA).

NHS England will grant access to the Data via the Secure Data Environment (SDE). The SDE is a secure data and research analysis platform. It allows approved researchers with approved projects access to pseudonymised data and industry-leading analytics tools

NHS England will provide access to the relevant records from the HES APC Data Set to UCL via the NHS England Secure Data Environment (SDE).

The Data will contain no direct identifying data items. The Data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

SDE users can request exportation of aggregated analysis results (suppressed and summarised according to the NHSE SDE Disclosure Control rules) subject to review and approval by the NHS England SDE Output Checking team. The SDE Output Checking team will ensure that no output contains information which could be used either on its own or in conjunction with other data to breach an individual's privacy.

Users must identify themselves via a multi-factor authentication mechanism and are only able to access the datasets detailed within this DSA. The access and use of the system is fully auditable, and all users must comply with the use of the Data as specified in this DSA.

Users are only authorised to access the Data specified in this DSA and can utilise a variety of analytical tools available within the SDE platform. Users are not permitted to export record-level data from the SDE.

The Data will be accessed by authorised personnel via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
Access controls granting users the minimum level of access required are in place;
Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
Multifactor authentication (MFA) is required for remote access;
Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.
The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England/Wales. The Data will not leave England/Wales at any time.

Access is restricted to UCL substantive employees and researchers including UCL faculty and UCL Doctoral students who have authorisation from the research director.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will not be linked with any other data.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Researchers from UCL will process the Data for the purposes described above.

A phase III, double blind, placebo controlled, randomised trial assessing the effects of aspirin on disease recurrence and survival after primary therapy in common non metastatic solid tumours ( ODR1718_261 ) — DARS-NIC-656806-N9V7N

Opt outs honoured: (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2024-01 – 2025-01 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

NDRS Cancer Registrations
NDRS Linked Cancer Waiting Times (Treatments only)
NDRS Linked DIDs
NDRS Linked HES AE
NDRS Linked HES APC
NDRS Linked HES Outpatient
NDRS National Radiotherapy Dataset (RTDS)
NDRS Systemic Anti-Cancer Therapy Dataset (SACT)

Type of data: Anonymised - ICO Code Compliant

Objectives:

Univeristy College London (UCL) requires access to NHS England data for the purpose of the following research project:
A phase III, double blind, placebo controlled, randomised trial assessing the effects of aspirin on disease recurrence and survival after primary therapy in common non metastatic solid tumours.

The following is a summary of the aims of the research project provided by UCL:

A clinical trial to find out whether taking aspirin daily for 5 years after treatment for an early-stage cancer, stops or delays the cancer coming back, which is testing if routine data sources can replace traditional approaches to patient follow up.

The primary outcomes of this study are:
- Overall survival for all participants
- Invasive disease-free survival for breast cancer
- Disease-free survival for colorectal cancer
- Disease-free survival for gastro-oesophageal cancer
- Biochemical recurrence-free survival for prostate cancer
Secondary outcomes are: In all participants these will include adherence, toxicity including serious haemorrhage, and cardiovascular events, as well as some tumour site-specific secondary outcome measures.

The following NHS England Data will be accessed:
NDRS Linked Hospital Episode Statistics
o Admitted Patient Care
o Accident & Emergency
o Outpatients
NDRS Linked Cancer Registration
NDRS Linked Diagnostic Imaging Dataset (DID)
NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
NDRS Radiotherapy Dataset (RTDS)
NDRS Linked Cancer Waiting Times (Treatment Data) (CWT)

The level of the data will be pseudonymised.

The Data will be minimised as follows:
Limited to a consented study cohort identified by UCL UCL will provide the NHS Numbers and cancer site/morphology codes for the data required.
Limited to data starting from trial initiation in 2015.
For each individual patient, data will only be provided from date of enrolment into the trial and until 10 years after their completion in the trial.
Limited to the following geographic areas: England.

UCL is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

Although Tata Memorial Centre (TMC) is the study sponsor, Tata Memorial Centre (TMC) will not carry out any controllership activities nor have the ability to limit or supress outputs

The lawful basis for processing personal data under the UK GDPR is:

Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The funding is provided by Cancer Research UK and NIHR HTA (Health Technology Assessment Programme). The funding is specifically for the stud described. Funding is in place until 31 August 2028.

The funder(s) will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services provides backup services to UCL and will store the Data as contracted by UCL.
VIRTUS provides IT support to UCL.

No one else other than UCL listed above will be accessing the data.

Data will be accessed by a PhD student affiliated with UCL. The individual has completed mandatory data protection and confidentiality training and is subject to UCLs policies on data protection and confidentiality. The individual accessing the data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by the individual. The PhD student would only work on the data for the purposes described in this Data Sharing Agreement (DSA).

There are 8 Public and Patient Involvement representatives with the Add-Aspirin trial who all support this research.

The national data opt-out does not apply where explicit consent has been obtained from the patient for the specific purpose.

Where individuals have opted out of disease registration by the National Disease Registration Service (NDRS), their data has been permanently removed from the registry and therefore will not be disseminated under this Data Sharing Agreement (DSA). https://digital.nhs.uk/ndrs/patients/opting-out

Yielded Benefits:

A previous publication has detailed the process and practical considerations for accessing routinely-collected healthcare data for use in clinical trials to support other research groups in undertaking such work (Macnair et al. Trials (2021) 22:340). Update as per latest confirmation report submitted 09 Jan 2024: Publication in a peer-reviewed cancer research journal reporting on the comparison between trial-specific follow-up and routinely-collected health data for ascertaining key outcomes within the Add-Aspirin trial (an ongoing large phase III RCT in multiple cancer types). Submission planned for November 2023 with expected publication date Q1 - 2 2024. The above publication, along with the previous publication on this work (see below), have the potential for wide-reaching implications for future conduct of cancer clinical trials. There is currently much interest in use of routinely collected health data for improving the efficiency of cancer trials and reducing the burden on NHS teams and resources. These publications report on both the practicalities of obtaining and using such data within trials, as well as the extent to which it is fit-for-purpose for the assessment of key (cancer and non-cancer) outcomes.

Expected Benefits:

The above publication, along with the previous publication on this work (see below), have the potential for wide-reaching implications for future conduct of cancer clinical trials. There is currently much interest in use of routinely collected health data for improving the efficiency of cancer trials and reducing the burden on NHS teams and resources. These publications report on both the practicalities of obtaining and using such data within trials, as well as the extent to which it is fit-for-purpose for the assessment of key (cancer and non-cancer) outcomes.

The following are specific benefits to patients are expected as an outcome subject to the findings:
- Streamlining the data collection process through routine health data could reduce the overall cost of conducting trials. This may contribute to making innovative cancer treatments more accessible for patients.
- A comprehensive assessment and comparison of trial data and HSD may provide a more holistic understanding of the impact of treatments on patients' overall health and quality of life by tracking mortality, recurrence, new cancer diagnoses and major non cancer events.
It is also envisaged that these listed above will also lead to public benefits.

Outputs:

The expected outputs of the processing will be a publication in a peer-reviewed cancer research journal reporting on the comparison between trial-specific follow-up and routinely-collected health data for ascertaining key outcomes within the Add-Aspirin trial (an ongoing large phase III RCT in multiple cancer types). Submission planned for November 2023 with expected publication date Q1 - 2 2024.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Public reports
MRC CTU website

The target dates for production and dissemination of the outputs is Q1-Q2 of 2024.

Processing:

No more data will flow under this agreement. Historically UCL transferred data to NDRS (at Public Health England now NHS England). The data consisted of identifying details (specifically NHS Number and Date of Birth) for the consented cohort to be linked with NDRS data.

NDRS provided the relevant records from the NDRS (HESAPC), NDRS (HESAE), NDRS (HESOP), NDRS Cancer Registration, NDRS (DID), NDRS (SACT), NDRS (RTDS), NDRS (CWT) datasets to UCL.

The Data contained no direct identifying data items but contained a unique person ID which can be used to link the Data with other record level data already held by the recipient.

The Data will not be transferred to any other location. The Data will not leave England.

The Data will be stored on servers at Amazon Web Services (AWS).
AWSs role is limited to secure back-up of data stored in UCLs Data Safe Haven. UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

UCL uses offsite back-up services provided by AWS.

The Data will be accessed by authorised personnel on site and via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England/Wales. The data will not leave England/Wales at any time.

Data will be accessed by individuals with an honorary contract with UCL. The individuals will act as an agent of UCL at all times under supervision from employees of UCL. Aside from this/these individuals, access is restricted to employees or agents of UCL who have authorisation from whom e.g. the Principal Investigator.

AWS and VIRTUS are not permitted to access the Data.

All personnel accessing the Data have been appropriately trained in GDPR/data protection and confidentiality.

The Data will not be linked with any other data outside of this agreement.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Researchers from the UCL will use the relevant subset of data to undertake the socio-economic analysis described above.

Understanding and improving the use of investigations in primary care in patients subsequently diagnosed with cancer (ODR1920_196) — DARS-NIC-656863-R0V6Q

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Other-The Health Service (Control of Patient Information) Regulations- Regulation 2

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-12 – 2026-12 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

NDRS Cancer Registrations
NDRS National Cancer Diagnosis Audit (NCDA)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires continued access to NHS England National Disease Registration Service (NDRS) National Cancer Registration and Analysis Service (NCRAS) data for the following research project:
Understanding and improving the use of investigations in primary care in patients subsequently diagnosed with cancer.

Before the dissolution of Public Health England (PHE), this request was managed by the PHE Office for Data Release (ODR) under the reference ODR1920_196. All data being retained under this Agreement was previously disseminated by the ODR.

The following is a summary of the aims of the research project:
1. To understand how often are primary care investigations used in patients before a cancer diagnosis, and which patient and disease factors are associated with their use?
2. To understand how the use of these investigations is associated with the length of diagnostic intervals and process and outcome measures of the diagnostic process.

The following NHS England NDRS NCRAS data will be accessed:
NDRS National Cancer Diagnosis Audit (NCDA)
NDRS Cancer Registrations
Access to the above datasets is necessary because they provide information that is integral to achieving the aim of the study.

The level of the data is pseudonymised.

The data has been minimised as follows:
Limited to individuals the NDRS identify as being included in the NCDA 2014 cohort- as defined by the NCDA_Diagnosisdatebest field as patients diagnosed with any cancer type between January 1st 2014 and December 3st 2014 and who were registered at one of the participating general practices in England
AND;
Limited to individuals the NDRS identify as being included in the NCDA 2018 cohort- as defined by the NCDA_diagnosisdatebest field as patients diagnosed with any cancer type between January 1st 2018 and December 31st 2018

UCL is the research sponsor and the controller is the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under UK GDPR is Article 9(2)(j) processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The study benefits from oversight/steering by the National Cancer Diagnosis Audit methodology/steering groups, with input from colleagues at the University of Exeter and Cancer Research UK. Neither CRUK nor the University of Exeter will have access to the data held under this Agreement.

Data will be accessed by:
Undergraduate, Masters or PhD students affiliated with UCL. Any student working with the data held under this Agreement must have completed mandatory data protection and confidentiality training and are subject to UCLs policies on data protection and confidentiality. Any students accessing the data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by students. These students would only work on the data for the purposes described in this Agreement.

In line with the national data opt-out policy, opt-outs are not applied because the data is not Confidential Patient Information as defined in section 251(10) and section 251(11) of the National Health Service Act 2006.

Where individuals have opted out of disease registration by the National Disease Registration Service (NDRS), their data has been permanently removed from the registry and therefore will not be disseminated under this Data Sharing Agreement (DSA). https://digital.nhs.uk/ndrs/patients/opting-out

Yielded Benefits:

The findings of the study have so far identified avenues for quality improvement activity relating to the management and referral of suspected cancer.

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers such as doctors, and patients to inform best practice to improve the care, treatment and experience of health care users relevant to the subject matter of the study.

Through achieving the studys aims the findings may allow the identification of opportunities for earlier diagnosis that can be obtained by use of blood and other tests (endoscopy, imaging) in primary care among patients who were subsequently diagnosed with cancer.

The study may allow researchers to obtain a granular understanding of variation between different patient groups in the use of diagnostic tests ordered by GPs, therefore identifying opportunities for improvement. The researchers additionally may gain an understanding of the predictors and implications of using or not using the test for the patients, to support the prioritisation of the implementation of the findings, and their incorporation into clinical audit and improvement initiatives.

It is hoped that through the publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

Outputs:

The expected outputs of the processing will be:
Submissions to peer-reviewed journals by the end of November 2024
Presentations at appropriate conferences such as the Society for Academic Primary Care or other national and international cancer research or diagnosis research conferences.

The outputs will not contain NDRS data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the datasets from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Social media
Public reports
Posters displayed at appropriate conferences

All outputs are due for dissemination by November 2024.

Processing:

No data will flow to NHS England for the purposes of this Agreement.

No data is being disseminated under this Agreement. UCL wish to retain NDRS NCRAS data previously disseminated by the PHE ODR. The data contains no direct identifying data items. The data is pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

The data will be stored on servers that support the UCL Data Safe Haven.

The data will not leave or be accessed outside the UK at any time.

Access is restricted to employees or agents of UCL who have authorisation from the study lead.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

The data will not be linked with any other data.

There will be no requirement and no attempt to reidentify individuals when using the data.

Analysts from UCL will analyse the data for the purposes described above.

Loneliness among people with 'Complex Emotional Needs' (CEN): A cross-sectional UK study — DARS-NIC-674976-S4T1V

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-01 – 2024-01 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 18 April 2024 final.pdf

Datasets:

Adult Psychiatric Morbidity Survey (APMS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The objective is to use the Adult Psychiatric Morbidity Survey (APMS) 2014 dataset for research purposes. Specifically, a secondary analysis of the data will be conducted by University College London (UCL) researchers to investigate the relationship between "complex emotional needs" (CEN, or "personality disorder" traits) and loneliness and suicidality outcomes, and the relationship of loneliness to discrimination, self-harm, and suicidal ideation in people with CEN.

Background:

Considering the prevalence and impact of loneliness on both physical and mental health outcomes, recovery, and quality of life, loneliness has been identified as an intervention target with the potential to alleviate symptoms and improve outcomes for people with mental health conditions such as depression, complex emotional needs (CEN), and psychosis. The term complex emotional needs (CEN) is used to describe people who are diagnosable with a personality disorder, who are a neglected group of great research and clinical interest. This is particularly important as the quality of care and available interventions for people with CEN have been criticized as lacking by service users and clinicians.

The research evidence has demonstrated the centrality of loneliness in the day-to-day lives of people with CEN. A quantitative study describing the intensity of loneliness in the lives of people with CEN found that poor social functioning and objective social isolation did not account for the severity of loneliness experienced by people with CEN. Not only do these findings illustrate the need to further investigate determinants of loneliness and the effects of loneliness among people with CEN, but they also highlight the probable role of negative and discriminatory societal experiences in exacerbating a sense of loneliness. Recent sociological and psychological theories and studies exploring the relationship between suicidal thoughts and self-harm and loneliness report that a lack of belonging is one of the prominent risk factors associated with suicide and self-harm. Collectively, the evidence points to the need to explore the role of self-harm, suicidal ideation as well as discriminatory experiences and their consequent effects on loneliness outcomes among people with CEN. Although there is a relationship between loneliness and CEN, there is very little known about the relationship between individual CEN traits/symptoms and loneliness outcomes, and whether specific symptoms of CEN are more relevant in relation to loneliness. Given the extent to which interventions focus on symptomatic reductions, investigating whether self-harm and suicidal ideation exacerbate levels of loneliness could provide a pathway for future interventions to target the reduction of loneliness.

Aims and objectives:
In this study, UCL aim to conduct a quantitative study to investigate the relationship between CEN and loneliness and suicidality outcomes, and the relationship of loneliness to discrimination, self-harm, and suicidal ideation in people with CEN. Moreover, UCL will also assess the effects of relevant demographic factors in the relationship between discrimination and loneliness outcomes. UCL will use the APMS 2014 database to explore loneliness in this group and therefore the objectives are to:
- investigate the relationship between the number of traits endorsed that are suggestive of complex emotional needs (CEN), and the presence of loneliness controlling for stressful life events, including serious injuries/assaults, deaths, financial/job problems, prison time, bullying, violence, sexual abuse, homelessness, being institutionalized or in foster care, and other sociodemographic variables.
- Investigate the relationship between the number of traits endorsed that are suggestive of CEN and the presence of suicidality and self-harm, controlling for stressful life events including serious injuries/assaults, deaths, financial/job problems, prison time, bullying, violence, sexual abuse, homelessness, being institutionalized or in foster care and other sociodemographic variables.
- Assess whether loneliness modifies the relationship between the number of CEN traits and suicidality/self-harm by testing interaction effects.
- investigate the association between individual CEN traits and loneliness, controlling for covariates.
- assess the possibility that experiences of discrimination modify the relationship between specific traits suggestive of CEN and loneliness by testing interaction effects.
- investigate the interaction effect between discrimination and sexual orientation, and loneliness (outcome), among people who meet the cut-off for a diagnosis of CEN.

UCL is requesting the Adult Psychiatric Morbidity Survey 2014 (APMS 2014). UK data service (UKDS) (https://ukdataservice.ac.uk/), holding APMS 2014 on the behalf of NHS Digital and are responsible for disseminating under the direction of NHS Digital, would provide the whole data set to UCL as there is no facility to select individual variables. Upon signing the data sharing agreement, UCL would then download the pseudonymised APMS data through UKDS for the period specified within the DSA. All APMS data are pseudonymised. In line with the procedures and standards, when the DSA expires, all local copies of the APMS 2014 dataset will be erased and destroyed. The PI will be dedicating a period of one year to work with the dataset, carry out secondary analysis, and write up results.

The data will be held in the UCL Data Safe Haven using UCL-approved computers. The Data Safe Haven is a highly confidential technical solution for transferring and storing data. It meets the requirements for the NHS Digital governance toolkit and ISO 27001 Information Security Standards.

The aim of this secondary analysis of cross-sectional data from the 2014 Adult Psychiatric Morbidity Survey (APMS) is to investigate the relationship between loneliness and complex emotional needs (or personality disorder) and suicidality outcomes in a representative sample in England. The APMS 2014, the most recent survey adult psychiatric morbidity survey, provides relevant data on personality disorder, loneliness, discrimination, and suicidality for a nationally representative sample of residents in private households in England, aged 16 and over. Therefore UCL requires the full database of this large nationally representative sample to achieve a sufficient sample size to obtain meaningful results for this research. The data obtained will be anonymous and stored and processed only on UCLs Data Safe Haven. The data will only be accessed by UCL employees and registered UCL Ph.D. students and registered UCL MSc students, as required or necessary. All students sign up to the UCL Academic manual and will be working under appropriate supervision on behalf of the data controller/processor within this agreement , they will have access to the data and only for the purposes described in this agreement.

The data will be processed under Article 6 (1)e- legitimate interest under public task, as UCL is a public authority and processing data for research is one of UCLs public tasks. Processing the APMS 2014 database to achieve the objective of this research proposed is necessary and there is no other way of fulfilling the purpose of this research. Further, this is the least restrictive way and a safe way of achieving the research goals as individuals will not be harmed through the processing of this data. The data will be processed under Article 9 (2)j - processing is necessary for archiving, research, or statistical purposes. Given the adverse effects of loneliness, with severe levels of loneliness predicting suicidal and self-injurious behaviours in people with CEN, and the prevalence of CEN in the general population, it is of public interest to conduct research on loneliness among people with CEN traits. This research would potentially build the groundwork for intervention development. Specifically, investigating whether self-harm and suicidal ideation exacerbate levels of loneliness could provide a pathway for future interventions to both target the reduction of loneliness, therefore increasing personal recovery outcomes, as well as reducing self-harm and suicidal ideation

UCL ensures that the processing of data abides by means of appropriate technical and organisational measures which includes pseudonymising data/measures.

This research is undertaken as a part of a Ph.D. research and the Ph.D. student conducting this research is sponsored by UCL and the Economic and Social Research Council (ESRC), however, the ESRC will not be involved in the processing of data by the APMS 2014.

There is no control group in this study as the APMS consists of participants (aged over 16 years) who completed the APMS survey.

Expected Benefits:

Improving the quality of care and broadening the range of interventions offered, to include socially focused intervention targets, such as loneliness, has been prioritised and emphasized by people with lived experience of complex emotional needs (or a personality disorder), professionals, and policymakers. Given the adverse effects of loneliness and the call for a more holistic treatment approach for people with complex emotional needs, as opposed to a narrow focus on self-harm and symptomatic reduction, loneliness is a potentially promising intervention target. To develop an intervention that targets loneliness, an understanding of the relationship between loneliness and complex emotional needs and suicidality, as well as investigating the role of discrimination would be useful and necessary. Loneliness, self-harm, and suicide have been a public health crisis and priority, and complex emotional needs traits are common in the general population. Therefore, developing evidence-based interventions targeting loneliness among people with complex emotional needs would be more beneficial if rooted in a detailed understanding of the associations of loneliness among people with complex emotional needs. However, pending this aim of developing an intervention, findings from this study can suggest that asking service users with complex emotional needs about social connections and loneliness can be beneficial in formulating a treatment plan. If this study reveals a link between loneliness and self-harm/suicidality then this could provide a pathway for future interventions to target the reduction of loneliness, which would consequently contribute to reductions in symptoms of self-harm and suicidality.

The identification of the specific symptoms that could be playing a particular role in exacerbating loneliness could be helpful in developing an effective intervention, with a focus on those specific symptoms exacerbating loneliness. This would give a target in intervention development.
Given the extent to which interventions for people with complex emotional needs focus on symptomatic reductions of self-harm and suicidality, investigating whether self-harm and suicidal ideation exacerbate levels of loneliness could provide a pathway for future interventions to target the reduction of loneliness. This would increase personal recovery outcomes (i.e. loneliness), as well as reduce self-harm and suicidal ideation.

The study will also make a case for the role of loneliness and the way in which it could maintain symptoms (self-harm), which can promote future studies exploring interventions for people with complex emotional needs to include outcome measures such as loneliness.

Assessing the possibility that experiences of discrimination modify the relationship between specific traits suggestive of CEN and loneliness would both be a targetable focus of intervention and would promote clinical discussion on the way loneliness intersects with sexuality, gender, and ethnicity.
The results of this study could also contribute to a better understanding of the complex interplay between loneliness and mental health. The associated/moderators who will be assessed, could shape future questionnaire design for a loneliness scale specific to people with complex emotional needs. This is an issue raised by people with lived experience recently.

Outputs:

The results of this quantitative study will be published in a relevant peer-reviewed journal specialising in psychiatry or personality disorders such as BMC Psychiatry or Journal of Personality Disorder. A jargon-free and accessible blog describing this study in layperson's terms will also be published on the Mental Elf website as a blog for easy access and on other relevant websites such as the Loneliness and Social Isolation Mental Health Network website. This study will also be submitted for a Ph.D. Thesis and presented in research meetings and therefore will be uploaded and freely available in https://discovery.ucl.ac.uk/ upon completion of the Ph.D. Outputs, included in the published papers, will include aggregated level data with small numbers supressed. The date the researcher aims to publish the paper is January 2024. The blog and other mediums of dissemination will be published after the journal accept the manuscript (i.e. March 2024). The results of this research study will be presented at relevant academic conferences with an interest in the link between complex emotional needs and loneliness and suicidality. To increase public awareness the results may also be posted on Twitter. The study results will also be shared in the Loneliness and Social Isolation Research Network Newsletter and other mental health and loneliness-related newsletters. To protect patient confidentiality in publications resulting from analysis of APMS 2014 data users must: guarantee that any output made available to anyone other than those with whom this agreement is made, will meet required standards, including the guarantee, methods, and standards contained in the Code of Practice for Official Statistics and the Office for National Statistics (ONS) Statistical Disclosure Control from tables produced from surveys; and apply method and standards specified in the Microdata handling and Security Guide to Good Practice for disclosure control for statistical outputs.

The results will be disseminated to mental health charities with a focus on targeting loneliness and supporting people with complex emotional needs such as befriending networks, and charities such as The British and Irish Group for the Study of Personality Disorder (BIGSPD). This piece of research forms one component of a Ph.D. thesis focused on using mixed methods to explore loneliness among people with complex emotional needs. Therefore, the researcher will contextualize and elucidate quantitative findings within qualitative experiences; therefore, combining subjectivity and complexity of reality with standardised and representative findings gathered through the APMS survey. As well as this study informing policy and potential practice, the results of all these studies will be disseminated in an accessible blog in laypeople's terms, and sensitively, to inform the public (in October/November 2024). The Ph.D. student conducting this research is funded by the Economic and Social Research Council (ESRC) and UCL. This project is a part of a Ph.D. research study.

Processing:

Once the agreement is active, the flow of data will be from the UK data service (UKDS) which will allow access for UCL to download APMS data. There are no other flows of data. Data will only be stored, processed, and held in accordance with UCLs data protection policy. It will be accessed, held, and stored in the UCL Safe Haven, within the Division of Psychiatry, by the research team. The research team consists of a UCL Ph.D. student, a MSc student (registered solely at UCL), and senior researchers and clinicians who are employees of UCL. The UCL Data Safe Haven has its own set of accepted and standard procedures that is described on the following website: https://www.ucl.ac.uk/isd/services/file-storage-sharing/data-safe-haven-dsh. Data will be stored within the Data Safe Haven, which is characterized by: dual-factor authentication, a firewall that has a default deny policy, data enter via a managed file transfer mechanism and only the asset owner has permission by default to draw down any data. All those analysing data within the team are required to complete the Data Security Training as provided by Health Education England.

All APMS participant data will be included in the data analysis and research study to achieve meaningful results. UCL will securely destroy all local copies of the dataset once the DSA expires and will inform the Data Access Request Service (DARS) as required by standard procedures. This 2014 version of the APMS dataset available via DARS has been redacted based on Disclosure Procedure advice to minimize the likelihood of participants being identified.

Methodology
Sample
UCL is conducting a secondary analysis of cross-sectional data from the 2014 Adult Psychiatric Morbidity Survey (APMS), a survey commissioned by NHS Digital that is carried out by the National Centre for Social Research (NatCen). The APMS provides data on mental health and treatment access for a nationally representative sample of residents in private households in England, aged 16 and over. The APMS uses a stratified probability sampling design that consisted of two stages, the initial stage is an interview with the whole sample and the second stage involves clinically trained interviewers conducting face-to-face interviews with a subset of participants.

Measures
The following clinical measures and sociodemographic characteristics collected in the APMS 2014 survey will be used:
Exposure variable:
- Personality disorder traits: personality disorder traits were measured using the Standard Assessment of Personality- Abbreviated Scale (SAPAS). SAPAS is an eight-item screen for identifying a possible diagnosis of personality disorder. The eight items are the following: 1) difficulty making and keeping friends, 2) identifying self as a loner, 3) difficulty trusting others, 4) tendency to lose temper, 5) tendency of being impulsive, 6) tendency to worry, 7) tendency to depend on others, and 8) tendency to be a perfectionist. The SAPAS has been validated for use in the psychiatric population and general population. SAPAS scores correlate with nurses ratings of externalizing behaviours, and future functioning and response to antidepressant treatment. A cut-off point of 4 on the SAPAS indicates a high probability of a diagnosis of a personality disorder.

Outcomes

- Loneliness: Loneliness is assessed using one item from the eight-item Social Functioning Questionnaire, a valid and robust measure to assess social functioning. Participants were asked to what degree they agreed with the statement: I feel lonely and isolated from other people, over the past two weeks. Participants indicated their agreement on a four-point Likert scale, ranging from not at all (scored 0) to almost all the time (scored 3), generating scores ranging from 0 to 3. To transform these scores and avoid including zero values, one will be added to each integer.

- Self-harm, suicidal ideation, and suicidal attempt:
- Self-harm: Questions on self-harm were asked in both the self-completion section and the face-to-face interview. Information about self-harm was obtained through questions: 1) Have you deliberately harmed yourself in any way but not with the intention of killing yourself? A variable will be created on self-harm in the past year, derived from either the self-completion section or the face-to-face section.

- Suicidal ideation: In the face-to-face interview questions on suicidal ideation was included and participants were asked: 1) Have you ever thought of taking your life, even if you would not really do it? An affirmative response would be followed by the following questions on when this last occurred. A variable will be created on suicidal thoughts in the past year. This approach has been used by other studies.

- Suicide attempts: Questions on suicide attempts were asked in both the self-completion section and the face-to-face interview. Data on suicide attempts was obtained by the question: Have you ever made an attempt to take your life, by taking an overdose of tablets or in some other way? A follow-up question on when this took place was asked. A variable will be created combining reports of suicidal attempts in the past year, in either section (i.e. the face-to-face or the self-completion section)

Potential modifiers of the association between the exposure and outcome:
- Discrimination
o The APMS includes binary measures on experiences of discrimination, based on a computer-assisted self-interview. Participants were asked a series of questions on whether people have been treated unfairly in the past year on the basis of belonging to a particular group such as their mental health condition, ethnicity, sexuality, sex, age, religious beliefs, and physical health problems or disabilities. UCL will be focusing on discrimination based on sexuality and mental health condition.

Covariates
The following sociodemographic characteristics collected and recorded in the APMS 2014 survey are covariates that might be associated with both the exposure and the outcome and therefore confound any association. Sociodemographic variables such as gender, age, Registrar General's Social Class, and marital status are all measured and recorded and included in a set of covariates and UCL will agree on these as a team.

- Gender
- Age
- Socioeconomic status
- Marital status
- Social participation
- Stressful life events
- Childhood emotional, physical, and sexual violence or abuse

Analysis plan

UCL will conduct all analyses using STATA (a statistical software) and present descriptive statistics, sociodemographic and clinical characteristics, by exposure group, and will divide the exposure group into three categories for the descriptive statistics, those scoring 0, those scoring 1-3, and those scoring a 4 or more (meeting the criteria for a probable diagnosis).

The relationship between the number of traits suggestive of CEN (continuous measure) and loneliness (binary measure) will be investigated using logistic regression models. UCL will then conduct adjusted models for the confounders described, adding blocks sequentially. Similarly, logistic regression models will be run to produce odds ratios for outcome variables, suicidality and self-harm, UCL will then assess if loneliness modifies this relationship between CEN and suicidality and self-harm by testing the interaction between the number of CEN traits endorsed and loneliness.

To assess whether experiences of discrimination modify the relationship between specific traits suggestive of CEN and loneliness, UCL will be testing for interaction effects between each trait of CEN and the modifier. UCL will carry out a separate statistical analysis to investigate the interaction effect between discrimination and sexual orientation, and loneliness outcomes.

UCL will apply appropriate survey weightings to the data, using the relevant survey (svy) commands in Stata 15, which allow for the use of clustered data modified by probability weights and provide robust estimates of variance. This weighting will take account of the complex survey design and of non-response to ensure that estimates are representative of the household population in England.

No linkage to other data will be conducted. No linkages are requested at this time. There will be no attempt to re-identify any participants

Modelling impact of interruptions to cancer screening with COVID ( ODR2021_016 ) — DARS-NIC-656876-L4B0V

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-02 – 2024-07 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 23 February 2023 final.pdf

Datasets:

NDRS Cancer Registrations
NDRS Rapid Cancer Registrations

Type of data: Anonymised - ICO Code Compliant

Objectives:

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital (Now NHS England) on 1st October 2021.

This project has three aims:

1. Estimate what impact delayed diagnosis and delayed treatment between one month and one year will have on:
a. The number of cancers that progress to a more advanced stage by the time they are diagnosed, and
b. Survival from cancer (specific cancers to be investigated are included in the data specification).

2. Model the impact of disruptions to breast cancer screening and identify strategies that could be used when re-starting screening that minimise any harms resulting from such disruption.

3. Predict the demand for diagnostic, treatment, and screening services.

Yielded Benefits:

Our modelling has shown interesting results on the impact of delays, but we have not yet finished our analyses.

Expected Benefits:

As previous.

Outputs:

The outputs of this project is modelling to show the impact of different lengths of delays on stage and outcomes at diagnosis. This is relevant for both post-COVID planning, and for cancer health policy more generally.

We are applying for an extension due to personnel changes that have slowed our progress.

Processing:

Design of study: Mathematical modelling. Study population: All adults aged 18 and over in England with one or more of lung, colorectal, prostate, breast, pancreatic, oesophageal, liver, bladder, kidney, or ovarian cancers. Statistical analysis: This analysis will have three overarching stages: 1) Stage progression. We will analyse retrospective cancer registry data to derive the time taken for each cancer type being investigated to progress between stages.

We will apply for access through Public Health England's (PHE) Office for Data Release to Cancer Registry data for all patients aged 18 or older in England diagnosed with one of: lung, colorectal, prostate, breast, pancreatic, oesophageal, liver, bladder, kidney, or ovarian cancer diagnosed between 01/01/2013 and 31/12/2017.We will subsequently apply for the Public Health Englands rapid cancer registration dataset providing data from 2018 up to the present.

Using depersonalised data on age at diagnosis, the year their cancer was diagnosed, the type of cancer, its stage at diagnosis, and year of death (if applicable), we will use a Markov multistate model to enable us to estimate the time taken for each cancer to progress between stages. We will then apply the resulting stage transition estimates to incidence and stage-at-diagnosis data for each cancer to derive the number of cancers expected at localised / advanced stages at different periods of time, under alternative lengths of delays to cancer services.2) Modelling breast cancer screening. We will use a multistate model taking into account the natural history of breast cancer to derive the probability of a cancer being detected by screening or clinically with different periods of disruption to screening programmes. We will apply this probability to a decision analytic model that uses a life-table approach to understand the impact on cancer outcomes. We will analyse alternative catch-up screening strategies to identify that which best mitigates the disruption on breast cancer mortality and life-years gained.3) Estimating impacts of delays and demand for cancer services We will use the estimated number of cancers at different stages to analyse the impact of delays to diagnostic and treatment services on cancer outcomes and demand for services in England. Treatment parameters by stage will be obtained from PHE's Cancer Registry data. All other parameters for the models will be from aggregated anonymous sources, for example those released under an Open Government License, or peer-reviewed literature.

We will first develop a continuous time Markov multistate model to describe the progression of cancer through the following states: healthy, localised cancer, advanced cancer, and dead1,2. Using these probabilities, and incidence of cancer by stage, we will estimate the expected number of additional advanced cancer diagnoses and the expected number of localised and advanced cancers at different periods of time. To analyse the impact of disruption to the breast cancer screening programme and alternative catch-up screening strategies that could be used when re-starting the programme, the following methods will be applied:1.We will use a multistate model of the natural history of breast cancer in the preclinical phase to derive the probabilities of detecting cancer by screening or clinically (i.e. interval cancers) following different time periods of disruptions to screening services. 2.Using a life-table approach3,4, accounting for the sojourn time of breast cancer by age, stage, and subtype, and using the derived probabilities for screen-detection and clinical diagnoses, we will model the impact of a suspension of screening services and of the backlog on interval cancer diagnoses and subsequent life years lost. We will consider different catch-up scenarios and identify the scenario that gives the fewest interval cancers and the least loss of life years. 3.We will consider delays in screening of between one month and one year and will liaise with PHE screening regarding alternative re-starting strategies that are under consideration. Finally, to analyse the impact on cancer outcomes and on predicted demand for cancer diagnostic, assessment and treatment services we will apply data on diagnostic and treatment modalities by cancer stage to estimate demand for services, and aggregate data on 1-year and 5-year survival to estimate impact on survival and mortality.

Our statistical analysis plan has been chosen as it encompasses robust methods with which the study team has experience for predicting medium and long-term cancer outcomes. There are two major caveats to the quality of the rapid cancer registration data for our analyses: missing variables, and data inaccuracies. For the purposes of our study, we have focussed on ten more common cancers, for which missing variables and data inaccuracies in the rapid cancer registration data are less of a problem than in rarer cancers or cancers of unknown primary. In both cases, earlier data are more accurate than the most recently available months, allowing us to take the inaccuracies into account in our modelling. Importantly, in our analyses our focus is on broad TNM stage as early/advanced for cancers as a whole (e.g. breast cancer, rather than breast cancer subdivided by hormone receptor) such that the impact of missing details, for example of cancer subtype, is limited. In addition, these data remain the best possible within the current context and we feel their use is justified given the role our results may be able to have in supporting policy and planning decisions as the pandemic continues to develop.

Stratifying Genomic Causes of Intellectual Disability by Mental Health Outcomes in Childhood and Adolescence (IMAGINE-2) — DARS-NIC-168879-K2N8Q

Opt outs honoured: unknowNo, Yes (Excuses: Mixture of confidential data flow(s) with consent and non-confidential data flow(s))

Legal basis: Health and Social Care Act 2012 - s261(5)(d); Health and Social Care Act 2012 s261(2)(a); Health and Social Care Act 2012 s261(2)(c), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2023-01 – 2026-01 2023.07 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, System Access
(System access exclusively means data was not disseminated, but was accessed under supervision on NHS Digital's systems)

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 10 November 2022 finalv1.pdf

Datasets:

Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
Mental Health of Children and Young People (MHCYP)
Mental Health of Children and Young People (MHCYP) Survey

Type of data: Identifiable, Anonymised - ICO Code Compliant (note: this information not disclosed for TRE projects )

Objectives:

The IMAGINE-2 study is a medical research study funded by the Medical Research Council (2020-2024) that aims to investigate the impact of genetic disorders that are associated with learning difficulties on children and young peoples mental health. It is a collaboration between University College London (UCL) and Cardiff University. Cardiff University will not have access to or process the NHS Digital data to be provided for this UCL data request. Cardiff University do not determine the purpose or the means of the data processing for the IMAGINE-2 study and are not therefore considered to be a data controller. The University of Cardiff Investigator has had no input on determining the purpose and means of workstream 1.

The UCL study team resides at the UCL Institute of Child Health department which is a joint research office between UCL and Great Ormond Street Hospital (GOSH). As such, both UCL and GOSH logos are used in the materials for this study however GOSH does not play any further role in the study and do not determine any purposes of this study in any capacity.

The IMAGINE-2 study is a follow-up project of the previous one called Intellectual Disability and Mental Health: Assessing Genomic Impact on Neurodevelopment (IMAGINE-ID) which was funded by the Medical Research Council (MRC) and Medical Research Foundation (2015-2020). A collaborator from the University of Cambridge who was involved in the IMAGINE-ID study is involved in IMAGINE-2 as a consultant only who may provide advice to the IMAGINE-2 project. The University of Cambridge will not have access to any newly collected data from the research programme and do not determine the purpose or the means of the data processing for IMAGINE-2. The MRC as funders of IMAGINE-2 do not determine the purpose or the means of the data processing and will not process any of the study data. These organisations are not therefore considered to be data controllers or data processors.

IMAGINE-2 is divided into two workstreams. Workstream 1 aims to map trajectories of developmental risk for individuals with different types of Intellectual Disability (ID). Workstream 1 is led by University College London and requires NHS Digital data. Workstream 2 is led by Cardiff University and involves a face-to-face follow-up study of young people seen during IMAGINE-ID who have been identified as carrying a genetic variation which is high-risk for mental health problems. Cardiff University will independently collect data from participants in IMAGINE-2 by direct contact with the identified high-risk subset of families whom they will visit at home. Cardiff University will not have access to the NHS Digital data.

UCL is the sole data controller who also processes NHS Digital data for the IMAGINE-2 study. NHS Digital data will be handled exclusively by UCL in the Data Safe Haven (DSH) to which only staff who are associated to UCL will have access. Access to UCL DSH will only be given to individuals who have had the appropriate UCL non-disclosure training and have signed a Non-Disclosure Agreement. No data will be exported outside the DSH.

UCL is a public authority, as defined in the Data Protection Act 2018, with a principal object of the organisation being research and its dissemination. The processing of identifiable personal data, including special category data, is necessary to carry out medical research that serves the public interest. As such, the legal bases for processing personal data are:
Article 6(1)(e) of the GDPR, processing is necessary for the performance of a task carried out in the public interest; and
Article 9(2)(j) of the GDPR processing is necessary for archiving purposes in the public interest, scientific or historical research purposes.

Section 8 of the Data Protection Act 2018 clarifies that In Article 6(1) of the GDPR (lawfulness of processing), the reference in point (e) to processing of personal data that is necessary for the performance of a task carried out in the public interest or in the exercise of the controllers official authority includes processing of personal data that is necessary for (d) the exercise of a function of the Crown, a Minister of the Crown or a government department. University College London has an established royal charter. It includes the following statement The objects of the College shall be to provide education and courses of study in the fields of Arts, Laws, Pure Sciences, Medicine and Medical Sciences, Social Sciences and Applied Sciences and in such other fields of learning as may from time to time be decided upon by the College and to encourage research in the said branches of knowledge and learning and to organise, encourage and stimulate postgraduate study in such branches.

As a higher education establishment, the University conduct research to improve health care and service and the linkage requested is necessary for the performance of a task carried out in the public interest; i.e. improving the health outcomes of children with genetic disorders.

The IMAGINE-2 cohort consists of children and young people who were born between 1989 and 2016, and have intellectual disability (ID) or developmental delay caused in whole or in part by a known genetic variant CNV (copy number variant); or SNV (single gene variant). The study aims to delineate the course and outcomes in CNV-associated intellectual disability (ID) and single gene disorders to provide information at the point of diagnosis and onwards for families, clinicians and service providers, as well as to pave the way to greater biological understanding and the personalisation of interventions. Several recurrent ID-associated CNVs and single gene disorders have been associated with poor mental health outcomes., However, there is considerable pleiotropy (i.e. variations in a single gene may affect multiple (possibly unrelated) observable characteristics of an individual), and also incomplete penetrance for specific psychiatric diagnoses (i.e. some individuals express the associated symptom or trait while others do not, even though they carry the disease-causing gene).

The cohort includes children with genetic disorders that put them at risk of autism, attention deficit hyperactivity disorder, anxiety and psychosis among other conditions. They are also at risk for non-psychiatric disorders including sensory impairments and epilepsy. No study to date has deployed systematic sampling and assessment to determine why some, but not all, ID-related CNVs and single gene disorders are associated with poor mental health outcomes, nor have they identified risk and resilience factors modifying outcomes across this population. Assessing the relative contributions of CNV and/or SNV genetic constitution, ID severity, cognitive profile, social/ environmental risk factors, and physical comorbidities, will highlight major determinants of adjustment. Better care could be provided if those individuals at greatest risk were identified early and if preventive intervention was timely and focused on salient biological and/or social processes. Early identification has the potential to reduce the costs of long-term care, better target key services/interventions, and improve quality of life over the life-course.

Background:
Participants were eligible for the IMAGINE-ID study if they were 4 years of age or over at the point of recruitment between 2015-2019, and if they possessed a genetic variant, identified by an NHS Regional Genetic Centre, that was reported to be causing intellectual disability/ developmental delay. The vast majority of participants in IMAGINE-ID were identified as being eligible by one of 25 UK Regional Genetics Centres (RGC). The original genetic testing was ordered on the basis of unexplained learning disabilities. Eligible families were invited to participate by the paediatric team linked to the RGC. The study was advertised to patient groups through social media and at parent-supported events. Once participants had been recruited, they were invited to complete online assessments of their childs mental health, behaviour and well-being for the Workstream 1 data collection. 3402 participants were recruited in total. 500 families, a subset of the total sample, have been seen face-to-face for more detailed evaluations by Cardiff University collaborators for Workstream 2 of the study. The new MRC grant (2020-2024) provides funds for further study and the title has been changed from Intellectual disability and mental health: Assessing genomic impact on neurodevelopment (IMAGINE ID) to Stratifying Genomic Causes of Intellectual Disability by Mental Health Outcomes in Childhood and Adolescence (IMAGINE-2) which will follow up the already-recruited cohort for 54 months, commencing 1st April 2020.

The study has established a patient, parents and carers consultation group. This group has been consulted from the inception of the study and is regularly updated. The group was established to provide feedback, comments and suggestions that have influenced the design and progress of the project.

The Workstream 1 data collection undertaken during IMAGINE-ID provided details of the childrens development, well-being, mental health and adaptive functioning. A brief account of the childrens medical history was obtained. The Adaptive Behaviour Assessment System (ABAS-3) was used to estimate the degree of developmental delay in key domains of adaptive functioning (e.g. language, self-care, motor skills). Childrens mental health was assessed by the Development and Well-Being Assessment (DAWBA), which has been employed in three national UK studies over the period January 1999-December 2017. The DAWBA is a detailed semi-structured interview and covers many areas of development, behaviour and well-being. Rates of mental health disorder and behavioural/ emotional dysfunction in the IMAGINE-ID cohort can therefore be directly compared with a national representative sample of typically developing children and young people, the dataset of the Mental Health Children and Young People (MHCYP) from the UK Data Service. Significant general health problems are described in many cohort children with mental health disorders.

Mental Health of Child and Young People (MHCYP) data request:
MHCYP data is requested for use as the control data to compare with IMAGINE-ID cohort data collected from the assessments of mental health, behaviour and wellbeing (i.e. Workstream 1 data collection). The Mental Health of Children and Young People (MHCYP) survey provides record-level, pseudonymised data on the prevalence of mental disorders in children and young people (aged 2-19 years old) living in England. This dataset contains the same measures as the IMAGINE-ID research data and covers a similar age of children and young adults. No identifiable data is requested from the MHCYP dataset.

HES data request:
The study is applying for access to Hospital Episode Statistics (HES) data to assess the broader healthcare needs of this group as a function of their genetic disorder. Pseudonymised data is requested from NHS Digital but will be linked to existing study data thus making the data technically identifiable. Access to these data will permit a more detailed view of the strengths and difficulties of participants, their use of services, and comorbidities. In the IMAGINE-2 study (2020-2024), a longitudinal perspective will be taken, throughout childhood and adolescence, to ascertain disease trajectory and outcomes relating to social inclusion, education and their health needs. Better characterisation of the trajectories of health risks and wider developmental impact of these diverse ultra-rare genetic disorders will inform and improve future healthcare and management.

The request is to access Hospital Episode Statistics (A&E visits, Critical Care Episodes, Admitted Patient Care Episodes and Outpatients appointments) and Emergency Care Data Set for the IMAGINE-2 cohort and a control cohort. By linking HES and ECDS data to already collected data on the IMAGINE-2 cohort mental health and family circumstances, the study aims to build a detailed picture of the more significant medical healthcare needs of this group of children and young people. The HES control cohort requested by UCL will be matched on age, sex and index of multiple deprivation.

Through analysis of the number of hospital visits, number of outpatient appointments, and length of stays in hospital, the study will examine to what extent the population of children and young people with intellectual disability or developmental delay caused by a known genetic variant relies on the NHS healthcare system more than typically developing children. The study will investigate the costs involved in caring for these children, including the costs to the children themselves (for example missing school because they have hospital appointments or are admitted to hospital). The study will examine potential links between specific ultra-rare genetic disorders and the need for specialist intervention in particular healthcare domains. For example, one third of the cohort has had seizures; in some cases, the seizures were associated with genetic anomalies that have never previously been studied in detail because of their rarity. It would not be possible in any other way to analyse cohort data at this scale, which has implications for improving future medical management in this vulnerable population. Preliminary data, from parental reports, indicates a high rate of frequent users, reflecting the complex nature and needs of many of these disorders.

Separately, UCL also require access to a standard extract of Mental Health of Children and Young People (MHCYP) 2017 and 2020 survey data. This data is not linked (nor does it have the capability to do so) to either UCLs cohort nor the control cohort. This data will be used for comparison purposes to give UCL a snapshot of statistics into several categories (mental health, behaviour and well-being).

UCL hypothesise that developmental trajectories of children and young people in the IMAGINE-2 cohort with genetic differences will differ to those of individuals without genetic disorders, and that this will be best captured by clusters of traits indicating their mental states, mental illnesses or disorders and impairment of their cognitive development. UCL also hypothesise that the trajectories of these clusters of traits will be impacted by genetic factors and related risk factors such as socioeconomic adversity and family environment. A control cohort matched by age, sex and geographic region will indicate the differences in impact according to genetic factors and index of multiple deprivation for example.

Data linkage:
The study aims to link HES data to the existing research data from Workstream 1 in order to obtain a detailed picture of participants use of NHS secondary care services. The data held from Workstream 1 are genetic, medical history and mental health data of which is coded and is in non-identifiable form. The genetic data (from Regional Genetic Centre (RGC) laboratory reports) and observable individual characteristic data (through online psychiatric assessments) will be linked to medical history data in order to build a highly detailed picture of the cohort as it develops over a period of 5 years since participating families were originally recruited and interviewed. The study will compare service usage in this cohort of children to service usage by children and young people in England.

Using the diagnosis categories for each episode, the study will be able to assess the health problems that are associated with each genetic disorder and which are common across genetic disorders. The study will ascertain if there are health problems in common within and across genetic disorders which are not yet well-known or described in the literature and will contribute to the existing body of knowledge.

Using HES data about length of episodes and specialty involved, the study aims to further develop analyses of socioeconomic factors involved in these genetic disorders. This will have two outcomes; to provide information about the level of contact with health services parents might expect if their child is diagnosed with a genetic disorder, and to provide an assessment of the cost to the NHS of caring for this group of children (using NHS Reference Costs), which will be of use in decisions relating to commissioning services. Prevalence of neurodevelopmental disorders is increasing as more children survive due to better care, and with the constant development in genetic sequencing technology, more and more children in the future will have a genetic cause of their developmental delay or intellectual disability identified. Early identification and better understanding of the disease trajectory of these conditions has the potential to reduce the costs of long-term care, better target key services and interventions, and improve quality of life over the life-course.

UCL will conduct network analyses and machine learning methods to look for commonalities which could indicate the mechanism by which specific genetic disorders are associated with medical/psychiatric disorder expression and provide paths of investigation for therapies. In regards to machine learning, this is a way of statistical analysis which may be used in this study to analyse the research and HES data. Any machine learning analysis will not involve any personal nor identifiable data. Researchers will use various statistical methods in conducting these analyses, such as regression, classification and machine learning methods, time series analysis, statistical inference and natural language processing, according to the approach that is most appropriate for the data.

In order to undertake the analyses as described, the study is applying to access HES A&E, Outpatients, Critical Care and Admitted Patient Care data and ECDS for each of the consented participants, covering as much of their lifespan as is available for each dataset (a range from 1994/95 to present) in order to gain a detailed and accurate picture of their medical history and use of services throughout their lifetime. As the data relates to individuals, the geographical spread of the data requested represents the current geographic spread of participants in England.

The study is requesting the medical history of each participant. There are no alternative or less intrusive ways of obtaining these data. Whilst the study has obtained, for approximately one third of participants, a basic medical questionnaire and a brief medical history, much more detailed data are required. It would be impractical to ask families to recount every healthcare interaction they have had. Obtaining data directly from primary or secondary care providers would not be feasible due to time and complexity, and the size of the cohort.

In order to minimise the data requested, the study is requesting very limited demographic data (limited to demographics relating to health care, such as Integrated Care Board/Trust names). No variables have been requested which are not necessary for the proposed analyses.

Yielded Benefits:

This is a new request for NHS Digital data. No yielded benefits have been attained to date using NHS Digital data. Information on the benefits of the wider IMAGINE-ID study can be found at https://imagine-id.org/

Expected Benefits:

In England, there are over a million people with learning disabilities, a quarter of whom are children of school age. Most moderate to severe intellectual disability (ID) has a genetic cause. The study hopes to have beneficial impacts upon the domains of clinical practice and care services, and quality of life for affected families. Potential benefits to health and social care include better understanding of the mental health and well-being of children with ID caused in whole or in part by a known genetic variant, with the aim of enabling more efficient targeting of healthcare resources to provide the best support to them.

Clinical practice: Clinicians in the NHS increasingly request specialist genetic investigations for children with ID. Usually, the results do not translate into specific recommendations for management or prognosis relating to behavioural adjustment, although families would welcome such knowledge . The patient, parents and carers consultation group gave feedback and comments that they highly supported more research on the genetic investigations for children with ID, in order to gain more understanding and knowledge of the conditions. With the genotypic (genetic constitution) data on specified genetic disorders with standardised phenotypic (observable individual characteristic) data and longitudinal health service utilisation, this study expects to generate valuable information of the mental health and well-being of this cohort. Identification and characterisation of the trajectory of these rare disorders will be used to provide information to clinicians and health professionals who see patients with these disorders. Better evidence-based information is expected to help clinicians and families in making appropriate health and social care decisions and aid in assessment of whether to undertake interventions or prescribe medication to ameliorate symptoms. If children are able to access better care or take advantage of adjustments or interventions, this may improve outcomes (e.g. better educational attainment, improved social communication and inclusion) which in the long term can result in fewer interactions with mental health care providers to alleviate some of the pressures on the resources of the healthcare system..

Families: Unusual behaviour patterns or emotional disorders associated with ID are often ascribed to inappropriate parenting practices. Recognising common disorder-specific patterns is the first step to reassuring parents and educating clinicians/social support staff. This is expected to reduce self-blaming and stress, with resultant improved quality of life for affected families. The impact of child behaviour can be reliably and easily measured across time, and may independently predict future symptoms and psychiatric disorders, including the interactive process by which behavioural and emotional problems can undermine family/individual quality of life. Through documenting health service utilisation in conjunction with genotypic and phenotypic detail, new opportunities for intervention could arise, thus enhancing parents' economic activity (e.g. by promoting their mental health, reducing school exclusions, limiting risk of parental separation).

Health Care Professionals: Intellectual disability implies global impairments in cognitive skills, yet some developmental trajectories may be preserved (exemplified by the relatively good language skills of children with the genetic disorder Williams syndrome). Gaining knowledge about differences in ability across different genetic disorders has implications for education planning and fostering the maximisation of individual potential. Such discoveries could inform policy on the management of children with ID due to a genetic cause. Information on environmental factors influencing emergence of challenging behaviour linked to genotypic risk could point to genotype specific interventions, reducing risk of transfer to residential care and the associated costs.

These benefits will impact not just the NHS in terms of better evidence for clinical care decisions, but also every family which includes a child with intellectual disability or learning problems due to a genetic cause, both now and in the future. Through the outputs resulting from the analysis of the requested data, we aim to deliver benefits over the next 5 years (2022-2027) initially. UCL and Cardiff University will benefit from the publication of peer-reviewed journal articles which will contribute to further funding applications and research collaborations both nationally and internationally.

Outputs:

The data will be used to create analytical outputs which are subsequently intended to be used in research reports and published in peer-reviewed journals and/or presented at academic conferences appropriate for the nature of the analysis and message.

All outputs of the data analysis will be aggregated and small numbers will be suppressed in line with the HES Analysis Guide. The study will ensure that NHS Digital data will not be linked to any other data which would be likely to make it identifiable.

The audience for this research is primarily clinicians and researchers. However, the research teams are keen to disseminate findings to participants in the study and to the general public. The study will work with patient groups such as UNIQUE (Understanding Rare Chromosome and Gene Disorders), a registered charity which provides information and support to individuals with various genetic disorders, and raises public awareness of the conditions. The outcome of this study may include high-level insights gained through analysis of MHCYP and the IMAGINE research data to provide more information and understanding on the genetic disorders for the affected individuals and the public. UCL aim to submit for publication within 24 months of receipt of the dataset.

Planned disseminations include:

Publications: The study will use open access publication, in line with current Medical Research Council policy, to maximise the impact of peer reviewed publications. Publications will be targeted at a number of different academic and clinical audiences. UCL intend to reach out to the community of researchers working with intellectual disability through specialist publications. A broader readership will be targeted in academic psychiatry that is interested in specific findings of more general relevance through more academic publications, and higher impact journals if appropriate. Journals to be targeted include: The Lancet Psychiatry, British Journal of Psychiatry, Journal of Developmental Disorders, Journal of Intellectual Disability Research, American Journal of Psychiatry, Journal of Health Services Research and Policy, Archives of Disease in Childhood and the British Medical Journal. Both the principal investigator (PI) and the co-PIs have strong track records in these areas, including publications in a range of high impact journals.

Conferences: Researchers intend to present findings at academic conferences and other forums relating to research in neurodevelopmental disorders and behavioural phenotypes, genetics, and rare diseases in the UK, Europe and North America, including: International Society for Autism Research (INSAR), Neurodevelopmental Disorder Annual Seminar, UCL Mental Health symposium, Society for the Study of Behavioural Phenotypes Conference, Royal College of Psychiatrists Conference.

Clinical community: A main objective is to provide clinically valuable information to clinicians from a wide range of specialities, including community paediatricians, clinical geneticists, neurologists (both paediatric and adult) as well as specialists working with intellectually disabled people. The study will target these groups by presenting findings at clinically oriented conferences. The study intends to publish summaries of the findings in wide circulation journals of general interest to practitioners, including family doctors. The study will aim to ensure that its findings are distributed to clinicians and others working professionally with ID through contacts with appropriate professional and specialist Societies, including: Royal College of Paediatrics and Child Health (through publication in the Archives of Disease in Childhood); Royal College of Psychiatrists (through publication in the British Journal of Psychiatry); Members of the British Medical Association (through publication in the British Medical Journal).

Other Intellectual Disability Stakeholders: The study has worked closely with the former Chief Executive Officer (CEO) of the charity UNIQUE to advise upon recruitment of families to the study. The study is also working closely with a wide range of other parent support organizations for children with specific ID-related genetic disorders such as SWAN ('Syndromes without a Name'). This liaison ensures there is a gateway directly into the community of parents and carers of individuals with intellectual disability for whom much of our research will have direct relevance. The study has established a newsletter and a website (https://imagine-id.org/) in collaboration with patient and public involvement (PPI) groups, to make its findings available to all stakeholders with an interest in the research, and its implications for the ID community.

Wider audiences: UCL and Cardiff University are keen to promote public engagement in science at many levels. Research staff, at all sites, participate in outreach activities such as attending conferences held by support groups. Communication of the project outcomes to the general public will take place during the lifespan of the project via on-line announcements (e.g. Twitter and Facebook), as well as meetings with journalists from the popular scientific press and general press. Presentations at symposia and public lectures will broaden the impact of the research on the public. The study also intends to make available information for families in collaboration with patient-support organisations and other stakeholders.

UCL estimate outputs will start to show 6 months on receipt of the data from NHS Digital.

Processing:

Data:
UCL are requesting Mental Health of Children and Young People (MHCYP) 2017 and 2020 survey data via the UK Data Service (UKDS). This will enable comparison of the mental health, behaviour and well-being of the IMAGINE-2 participants with intellectual disability (ID) to the MHCYP cohort in the general population.

UCL are also requesting HES APC, HES CC, HES OP, HES A&E and ECDS data for IMAGINE-2 participants and a matched control cohort identified by NHS Digital who do not have recorded ID. UCL will provide NHS Digital with the NHS number, date of birth, postcode and gender of IMAGINE-2 participants on one occasion alongside a unique study Identification. HES data will be returned to UCL with the unique study Identification only. An equivalent pseudo-Identification will be generated for the matched control cohort.

Processing:
The MHCYP data is only available by system access via the UK Data Service (UKDS) and the data will be processed in their secure environment. The MHCYP data will not be linked to neither cohort data. In order to protect patient confidentiality in publications resulting from analysis of MHCYP data users must apply the following rules:
· zeros should be shown,
· 1-7 to be rounded to 5,
· any other numbers rounded to nearest 5,
· rounding unnecessary for averages etc.,
· percentages calculated from rounded values,
· if zeros need to be suppressed, round to 5.

The HES data will be stored and analysed securely within the UCL Data Safe Haven environment. Once the data are received, the IMAGINE-2 research team at UCL will perform exploratory data analysis and clean data as required (e.g. by removing or flagging missing data, subsetting the data for ease of analysis and any other necessary processing in order to make the data ready for analysis). The study intends to link records in the IMAGINE-ID database regarding details of the cohorts development, well-being, mental health and adaptive functioning to HES data for this project specifically. This will be done using each cohort members unique study ID. There will be no requirement or attempt to re-identify individuals.

All research data held in the Data Safe Haven for analysis is kept separate from the identifying data files (also stored in the Data Safe Haven).. These data are kept in a separate secure system within Data Safe Haven. The research data and identifiable data will not be linked, with the exception of Date of birth and postcode for demographic analysis purposes if required. The raw data will not be transferred out of the Data Safe Haven. At the stage of creating publications or presentations, only aggregate and summary data with small numbers suppressed in line with the HES Analysis guide will be transferred out of the Data Safe Haven. All data will be processed by UCL.

VIRTUS LONDON 4 do not access data held under this agreement as they only supply the building. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database(s) containing the data.

The data will be stored for use as part of the research analysis carried out by the IMAGINE-2 study only. It will not be available at record level to third parties and will not be available for any commercial use. Access to the IMAGINE ID data within the UCL Data Safe Haven is controlled by the IMAGINE ID Principal Investigator at UCL. Only authorised users have access and access is via 2-factor authentication (username, password and authentication code).

Under this Agreement, the data will only be processed by substantive employees of UCL and those with access to the data have taken information governance training and are aware of their responsibilities and obligations.

Prevalence, clinical characteristics and impact of body dysmorphic disorder in young people — DARS-NIC-259538-Q4V0W

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2022-03 – 2023-09 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 12 May 2022.pdf

Datasets:

Mental Health of Children and Young People
Mental Health of Children and Young People (MHCYP)
Mental Health of Children and Young People (MHCYP) Survey

Type of data: Anonymised - ICO Code Compliant

Objectives:

The research team at University College London (UCL) are requesting access to the 2017 Mental Health of Children and Young People (MHCYP) survey dataset for the purpose of examining the prevalence, clinical characteristics, and impact of body dysmorphic disorder (BDD) in young people.

Body dysmorphic disorder (BDD) is characterised by excessive preoccupation with perceived flaws in physical appearance (most commonly facial features), which appear minimal or completely unobservable to others. Sufferers typically engage in a range of compulsive and repetitive behaviours, such as extreme grooming rituals, often in an attempt to conceal or correct their perceived appearance flaws. The disorder has a devastating impact on quality of life, yet remains under-diagnosed, under-researched and poorly understood. There have been no epidemiological studies of BDD in young people, and therefore many fundamental questions remain unanswered. To this end, this project intends to examine the prevalence, clinical correlates and impairment associated with BDD in young people. This information could have direct implications for the detection and diagnosis of BDD and may identify care needs which could assist those designing, commissioning, and delivering Child and Adolescent Mental Health Services (CAMHS). More specifically, the project aims to answer the following questions:
- What is the prevalence of BDD in young people?
- How does the prevalence of BDD vary with age and sex?
- What are the patterns of psychiatric comorbidity associated with BDD?
- What is the psychosocial impairment associated with BDD?
- Is BDD associated with service utilisation?

The MHCYP 2017 data are uniquely able to address the aims of this project as this is the only population-based survey to include assessment of BDD in young people, either in the UK or internationally.

The MHCYP 2017 survey included 9,117 children and young people aged 2 to 19 years old, who were recruited from a stratified probability sample taken from GP registers. Parents reported on younger children, with additional self-report questions for those aged 11-16. Young people aged 17 and over completed their own questionnaires.

The survey included the Development and Well-Being Assessment (DAWBA), a validated diagnostic assessment tool. The DAWBA assesses a wide range of psychiatric disorders. Parents and young people (aged 11 upwards) complete a series of questions online. Within each diagnostic category, initial screening items are presented, followed by more detailed questions. If screening items are not endorsed then the informant can skip to the next diagnostic category. Parent and child responses are aggregated to determine whether a diagnostic threshold is met.

The DAWBA has been used in previous surveys of child and adolescent mental health in the UK (the British Child and Adolescent Mental Health Surveys), but in the MHCYP 2017 survey the DAWBA was extended to include assessment of body dysmorphic disorder (BDD) for the first time. In addition to the DAWBA, the MHCYP 2017 survey included the Strengths and Difficulties Questionnaire (SDQ), which is a validated dimensional measure of mental health difficulties and impacts. Furthermore, data on the socioeconomic circumstances of the family and the child or young person's contact with services was collected.

The data controller and processor will be UCL. Only those who are substantive employees of UCL will be accessing and analysing the data requested within the UCL secure research facility.

The GDPR lawful basis for UCL to process this data is Article 6(1)(e)'processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller'. As per article 8(c) of the Data Protection Act (DPA) 2018, this includes processing of personal data that is necessary for the exercise of a function conferred on a person by an enactment or rule of law.

Power is conferred upon the university by the University College London Royal Charter to provide education and courses of study in the fields of Arts, Laws, Pure Sciences, Medicine and Medical Sciences, Social Sciences and Applied Sciences and in such other fields of learning as may from time to time be decided upon by the College and to encourage research in the said branches of knowledge and learning and to organise, encourage and stimulate postgraduate study in such branches.

The legal basis for the processing of special category data is GDPR Article 9 (2)(j), for research or statistical purposes. This is further supported by article 10 of the DPA 2018 that states that an exception can be made to the prohibition on the processing of special categories of personal data if "the processing meets the requirement in point (b), (h), (i) or (j) of Article 9(2) of the GDPR for authorisation.

This project falls under this category because body dysmorphic disorder (BDD) is a major public health concern in the United Kingdom yet remains poorly understood.

Patient and Public Involvement (PPI) will be central to the generation of outputs for this project (see also section 5c), particularly public facing outputs such as blogs, podcasts, and leaflets for schools. Such materials will be developed in conjunction with PPI groups, including those linked with the BDD Foundation. The BDD Foundation is the UK's national charity for BDD and plays a key role in raising awareness of the condition and its treatment. The research lead for this study, who is a substantive employee of UCL, is a clinical advisor for the BDD Foundation, and will work closely with their PPI group to identify relevant forums for dissemination of findings (e.g. relevant podcasts), and to ensure appropriate wording in written outputs.

This project does not link to any wider studies or collaborations.

Expected Benefits:

The first outputs for this project are expected within a year of receiving the Mental Health in Children and Young People (MHCYP) 2017 data. The project will hopefully lead to recommendations which may have an impact on the following categories.

1) Impact on young people with BDD.
Outputs from this work may have direct clinical benefits for young people up to the age of 19 years with body dysmorphic disorder (BDD) across the UK. Findings may also be generalisable internationally. Although the prevalence of BDD is currently unknown, existing research indicates that the disorder is likely to affect 1-2% in adolescents, which equates to approximately 80,000 young people in the UK. BDD is grossly under-diagnosed and there are long delays in young people accessing treatment. This project may help to identify the characteristics of BDD in young people, which could help in raising awareness of this condition. In addition, this project aims to identify demographic and clinical correlates of BDD in youth, which could assist in improving detection and diagnosis of the disorder.

2) Impact on clinical services.
Recommendations based on the study may be able to guide professionals in the detection and diagnosis of BDD, thereby aiding early access to effective treatment. Understanding who is most likely to be affected by BDD (i.e., demographic, and clinical correlates) could inform targeted screening of BDD in within Child and Adolescent Mental Health Services. Similarly, those offering mental health support in schools (e.g. Education Mental Health Practitioners, school nurses and counsellors) could benefit from improved knowledge of the extent of service need and also which groups are particularly vulnerable to experiencing BDD. This could in turn improve prevention and early intervention.

3) Impact on commissioning.
A major beneficiary will be those designing and commissioning Child and Adolescent Mental Health Services (CAMHS) in England, as the research could provide evidence of the prevalence of body dysmorphic disorder and associated clinical needs.

4) Impact on policy.
Policy and commissioning could be impacted at a national level, with beneficiaries including the Parliamentary Health and Social Care Select Committee, and the Women and Equalities Select Committee. These have an important role in holding the Government to account on child mental health policy, and have an interest in addressing body image problems as demonstrated by their recent inquiries into this topic (e.g. Changing the Perfect Picture: an enquiry into body image published in April 2021). Other relevant national bodies include NHS England, as well as regional specialist mental health and child health commissioning networks. Policy briefings will be widely disseminated across these groups.

5) Impact on society.
As described above, this project may improve the understanding of BDD in young people thereby increasing early detection, diagnosis and treatment of this condition. Previous BDD research has shown that longer duration of illness is associated with poorer treatment response. Therefore, improving diagnosis and treatment of BDD in youth, which is when the disorder usually emerges, is likely to improve long-term outcomes. This is not only important at an individual level, but may also have benefits at a societal level. If left untreated, BDD is a chronic disorder and associated with unemployment and high levels of service utilization in adulthood. Early diagnosis and treatment is likely to reduce the financial impact of BDD.

Outputs:

This project aims to contribute to a better understanding of the prevalence and impact of body dysmorphic disorder in children and young people. Specific planned outputs are:
- Peer reviewed journal articles of international standing e.g. Journal of Child Psychiatry and Psychology, Journal of the American Academy of Child and Adolescent Psychiatry (Winter 2022).
- Conference presentations to a range of audiences including health and education (Summer-Autumn 2022).
- Blogs and other public facing outputs, including via social media (e.g. the researchers' Twitter accounts; @georginakrebs and @argStringaris). Public facing outputs will target schools, health professionals and the general public. These will be developed in conjunction with PPI groups and partners such as the BDD Foundation, a national charity (Summer-Autumn 2022).

All outputs will involve presentation of aggregate data with small numbers suppressed only, in accordance with the special conditions detailed under this Agreement.

Processing:

NHS Digital are the data controller for the MHCYP 2017 data survey. The survey is being carried out by NatCen Social Research and the Office for National Statistics who are co-data processors.

The collected data is checked, derived further (if required), minimised and pseudonymised (direct and indirect identifies removed). The pseudonymised data asset is then sent to NHS Digital for information and to the UK Data Service (UKDS) (www.ukdataservice.ac.uk) for storage and further dissemination. Before UKDS are able to release the data to UCL a Data Sharing Agreement (DSA) must be signed with NHS Digital.

The UKDS will securely transfer the entire standard MHCYP 2017 dataset to UCL. It is not possible to obtain individual variables. Personal data such as names, addresses and dates of birth are not included, only the unique serial number used to represent participants. To minimise the risk of re-identification in this pseudonymised dataset, the study team will also follow the Disclosure control for microdata produced from social surveys guidance set out by the Government Statistical Service.

UCL will electronically store the data in their 'Data Safe Haven', a secure research storage and processing environment, which has security assured under a Data Security and Protection Toolkit. Analysis will take place within the secure research environment. Only study team members who are substantive employees of UCL will have the authorisation to access the data for the purpose(s) described and will access the storage and processing environment using their individual password. Remote access will occur via multi-factor authentication over an encrypted connection. Only aggregate data (with small number suppression applied) will be exported for the purpose of dissemination of findings. The data will only be used for the purposes described in the agreement.

UCL will hold data as per the Data Sharing Agreement length, after which the data will be securely destroyed according to the Data Sharing Agreement between UCL and NHS Digital, unless an extension is applied for and granted.

Only aggregated outputs will be made available to third parties in peer reviewed publications and open access reports. There will be no requirement or attempt to re-identify individuals.

Data management will be done using a statistical analysis package. All analyses will be conducted using survey weights and controlling for complex survey design where appropriate, and for non-response. Descriptive statistics will initially be used, with stratification by age and gender where appropriate. This will be followed by the use of logistic and linear regression models to examine the association of a range of psychiatric comorbidities with BDD, as well as psychosocial impairment and service utilisation.

Educational outcomes in children born after assisted reproductive technology; a population-based linkage study — DARS-NIC-258079-G7W1Y

Opt outs honoured: (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-02 – 2025-02 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 7 October 2021 final.pdf, IGARD Minutes - 3 March 2022 - Final.pdf, IGARD Minutes - 17 February 2022 Final.pdf

Datasets:

Birth Notification Data
Civil Registration - Births
Demographics

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

Infertility problems are common, with approximately 1 in 6 couples experiencing difficulty in conceiving children naturally. Fertility treatments can help many of these couples, and their use is rapidly increasing. Fertility treatments in which either eggs or embryos are handled are called assisted reproductive technologies (ART), and they include in vitro fertilisation (IVF) and other related techniques. Nearly 1 in 50 children in the UK are born to parents who have benefited from fertility treatments. There remains outstanding a major concern however of whether children born after assisted reproductive technologies (ART) are at higher risk of developing learning or behavioural problems.

There are biologically plausible reasons for increased vigilance regarding the development of children conceived after ART. These procedures involve the handling of eggs and embryos outside of the body at a vulnerable period of early human development, which could impact upon development of the nervous system and brain (neurodevelopment) in children conceived in this way. ART also carries increased risk of multiple births, premature delivery, and low birth weight, all of which are adversely associated with neurodevelopment.

The long-term cognitive and behavioural development of this increasing population of children has not been adequately investigated to date, because existing studies have been small, and have not included adequate comparison groups. A recent systematic review summarised that there is insufficient evidence to conclude whether the long-term neurodevelopment of children born after ART is comparable to that of spontaneously conceived children. The European Society of Human Reproduction and Embryology has emphasised that high-quality research is essential to understand whether or not this increasing population of children are at higher risk of developing any problems as they grow and develop, so that couples considering ART can receive appropriate and reliable information, and so that any problems in children can be identified and managed early.

University College London (UCL) are therefore conducting a population-based cohort study (a research study in which group(s) of individuals are followed over time) to compare educational outcomes in children born after ART procedures with two control (or comparison) groups of naturally conceived children: 1) naturally conceived siblings of the ART children, and 2) unrelated children chosen at random from the same schools (school-matched controls).

A parallel study conducted by UCL investigating childrens health following ART has identified a cohort of children born following non-donor ART procedures (ART procedures which did not involve the use of donor sperm or eggs) throughout England between 1992-2009 (n=86,064), as well the naturally conceived siblings of these children (n=23,299). These cohorts were identified by linking data from the Human Fertilisation and Embryology Authority (HFEA) database with the Office for National Statistics (ONS) birth records. This data is securely held by NHS Digital, with each individuals records in the dataset pseudonymised using a unique member number (UMN). Pseudonymisation is a technique which helps to protect the personal information of data subjects by replacing information in a data set that identifies an individual with a reference number.

For this study, the proposal is to explore educational outcomes in this cohort of children born following ART procedures as well as their naturally conceived siblings. This will be achieved by linking the data for these children (held by NHS Digital) with the National Pupil Database (NPD), which is held by the Department for Education (DfE). The NPD contains detailed information about the educational achievement of all pupils in state sector schools and sixth-form colleges in England. Following change in legislation with the Digital Economy Act 2017, which aims to enable and facilitate the secure use of data from across the government sector for research, access to NPD data is now being made available for research purposes through the ONS Secure Research Service (ONS-SRS). The NPD will also be used to identify a second control (or comparison) group of unrelated (non-sibling) school-matched controls for the ART Cohort (unrelated children chosen at random from the same schools as the ART conceived children). The DfE will transfer the identifiers (including names, date of birth, sex, and postcode) of the unrelated school-matched controls to the ONS. The ONS will then identify information related to the birth of these children (birth weight, multiple birth status, and maternal age) from the national birth records. This birth information is important to ensure that the educational outcomes in these school-matched controls can be meaningfully compared with the ART conceived children. The ONS will then pseudonymise this birth information for the school-matched controls, and deposit it into the ONS-SRS where the pseudonymised data will be analysed by the UCL research team.

Through the linkage of several large existing national datasets, the proposed study will address an important gap in scientific knowledge regarding the educational achievement of children born following ART. It has a number of important methodological strengths compared to existing studies: much larger size, follow-up of children throughout the full age range of school education in England (4 to 18 years), national coverage, and the inclusion of two control (or comparison) groups of naturally conceived children (siblings of ART children, and unrelated school-matched controls) to provide a more robust and reliable assessment of educational achievement in the ART conceived children.

Data linkage using existing national datasets is the least intrusive, most efficient, and only feasible methodology for adequately investigating this research question with a sufficiently large and representative sample size. The existence of the HFEA database in the UK, with the mandatory recording of all treatment cycles of ART undertaken nationally as a legal requirement, offers an internationally unique opportunity to study the health and development of children conceived following ART. Linkage of the national cohort of children born after ART with the NPD is the only way in which the HFEA database can be utilised to investigate the educational outcomes in these children.

Within the remit of this data sharing agreement, UCL specifically aims:
i. To compare educational outcomes among children born following ART with children born following natural conception.
ii. To compare the frequency of special educational needs (SEN) and school exclusion among children born following ART with children born following natural conception.
iii. To compare outcomes for specific types of ART (e.g. intracytoplasmic sperm injection vs in-vitro fertilisation; fresh vs cryopreserved cycles) and specific causes of infertility.

It is necessary to use data for the whole available cohort of children conceived following ART (born between 1992-2009) for this study. This is because the large size of the study cohort is a key methodological strength of this study, and is necessary for the study to deliver reliable and accurate results. The size of the study cohort is critical to allow the study to have the statistical power to adequately address the research question, by providing sufficient numbers of children in the ART conceived group and the sibling control group at each Key Stage level of assessment in the national educational curriculum. The size of the study cohort is also essential to facilitate planned subgroup analysis, which will compare outcomes separately depending on the type of ART treatment used, and depending on the cause of infertility. Under the terms of the Human Fertilisation and Embryology Act 2008, HFEA data on ART cycles carried out before 01/10/2009 can be used for research, subject to ethical approval, without explicit patient consent. Patients can withdraw consent, but only around 350 have done so. This makes the data prior to 01/10/2009 virtually complete. After this date, an opt-in system for research consent applies and the data is much less complete. The data after 01/10/2009 is not covered by Confidentiality Advisory Group s251 approval and is therefore not available for use in this study.

The principles of data minimisation will be respected by limiting the data transfer to the minimum individuals necessary (individuals within the study cohorts), by limiting the variables (or data items) transferred to the minimum necessary to facilitate the linkage and subsequent analysis (see Processing Activities), and by pseudonymisation of the data as soon as data linkage is complete (prior to data analysis by UCL). It is not possible to further minimise the data, as the size and geographic spread of the study cohort across England are essential for the study to have sufficient statistical power to adequately investigate the research question. The ART cohort includes all children born in England following IVF procedures between 1992-2009.

With respect to the processing of information flowing from NHS Digital, the Data Controller for the study will be University College London (UCL). The study purpose and design has been determined by the research team at UCL.

UCL, the DfE and the ONS will be Data Processors. The DfE are processing the data for their respective linkage step (combining data for the study groups with the National Pupil Database). UCL research staff will process the record level pseudonymised data after data linkage processes have been completed. The linked pseudonymised data will be held within the ONS-SRS for analysis by UCL research staff. The ONS-SRS is a system hosted by the ONS which will store the linked record level NHS Digital data. ONS staff check all output to be transferred out of the SRS to ensure that it is aggregated with small numbers suppressed (so that information regarding individuals is not disclosed), and to ensure that it cannot be combined with other data sources to identify individuals.

The main ethical consideration regarding the proposed data dissemination is the use of individuals data for research without individual consent being sought. Favourable independent review of the study proposal by a Research Ethics Committee (REC) and the Health Research Authority Confidentiality Advisory Group (CAG) has confirmed that use of individual data without consent is justified on the basis of the public interest of this high-quality research and the safeguards in place to maintain data confidentiality (pseudonymisation of the data before it is made available to the UCL research team for analysis, and the data security assurances for the data controllers and processors).

Access to personal identifiable data from NHS Digital will be restricted to a limited number of staff at the DfE who are experienced in handling sensitive data and who are bound by confidentiality agreements as part of their employment contracts and codes of practice. Robust security measures will be in place for the transfer and storage of personal identifiable and sensitive data, and the UCL research team will not have access to any personal identifiable data at any stage. The minimum necessary patient identifiers will be transferred between organisations for the purposes of linkage, alongside birth weight, multiple birth indicator and maternal age (with no transfer of other fertility, treatment or educational data between the organisations performing linkage).

The study has been funded by a research grant from the Nuffield Foundation. The Nuffield Foundation has no role in the design or conduct of the study, and will not be processing or accessing any data. One of co-investigators in the study team is based at the University of Oxford and Birkbeck University of London. They have an advisory role only, and will advise upon the statistical analysis and interpretation of the data. They will not have access to individual level data in the ONS SRS. The University of Oxford and Birkbeck University of London are not considered to be Data Controllers or Processors for the study.

The legal basis for the data processing proposed for this study is that of Public Task, as set out in Article 6(1)(e) of the GDPR. Universities are classed as public authorities for the purposes of data protection law. When carrying out research, as is proposed in this agreement, UCL will be carrying out tasks in the public interest in its capacity as a public authority. The legal basis for processing special category data is under Article 9(2)(j) of the GDPR (processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes). Appropriate safeguards will be in place when processing data in accordance with Article 89(1) of the GDPR. All data made available to the UCL research team will be pseudonymised using a unique member number (UMN). The UCL research team will not have access to individual identifiable data at any stage. Individual identifiable data will be accessible only to a limited number of DfE staff responsible for conducting the data linkage, within secure DfE internal systems.

Legal provision for the processing of identifiable data from the HFEA Register for the purposes of medical research is provided under section 33D of the Human Fertilisation and Embryology Act 2008. UCL have received approval from the HFEAs Register Research Panel for the use of HFEA Register data for this project.

Yielded Benefits:

This is a new Data Sharing Agreement. No data has been disseminated by NHS Digital for this research study. There are therefore no yielded benefits to date.

Expected Benefits:

The HFEA regulates fertility treatment in the UK, with HFEA policy having the potential to influence the large number of fertility treatments performed each year - approximately 54,000 women received ART in the UK in 2018. UCL will provide the HFEA with a report of the study findings via the HFEA Register Research Panel, and anticipate that the results of the study may influence policy decisions by the HFEA. For example, there are concerns at present that some more invasive and expensive forms of ART are being over-used in some treatment centres, with wide variation in practice between centres. Should the results of this study demonstrate potential risks associated with particular types of ART, this may influence HFEA guidance regarding the use of different types of treatment, potentially saving costs for the health service and/ or distress for the recipients of ART. Any change in HFEA guidance is likely to directly influence the clinical practice undertaken by fertility clinics.

The HFEA have also articulated concern about the use of add-on treatments by private fertility clinics, and the linkage methodology established by this study is expected to facilitate replication of the study in the future for evaluation of the safety of emerging technologies. Measurement of these potential benefits are intended to be through changes in HFEA guidance and policy regarding fertility treatment, and ongoing analysis of national ART treatment practices in the HFEA annual fertility treatment statistics.

Publication of the research findings in peer reviewed journals and presentation at scientific conferences is anticipated to enable dissemination of the results to a wider medical, educational, and scientific audience. This is intended to enable the results to influence clinical practice in fertility medicine, including beyond the remit of the HFEA such as internationally. It is also expected to help inform education providers as to whether this group of children may benefit from additional educational support, and enable the results to influence future academic research in this field.

UCLs engagement with the HFEA and the media is anticipated to help disseminate the results of the study to the general public. The studys findings are expected to be of considerable interest to families who have used or are considering ART. Previous research has shown that many parents who have used ART are concerned whether their children are at higher risk of developing learning or behavioural problems as a result of the fertility treatment. There are biologically plausible reasons for increased vigilance regarding the neurodevelopment of children conceived after ART, because of the potential for ART procedures to influence the nervous system during a vulnerable period of development. This important aspect of the health and development of ART conceived children has not been adequately investigated. As educational performance is a key measure of childrens neurodevelopmental progress once they reach school age, this study is expected to provide families with robust information that addresses these concerns.

Although UCL do not anticipate that the studys findings would directly influence couples decisions about whether or not to use ART, couples using ART treatments experience considerable anxiety and psychological stress both during and after treatment. Should the study show there to be no difference in the educational attainment of ART conceived children compared to naturally conceived children, it is anticipated that this will provide considerable reassurance for families. The resulting reduction in anxiety experienced by parents of ART children has the potential to improve their wellbeing and family functioning. The study has the potential to benefit tens of thousands of families annually across the UK. Measurement of this potential benefit is likely to be through ongoing qualitative research and surveys exploring the experiences of people using ART, such as the HFEA national fertility patient survey.

Should the study show evidence of reduced educational attainment in ART conceived children, this will identify that these children may benefit from additional educational support. This could influence the decisions made by parents and education providers regarding the provision of such extra educational support for their children. It might also stimulate further research to explore the specific domains in which children may experience difficulties and benefit from support. Targeted extra support for children could have the potential benefit of minimising or overcoming any identified attainment gaps, thus improving the educational outcomes of these children. This has the potential to benefit a large number of children annually; over 18,000 children were born following ART in the UK in 2018.

Dissemination of the research findings to families using ART and to education providers is planned to be achieved via publicity through various stakeholder organisations, including Fertility Network UK, the HFEA and the Nuffield Foundation. Measurement of the potential benefits of this dissemination is proposed to be through qualitative research and surveys exploring the experiences of families using ART, as well as through follow-up studies of educational outcomes in children conceived by ART (for example by repetition of the linkage methodology used for this study).

The Confidentiality Advisory Group specifically considered the public interest of the research, and concluded: The Group discussed the application and agreed that this defined a clear medical purpose which was in the public interest as high-quality research which was essential to understand whether or not this increasing population of children born following assisted reproduction therapy (ART) were at higher risk of developing learning or behavioural problems as they grow. This information would also enable individuals considering ART to receive appropriate and reliable information, and ensure that any problems in children can be identified and managed early.

Outputs:

Outputs are expected to include:
a) Peer reviewed publications in medical journals. The main results of the study are intended to be submitted for publication in a high impact general medical journal (such as The Lancet, The Journal of the American Medical Association (JAMA), or the British Medical Journal (BMJ)). Publications will be Open Access as per UCL policy, and freely available via both journal websites and UCL webpages.
b) National and international conference presentations. Proposed conferences for presentation include the European Society of Human Reproduction and Embryology (ESHRE) Annual Meeting (2023, date not available yet), and the American Society for Reproductive Medicine (ASRM) Scientific Congress (October 2023).
c) Report for the study funder (Nuffield Foundation), which will be publicly available via the study webpage on the Nuffield Foundation website.
d) Brief lay summary report, which will be made publicly available via the websites of stakeholder organisations (HFEA, Fertility Network UK, Nuffield Foundation, and UCL)

Outputs will contain only aggregate level data with small numbers (<10) suppressed in line with ONS and DfE policy.

Dissemination of the research findings to researchers and scientists will involve presentation at national and international conferences and publication in peer review medical journals, as detailed above.

Dissemination of the research findings to the public (key stakeholders being parents who have used or are considering ART) will be facilitated through existing collaborations with the HFEA and Fertility Network UK (the leading patient organisation supporting people suffering from infertility). The research project has already been informed by Patient and Public Involvement work, with UCL conducting a survey of parents of children born following ART, which determined that childrens educational potential was one of the most common concerns they had.

Dissemination of the research findings to a lay audience will be in the form of a brief research report and a video summary. Communication of the research findings to the public will be via the websites of stakeholder organisations (HFEA, Fertility Network UK, Nuffield Foundation, and UCL), and via the newsletters and social media channels (e.g. Twitter) of these organisations.

Research regarding fertility treatments, including previous work of the UCL research team, has attracted a high level of media interest, and the team anticipate that this will be the case for the proposed study. The team are acutely aware of the potential harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for couples and parents. The team will work closely with the HFEA, Fertility Network UK & UCL to co-ordinate press releases and ensure that information is conveyed accurately and responsibly.

The research team will commence data analysis as soon as the linked data has been made available. The team would anticipate that the process of data analysis, interpretation and report writing would take approximately 12 months. The team anticipate that the analysis will be completed and outputs generated in late 2023, although this estimate is dependent upon the timeframe for data access approvals being obtained and the data linkage being completed.

Processing:

The study will involve the following data processing and linkage steps:

1. Linkage of the ART Cohort and Sibling Cohort to the NPD:

Individual level data held by NHS Digital under DARS-NIC-180665-GJMW5 regarding the ART and Sibling Cohorts will be securely electronically transferred to the DfE, for linkage to the NPD. Data to be transferred will include individual identifiers required to facilitate linkage (forename, surname, gender, date of birth, postcodes at multiple time points, and unique member number (UMN)), as well as existing variables in a separate pseudonymised bridge file that will be used as covariates in subsequent analysis (birth weight, multiple birth indicator, mother's month and year of birth). The DfE will not attempt to match individuals' birth information to their identifiable data. In order to optimise linkage to the NPD - which records postcode data yearly over the course of individuals period of school education - NHS Digital will transfer all available postcodes for individuals in the study cohorts from 2004 onwards (the period for which the NHS demographics dataset is available). Prior to data transfer, NHS Digital will exclude individuals who have submitted a National Data Opt Out.

Substantively employed staff at the DfE will perform the linkage to the NPD on-site in DfE internal systems, with development of the linkage algorithm in collaboration with the UCL research team. The DfE will link the ART children, and their naturally conceived siblings, to their educational outcome data (from the Early Years Foundation Stage Profile through to Key Stage 5), as well as to information regarding secondary outcome measures (special educational needs and school exclusion).

2. Identification of school-matched controls:

Staff at the DfE will identify unrelated (non-sibling) matched controls for the cohort of ART children, using a 1:1 matching algorithm developed alongside UCL. For each level of Key Stage outcome data available for each child in the ART Cohort, one matched control will be identified and randomly selected from the NPD database, with matching by school, age, and gender. A different matched control will be identified for each level of Key Stage outcome data available for each ART child. This is because children can change school during follow-up, and because a single control child may not have outcome data available at all of the Key Stage levels for which the matched ART child has data available. Linking the matched controls back to the ART Cohort file will facilitate a check that the matched controls were not conceived by ART. If any matched controls are found to be ART conceived, they will be replaced with an alternative randomly selected matched control who is not ART conceived.

UCL will request the following outcome and demographic data held in the NPD for children in the study cohorts; test scores for national Key Stage pupil assessments at fixed points during school education between 5-18 years of age (Early Years Foundation Stage Profile through to Key Stage 5), special educational needs (SEN) status, school exclusion status, ethnicity, eligibility for free school meals, main language, and Income Deprivation Affecting Children Indices (IDACI) score.

3. Linkage to ONS birth records for school-matched controls:

In order to ascertain information regarding key potential confounding variables (birth weight, multiple birth status, and maternal age) for the unrelated school-matched controls, a minimum number of individual identifiers (names, dates of birth, gender, and postcode) for the unrelated school-matched control group will be securely transferred from the DfE to ONS for linkage to ONS birth records. The data linkage will be performed by ONS staff, within ONS internal systems. N.B. This data linkage step is outside of the scope of the data sharing agreement with NHS Digital, occurring under a separate agreement between UCL, the DfE and ONS.

4. Creation of final dataset for analysis:

Once data linkage processes have been completed, the data will be pseudonymised by DfE and ONS staff using the UMN. The pseudonymised dataset will contain demographic data (gender, age, age within academic year, ethnicity, area-based index of deprivation (IDACI), and eligibility for free school meals) as well as educational outcome data for children in all three groups (ART Cohort, Sibling Cohort, and unrelated school-matched controls). The pseudonymised datasets will be transferred from DfE and ONS internal systems to the ONS-SRS system to allow access by the UCL research team for data analysis. The bridge file of birth covariates (birth weight, multiple birth status, and maternal age) transferred from NHS Digital to the DfE, keyed on UMN, will also be onwardly transferred into the ONS-SRS and not retained by the DfE. The SRS is the ONS facility for providing secure access to sensitive detailed data for accredited Approved Researchers, and conforms to the NHS Data Security and Protection Toolkit. Named Approved Researchers are able to access data related to their project either through a secure online connection to the SRS system from a University computer, or at a physical desk in one of the ONS offices (designated Safe Settings).

Data regarding fertility treatment for the ART cohort is held by UCL in pseudonymised form (using the UMN), under a data sharing agreement with the HFEA. This will be securely transferred from UCL to the ONS-SRS. This data will include the type of ART used and the cause of infertility. The UCL research team will link this data to the demographic and outcome data for the cohort within the ONS-SRS system, using the UMN.

The UCL research team will conduct analysis of the final pseudonymised dataset through the ONS-SRS. At no time will UCL have access to identifiable data. Access to the data will be restricted to a limited number of named UCL research staff working on the project (both substantive employees of UCL and honorary contract holders). Named UCL research staff responsible for conducting the analysis for the project will complete the ONS Researcher Accreditation process, which involves specific training in the safe use of research data environments. They will sign and adhere to the ONS Accredited Researcher Declaration, and will be required to adhere to ONS data protection policies and procedures. This training in data protection and confidentiality through the ONS Researcher Accreditation processes, and the contractual agreements with the ONS to protect the data through the ONS Accredited Researcher Declaration, will apply to both substantive UCL employees and honorary contract holders working on the project.

In the data analysis, performance at each Key Stage level will be compared between the ART group and each of the two control groups (naturally conceived siblings and unrelated matched controls) using linear regression models. Because the data are matched, mixed effects linear regression models will be used. Regression models will also be used to analyze the secondary outcomes (SEN and school exclusion). Because these outcome variables are binary (yes/no) variables, logistic regression models will be used. Analyses will control for birth weight, multiple birth status, birth order, and age within academic year (by month). Analysis involving matched, unrelated controls will additionally control for ethnicity, maternal age at time of birth, postcode-linked social deprivation score (Income Deprivation Affecting Children Indices), and eligibility for free school meals (as a marker of poverty). Subgroup analyses will be performed for children conceived using different types of ART treatment (intracytoplasmic sperm injection vs in-vitro fertilization, fresh vs. cryopreserved cycles) and also for children whose parents have specific causes of infertility (female factor, male factor, both male and female factors, unexplained).

Individual level data will at all times be held within ONS systems, and at no stage will individual level data be transferred to or processed within UCL systems. All data to be transferred out of the SRS (the results of the analyses) will be checked by ONS staff to ensure that no individual level data, or potentially identifiable data, is transferred. Only aggregate level data, with small numbers (<10) suppressed, will be transferred out of the SRS system for publication. When the UCL research team access the pseudonymised dataset in the ONS-SRS they will be performing all analyses within the ONS-SRS system, and will only be extracting aggregated data.

Identifiers that the research team at UCL will have access to (ethnicity, gender, and month & year of birth) are important potential confounders to control for the analysis. The identification of research participants will not be possible from these variables. UCL does not hold any other information which could allow identification of individuals through interaction with the pseudonymised data that UCL will receive during the study (from NHS Digital, the DfE, and the ONS). There will be no requirement or attempt to reidentify individuals.

5. Data Retention:

Individual identifiable data for the study cohort will held by the DfE for a period of three months following completion of the data linkage. This will allow for DfE staff to investigate any data anomalies or discrepancies, should they be identified by the UCL research team during data cleaning and analysis. The DfE will securely delete all identifiable and birth covariate data for the study cohort after this three-month period. Only the pseudonymised analysis dataset will be retained after this point, to facilitate analysis by the UCL research team.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

1970 British Cohort Study - Tracing — DARS-NIC-129836-D5F3W

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(7); Other-National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2022-10 – 2025-10 2022.12 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

Demographics

Type of data: Identifiable

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. It is responsible for four of Britain's internationally renowned longitudinal cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study (BCS70), the Next Steps, and the Millennium Cohort Study (MCS). All these studies are following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being.

This Data Sharing Agreement is specific to the 1970 British Cohort Study (BCS70).

The purpose of this Agreement is to update CLS' database with new addresses . CLS is requesting to receive updated addresses for the BCS70 cohort members on an annual basis. The new addresses will be used to invite participants to take part in the upcoming survey it will also be used to communicate with participants in between sweeps for various other activities related to their participation in survey, for example, pilot surveys, participant engagement, birthday communications etc.

Background
The British Cohort Study 1970 (BCS) is one of Britains world renowned national longitudinal birth cohort studies. It follows a large sample of individuals born over a limited period of time (all those born in one week in 1970) through the course of their lives, charting the effects of events and circumstances in early life on outcomes and achievements later on. The study has its origins in the British Births Survey in which information was gathered about almost 17,500 babies. The original study focused on the circumstances and outcomes of birth but since then the study has broadened in scope to map all aspects of health, education, social and economic development. The Study is funded by the Economic and Social Research Council (ESRC).

Since 1970 there have been nine attempts to gather information from the whole cohort. Over time, the scope of enquiry has broadened from a medical focus at birth, to encompass physical and educational development at the age of five, physical, educational and social development at the ages of ten and sixteen, and then to include economic development and other wider factors at ages 26, 30, 34, 38, 42 and 46. The current sweep age 51 was scheduled to begin in 2020. Due to the pandemic, it had to be postponed. It is now underway and is scheduled to continue until early 2023.

The Sweep age 51 will provide the opportunity to collect a range of information from cohort members to aid the understanding of midlife outcomes across multiple life domains and their lifetime determinants. This data collection will build on the extensive data collected from birth and across the lifetime of cohort members and will facilitate comparisons with other generations, particularly the 1958 cohort at 50, and the 1946 cohort at 53, allowing for the study of social change. The data will be of interest to researchers working in a wide range of disciplines, including population health and epidemiology, economics, sociology, demography, psychology and others. It has the potential to inform a wide range of policies, including relating to work, health, relationships, and civic participation. It will include face-to-face interview with cognitive assessments (some interviews may be carried out via video link rather than in person), paper self-completion questionnaire and online diet questionnaire. The interview and paper self-completion questionnaire will cover the following three broad themes:

- Family, relationships and identity: including topics such as social networks, relationships with partners, parents, children, friends, neighbourhood, social and cultural capital, social and political participation, attitudes and values, religion, and expectations.
- Finances and employment: including topics such as work, income, wealth (savings and debts, pensions, and housing), inheritance (receiving and giving) and other transfers, and education.
- Health, wellbeing and cognition: including topics such as physical health, mental health, medical care, medication, smoking, drinking, diet, exercise, and cognitive function.

Data Summary
CLS are requesting access to record level, identifiable data linked to the cohort from the following datasets:
- Demographics dataset

Aim
Of the approximately 17,500 individuals that have ever participated in the study there will always be a number of individuals for whom the Centre for Longitudinal Studies (CLS) at University College London will not have confirmed addresses at the time of carrying out the next survey.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for BCS70 cohort members. CLS believe that a substantial number of these individuals would be willing to participate in the Age 51 survey if they could be contacted. Previous efforts to re-establish contact for earlier surveys for the BCS70 cohort study have been very successful, using NHS Digital data to assist with maximising participation in the Surveys.

All of these individuals have made an informed decision to participate in the study over the years and have been made aware that the study is seeking to follow them throughout their lives. This information is provided to participants on the study website under How we find you https://bcs70.info/faqs/#keeping-in-touch# Do you use information held by Government to find us?'. CLS provide a link to the information on the study website in all materials provided to cohort members. Cohort members receive an advance booklet with complete information about each upcoming survey.

Each year CLS sends an annual postal mailing to all BCS70 participants. CLS asks that participants complete a reply slip which is returned to CLS which allows participants to provide CLS with any change in their details e.g., a new email address, phone number, etc. CLS also ask them to return the reply slip even if none of their details have changed i.e., seeking a positive confirmation that that is the address CLS hold for them. As a result, CLS can maintain the cohorts' latest details on the BCS70 database. In the event of the annual mailing not reaching the participant it is returned to CLS as a 'return to sender'. CLS will attempt to trace all these returns but if CLS cannot locate the participants then they are flagged on the database as a 'gone-away'. NHS Digital may potentially hold a more recent address and provide CLS with an opportunity to invite the cohort to re-join the study. CLS will send the details of approximately 12,000 cohort members to be linked to the NHS Digital Personal Demographics Service (PDS) dataset. This number excludes those who have died and those who have requested to be withdrawn from the study. NHS Digital will supply new addresses for study members who can be matched to the PDS dataset.
CLS have contracted an external supplier NatCen Social Research (the trading name of the National Centre for Social Research) to carry out the individual study members' interviews. NatCen was commissioned to run interviews with study members for the Age 51 Survey. NatCen have also contracted an additional Data Processor, Kantar Public, to assist with the interviews.

CLS intend to use the new addresses to invite participants to continue participating in the study. This will include sharing addresses with NatCen and Kantar so that participants can be invited to take part in the current Age 51 Survey. Further details about exactly how addresses will be used are provided below in Section 5b.

If a study member makes clear that they do not wish to take part in the study this is flagged on the CLS database with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which case this data is securely deleted.

With regard to a request for 'withdrawal' from a participant CLS classifies them as a 'withdrawal from the current survey' or a 'withdrawal from the study' and these are handled slightly differently:
(a) Withdrawal from the current survey: CLS will flag this on its computer system to indicate that the participant will not be taking part in the current survey and the reason for not wanting to take part is also recorded. For example, they may just not have the time to take part. Therefore, there will be no further contact with the participant for the duration of the current survey, but they will be invited to take part in the next survey.
(b) Withdrawal from the study: CLS will flag this on its computer system as a permanent refusal to indicate that the participant will not be taking any further part in the study itself and the reason for this type of withdrawal is also recorded for analysis purposes. Therefore, there will be no further contact with the participant for the remainder of the longitudinal study. If this request is received in writing, then CLS will acknowledge the request and notify the participant that they have been flagged and will no longer be contacted or receive any further communications. This request may sometimes be accompanied by a request for the destruction of their data.

As University College London (UCL) determines the purposes and means of personal data processing under this agreement, they are the sole Data Controller who will also process data. Natcen and Kantar Public are Data Processors.

UCL's legal basis for processing (acquiring, linking and sharing) personal data is for a public task under GDPR Article 6(1)(e) i.e. processing is necessary for the performance of a task carried out in the public interest (as is made explicit to participants in the information leaflets provided). UCL also process special categories of personal data for research under GDPR Article 9(2)(j) i.e. processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. In addition, for ethical reasons and under the Common Law Duty of Confidentiality, UCL sought permission from the Confidentiality Advisory Group to access this data without consent. CLS has also received Research Ethical Committee (REC) approval for tracing participants via NHS Digital.

Research using the data from this study will contribute to a body of evidence which have the potential to inform government and result in policy change in various areas including health. As a result, this has the potential to benefit the public in general.

Participants are aware that the study will attempt to trace them and CLS are confident that many of those newly traced via the NHS will be happy to take part. CLS also uses its own methods for tracing cohort members for example, asking the named stable contacts (relative, neighbour or friend) for the participants new address.

The Economic and Social Research Council (ESRC) are the funder for this study. No data received under this agreement will be shared with the funder.

Yielded Benefits:

In the BCS70 Age 42 Survey which was conducted in 2012 UCL CLS achieved almost 800 interviews with participants who had been newly traced to an address supplied by NHS Digital and this included almost 600 interviews with study members who had not previously taken part in the study for over 10 years. This shows very clearly the value of tracing participants in this way. The addresses obtained in previous versions of this agreement were also very useful to invite study members (BCS70 age 46 sweep) to take part and re-engage with the studies. Using addresses provided by the NHS Digital helped CLS get in touch with those cohort members who would otherwise not be able to take part in a new survey. The BCS46 data is now available for researchers to access via the UK Data Service, providing an important resource for UK Social Science, including researchers in health and social care. A link to the dataset is provided here https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=8547

Expected Benefits:

The continuing success of the study will be underpinned by the successful matching of untraced cases. Submitting the cohort for up to date contact details and being able to keep the information for future sweeps, will allow the researchers to re-contact the participants who CLS have lost touch with and give them the opportunity to re-engage or clearly state that they wish to withdraw. It will also ensure that literature goes to the correct name and address. It will also ensure that no contact will be made with participants who have died.

The information collected during the Age 46 Survey and in the future sweeps age 51 and 54 may enable researchers to uncover life course and inter-generational factors which contribute to healthy ageing among this generation, and thus to inform the development of preventative health policies across the whole of life that will expand healthy life expectancy, and reduce the burden of ill-health and disease at older ages.

The BCS46 data is now available for researchers to access via the UK Data Service , providing an important resource for UK Social Science, including researchers in health and social care. This Sweep involved many data collection elements, including a full range of bio-measures administered by a nurse. The inclusion of objective measures of health will allow researchers to assess the longitudinal predictors of health in midlife. Many of the measures were included in CLS other study the NCDS age 44 biomedical sweep, which will allow for cross cohort comparisons. A link to the dataset is provided here https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=8547

The use of the data may result in papers that will be published, presented at conferences and sometimes reach media coverage. Most papers will contribute to a body of evidence which will result in improvements to health care users experience or health care delivery. It is expected that occasionally, these may have a higher impact such as the examples highlighted below:

- For example the Welsh Government policy on early years planning - http://www.closer.ac.uk/news-opinion/2013/welsh-governments-early-years-childcare-plan-draws-evidence/

-Encouraging reading for pleasure and childrens cognitive development.
Research using data from the 1970 British Cohort Study (BCS70) has revealed how reading for pleasure can help children excel not only in English but also in maths. This important work, led by CLS, has had a big influence on reading for pleasure programmes, policies and practice in the UK and beyond, benefitting millions of children worldwide. The link between reading for pleasure and childrens maths and vocabulary scores was covered extensively in the media, including in articles in the Daily Telegraph, Sydney Morning Herald and Vancouver Sun and in interviews for BBC Radio 4s Today Programme, BBC London and Al Jazeera. The findings attracted a remarkable amount of interest from schools, libraries and literacy organisations around the world. They have been used to help protect library services, to persuade children of all ages to spend more time reading, and to encourage parents to support schools home reading initiatives. In the UK, the research was cited in a 2015 Department for Education report, Reading: the next steps, underpinning recommendations for government funding to support book clubs, resources for reading, and instructing schools to promote library membership.
Selected coverage:
The Guardian Reading for fun improves childrens brains, study confirms
Daily Telegraph Reading for pleasure boosts pupils results in maths
Vancouver Sun Libraries are worthwhile public investment
Sydney Morning Herald Reading gives kids an edge, study says

Outputs:

The addresses supplied by NHS Digital under this agreement will help to boost the sample size and to increase the data collection at age 51. Prior to a new survey collection, there are always a number of people who CLS will have lost contact between the previous surveys and the new upcoming survey. The output of the demographic data will be the participation in the survey, for those participants who wouldnt otherwise have participated in the age 51. In the BCS70 Age 42 Survey which was conducted in 2012 we achieved almost 800 interviews with participants who had been newly traced to an address supplied by NHS Digital and this included almost 600 interviews with study members who had not previously taken part in the study for over 10 years. This shows very clearly the value of tracing participants in this way.
The main outcome for the study is the next sweep, Age 51 currently underway and scheduled to complete in 2022. This will be a fully documented, anonymised research dataset which will be archived with the UK Data Service to provide a strategically important resource for UK Social Science, including researchers in health and social care. For clarity, the data stored on UK Data Service does not include NHS Digital data but responses from participants of the study.

The scientific priorities and questionnaire content of the survey were elaborated and developed in consultation with the academic and policy community with the aim of collecting both information relevant to their lives at age 51 and to later life outcomes, as well as repeat measures of topics covered at age 51. CLS will continue to prospectively harmonise the content with other comparable cohorts, particularly those in the UK, by drawing on comparable measures at a similar age. All surveys are overseen by the CLS Strategic Advisory Board (SAB) which contains representatives from UK research and Innovation (UKRI), Wellcome Trust, Medical Research Council, the scientific community, and government departments. The SAB provide high level strategic oversight for CLS to ensure the cohort studies led by the centre are developed, managed, and maintained in a manner that maximises their benefit as long-term scientific resources of importance both nationally and internationally, while protecting participants' interests. The SAB ensure that the content is closely aligned with research priorities, as well as the areas of research interest (ARIs) published by government departments. CLS reflect these priorities when deciding on the major themes which CLS intend to cover at the sweep.

In additional to the creation of this rich database which will be BCS70 Age 51, CLS will continue to produce outputs from the study via the UK Data Service in the form of aggregated reports with small numbers suppressed, for the benefit of the wider research community, as previous interest in BCS70 data has proven to be sought in a large scope of research areas.

CLS will also publish papers in a range of journals; however, it is not possible to provide detail at this point as to precisely which journals and dates, but the intention is to produce outputs along the same lines as those produced after the previous sweep. All scientific papers using the BCS70 data are published on the CLS Bibliography page online.
https://www.bibliography.cls.ucl.ac.uk/Bibliography.aspx?sitesectionid=647&sitesectiontitle=Bibliography

Processing:

NHS address tracing and matching variables:
NHS address tracing. CLS wish to use the demographics data to receive new updated address regularly (annually) for the BCS70 cohort members.

CLS will supply NHS Digital with a file of approximately 12,000 study members to match to NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of BCS70. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following identifying data items:
- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Sex
- Last known address, and postcode
- NHS Number

NHS Digital will match the cohort identifiers to the demographics data and then supply the following details to CLS:
- CLS identifier
- NHS number
- Requested fields from Demographics dataset.

No other data will be linked to the NHS Digital data received.

NHS number is very useful in determining if a cohort is the correct cohort member, where there are people with the same name as the cohort member in the NHS database and the address CLS receive is different to the address CLS hold in CLSs database, this means that CLS will use the full name and the NHS number to ensure that the new address NHS Digital have sent CLS is for the right person. CLS dont use the NHS number for anything else other than for validation. CLS will also use NHS Number when in future CLS need to send NHS Digital a file for linkage, CLS use these to send a matching file to NHS Digital so that NHS Digital can more easily match CLSs cohort members in NHS Digitals database.

All those accessing the data supplied by NHS Digital are substantive employees of University College London or employees of the processor organisations (Natcen and Kantar ) carrying work on behalf of UCL who have been appropriately trained in data protection and confidentiality.

At UCL, the NHS Digital data will be held securely at the UCL Data Safe Haven (DSH) and accessed remotely by CLS staff. The UCL DSH is certified to ISO 27001:2013 and is compliant with NHS Digitals Data Security and Protection Toolkit. Staff using the DSH complete annual training and regularly review data access arrangements ensuring data are only limited to those authorised to access it. UCL Computing Regulations are based on the premise that access to resources is generally forbidden unless expressly permitted. All data transfers from the DSH require approval and are carried out through secure portals which are fully audited. Access to the UCL DSH is via remote desktop and requires multi-factor authentication. In addition to a strong password each user has to use a six-digit number generated by a smartphone app or physical token at each login. Passwords must be changed at regular intervals, and unused accounts are automatically disabled after a fixed period. Once inside the environment, robust access control ensures that researchers can only examine information that they are approved to use.

The data file supplied by NHS Digital, will be reviewed by CLS where CLS will make a judgement as to whether each address should be considered new. The decision as to whether to regard an address provided by the NHS as new will be made as follows:

Is the NHS address the same as the current address held on the study database, or the same as a historical address held in our database which we have previously established is no longer the address of the study member? If so, the NHS address will not be uploaded.
Has the current address on our database been confirmed in a recent survey or via some other way? If so, the current address will be retained and the NHS address will not be uploaded.
Is the date associated with the NHS address more recent than the date at which the current address on our database was most recently confirmed? If not, the current address will be retained and the NHS address will not be uploaded.
The NHS address will therefore be regarded as a new address if a) we do not already have a recently confirmed address on our database, b) the NHS address is more recent than the address on our database and c) the NHS address is not the same as an existing address on our database. Addresses which are considered to be new will be uploaded into our database.

At the outset of any data collection project, CLS sends a sample file to the fieldwork agency which contains the latest contact details for all study members who are to be invited to take part. This information includes name, sex, date of birth, addresses, telephone numbers and email addresses.

The individuals selected to take part will be all those who have taken part in at least one recent survey (last three sweeps) and all those who have not taken part in a recent survey but where we have recently obtained new contact details. NatCen will send a letter (and email) to all study members on behalf of CLS, which will invite them to participate in the forthcoming survey and will let study members know that an interviewer from NatCen or from Kantar Public will be making contact with them soon. NatCen will allocate half of the study members to be contacted and interviewed by Kantar Public interviewers (because NatCen have sub-contracted 50% of interviewing to Kantar Public because of capacity constraints). NatCen will send the names and addresses of these cases to Kantar Public in order that they can allocate study members to their interviewers. NatCen interviewers and Kantar Public interviewers will both gain access to the names and addresses via NatCen systems.

Kantar Public interviewers will access NatCen systems via a Virtual Machine Network.

Invitations are sent out and then shortly after, interviewers will make contact with participants via telephone and via personal visits to the address.

Ideally, this NHS tracing exercise would have been completed prior to the commencement of the current fieldwork so that the NHS addresses could have been supplied to NatCen in the original sample file. This would have allowed invitation mailings to be sent to the new addresses.

Fieldwork on the current survey is being conducted in waves. In total around 12,000 will be invited to take part this will happen in waves of around 2,000 which are spaced several months apart.

We have so far issued 3 waves of fieldwork so around half of the total. For these waves, postal invitation mailings were sent to the latest addresses held in our database and interviewers have been trying to reach these cases in order to see if participants are willing to take part and to conduct interviews if so.

Because fieldwork has already started, the new NHS addresses will be provided to NatCen as an update to the original sample file.

The way in which the new NHS addresses will be used will depend on whether the individual has already been invited to take part and whether they have been located.

If an individual has already been sent their invitation, and the interviewer has subsequently located them, then the NHS address will not be used.

If an individual has already been sent their invitation, but the interviewer has not been able to locate them then a new invitation letter will be sent to the new NHS address and the interviewer will then attempt to contact the individual at that address.

If an individual is allocated to a wave that has not yet commenced then the initial invitation mailing will be sent to the new NHS address and the interviewer will then attempt to contact the individual at that address.

If a letter that is sent to the new NHS address is returned to sender, or the interviewer is unable to make contact with the participant at that address, then interviewers will try to locate the participant elsewhere, and failing this the individual will be classified as untraced and marked as such on the CLS database. If the interviewer makes contact with a participant but the participant makes clear that they no longer wish to take part in the study, the case will be marked as a permanent refusal on the CLS database and will not be invited to take part in any future studies.
If a new address is obtained for a participant who was not included in the original sample file provided to NatCen it will not now be possible to invite that participant to take part in the Age 51 Survey. The address would still be uploaded to the CLS database. CLS send an annual mailing to all study members each April in their birthday week. The new addresses for cases not already provided to NatCen will be used to send this birthday mailing.

At the end of each interview, names, addresses and other contact details will be confirmed or updated on NatCen systems prior to being returned to CLS.

All personal information provided to or collected by the fieldwork agencies will then be destroyed on completion of their contracts. Kantar Public do not hold any data on their own systems. Access to NHS data by Kantar Public is strictly through a NatCen Virtual Machine Network used to look up participant data.

Creating synthetic data for health research — DARS-NIC-419453-G3G1G

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2021-03 – 2024-03 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 18th September 2025 final.pdf, AGD minutes - 4 May 2023 final.pdf, IGARD Minutes - 22nd April 2021 final.pdf, IGARD Minutes - 27th May 2021 final.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The study aims to evaluate methods for creating synthetic datasets for health research. The idea is to create artificial datasets that look like the original data source (preserving the structure and statistical properties of the data and relationships between variables) but that do not contain information on any real individuals, and therefore pose no confidentiality risks. If such datasets can be created, synthetic data could be used by researchers to understand the structure of the data, develop data cleaning protocols, codes and algorithms, and test out methods. Final analyses could then be conducted once approvals are in place (in secure settings), or alternatively, by the data providers themselves (so that researchers would not need any access to confidential information).

The concept of synthetic data is not new, but its use is increasing. For example, synthetic versions of general practice data (from the Clinical Practice Research Datalink) have recently been generated, including for COVID-19 research (https://www.cprd.com/content/synthetic-data). Health Data Research UK have also recently prioritised work in this area (for more information, see https://www.hdruk.ac.uk/synthetic-data/).

To achieve this aim, the research team at University College London (UCL) will compare a range of methods for synthetic data generation for generating synthetic versions of Hospital Episode Statistics (HES) in particular Admitted Patient Care (APC).

Legal basis
The legal basis for processing personal data for this purpose data at UCL falls under Article 6(1)(e) of the General Data Protection Regulations (GDPR), i.e. a task carried out in the public interest. It also falls under Article 9(2)(j), processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes.

There is clear public interest for this application, as it could lead to a significant streamlining of research using electronic health data. The potential for using electronic health data for timely health research has recently been highlighted by COVID-19. Strict governance restrictions to protect confidentiality mean that data, when released, are highly anonymised and/or only accessed in secure research settings. This is for good reason - there are some who argue that individual-level data can never be truly anonymous. However, methods to anonymise data, such as by removing exact event dates or categorising variables, can mean that the resulting data are not sufficiently granular for research purposes. Synthetic datasets could provide an alternative resource for health researchers.

Findings from the study will help data providers decide whether providing synthetic versions of electronic health data could help address the increasing pressure to deliver timely outputs in the context of increasing numbers (and complexity) of data access applications. Sensitive or potentially identifiable datasets such as Hospital Episode Statistics have great potential for economic and social impact, leading to better informed policy decisions and effective public services. Widening the use of these data through synthetic data would therefore lead to increased efficiency in health research, ultimately benefiting the public.

The researchers will access pseudonymised data only (as only pseudonymised data has been released under NIC-393510).

How the data requested will achieve the aim
The HES data requested will achieve the aim of this study by providing an original data source that can be used as a basis for evaluating a range of synthetic data generators. For example, the UCL team will select a core set of variables to be synthesized including patient characteristics (index of multiple deprivation score (IMD), ethnic group, sex), birth outcomes, number of admissions, and high-dimensional fields such as diagnosis fields.

Exploring whether it is possible to accurately generate lookalike variables in a synthetic dataset will help inform researchers and data providers on the value of synthetic data. Information from HES will also allow the researchers to determine for which types of data the methods are effective. For example, they will be able to determine whether the methods can be used to generate synthetic versions of early HES data (from 1997, likely to be less complete) and more recent data (from 2019, more complete).

The data will be used to evaluate three synthetic data generators, Synthpop, Simulacrum and Jomo. Synthpop and Jomo are implemented in open source software (R packages). Simulacrum was developed by Health Data Insight (a social enterprise overseen by the Office of the Regulator of Community Interest Companies). UCL will evaluate these generators, in terms of how well they can create synthetic datasets, by the following:
- Assess general utility by visualising marginal distributions of key variables and by estimating the standardised propensity mean square error (pMSE). The pMSE is a measure of poor discrimination between the original and the synthetic data (a positive feature in this context) and is derived from a logistic regression model for the propensity of a record to be from the original dataset. Coefficients of the propensity model will be inspected to identify ways in which the synthetic and original data diverge.
- Assess specific utility by estimating coefficients from selected models of interest using the synthetic and the real data, and then deriving standardised differences and percentage bias for the coefficients of interest. UCL will assess the extent to which inferences based on the synthetic data are robust, by estimating the overlap of confidence intervals for coefficients derived from the original and synthetic datasets using the interval overlap measure. Results will be averaged across multiple versions of the synthetic data.

Relevant background to the request
This is a methodological research study funded by the Economic and Social research Council (ESRC).

Relationship between proposed project and associated work
UCL are proposing to re-use an existing extract of HES APC data (held by members of the wider research team under a separate DSA; NIC-393510). UCL are requesting a re-purposing of this DSA to allow them to access specifically the years 1997/98 and 2019/20, which will enable them to establish whether methods can be used to generate synthetic versions of both early HES APC data and more recent, more complete data.

The purpose of the request
The research team are requesting that health data captured in HES are used to evaluate whether it is possible to generate synthetic versions of health data that can be used for health research. The purpose of this request is to answer a set of research questions about the feasibility and usability of synthetic data, aiming to generate evidence on the usefulness of synthetic data for data providers and health researchers.

For this purpose, the research team are requesting pseudonymised HES APC data for 1997/98 and 2019/20. National data are required in order to capture the variation in data quality across different providers and to evaluate whether synthetic data methods can handle large amounts of data accurately. There are no alternative ways of achieving the purpose of this application. The research team will use the minimum data required in order to answer the research questions.

Organisations involved
Data controller: UCL
Data processor: UCL

Although the study involves a co-applicant at the London School of Hygiene and Tropical Medicine (LSHTM), they will only contribute advice (particularly on the use of the Jomo package). They will not access any HES data. LSHTM is not considered as a Data Controller or Data Processor.

There are no funders or commissioners directly involved in the project.

No party involved in the application will receive any form of commercial benefit from the use of data.

Expected Benefits:

The research will benefit the provision of health care and the promotion of health, by informing policy on whether synthetic versions of electronic healthcare datasets can be shared with researchers, in order to streamline the research process. This will have direct relevance to all health research using healthcare datasets such as HES. The research is in the public interest, because the public have vocalised opinions about the need for timely access to high quality healthcare data, especially in light of COVID-19. The results of this study will provide evidence on whether synthetic data can be used to speed up the data access applications, data management, and data analysis stages of research. The study will directly benefit the Health and Social Care sector by providing data providers, researchers, governance bodies and policy makers with detailed and up-to-date evidence to aid decision making about the use of synthetic data to support a wide range of health research. Our guidelines on the appropriate use of synthetic data will help facilitate timely access to administrative datasets, improve the efficiency of research and streamline approval processes in the context of increasing demands on data providers. Our work will help minimise access to identifiable personal data, by allowing researchers to develop methods using anonymised, synthetic data, with final models being implemented by data providers or within secure settings.

The study team will engage with data providers, researchers and the public throughout the study. UCL will offer workshops on synthetic data and the results of our study with NHS Digital, DfE and ONS. This will allow the study team and data providers to establish views on the resource implications associated with synthetic datasets, and the likely efficiency benefits of being able to provide synthetic datasets to researchers.

Outputs:

The main output will consist of a set of guidelines and evidence on the use of synthetic data, including a comparison of approaches. These guidelines will be developed alongside data providers and other researchers as part of an engagement phase of the study. UCL will disseminate the guidelines using existing networks, e.g. including with colleagues at the Office for National Statistics (ONS), the Department for Education (DfE), NHS Digital and Public Health England (PHE), as well as Administrative Dara Research UK (ADR UK) and Health Data Research UK. Findings will be used by data providers to inform ongoing research into the use of synthetic data. Findings will also be published as peer review publications in high quality journals (e.g. PLoS One, submitting within 3 years of data access). The researchers will also work with members of the public to co-produce a range of outputs suitable for communicating results to members of the public interested in the use of electronic health data for research, e.g. through training events or fact sheets.

All journal articles will be published with open access, to ensure the wide dissemination of the studys results to data providers, healthcare professionals, governance bodies, and other researchers. Results of the study will also be made available in both clinical and methodological research forums: abstracts will be submitted to the following conferences within 2 years of data access: International Population Data Linkage Network, Public Health Science.

As the main output will be guidelines on the use of synthetic data, and will not inform any decisions about individuals, the researchers do not expect that an a EQIA will be required. However, the guidelines will include as assessment of how well synthetic data can preserve information in the real data about people with protected characteristics (e,g., to ensure that ethnic groups are represented in the same way within the synthetic and real data).

Data will not be used for sales or marketing purposes.

Processing:

The research team at UCL will extract a core set of variables including patient characteristics (IMD, ethnic group, sex) birth outcomes, number of admissions and high dimensional fields such as diagnosis fields, from an existing HES APC extract held under a separate Data Sharing Agreement (NIC-393510). The intention is to use this original data set and replace all of the values with synthetic ones, causing minimal distortion of the statistical information contained in the original data set. This would result in a new dataset, in which every value for every variable will be synthetic, i.e. the synthetic data will not contain any records that correspond to a real person. The new extract will be transferred to a new location in the UCL Data Safe Haven. All analysis will take place within the UCL Data Safe Haven.

The data flows are summarised as follows:
1. HES APC data with no identifiers will be extracted for 1997/98 and 2019/20 from NIC-393510-J1Q6T
2. The new extract will be transferred to a new location on the UCL Data Safe Haven.
3. All analysis will take place in the UCL Data Safe Haven.

There will be no attempts to identify individuals. Risk of re-identification will be mitigated by checking all outputs for small cell sizes. No potentially disclosive outputs will be shared or published. Data processing will only be carried out by substantive employees of UCL who have been appropriately trained in data protection and confidentiality.

Childhood outcomes after perinatal brain injury (Data flowing to ONS) — DARS-NIC-342322-Q1N7M

Opt outs honoured: Yes (Excuses: Mixture of confidential data flow(s) with support under section 251 NHS Act 2006 and non-confidential data flow(s))

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2021-10 – 2024-10 2024.02 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), IMPERIAL COLLEGE LONDON, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 14 July 2022 final.pdf, IGARD Minutes - 30 September 2021 final.pdf

Datasets:

Birth Notification Data
Civil Registration - Births
Civil Registration - Deaths
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
Mental Health Services Data Set
Civil Registration (Deaths) - Secondary Care Cut
Civil Registrations of Death
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)
Mental Health Services Data Set (MHSDS)
Civil Registrations of Death - Secondary Care Cut
Emergency Care Data Set (ECDS)

Type of data: Anonymised - ICO Code Compliant

Yielded Benefits:

This is a new Data Sharing Agreement. No data has been disseminated by NHS Digital for this research study. There are therefore no yielded benefits to date.

Expected Benefits:

This population study is hoped to provide the most complete picture of how childrens lives are affected by perinatal brain injury, providing essential information to answer parents questions accurately and in a meaningful family-centric manner. This information is intended to reshape clinical practice and facilitate optimum service planning within the NHS, to meet the needs of these children and their families through to adulthood, and ultimately improve their future health outcomes. An understanding of the sequelae of perinatal brain injury, specifically how and when children are affected, is expected to inform enhanced developmental surveillance across the NHS and enable the design of targeted multidisciplinary interventions to support children as needed. For example, premature infants (prone to inattention) can benefit from delayed school entry, Special Educational Needs (SEN) support, and educational packages raising awareness amongst educators of these specific challenges.

Anticipated impact on neonatal care, society and NHS services:

Impact on neonatal care
Equip healthcare professionals with reliable information to counsel families (target date 2024).
Communication aids will facilitate meaningful family-centred conversations on the neonatal unit (target date 2024)
Help prepare families for their childs future and understanding what additional support may be needed (long-term)
Encourage healthcare professionals to consider the long-term impact of various neonatal care decisions (long-term)

Impact on the NHS and policymakers
Help those involved in shaping policy, resource planning and service provision to make informed decisions about how to most effectively support these children whilst maximising the efficiency of services (target date 2024-25)
UCL findings are intended to inform national guidelines on follow-up after brain injury (target date 2024-25)

Impact on schooling and policymakers
Equip parents with important information about the academic impact of brain injuries to help them plan their childs future and support them with their educational needs (long-term)
Provide key information and education to teachers about how they can support children with perinatal brain injuries (long-term)
Help the Department for Education in determining resource allocation and the provision of additional educational support (long-term)

Outputs:

The research plan to date has been shaped by detailed feedback from charity representatives and over 30 parents and ex-neonatal unit patients via the Great Ormond Street Parent Advisory Committee, the BLISS Insight and Involvement group, and the Meningitis Research Foundation. Parents, via these partner charity organisations, will continue to be involved in focus groups over the lifetime of the study in order to explore the study results and capture their thoughts on what they mean for parents and how best to communicate these results. The Patient and Public Involvement (PPI) work, undertaken in collaboration with the aforementioned charities, highlighted that evidence about the long-term impact of brain injuries (particularly the unseen impact on mental health and schooling) was a frequently overlooked parental priority. It matters to the people most affected.

Academic outputs are hoped to include high-impact peer reviewed publications, and international conference presentations. Findings are expected to be submitted for publication in high impact general medical journals, such as the New England Journal of Medicine, the British Medical Journal, and JAMA Pediatrics. The study results are intended to be presented at international conferences such as the Royal College of Paediatrics and Child Health annual conference, the Kings Fund annual conference, and the Paediatric Academic Societies meeting in the USA.

Publications will be Open Access as per UCL policy, and freely available both on journal websites and via the UCL webpage. Outputs will contain only aggregate level data with small numbers suppressed in line with National Neonatal Research Database (NNRD), NHS Digital, Office for National Statistics (ONS) and Department for Education (DfE) policy and guidance. All data will be stored within the ONS secure research service (SRS) and all outputs from this server undergo independent checks by ONS staff to ensure outputs meet regulations and could not be deemed identifiable in any way.

Dissemination of the research findings to the public (parents who have children with a perinatal brain injury) are intended to be facilitated through existing collaborations with the Neonatal Data Analysis Unit (NDAU), BLISS (the charity for babies born sick or premature) and the Meningitis Research Foundation. UCL are also looking to also create an infographic/ information leaflet to improve communication of prognosis after perinatal brain injury between doctors and parents. Public dissemination is intended to include production of lay research reports publicised on the NDAU, BLISS, UCL and Meningitis Research Foundation websites. Research regarding neonatal outcomes has attracted a high level of media interest, and it is anticipated that this will be the case for the proposed study. UCL are acutely aware of the potential harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for affected families. UCL are planning to work closely with BLISS and Imperial College London to co-ordinate press releases and ensure that information is conveyed accurately and responsibly. BLISS and the Meningitis Research Foundation are also expected to publicise findings to their followers and the general public through their social media channels.

UCL will commence analysing the data as soon as it has been made available in the ONS SRS. It is anticipated that the process of data analysis, interpretation and report writing will take approximately 36 months, with papers submitted for publication in mid to late 2024.

Processing:

The study will involve the following data processing and linkage steps:

1. Infants meeting the Department of Health definition for perinatal brain injury will be identified within the National Neonatal Research Database (NNRD) (cohort 1, n = 40,166). This database contains care data for all neonates admitted to NHS neonatal units across England, Wales and Scotland. Its population coverage is internationally unique with 100% coverage since 2012 and high representative coverage since 2008. The 14,911 premature infants (< 34 weeks gestation) in cohort 1 will be matched to a comparator group of infants within the NNRD (cohort 2, n = 14,911).
2. The pseudonymised neonatal care data for cohort 1 and 2 will be transferred to the ONS Secure Research Service (SRS) by Imperial College London.
3. The NNRD will transfer the minimum identifiers for the NNRD cohorts (1 and 2) to NHS Digital (NHS number, date of birth, sex and postcode at birth). The NNRD will also provide the birth weight, gestation (from 2015), and multiplicity status (i.e. twins, triplets etc) for the remaining 25,255 children with gestation time > 34 weeks in cohort 1 to NHS Digital.
4. The 25,255 un-matched infants in cohort 1 with perinatal brain injury will be matched in a 1:3 ratio, by NHS Digital, to a comparator group of infants, identified from Birth Notifications and Civil Registrations (Births) data to create a term control cohort (cohort 3, n = 75,765).
5. All 3 cohorts will be linked to Civil Registrations (Deaths), Hospital Episode Statistics (HES) Admitted Patient Care (APC), HES Accident and Emergency (A&E), HES Outpatients and the Mental Health Services Data Set (MHSDS) up to December 31st 2020, by NHS Digital. The pseudonymised health outcomes and analysis covariates from the Births products for the three cohorts will be transferred from NHS Digital to the ONS SRS.
6. Under DARS-NIC-475526-F3Z5H, a file containing a list of personal identifiers (forename, surname, date of birth, sex, and postcodes) for linkage to the National Pupil Database (NPD) will be transferred from NHS Digital to the Department for Education. The NPD contains detailed information on the educational attainment, special educational needs and attendance of children at state schools across England between the ages of 5-18 years. A logic model, designed to maximise the chance of a reliable postcode match (given the variation over time), will be used. After linkage, all identifiers will be removed (only the unique study ID number will be retained) and these pseudonymised educational data will also be securely transferred for storage within the ONS SRS.

UCL researchers will only have access to pseudonymised data held within the ONS SRS. In order to access any data in the ONS SRS, all researchers will need to be ONS accredited and undergo data protection and confidentiality training. No data will be held by or at UCL. There will be no requirement or attempt to re-identify participants. Indeed, this would not be possible for UCL.

Named UCL research staff responsible for conducting the analysis for the project will complete the ONS Researcher Accreditation process, which involves specific training in the safe use of research data environments. They will sign and adhere to the ONS Accredited Researcher Declaration, and will be required to adhere to ONS data protection policies and procedures. All data to be transferred out of the SRS (the results of the analyses) will be checked by ONS staff to ensure that no individual level data, or potentially identifiable data, is transferred. Only aggregate level data with small number suppression will be transferred out of the SRS system for publication.

Data retention
The linkage keys used for the health and educational linkages will be securely held by NHS Digital and the Department for Education respectively. Only the pseudonymised dataset will be retained within ONS SRS to facilitate analysis by the UCL research team.

Research on Health and Ageing using English Longitudinal Study of Ageing (ELSA) — DARS-NIC-30493-Y0C0K

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2019-02 – 2022-02 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: NATIONAL CENTRE FOR SOCIAL RESEARCH, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The English Longitudinal Study of Ageing (ELSA) is a well-established, on-going, multi-disciplinary cohort study involving a collaboration between University College London (UCL), the Institute for Fiscal Studies (IFS), the University of Manchester (UoM), and NatCen Social Research (NatCen).

Since its inception in 2002 it has provided valuable insights into a range of social, health and economic issues. Traditionally, data have been collected biennially face-to-face via interview and clinical examination. While this approach has been very useful and will continue, linkage of study members in ELSA to routinely-collected data offers not only additional rich, complementary information about their health which cannot be gathered using these methods (e.g., valid data on diagnosis and prognosis of common chronic diseases such as cancer and depression) but, crucially, data which come at no burden to the study members. Participants are invited to re-consent every 2 years (known as Waves) with Wave 8 recently completed and Wave 9 due to start in Summer 2018.

The Department of Epidemiology and Public Health at University College London (UCL) require linked pseudonymised Hospital Episode Statistics (Admitted Patient Care, Outpatient, Critical Care and Accident & Emergency) and Cancer registration data as part of their research obligations as part of the ELSA research group. This agreement will permit NatCen (under NIC-311182-N0L1Y subject to an active DSA and supporting purpose) to share pseudonymised linked HES, and Cancer data in order for UCL (under this agreement) to carry out their obligations.

The requested data will be used for a programme of research on health and ageing in England. This is a long-standing and on-going programme of work which aims to improve understanding of the ageing process, and how the use of health care affects this ageing process and the evolution of health over the lifecycle.

Linking NHS Digital data with ELSA will allow UCL to combine detailed information on health outcomes; the use of hospital services; the quality of health care and the identification of trends in health that will impact on future demands for health care with wider characteristics of the elderly population. The proposed linkage of ELSA to administrative health data will provide novel data for research on ageing in England. Existing studies on ageing, and in particular the use of health and social care services of individuals as they age, has been restricted by extremely limited data on the use of these services. Studies on the evolution of health at older ages using administrative health records has also been limited by a lack of information on the socio-economic and wider health characteristics of individual. Linking the data together therefore provides a rich dataset which enables research in this crucial policy area.

The work will be carried out by researchers at UCL and is funded by research grants from the National Institute on Aging in the USA, and the Economic and Research Council in the UK. There is a team at UCL who carry out research on ELSA as well as manage the study, and this is funded by the MRC, Cancer Research UK and other funders as well as by NIA and the ESRC. This request is to allow that team to carry out this research.

These new data will be used to continue UCLs programme of research on ageing and health in England which aims to improve understanding of the ageing process, including predictors of various age-related disease diagnoses (e.g., heart disease, stroke, specific cancers, depression, dementia), and how health and the use of health care affect that ageing process. This will have value in planning services, predicting future needs, estimating the costs of care, and understanding the impact of various ageing states on individuals and their families. All analyses will use de-identified data.

Specifically, the data will improve understanding of:
1) Variation in hospital use across individuals with different individual and family characteristics, particularly at the end of life, providing a new understanding of the extent to which spending is efficiently allocated across different types of people.

2) The relationship between social care provision and hospital use, in particular the extent to which lack of social care availability may increase use of hospital care either through more entry into hospital or delayed exit.

3) Social inequalities in health among older people in England, and the relationship between these inequalities and other social characteristics, quality of care, and disability. This work is important for future planning of health care provision.

4) New funding from the National Institute on Aging has charged UCL explicitly to model the incidence of dementia in the ELSA population, and this will be greatly facilitated by the availability of hospital care statistics. At present, UCLs estimates of dementia incidence and prevalence are based on cognitive tests and ratings from relatives and carers. Having information about the use of hospital services by ELSA participants with dementia and cognitive impairment will strengthen the evidence base and provide a platform for more detailed analyses of the determinants and consequences of dementia. UCL will also be able to fulfil the mandate from the National Institute on Aging to provide data that can be used to compare dementia rates in the UK and USA.

UCL will use the pseudonymised linked data for research on ageing to understand what factors influence survival, and whether attrition from the repeated measures in the study has happened because of drop-out or because they died.

Another objective of this research project is to compare the risk of mortality following the onset of different health conditions across demographic and socioeconomic groups within the older population in England, and between similar groups in England and the USA.

1) The cancer registration data is specifically required for projects funded by Cancer Research UK. These will address the following issues: The relationship between body weight, changes in body weight, and cancer incidence among older people.

2) The impact of cancer diagnosis on health behaviours and quality of life. Up to now, UCL have based these analyses on self-reported diagnoses (Br J Cancer, 2013; Psycho-Oncology, 2016). But registry data will allow these issues to be investigated with greater precision.

3) The association between bowel cancer screening (measured in ELSA since 2012) and cancer incidence.

4) The relationship between psychosocial factors (depression, social isolation, cognitive function) and cancer incidence.

Other projects that will take place as part of this programme of work are:
(1) To examine how the pattern of hospital care use changes in the final year(s) of life, and to examine whether it is proximity to death, as opposed to age, that determines healthcare utilisation (controlling for other characteristics captured in the ELSA data).

(2) To compare the risk of survival following the onset of different health conditions across demographic and socioeconomic groups within the older population in England, and between similar groups in England. UCL will use the information on cause of death to find out who has had an onset of a condition prior to their death, so that UCL can work out the probability of survival among those who experience (e.g.) a heart attack. UCL have missing survey information on those who die before they are able to report a new onset, and the cause of death information allows UCL to fill in the gap.

The requested data would be used solely for research purposes, in accordance with the research aims stated above. UCL would model the use of NHS-funded inpatient services (provided by HES) by individuals with similar underlying health needs, but who differ in other, non-need based characteristics (provided by ELSA). UCL then model the relationship between receipt of social care and hospital admissions to examine whether cuts in social care are likely to increase probability of hospital admission.

Yielded Benefits:

The data were only received in Summer 2018, and as of yet no work has been published. As a result, this work has not yet yielded any of the expected benefits. It is expected that publication will be produced from Autumn 2019 onwards, with benefits to follow after this.

Expected Benefits:

The twin pressures of a rapidly ageing population and a prolonged period of public spending austerity will produce unprecedented pressures on NHS services over the coming years. The English population aged 65 and over is expected to grow by more than 20% over the next decade. Meanwhile, the NHS is experiencing a period of funding austerity, with little increase over the past few years. Understanding how to meet these additional demands with fewer resources is therefore a key challenge for health policymakers and practitioners. The importance of this challenge is reflected in the recent policy and practice debate (e.g. the Better Care Fund), and the size of the challenge has been well documented by the Dilnot Commission and initiatives such as the Quality Innovation Productivity Prevention (QIPP) programme. It also highlights the importance of prevention rather than cure, and the crucial role played by lifestyle and behaviour in more effective prevention.

Existing work on ELSA has been used to inform policy makers including Monitor, NHS England, the Department of Health, the Cabinet Office and representatives from PCTs. This type of academic work helps to understand the impacts of former policy and guides improvements to the existing health and social care system. The DH has noted that they have have no doubt that the linkage of the Hospital Episode Statistic with survey data from the English Longitudinal Study of Ageing will be a valuable source of information in understanding the variation of health care use across individual with similar medical needs but different characteristics.

The data linkage between HES and Cancer Registration data with ELSA would provide an important contribution to this debate. ELSA has high quality data on the social circumstances of a large representative sample of older people living in England, very precise economic data on wealth and income, a broad range of psychological factors, information about health and disability, lifestyle factors relevant to health including physical activity, diet and cancer screening participation, objective measures of cognitive function, physical capacity, health-related biomarkers, and genetic data. A particular strength of ELSA is that it is a longitudinal study involving data collection every two years since 2002. This allows trajectories of social, economic, psychological and biological processes to be tracked over prolonged periods.

The linkage would provide detailed information on the characteristics of individuals who use health and social care services and who experience major health problems. This will allow a detailed analysis of who uses these services, and to identify any spillovers in the use of health and social care (e.g. do cuts in social care spending have negative impacts on NHS services). In particular, the ability to follow the same individuals over an extended period of time will provide information on how needs for (and use of) health and social care have changed over across cohorts. This will contribute directly to an important debate over the size of additional pressures on services as a result of an ageing population (e.g. does healthy ageing lead to increased health spending?). UCL will also be able to strengthen the evidence base for the importance of maintenance of healthy lifestyles into older ages. Some people believe that once they have reached 60 or 65 years old, then sustained physical activity, healthy dietary choices and other lifestyle factors no longer matter because they have already done their damage. UCL research with other datasets indicates this is not the case, but ELSA can provide even more convincing evidence because of its representative sample and long follow-up, and will help to drive the prevention agenda in order to reduce health and social care costs among the elderly.

Outputs:

The number of publications using ELSA data is substantial (more than 200 at time of writing), and ELSA researchers have given numerous presentations, talks and seminars at government seminars, academic conferences, and policy workshops. Other outputs will include presentations at academic conferences and presentations to policy makers. Presentations with policymakers will focus on disseminating results, and helping to inform the government departments who are involved in planning and delivering NHS care to elderly individuals. All outputs will only report large sample aggregate statistics and regression outputs and small numbers will be suppressed in line with the HES analysis guide. No individual or episode level data will ever be published.

Two types of written output are expected:
(i) Articles submitted to peer-reviewed journals. Previous results have been published in high impact general medical and scientific journals (Lancet, British Medical Journal, Journal of the American Medical Association, Proceedings of the National Academy of Sciences USA), and in specialist journals in epidemiology, public health, and social science.
(ii) non-technical research summaries which will be press-released and target policy makers, such as the Department of Health
and NHS England.

Processing:

As specified, ELSA consist of NatCen, IFS, UCL, and UoM working in collaboration with NatCen being the lead organisation. NatCen are the holders of the ELSA cohort and thus only NatCen hold the identifiable data in association with this cohort. IFS and UCL only have access to a pseudonymised version of the ELSA data with only the pseudonymised study ID as a form of identifier. For the purpose of this agreement UCL will obtain pseudonymised data from NatCen directly.

The data shared by NatCen contains both ELSA data (pseudonymised) and data provided by NHS Digital under NIC-311182-N0L1Y (also pseudonymised). The NHS Digital data shared will be restricted to the fields and pseudonymisation specified in this agreement and in that of NatCens agreement under NIC-311182-N0L1Y.

The data received from NHS Digital will be converted by NatCen into a pseudonymised format before onward sharing to UCL by removing identifying data. Date and Birth, Date of Death, Date of Inquest, and Date of Registration will be converted to MM/YYYY format. Cancer registration number will be downgraded to the first 6 digits. Only pseudonymised data will be shared with UCL and UCL may only receive, process and retain the data with an active NHS Digital Data Sharing Agreement in place.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

No data will be shared outside of the ELSA research group. Each organisation requesting access to the data will be required to hold an active agreement with NHS Digital.

All persons accessing the data are direct employees of UCL, and who are named ELSA collaborators.

The Data will only be used for the purposes described in this agreement.

UCL do not require identifiable data nor will they attempt to re-identify this data. The data will not be linked to any other dataset.

The UCL team require the earliest data (from 1997/98 or as far back as possible) on all admissions, outpatient appointments, critical care and A&E attendances at NHS hospitals in England. The youngest ELSA cohorts were born in the 1950s and that coupled with information UCL have about early life experiences allows UCL to predict both the risks and consequences of hospital usage over time and across much of the life course. The statistical power of the research will be greatly enhanced by capturing as many years of hospital usage as possible and will enable the UCL team to carry out their proposed research in a statistically robust manner.

When turning the supplied data to outputs UCL will be doing the following:

(1) Produce hospital utilization measures
(2) Produce death outcome measures
(3) Produce cause of death measures
(4) Use (1), (2) and (3) to run regressions, correlations and produce descriptive statistics to further our understanding of the relationship between these and the ageing process.
(5) Create indicators of hospital admissions at the local authority level

Dose-response relationship between alcohol and suicide — DARS-NIC-287229-D6F9F

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2019-08 – 2022-08 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

Adult Psychiatric Morbidity Survey
Adult Psychiatric Morbidity Survey (APMS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires Adult Psychiatric Morbidity Survey (APMS) data to investigate if there is a dose-response relationship between weekly alcohol consumption and suicidal/self-harm behaviours in the general population.

A link between alcohol misuse and suicidal behaviours is well-established. Alcohol use disorder (AUD) and acute use of alcohol prior to attempt (AUA) are particularly significant risks to suicidal behaviours. There is little research exploring the relationship between alcohol consumption on self-harm and suicidality at a population-level with previous research showing inconsistent results.

The data is being processed in line with EU GDPR Article 6 (1) (e) for the performance of a task carried out in the public interest and EU GDPR Article 9 (2) (j) processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1). Approximately 800,000 people die by suicide worldwide each year with approximately 6,000 suicidal deaths in the UK. It is in public interest to conduct research to identify trends and risk factors for suicidal and self-harming behaviours to inform policy and practice and potentially inform interventions for these at-risk groups to lessen the societal burden of suicide and self-harm. Participants have willingly participated in the Adult Psychiatric Morbidity Survey (APMS) knowing their data will be used for research purposes and consenting to this. All APMS 2014 data is pseudonymised, protecting the identities of participants.

The APMS data will achieve the aim as statistical analyses (logistic regression) will be carried out to test the association between alcohol use and self-harm and suicidality. Alcohol use has been collected in APMS using the AUDIT score, including specific questions on units of alcohol consumed and drinking days. A weekly measure of alcohol consumption will be created based on the responses to these questions. Total AUDIT scores will also be used to analyse the relationship between alcohol consumption and self-harm & suicidality.

This research will be undertaken as part of a PhD research on alcohol and substance use as a risk factor for suicide and self-harm across the lifespan, funded by the NIHR School for Public Health Research. This will be the first in a series of studies unpacking the relationship between alcohol consumption and self-harm/suicidality at a population health level. Depending on the results observed in this study, it is expected that future longitudinal research will be conducted using UK GP records (THIN/CPRD).

The data subjects for this study are UK residents aged over 16 years living in private households who have completed the APMS survey. There is no control group.

The APMS data is a nationally representative sample of the UK general population. Due to the rare outcomes of self-harm/suicidality UCL need a sufficiently large sample to get meaningful results for the research question, so the full 2014 APMS sample is required. Individuals responses within APMS are pseudonymised. No linkages are requested at this time.

University College London (UCL) are the sole Data Controller who also process data.

NIHR School for Public Health Research are funding the PhD project for which this research project is being conducted.

Outputs:

The research paper will be submitted to an academic peer-reviewed journal specialising in psychiatry/public health, such as the BMJ or Journal of Affective Disorders.

The PhD thesis will included the research results. The thesis will be uploaded and freely available via http://discovery.ucl.ac.uk/ upon completion of the PhD.

The aim is for the results to be presented at an academic conference with an interest in suicide/self-harm/mental health. Results will be presented at the School for Public Health Research Annual Scientific Meeting in March 2020 and will be submitted to the ECR/MCR Suicide and Self-Harm Research Forum and the European Symposium on Suicide and Suicidal Behaviour in September 2020.

If results of interest emerge from the study, a public health resource may be created to inform clinicians and policy makers. This will include recommendations for practitioners, policy makers and the public with respect to the improved knowledge of the links between alcohol consumption and self-harm/suicidality. The public resource summarising the outputs and key findings should assist clinicians with the recognition of at-risk individuals and populations so they can be offered the necessary support and appropriate targeted interventions can be designed for these groups. The core research team work clinically in primary and secondary healthcare settings, and have direct links with the NIHR School for Public Health Research and Public Health England. The research team will work with these organisations to plan the distribution and dissemination of this work to maximise its impact, while being mindful of the sensitive nature of this subject.

Results will also be broadcast through Twitter. Relevant organisations, such as Samaritans, Addaction and Rethink Mental Illness, will be contact through both Twitter and by email to make them aware of the results and they will be encouraged to broadcast results through their mediums to increase public awareness of the research findings.

Outputs to be included in the peer-review journal is expected to be submitted by January 2020. The expected date for presenting at a conference is by March 2020.

APMS low numbers and suppression
In order to protect patient confidentiality in publications resulting from analysis of APMS data users must:
guarantee that any outputs made available to anyone other than those with whom this agreement is made, will meet required standards, including the guarantee, methods and standards contained in the Code of Practice for Official Statistics (http://www.statisticsauthority.gov.uk/assessment/code-of-practice/index.html) and the ONS Statistical Disclosure Control (https://gss.civilservice.gov.uk/statistics/methodology-2/statistical-disclosure-control/) for tables produced from surveys;
apply methods and standards specified in the Microdata Handling and Security Guide to Good Practice (http://www.data-archive.ac.uk/media/132701/UKDA171-SS-MicrodataHandling.pdf) for disclosure control for statistical outputs.

Processing:

This request is for the APMS 2014 dataset to flow out of NHS digital. Some of this relates to health-related data given that some screening questionnaires were used in the conduct of the survey. Once this agreement is active the actual flow of data will be from the UK Data Service (UKDS) who will grant system access to pseudonymised APMS data to the University College London. There are no subsequent flows of data.

Data will be stored and upheld in line with UCLs Data Protection Policy. It will be marked as highly confidential and only be accessed by the principal research team through the UCL Safe Haven source. The principal research team consist of a PhD student and senior researchers (including professor, senior clinical lecturer). All are substantive employees of UCL.

Logistic regression will be carried out on the data for the main statistical analyses. All statistical testing will be conducted using Stata version 16. All files, including coding instructions, will be deleted upon conclusion of this agreement.

All APMS participants data will be included in the research. No linkage to other sources will be conducted. APMS Data are pseudonymised.

There will be no requirement or attempt to re-identify individuals.

The 2014 APMS dataset (English adult population (aged 16 and over) is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole dataset; there is no facility to select individual variables.

UCL will be able to download the dataset from UKDS for the period specific within the data sharing agreement (DSA) and they must securely destroy all local copies of the dataset when the DSA expires and notify the Data Access Request Service (DARS) in line with standard procedures. This 2014 version of the dataset available via DARS has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

Examining loneliness in people with borderline intellectual functioning compared to the general population and its relationship to mental and physical health outcomes — DARS-NIC-177523-N8J2S

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2018-06 – 2021-05 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 20 January 2022 final.pdf

Datasets:

Adult Psychiatric Morbidity Survey
Adult Psychiatric Morbidity Survey (APMS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The objective is to use the APMS 2014 dataset for the purposes of research (MSc research project). A secondary analysis of the data will be carried out by researchers at UCL in order to investigate how loneliness may affect people with borderline intellectual functioning.

Background
A prevalence of loneliness of 10.5% has been reported in the general population.
Loneliness has been associated with being female but there appears to be a complex relationship with age, with some studies reporting a U shaped relationship with higher levels of loneliness in younger and older people, or higher levels in older age. Other socio-demographic factors associated with loneliness include being single, living alone, low education and income, immigration status and low social support. Loneliness has been associated with life style factors such as smoking, being less physically active and lower consumption of fruit and vegetables. Loneliness is associated with increased mortality and higher rates of chronic diseases such as raised blood pressure and cholesterol and chronic heart disease. Loneliness has also been linked to depression and higher levels of psychological distress, suicide and psychosis.

The prevalence of loneliness in people with intellectual disability (ID) has been reported to be 44.7%, which is thought to be higher than the general population. Loneliness in people with ID has been associated with increasing age, living in a large residential setting, with lower levels of loneliness being associated with having choice of living companions or living with family. Loneliness was also associated with being afraid at home and the neighbourhood (but liking where you live was associated with less loneliness. In addition, social contact with friends and family was associated with less loneliness. Studies on the association between loneliness and mental health problems is limited. However, one study did find an association with depression. Not feeling lonely has been associated with better physical health.

However, little is known about the prevalence, risk factors and outcomes associated with loneliness in people with borderline intellectual functioning. Borderline intellectual functioning is generally defined as having an IQ score between 70-85. This group has increased vulnerability to social disadvantage and mental health problems.

The APMS dataset has not previously been used to explore loneliness in this group.

The aim of the study is to examine loneliness in people with borderline intellectual functioning and compare their physical and mental outcomes to the general population. The specific objectives are to:
1. Compare the prevalence of loneliness/social support in people with borderline intellectual functioning and the general population
2. Explore the association between loneliness/social support and socio-demographic variables (age, sex, ethnicity, qualifications, income, employment, accommodation and neighbourhood characteristics) separately in people with borderline intellectual functioning and the general population to explore similarities and differences in the associations.
3. Explore the relationship between loneliness/social support and wellbeing, common mental disorders (depression and anxiety disorders) and chronic physical health conditions separately in people with borderline intellectual functioning and the general population in order to identify similarities and differences in the associations
4. Does loneliness/social support moderate the relationship between intellectual functioning and mental disorders (anxiety, depression etc.), chronic physical disorders and wellbeing?

The data will be analysed only within University College London. It will not be used to support a larger programme of work.

Expected Benefits:

The research study will further the understanding of the mental and physical health impacts of loneliness in people with borderline intellectual functioning.

The results will be disseminated to commissioners and mental health charities (e.g. MIND), including befriending organisations that support people who are lonely or those who have limited or no social support (target date 01/06/2019). Tackling and reducing loneliness may lead to improvements in physical and mental health outcomes and therefore the research findings could help to promote the role of befriending and volunteering organisations and provide evidence for the need to lead to develop interventions to reduce loneliness in this group (and other disadvantaged groups).

Outputs:

The results of the study will be of interest to mental health practitioners and mental health services that encounter people with borderline intellectual impairment. The study will raise awareness of the issues experienced by people with borderline intellectual functioning and how their needs should be better met. The findings of the study will be published in a peer reviewed scientific journal such as the Journal of Intellectual Disability Research and presented at conferences (e.g. the International Association for the Scientific Study of Intellectual and Developmental Disabilities). Personal identifiable data will not be published. Outputs will only contain aggregate level data.

The target data for publication within a scientific journal is 02/01/2019.

In order to protect patient confidentiality in publications resulting from analysis of APMS data users must:
· guarantee that any outputs made available to anyone other than those with whom this agreement is made, will meet required standards, including the guarantee, methods and standards contained in the Code of Practice for Official Statistics and the ONS Statistical Disclosure Control for tables produced from surveys;
· apply methods and standards specified in the Microdata Handling and Security Guide to Good Practice for disclosure control for statistical outputs.

Processing:

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

The 2014 APMS dataset is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole dataset; there is no facility to select individual variables. They will be able to download the dataset from UKDS for the period specific within the DSA and they must securely destroy all local copies of the dataset when the DSA expires and notify DARS in line with standard procedures. This 2014 version of the dataset available via DARS has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

Once an active data sharing agreement is in place, UKDS will transfer the pseudonymised APMS data to UCL. It will be transferred and accessed within the Data Safe Haven. This is UCL's data service for storing, handling and analysing identifiable data. It has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit.

The data obtained will be fully anonymous. It will be stored directly and processed only using UCL Data Safe Haven, which uses encryption and is therefore very secure. The data will only be accessed by substantive employees of UCL. Registered UCL MSc students will only have access to aggregated data with small numbers suppressed. No data will be linked to record level patient data.

Justification for processing the data:
The data will be processed according to article 6(1) e legitimate interest under public task.
UCL is a public authority and therefore the legitimate interest for processing data is under public task. Processing data for the purposes of research is considered to be one of UCLs public tasks. The processing of the APMS dataset is considered necessary as there are no other means of examining the objectives in a less restrictive way. Individuals will not be harmed through the processing of the data.
In addition, data will be processed according to article 9(2) j processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes.
UCL ensures that the processing meets the public interest test and appropriate safeguards are in place such as using technical and organizational measures to ensure minimisation e.g. pseudonymisation and not processing in a way that will cause damage or distress to individuals.

Method
A secondary analysis of data will be conducted using The Adult Psychiatric Morbidity Survey, 2014. This is the fourth and most recent survey of adult mental health in the general population. It comprised two phases, an initial interview with the whole sample and a second phase interview that was conducted with a sub-sample of phase one participants by clinically trained interviewers coordinated by the University of Leicester.
The survey employed a multi-stage stratified probability sampling design. The sampling frame was based on the small user Postcode address File (PAF), which permitted private households to be indentified. The primary sampling units (PSU) were individual or groups of postcode sectors. The PSUs were stratified by a number of different strata and a random sample was obtained from this list. Addresses that did not contain private households were excluded. One person over the age of 16 was randomly selected to take part in the survey per household.
13313 individuals were contacted but 7546 participants completed the survey (57 % response rate).

Measures
1. Measuring intellectual functioning
Intellectual functioning will be assessed using the National Adult Reading Test (NART), which is a standardized test designed to estimate the premorbid intelligence level of adults. The NART consists of 50 English words presented in ascending order of difficulty (Nelson & Willison, 1991). The NART error score is calculated from the total number of reading errors made by the candidate and this is used to estimate a verbal IQ score. An IQ between 70-85 will be used to identify the sample of participants with BIF and those with an IQ greater than 86 will be identified as being in the general population (variable name: iqbest2g) . The sample will be further defined by excluding participants who have educational qualifications of A-levels or higher.

2. Measuring loneliness and social support
Loneliness will be measured using one item I feel lonely and isolated from other people. A four point likert scale was used (very much, sometimes, not often and not at all).
Social support will be measured using the total score from 7 items measuring social support, which includes items such as there are people I know amongst my family and friends who make me happy, there are people I know amongst my family and friends who can be relied on no matter what and there are people amongst my family and friends who give me support and encouragement. These questions were rated on a three point scale (not true, partly true, certainly true).
In addition, we will use one item using a variable that has been derived from the Social support scale that examines the number of family and friends the respondent feels close to (variable name: Primgrp)

3. Wellbeing
Well being will be measured using the total score on the Warwick- Edinburgh mental wellbeing scale (WEMWBS). This is a 14-item scale with five response categories, providing a total score ranging from 1470. The items are all cover both feeling and functioning aspects of mental wellbeing. A higher score
indicates a higher level of mental wellbeing.

4. Common mental health disorders
The Clinical Interview Schedule Revised (CIS-R) was used to identify the presence of common mental disorders.
The following will be examined: Participants who were diagnosed with depression in the past 12 months, participants who were diagnosed with phobia in the past 12 months, participants who were diagnosed with panic attacks in the last 12 months and participants who were diagnosed with Post Traumatic Stress Disorder in the past 12 months . The overall CISR score (variable name CISR Two) and participants who were diagnosed and treated with any common mental health problem in the last 12 months will also be examined.
Suicidal thoughts in the last 12 months will be analysed.

5. Physical health disorders
One item about general heath will be used: How is your health in general?. This item is rated on a 5 point Scale (excellent, very good, good, fair or poor).
Participants were presented with a list of 22 physical conditions and were asked whether they had ever had any of these conditions; whether they had the condition in the past year; whether the condition had been diagnosed by a health professional and if they received any medication or other treatment for it. The presence of any chronic disease (e.g. asthma, diabetes, epilepsy, high blood pressure, cancer) in the last 12 month will be examined as well as individual disorders.

6. Socio-demographic variables
The following socio-demographic variables will be analysed: age, sex, marital status, ethnicity, income, any educational qualifications, employment (paid work in the last 7 days (wrking), ever had a job, accommodation.
Whether people feel safe in their neighbour hood will also be examined using 1 item: I feel safe around here in the day time. This item was measured on a five point scale (strongly agree to strongly disagree).

Analysis
Stata will be used to analyse the data and sampling weights will be applied to all the analyses. Descriptive statistics will be used to describe the sample (proportion of people with borderline intellectual functioning, the number of males and females, mean age and ethnicity in both groups (general population and borderline intellectual functioning). The proportion of people reporting loneliness will be compared in people with borderline intellectual functioning and the general population. Data will be presented as weighted percentages and Chi Square tests/ T tests will be reported, where appropriate.

Subgroup analysis will be carried with both groups to identify the relationship between loneliness (dependent variable) and socio-demographic variables, mental health and chronic physical disorders.
The moderating effects of loneliness on the relationship between intellectual functioning and chronic mental health and physical disorders will be analysed.

MR104C - British Women's Heart and Health Study (s251 cohort) — DARS-NIC-174486-Q8J1B

Opt outs honoured: (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 s261(7)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-06 – 2020-03 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 20th May 2021 final.pdf, igard_minutes_1_march_2018.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report

Type of data: Identifiable

Objectives:

University College London (UCL) currently holds mortality data, cancer registration data and demographic data for use in the British Womens Heart & Health Study (BWHHS).

The BWHHS has previously been managed by both the University of Bristol and the London School of Hygiene and Tropical Medicine but since 2015 has been managed by UCL at The Institute of Health Informatics. UCL is now the sole Data Controller and, for the primary purpose of the BWHHS, UCL is the sole processor of the data.

The University of Bristol and the London School of Hygiene and Tropical Medicine have no ongoing role in the study other than individuals from the respective organisations working as study collaborators with UCL in accordance with the arrangements and processes described below.

The BWHHS aims to determine the contribution of both established and new risk factors to the considerable variation in ischaemic heart disease and stroke in Great Britain. It is also concerned with the effects of risk factor changes and their impact on Cardiovascular Disease (CVD) events and on other common causes of morbidity and mortality in British women. The present aim is to continue to collect CVD and other common diseases of incident morbidity and mortality to provide information for the prevention and promotion of a disability-free life in older women.

The BWHHS is a prospective cohort study of cardiovascular disease in women aged over 60 years, in England, Scotland and Wales.

The study was set up in 1999 to complement the British Regional Heart Study (BRHS), to describe and establish risk factors and the differences in their impact in women compared to the men followed up by the BRHS.

The study selected women at random from 24 GP practices, in 23 towns from 1999 to 2000. Of the 7,296 invited, 4,286 (60%) were recruited and attended the baseline examinations and completed questionnaires. Follow up consisted of postal questionnaires and regular reviews of GP medical records. This agreement covers the release of data in respect of 1049 participants of the study where the legal basis to address the Common Law Duty of Confidentiality (CLDC) is under Section 251.

The BWHHS has used the patient tracking service provided by NHS Digital and predecessor organisations to receive notifications of its cohort members deaths (date and cause), cancer registrations, exits from the NHS and changes in recorded demographics (such as name, NHS number, etc.). The latest demographics data has been used for administrative purposes to support ongoing contact with participants (e.g. for sending questionnaires) and for requesting primary care data from GPs.

The BWHHS was set up to explore the current patterns of CVD & Cardiovascular Heart Disease (CHD) risk factors (and recent changes in this pattern), and prevention and treatment for CHD in older British women.

The BWHHS involves multiple ongoing analyses and investigations testing different hypotheses within this scope. The research activities are determined by the BWHHS study director on a monthly basis. The BWHHS study director is responsible for determining what analyses will be undertaken and what data will be used for each analysis in support of the objectives. All analyses are undertaken by members of the BWHHS team, all of whom are UCL employees working under supervision of the study director.

The clinical outcomes that are the current focuses of the BWHHS are: myocardial infarction, stroke, angina, heart-failure, diabetes, deep vein thrombosis and pulmonary embolism (DVT/PE), dementia, atrial fibrillation and cancers.

The main focus is the measurement of biological variables (biomarkers) using the biological specimens collected at baseline (in the form of DNA, plasma, and serum samples) and their effect on clinical diseases of relevance to post menopausal women such as cardio-metabolic disease, dementia and common cancers. The continuing provision of mortality and cancer registration data increase the number of events accrued over time, which will increase the statistical power to BWHHSs analysis.

Under this Data Sharing Agreement, the BWHHS team is not permitted to use the data to expand the focus of the BWHHS to non-cardiovascular conditions of relevance to post-menopausal women that the BWHHS currently does not collect.

Additionally UCL has made data from the BWHHS available to third parties subject to an approval process outlined below. NHS Digital has assessed what data is shared and determined that it is not sufficiently derived and therefore may not be onwardly shared without NHS Digitals express permission.

Under this Data Sharing Agreement, UCL is not permitted to share any data supplied by NHS Digital or data derived from that data with any third parties or with UCL employees for any purposes other than the primary objectives of the BWHHS.

UCL has informed NHS Digital that it has previously shared pseudonymised data with the following organisations:
1. University College London for the purpose of a UCL-LSHTM-Edinburgh-Bristol (UCLEB) Consortium
2. UCL investigators at the Department of Medicine
3. University of Bristol
4. University of Cambridge
5. London School of Hygiene & Tropical Medicine
6. Birkbeck, University of London
7. Universidad de Salamanca
8. University of Alcala
9. Imperial College London

No new data may be shared with these third parties. Under this Data Sharing Agreement, the above organisations are permitted to retain the data while UCL prepares and submits an application to NHS Digital for approval to onwardly share data under appropriate controls. The third parties are not permitted to onwardly share the data and may only use it for the purposes of the projects that were previously authorised by the BWHHS Study Director following the approvals process outlined in the Processing activities section below. No changes to the existing purposes may be approved.

Data will only be used for academic research purposes and not commercially.

Findings from data have been used to inform policy and clinical guidelines and have also led to the development of prediction models available as web tools. Work conducted operates in the pre-competitive arena so does not contain patentable or commercially exploitable results.

UCL have identified the appropriate legal basis for processing under General Data Protection Regulation (GDPR). Based on the purpose for processing, the legal basis is 'processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.' Article 6(1)(e). As the research involves health data, which is included in the definition of special categories of personal data, and requires an additional condition for processing. For health research this will be Article 9(2)(j) 'processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law...'.

Yielded Benefits:

The BWHHS has 218 publications within peer reviewed journals to date addressing the aims outlined previously in this application. Many of the journals for BWHHS manuscripts are considered high ranking journals. This ranking of journals is measured by something called the impact factor which reflects the frequency with which the average article in a journal has been cited within the year. 31% of the publishing journals for BWHHS manuscripts have been published in journals with a high-impact factor (>8). This means that the output is published in journals that are widely read by doctors, scientist and public health practitioners, ensuring a greater impact of the work. The BWHHS has also had measurable benefit directly as fourteen of the 218 publications have contributed to the following fifteen clinical care and public health guidelines: 1. National Clinical Guideline Centre NICE clinical guideline CG181 Lipid modification (2014) 2. Diabetes, Pre-Diabetes and Cardiovascular Diseases developed with the EASD ESC Clinical Practice Guidelines (2013) 3. Dyslipidaemias 2016 (Management of) ESC Clinical Practice Guidelines (2011) 4. Dyslipidaemias 2016 (Management of) ESC Clinical Practice Guidelines (2016) 5. Arterial Hypertension (Management of) ESC Clinical Practice Guidelines (2013) 6. CVD Prevention in Clinical Practice (European Guidelines on) (2016) 7. Factors Influencing the Decline in Stroke Mortality: A Statement from the American Heart Association/American Stroke Association (2013) 8. Genetics and Genomics for the Prevention and Treatment of Cardiovascular Disease: Update A Scientific Statement From the American Heart Association (2013) 9. Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association (2014) 10. Update on Prevention of Cardiovascular Disease in Adults With Type 2 Diabetes Mellitus in Light of Recent Evidence: A Scientific Statement From the American Heart Association and the American Diabetes Association 11. Social Determinants of Risk and Outcomes for Cardiovascular Disease: A Scientific Statement From the American Heart Association 12. Future Translational Applications From the Contemporary Genomics Era: A Scientific Statement From the American Heart Association 13. Basic Concepts and Potential Applications of Genetics and Genomics for Cardiovascular and Stroke Clinicians: A Scientific Statement From the American Heart Association 14. Salt Sensitivity of Blood Pressure: A Scientific Statement From the American Heart Association 15. Preventing and Experiencing Ischemic Heart Disease as a Woman: State of the Science. A Scientific Statement from the American Heart Association

Expected Benefits:

The benefits are labelled in a way that match specific outputs to be produced (section 5c, above)

1) The BWHHS has on-going work on neighbourhood deprivation focusing directly on government policy to narrow the gap between the most deprived areas and the rest of the country, with the aim to provide a tool to explore the specific attributes of the built environment that affect health in elderly people (work to be completed by end of 2018).

2) BWHHSs work on metabolomics and proteomics will help to discover which substances in the blood should be measured (called biomarker of therapeutic efficacy) to better inform the development of new medications. The expectation is that this work will help to discover such blood markers that will help to produce new medications to raise the levels of the good-cholesterol (HDL-cholesterol) and hence to improve cardiovascular health (work to be completed by end of 2018).

3) As new cardiovascular medications, still under-patent (e.g. PCSK9 inhibitors), are being adopted by healthcare systems world-wide, there will be a resurgence on risk-prediction models. BWHHSs work on metabolomics and proteomics and cardiovascular disease is expected to identify new ways to know who in the future will suffer from a cardiovascular disease, this is called risk prediction (work to be completed by end 2020).

4) BWHHSs work on Mendelian randomization confirm the potential of a genomic lead strategy to not only identify and validate new drug-target , but also to discover more adequate biomarkers of therapeutic efficacy and overall help to optimise the drug discovery process (work to be completed by end of 2020).

Outputs:

The BWHHS has led to 218 publications in peer reviewed journals to date addressing the aims outlined above.

The BWHHS typically produces manuscripts. Those working on the study do not typically engage with the end users of the studys findings. UCL has an infrastructure for communications and determines the dissemination strategy for outputs on a case by case basis depending on factors such as the quality and impact of the manuscript. UCLs press office may engage with the media and the public. There are examples on the webpage.

The following manuscripts were delivered within the last year:

"Causal Associations of Adiposity and Body Fat Distribution With Coronary Heart Disease, Stroke
Subtypes, and Type 2 Diabetes Mellitus: A Mendelian Randomization Analysis was published in
Circulation in 2017.
Investigating the importance of the local food environment for fruit and vegetable intake in older men
and women in 20 UK towns: a cross-sectional analysis of two national cohorts using novel methods was
published in Int J Behav Nutr Phys Act. 2017
"Functional Analysis of the Coronary Heart Disease Risk Locus on Chromosome 21q22." was published in
Disease Markers in 2017.
"Identifying low density lipoprotein cholesterol associated variants in the Annexin A2 (ANXA2) gene."
was published in Atherosclerosis in 2017.
"Optimising measurement of health-related characteristics of the built environment: Comparing data
collected by foot-based street audits, virtual street audits and routine secondary data sources." was
published in Health Place in 2017.
A manuscript titled "PCSK9 genetic variants and risk of type 2 diabetes: a mendelian randomisation
study." was published in Lancet Diabetes & Endocrinology in 2017.
A manuscript titled "HAPRAP: a haplotype-based iterative method for statistical fine mapping using
GWAS summary statistics." was published in Bioinformatics in 2017.
A manuscript titled "Worldwide trends in blood pressure from 1975 to 2015: a pooled analysis of 1479
population-based measurement studies with 19.1 million participants was published in Lancet in 2017.
A manuscript titled "Challenges of monitoring global diabetes prevalence." was published in Lancet
Diabetes & Endocrinology in 2017.

Data from the BWHHS was used for PhD training. In the last year 2 PhD thesis that used BWHHS data were awarded:

Barcoding Cardiovascular risk: Predicting cardiovascular disease in patients with systemic lupus erthematosus (SLE) was completed in 2017.
Mitochondrial DNA copy number as a phenotypic trait for human diseases in genetic epidemiological studies was completed in 2017.

The following outputs will be produced:

Scientific papers are being written to investigate:

1) What components of the physical environment (e.g. access to green space) where participants of the BWHHS live affect lifestyle behaviours (e.g. physical activity) known to affect the risk of cardiovascular disorders. This will improve understanding of how the modification of the physical environment could positively impact the cardiovascular health of older women in the UK. The analysis has been completed and the manuscript is currently being drafted. Once completed, it will be submitted in the coming year to peer-review journals such as International Journal of Epidemiology.

2) What substance in the blood (called metabolites) are responsible for correlation of the high levels of good-cholesterol (called HDL-cholesterol) with cardiovascular health. This work will help to inform development of new medications that by raising the levels of the good-cholesterol may lead to an improve cardiovascular health. The analysis has been completed and the manuscript is currently being drafted. Once completed, it will be submitted in the coming year to peer-review journals such as Circulation.

3) Best ways to identify if someone will develop in the future a heart attack or a stroke. This is through the analysis of thousands of substances in the blood (what scientists call proteomics) and is intended to be more accurate than current ways used in clinical practice by GPs and specialists. The analysis will commence in November (once new data is received) and it is estimated that the analysis will be complete and the manuscript ready for publication by mid-2018. The manuscript will be submitted to peer-review journals such as Circulation.

4) New drug-targets for prevention of cardiovascular disorders. By combining the information on the genetic information of BWHHS participants with the levels of substances in the blood (called proteomics and metabolomics), it is expected that targets for new medications aimed to prevent the occurrence of heart attacks or stroke will be identified. This strategy is called Mendelian randomization. Future work will use this strategy extensively.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

This agreement permits continued retention of the data only. The agreement does not permit any other processing of the data.

Identifying details for study participants were supplied to NHS Digitals predecessor organisation. These were matched to the patient entries on the NHS Central Register (since replaced by the Personal Demographics Service) and a flag was added to each matched entry to facilitate ongoing event reporting.

NHS Digital (and predecessors) provided routine notifications of deaths, cancer registrations and exits from or re-entries to registration with the NHS along with the latest recorded demographic data for members of the cohort.

To ensure that type 2 patient objections can be appropriately upheld, NHS Digital purged the list of participants it currently holds and the cohort will be re-flagged in two subgroups.

UCL will supply to NHS Digital two separate lists of study participants: one listing the individuals who have given informed consent and one listing the individuals for whom section 251 support permits the processing of their personal data without informed consent. The lists will contain the following identifiers: NHS Number, Date of Birth, Surname, Forename, Postcode , Gender and unique study ID .

NHS Digital will trace the relevant patient entries for each participant and flag them as being part of the relevant subgroup. NHS Digital will return separate reports for each subgroup to UCL. These will list all individuals who were successfully traced and, for those covered by section 251 support, who had not registered a type 2 patient objection.

UCL will review the data it has historically received from NHS Digital (and predecessors) and if any data was provided for individuals who were not included in either of the lists UCL will send to NHS Digital (described above), UCL must securely destroy the data of those individuals.

NHS Digital will then provide routine notifications of cancer registrations, deaths and changes to status of registration with the NHS. These will include personal demographics data.

UCL stores the data on a server in its secure safe haven facility which can be remotely accessed at the UCL Institute of Health Informatics.

Data will only be accessed by individuals within the BWHHS team at the UCL Institute of Health Informatics who have authorisation from the study director to access the data.

Participant identifiers are stored in a table that is kept separate to the research dataset which contains all other collated study data. Access to the identifiers is restricted to a small number of authorised individuals all of whom are substantive employees of UCL and are used for administrative purposes only such as contact with participants and requesting primary care data from participants GPs. The information has only been managed (received, entered and stored) by four people during the 18 years of the study.

From the data provided by NHS Digital, the UCL staff (described above) produce derivations that are added to the separate research dataset which contains only pseudonymised data from multiple sources including data from the baseline interview, questionnaires, GP record reviews and biological variables measured from biological specimens. This dataset does not contain participants names, NHS numbers, Dates of Birth or Dates of Death. Instead, it contains variables such as participants age rather than Date of Birth and, rather than Date of Death or Date of Cancer Registration, it contains calculated time to event from the date of recruitment (baseline). The baseline date is not included. Full details of how data provided by NHS Digital is converted into data added to the research dataset are given below. All subsequent analyses use only the data in the research dataset.

Extracts from the research dataset are shared with UCL staff and external collaborators for use in analyses in support of the BWHHS objectives subject to a formal approval process (outlined below). Following an audit by NHS Digital in February 2017 the BWHHS team has developed a standard operating procedure for sharing of data. External researchers wishing to collaborate with the BWHHS team will have access to the standard operating procedure document which will be available on the newly updated website at the Institute of Health Informatics, UCL.

Prospective collaborators contact the BWHHS study coordinator to informally discuss the feasibility of the proposed project, to ensure minimum criteria are met and to discuss any questions regarding the application before passing onto the BWHHS Study Director.

The application requires the submission of the BWHHS Collaborators Request Form, completed and signed to confirm that prospective researchers have read the terms and conditions before initial approval by the BWHHS research team and the Study Director. The application requires the following information:
Name, affiliation and contact details of the principal investigator, who should be the main applicant.
Project Title.
Aims should be focused and as specific as possible.
Higher degree For those that do not involve a BWHHS researcher (i.e. UCL employees working with
BWHHS and with approval to access BWHHS data and will be ONS approved researchers and have
attended information governance and data safe haven training.), the main applicant should be the
Principal Investigator or PhD supervisor. These applications will be considered as a regular data sharing
project, no additional supervision should be expected from any BWHHS researcher.
Details of funding The applicant should specify whether they are seeking funding to carry out the
project and if the grant application is partially or totally based in the use of BWHHS data.
Variables list of exact variable names and state from which data collection. Variables requested must be
consistent with the project proposal.
Biological samples required.
Ethical issues.
Timing of study.
Signature and date to confirm that the applicant has read and will comply with the terms and conditions.

All requests are reviewed by the BWHHS study coordinator followed by a review with the BWHHS Director. The Director may deem it necessary to hold discussions with other members of the BWHHS team, before deciding whether to approve the application. Where necessary external peer review will take place.

BWHHSs internal review process will consider:
That there is no overlap internal projects
Researchers have the necessary skills
Bona Fide Researchers - BWHHS will ensure that collaborators have conducted high quality, ethical
projects for research purposes using rigorous scientific methods. They must also have a formal
relationship with a bona fide research organisation, which is an established academic institution,
research body or organisation with the capability to lead or participate in high quality, ethical research.
Investigators are based and data is stored in the European Economic Area
That request is adequate data requested must consist of the variables necessary to answer the research
question correctly and to a high quality.
That requested data is relevant to the proposal
It is not an excessive request only data necessary to answer the specific research question for which the
request was made will be provided.
Data is only intended to be used for the aims highlighted in the request form for a specific research
question. If future research questions arise, the investigators will complete a new request for
collaboration.
No intention to pass on the data to a third party not named in the request
That request follows BWHHSs rules of data pseudonymisation and data release as detailed below.

Once the completed and signed proposal has been reviewed and authorised by the PI, the study co-ordinator for BWHHS will be responsible for compiling the pseudonymised dataset.

Any project involving external collaborators is recommended to actively involve at least one member of the BWHHS team at UCL. Further use of data outside those specifically stated in the application form will require a new application being submitted and approval from the BWHHS team.

Sharing the data with collaborators allows wider research, maximising its value. Data shared would include only necessary variables for the research question from any of the sources including, event data from GP records, participant self-reported data, data from biological samples, data from the baseline medical examination and the derived variables from NHS Digital data. No NHS Digital data is shared with collaborators.

The data shared with collaborators complies with the following standards:

The BWHHS does not share the following variables: individual identifiers, names, addresses, postcode, telephone numbers, NHS numbers, GP details, place of date of birth. BWHHS will also not share data on low frequency events (n<30), from which an individual may be re-identified.

The BWHHS does not share any sensitive data including occupations or information on mental health.

The following restrictions have been applied to the data released to all users.

1. Records are pseudonymised and identified by a unique BWHHS numerical ID.

2. All datasets will be stripped of specific variables that can create a risk of participant identification including:

Date of Birth
Date of clinical events This includes date of recruitment (baseline)
Any variable with a low prevalence

3. Index of multiple deprivation and super output area variables are only used by members of the BWHHS team.

4. ICD codes are hidden behind generic labels such as MI or stroke. A separate variable will specify whether cause of death is underlying, direct or both. The following describes the data received from NHS Digital and shared and an explanation of the derived variables created before sharing.

Date of Birth Shared as Derived Variable By subtracting baseline dates from birth dates and dividing by
365.25 BWHHS get the age at baseline. Age in years at baseline is shared.
ICD-codes for direct & underlying causes of death - Shared as Derived Variable Using the event specific
ICD10 codes, variables are created for the specified outcome classifying them as an underlying cause or
direct cause. If both are true then classification is for Underlying + Direct. If neither are true the
outcome is not a cause of death. Result on whether the outcome played a role in the cause of death
either as a direct (D) or as an underlying (U) cause or as both (D+U) is shared.
Date of Death Shared as Derived Variable by subtracting each death date from baseline dates BWHHS
get the number of days to death after the baseline date. Number of days from baseline to death is
shared.

5. Instead of date of birth, age at baseline is provided in years, but not the specific date it was collected (actual date of birth falls in a range of 40 months).

6. Dates of clinical events are not provided; instead this will be time to event as days of event from or to
baseline.

7. Data will be transferred through the Data Safe Haven file transfer mechanism only.

This data will be stored in an Access database where all other BWHHS data is held. Only authorised researchers will have access to this data for the purpose of analysis for publications. All staff working with this data will undergo compulsory annual training in information governance run by UCL Information services Division.

Prior to transfer of the dataset another BWHHS researcher will check the dataset contains authorised variables only, and that personal information has not been mistakenly included. A record of this check is made on BWHHSs collaboration record along with the collaborators details, project details, date data set is generated and date dataset is confirmed to be destroyed.

All organisations party to this Agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract - i.e. employees, agents and contractors of the Data Recipient who may have access to that data).

A study exploring the relationships between cognitive and sensory impairments and experiences of abuse and discrimination — DARS-NIC-164594-K4C5N

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2019-03 – 2020-02 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 31 March 2022 FINAL.pdf, igard-minutes---19th-march-2020-final.pdf, igardminutes-11thfebruary2021final.pdf

Datasets:

Adult Psychiatric Morbidity Survey
Adult Psychiatric Morbidity Survey (APMS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The applicant has published previously on the 2000 and 2007 APMS datasets. The 2014 dataset is the first of this recurring survey, the first of which took place in 1993, to be helped by NHS Digital. The survey is undertaken by NatCen, supported by a writing group of academics (including the applicant) from several Universities including UCL who undertake secondary data analyses, that have contributed a significant body of knowledge on mental wellbeing in the UK over the past 25 years. UCL are requesting access to the 2014 APMS Data Set for the purposes of a study instigated by a clinical reader at the UCL Division of Psychiatry.

This study will explore how and why cognitive and sensory impairments might be associated with worse mental health; and how health and social disadvantages might increase the risk of experiencing abusive and discriminatory experiences. The UK Department of Health defines abuse as a violation of an individual's human and civil rights by another person(s)" (2). Abuse is defined by the impact, rather than intention of actions or inactions on an individual. Hearing and visual loss are very common, with hearing loss alone affecting a third of people aged 65 and over. Sensory impairments can present barriers to social engagement, and people who experience sensory impairments are more likely to require help from others in Activities of Daily Living. Sensory impairments might therefore be a risk factor for experiencing abuse and discrimination, although this has not been studied before.

The APMS survey data would be transferred to UCL data safe haven, where it would be analysed. The only outputs would be for publication in research journals, and no participant would be identifiable from these publications. No other organisations are involved in the data analysis.

Previous publications using the 2007 APMS dataset from UCL have found that older people are less likely to receive evidence-based treatment for common mental disorders (CMD). In the 2000 APMS data set, it was reported that the lower quality of life previously reported by people with cognitive impairment is due to the greater physical and mental health problems in this population, rather than to cognitive impairment per se. These findings have helped to build overall understanding that it is often the adverse circumstances of vulnerable groups barriers to accessing treatment and physical morbidity that results in their vulnerability to CMD. In the planned study UCL wish to extend this work to people with sensory impairment.

The purpose of the project is to conduct secondary analysis of the APMS 2014 data, to further understand how sensory and cognitive impairments might impact on abusive and discriminatory experiences, in order to increase scientific understanding in this area. There is a paucity of research currently on the mental health of people with sensory impairment in the UK, which is why the applicant wishes to explore the proposed work plan. The research team at UCL want to find out how having a sensory impairment can impact on mental health and social experiences, so ways to improve lives of people with sensory impairments can be explored. The researchers intend to publish findings in an open access scientific journal. The secondary analysis wok was initiated by UCL in discussion with NatCen and the wider APMS writing group.

This work is not independently funded and will be carried out by the applicant and a team of researchers based at UCL. The work was instigated within the remit of the applicants post as clinical reader at UCL division of psychiatry. No organisations other than UCL are involved in the planned data analysis, all those involved in the processing of the data are substantive employees of UCL. No elements of the work will take place outside the UK.

Yielded Benefits:

Secondary analyses of the writing groups 2007 APMS have already been widely cited. The 2014 APMS report, published by NHS Digital cites the contribution of UCL's previous work in this area and its influence on the 2014 survey that UCL are requesting as follows: Analyses of APMS 2007 data indicated that white people were the ethnic group most likely to receive mental health treatment (Cooper et al. 2013) and that people of working age were more likely than older people to get appropriate treatment, especially psychological therapy (Cooper et al. 2010). APMS 2014 allowed researchers to examine whether these inequalities have persisted, and (due to the introduction of a new question in 2014) whether some groups of people are more likely to have requested mental health treatment but not received it than other groups. A paper (listed on previous page as paper 1) that used the APMS data to explore the relationship between sensory impairment and common mental illnesses, and assessed social functioning as a mediator of this relationship as well as the 'treatment gap', has now been submitted for publication.

Expected Benefits:

There is a lack of research currently on the mental health of people with sensory impairment in the UK, which is why UCL wishes to explore the proposed work plan. Previous work carried out by this group of researchers has raised awareness of the mental health needs of older people It is expected that this research will allow for increased scientific knowledge of how cognitive and sensory impairments might impact on the likelihood of experiencing abuse or discrimination. The benefit of exploring the mental health of people with sensory impairment in the UK is so that more information is available to a variety of clinicians and organisations to make better decisions in the provision of health care regarding those affected. With the results derived from the statistical analysis undertaken, this extra scientific knowledge can be disseminated as relevant information to a variety of people.

The employees of UCL involved in this research project are well placed to disseminate knowledge directly into mental health services through a number of channels;

1. Via professional work as clinical psychiatrists. The research can be disseminated directly to colleagues, and discussed/used in continuing professional development forums to improve the standard of health care for people with sensory impairment in the UK. If the researchers are able to understand to what extent and how people with sensory impairments, for example, are at greater risk of mental disorder, how this will inform targeting/planning of future talking therapies, and adaptation to remove barriers to this population accessing mental health services.

2. The applicant is also the lead on the Alzheimers Society Centre of Excellence Independence at Home which is based at University College London. Any findings that are directly related to patients affected by Alzheimers can be disseminated via the work here to inform future work programmes that can address any inequalities identified in the analysis, and to help improve the standard of healthcare given to those who have sensory or cognitive impairments.

3. They are well placed and connected with a variety of Third Sector Organisations (Charities). Any findings that are directly relevant to the older generation can be disseminated to Age UK to inform future targeted work programmes again to address the inequalities discovered through the research to help better the healthcare provision for those with cognitive and sensory impairments, and to remove the barriers that this population may face when accessing mental health services.

Outputs:

UCL will disseminate findings through peer reviewed journal papers, supported by UCL media press release where findings are likely to be of widespread national interest.
The following is a list of planned papers for publication in peer reviewed journals; it is anticipated these will be published by February 2020:

1. A paper exploring the relationships between sensory impairment and mental health and possible mediators of this relationship, as well as the 'treatment gap' for people with sensory impairments relative to those without.
2. A paper exploring the relationships between cognitive impairment and mental health and possible mediators of this relationship
3. A paper exploring how sensory impairment is associated with suicidal ideation and any mediators of an association
4. A paper exploring what predicts whether people who are experiencing symptoms of common mental disorder self-diagnose or receive a professional diagnosis
5. A paper exploring sensory impairment and psychosis and possible mediators of any association.

Target journals include British Journal of Psychiatry, Lancet Psychiatry, International Psychogeriatics. The papers are intended of the scientific community, as well as interested members of clinical professions and the public. The applicant organisation has collaborated with the Alzheimers society on previous press releases.

All outputs published will be in the form of aggregated outputs with small numbers suppressed (as is in line with the HES-Analysis guide).

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

The 2014 APMS dataset is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole dataset; there is no facility to select individual variables. They will be able to download the dataset from UKDS for the period specific within the DSA and they must securely destroy all local copies of the dataset when the DSA expires and notify DARS in line with standard procedures. This 2014 version of the dataset available via DARS has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

Once an active data sharing agreement is in place, UKDS will transfer the pseudonymised APMS data to UCL. It will be transferred and accessed within the Data Safe Haven. This is UCL's data service for storing, handling and analysing identifiable data. It has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit.

The data transferred from UKDS to UCL data safe haven contains survey data that is potentially identifiable. This transfer will be governed by accepted, standard procedures used by UCL Data Safe Haven identifiable data transfer portal and described on their website: http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln

Analysis of the data will be undertaken using statistical software, in order to understand the mental health needs of those with sensory impairment. All analyses will be performed using data weighted to take account of the complex survey design and of non-response in order to ensure that the results are representative of the British Household population. Conceptual models developed that seek to explain abuse consider: victim vulnerability, abuser stress, psychopathology or impairment, intra-individual dynamics and societal attitudes. The research team at UCL have based the hypotheses of the research project on this model, hypothesising that those who are more vulnerable due to cognitive or sensory impairments are at greater risk of worse mental health and suicidal ideation, abusive or discriminatory experiences, and that this increased risk may be due to dependence on others for care and relative isolation from protective social support

The data will not be stored, processed or in any other way accessible by a third party organisation. The data will be held on, and only analysed on a UCL computer within the UCL Division of Psychiatry.

Data transfers out of UCL will only be in the form of aggregated, non-identifiable data, published as research journal papers, with all small numbers suppressed as is in line with the HES-Analysis guide. Data will not be accessed outside the UK. All those processing the data are substantive employees of University College London. The data NHS Digital supplies will be stored within the Data Safe Haven. The data will not be linked or compared (matched) with other data sets. There will be no attempts to try and re-identify or re link to identifiable record level patient data.

Mental disorders and help seeking for mental or physical health conditions in sexual minorities — DARS-NIC-159576-C0V1M

Opt outs honoured: (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2019-05 – 2022-05 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 29 September 2022 final.pdf

Datasets:

Adult Psychiatric Morbidity Survey
Adult Psychiatric Morbidity Survey (APMS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The Division of Psychiatry at UCLs legal basis for processing personal data under GDPR is function of a public task (by a public organisation) as set out in Article 6(1), point (e) (necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller and the task or function has a clear basis in law.) and Article 9(2), point (j) (necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes)

The Division of Psychiatry at UCL has a 20 year history of research into LGBT (lesbian, gay, bisexual, and transgender) mental health and well-being. The Adult Psychiatric Morbidity Survey (APMS) in 2007 was the first time that questions on sexual orientation were included in such a national survey. The chief investigator for this project from UCL collaborated with the survey makers (NatCen) on the most appropriate wording for questions about sexual orientation. Thus, UCL has a long history of research in this field and particularly in collaboration with the APMS survey.

There are many positive aspects to being gay. Evidence suggests that lesbians and gays may be more resourceful and self-reliant, have a wider circle of supportive friends and more disposable income then heterosexuals. However there is also evidence that people who self-identify as gay or lesbian have poorer psychological health and lower social well-being then the heterosexual population in modern Britain. This is puzzling as recent social attitudes to same/sex relationships have become much more positive. The last Adult Psychiatric Morbidity Survey was the first national survey of its kind in the UK to include questions on sexual minority status. The results showed an excess of mental disorder of all types in the lesbian gay and bisexual population (Chakraborty et al 2010). The current research seeks to discover whether this has changed and, in particular, whether mental distress maybe less common in younger generations who have experienced less negative social attitudes.

There is considerable evidence that people who identify as LGBT suffer higher rates of mental disorder than their heterosexual counterparts. However, the sample populations studied are often not probabilistic, the origins of this distress are not always clear and it is not known whether higher levels of distress fall as societies become more accepting of same sex couples. UCL therefore are planning a research study with the following aims:

1. To compare rates of mental disorders and help seeking for mental or physical health conditions in non-heterosexual and heterosexual people in 2007 and 2014.
2. To model predictors of mental distress, well-being and help seeking in non-heterosexual people

The predictors identified, and the interventions suggested to modify them, will have important clinical and public health implications. This study therefore has significant potential to address health inequalities.

The study population consists of people aged less than 65 years (people aged 65+ were not asked about their sexuality in 2014) who responded to either the 2007 or 2014 Adult Psychiatric Morbidity Survey (APMS). The study design takes the form of cross-sectional studies carried out in England.

Chakraborty A T, McManus S, Bebbington P, Brugha T, Nicholson S, & King M (2010). Mental health of the non-heterosexual population of England. British Journal of Psychiatry: 198, 143-148.

Yielded Benefits:

The research has yielded no tangible benefits so far as it has taken the statistician longer than envisaged to carry out the analyses. The approach to the data and analysis must be as painstaking as possible as it is complex, and the researchers wish to make certain that the analyses and interpretation are accurate. Additionally, as this work is unfunded and funded work with more pressing deadlines has taken priority, there has been a delay in completing the work. Furthermore, the clinical trials unit (PRIMENT at UCL) in which the researchers work has been distracted with the international nature of some trials with regard to Brexit, for example supply of medications for CTIMPS (drug trials: Clinical Trial of Investigational Medicinal Products).

Expected Benefits:

Using these data, it will be possible to see whether lesbians, gays and bisexual people have excess mental distress and the factors that are associated with this. Information will also be available on positive aspects of wellbeing and resilience, and whether there have been changes since the 2007 survey. This is a key issue for the health of sexual minorities. It is also a current priority for Government, which has recently published a large survey of the health of sexual minorities. The problem with this survey, however, is that it is a much less representative sample of LGBT people than those in the APMS study. Thus, this analysis will be an important contribution to the current state of knowledge. UCL plan to disseminate it at the International Meeting of the Royal College of Psychiatrists International Congress, as well as at the annual meeting of the World Association of Social Psychiatry. Study findings will be published in leading journals such as Lancet Psychiatry, the British Journal of Psychiatry, and World Psychiatry. The clinical experience within the team will maximise the chances of our findings being translated into clinical and public health recommendations regarding interventional work. Our links with the PRIMENT Clinical Trials Unit will improve the chances of successful applications for trial funding to evaluate such interventions. Members of our research team have participated in a Government Equalities Office (GEO) to advise on effective interventions to improve LGBT mental health. These policy links will enhance the application of our research findings on effective interventions to policy developments.
This AMPS work therefore represents an important foundation for future work to benefit the health of LGB people, in line with the governments LGBT Action Plan 2018.

UCL will take action to improve mental healthcare for LGBT people. The Department of Health and Social Care and the Government Equalities Office will jointly develop a plan focused on reducing suicides amongst the LGBT population. The Department of Health and Social Care will ensure LGBT peoples needs are addressed in the updated Suicide Prevention Strategy, and the new Health Education England suicide prevention competency framework will cover high-risk groups including LGBT people.

There have been large positive changes in the UK and other western societies in terms of acceptance of LGBT rights, including equal marriage and the Equality Act. Thus, society should have begun to see changes in their mental health. However, UCL's recent analysis of data from an English birth cohort of non-heterosexual people aged 20 (the Avon Longitudinal Study of Parents and Children, ALSPAC) indicates that levels of psychological distress remain markedly elevated over that of their heterosexual peers (Irish et al, 2019). This is puzzling and needs replication in a national sample that is more representative of England. If UCL find a similar picture in the APMS data, available co-variates will need to be explored in more detail to understand why that might be. This will inform more targeted prevention, as well as support and treatment for those affected in the general population.

The 2007 APMS was the first to ask a question about sexual orientation and so the results of UCL's analysis (which combines the 2007 and 2014 surveys) of mental health and service use in LGBT people will be crucial in determining how best services, schools and the wider community can act together to improve the health of this sexual minority. This may involve health and well-being campaigns that are directed at the whole community and not just LGBT people, who unfortunately are still too often the targets of exclusion and discrimination.

Outputs:

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

The results of this research will be published in high-impact peer-reviewed journals and presented at conferences (for example, the Royal College of Psychiatrists Annual International Congress 2019; the Health Studies User Conference 2019). In particular, results will be published in journals concerned with mental health and medical practice. The peer-reviewed articles will be available on the UCL repository giving free access to either the published article or the accepted manuscript depending on the journal. The aim is to have the work published by the end of 2019.

To reach a broader audience UCL will also use Twitter, UCL blogs, Mental Elf blogs, and media such as The Conversation to disseminate the key messages of our research, and their clinical implications.

UCL plan to disseminate and recommend action in late 2018 or early 2019. Organisations involved in implementation include the Royal College of Psychiatrist, the Mental Health Foundation and Stonewall. The principal investigator (PI) for this project is on the Executive of the Royal College of Psychiatrists Special Interest Group in LGB mental health and thus has a direct line to their policy units.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

The 2014 APMS data set is held on behalf of NHS Digital by the UK Data Service (UKDS) (www.ukdataservice.ac.uk ) and UKDS are responsible for dissemination under direction by NHS Digital. UCL will get the whole data set; there is no facility to select individual variables. UCL will be able to download the data set from UKDS for the period specified within the DSA and must securely destroy all local copies of the data set when the DSA expires and notify NHS Digital in line with standard procedures. This 2014 version of the data set available has been redacted on Disclosure Control Procedure advice to minimise the likelihood of individuals being able to identify anyone taking part in the survey.

Once an active data sharing agreement is in place, UKDS will transfer the pseudonymised APMS data to UCL. It will be transferred and accessed within the Data Safe Haven. This is UCL's data service for storing, handling and analysing identifiable data. It has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit.

The data transferred from UKDS to UCL data safe haven contains survey data that is potentially identifiable. This transfer will be governed by accepted, standard procedures used by UCL Data Safe Haven identifiable data transfer portal and described on their website: http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln. Data will also be stored in the UCL Data Safe Haven which is IG Toolkit assured, dual-factor authenticated, access is determined on a need-to-know basis, firewall has a default deny policy, data enters via a managed file transfer mechanism and only the information asset owner has permission by default to draw down any data. All those on the study team who will analyse data are required to complete Data Security Training, as provided by Health Education England, and to complete the authorisation process. Data will be destroyed in time for the termination of the Data Sharing Agreement with NHS Digital so that only the outputs remain. A notice of destruction will be issued in writing to confirm, 90 days after the user has clicked delete, once the backup cycle has overwritten the files and the data have not been restored in the intervening time.

All analyses will make use of the weightings provided with the data sets to give results that are applicable to the population and account for the primary sampling unit. New weightings for the 2007 survey, provided in 2016 will be used. Descriptive statistics will also be carried out unweighted (representing those in the data sets only). All weighted analyses will be carried using Stata version 14 survey (svy) commands.

Descriptive analysis
Descriptive analyses will be undertaken for each survey (2007 and 2014) as well as in the whole data set combined. However for some of the variables below, it will not be possible to examine these by year of survey because questions were changed somewhat between surveys. If any particular outcome from those noted above is not available for either year, it will only be reported in the analysis for the year it is available.

Variables to be included:
Social, demographic and personal: such as age, sex, children, and social support.

Mental health outcomes:
The main outcome of interest is what is called common mental disorder. This is a variable which combines all the diagnoses which come under the general rubric of depression and anxiety. Other factors such as severe mental illness, self-harm and drug use will also be examined. However UCL do not wish to lose sight of positive factors such as emotional well-being and will also present group comparisons on these measures.

Help-seeking and use of services:
The 2014 APMS data set is rich in information on reported help seeking and so UCL shall explore to what extent people have received counselling, their contacts with primary medical care and social care services, and whether they have been admitted to hospital.

Other variables:
UCL will also explore difference between heterosexual and non-heterosexual people on factors such as spiritual beliefs and religious practice.

The differences between the data contained in the 2007 and 2014 APMS will enable UCL to examine change with time in terms of what is called period effects, to determine whether there are differences for people of similar age and background between the two time points.

In more sophisticated analyses UCL will explore the possible reasons for hypothesised elevated rates of psychological distress in LGB people, for example discrimination, childhood abuse and neglect, parenting, interpersonal violence, and bullying. UCL will examine the effect of age, sex and general health on the associations observed and where possible examine men and women separately. In some instances this may be limited by numbers. If numbers become very small in sub-categories, UCL shall amalgamate the two groups into the more general comparison of heterosexual versus non-heterosexual.

There will be no data linkage undertaken with NHS Digital data provided under this agreement that is not already noted in the agreement.

Data will only be accessed and processed by substantive employees of University College London and will not be accessed or processed by any other third parties not mentioned in this agreement.

This research will only report aggregate data showing patterns overall; meaning no individual will be able to be identified.

MR1129 - SCORAD Feasibility Study — DARS-NIC-156409-W056Z

Opt outs honoured:

Legal basis: , Informed Patient consent to permit the receipt, processing and release of data by NHS Digital

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2011-10 – 2026-10 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-draft-minutes-4th-july-2019-final.pdf, igard-draft-minutes-13th-june-2019-final.pdf, igard-minutes-11th-april-2019---final.pdf, igard-minutes-17th-january-2019---final.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
MRIS - Personal Demographics Service
MRIS - Scottish NHS / Registration

Type of data: Identifiable

Objectives:

SCORAD III: A randomised phase III trial of single fraction radiotherapy compared to multifraction radiotherapy in patients with metastatic spinal cord compression

This trial aims to determine whether patients with spinal cord compression can maintain or regain the ability to move and walk as well after one dose of radiotherapy as after five doses.

It will also examine whether they have similar quality of life and tolerance to treatment regardless of which treatment they receive. It will evaluate single fraction radiotherapy against multifraction radiotherapy in terms of ambulatory status, function, quality of life and toxicity to 3 months and survival to 12 months. This trial is attractive to radiotherapists because either treatment is simple to deliver. Furthermore, the single fraction regimen requires fewer NHS resources, results in both a reduced hospital stay and less movement for patients with spine damage and shortened waiting lists as radiotherapy slots are freed.

MR688 - UK Ductal Carcinoma in Situ (DCIS) Trial — DARS-NIC-148348-XTNPJ

Opt outs honoured:

Legal basis: , Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007; National Health Service Act 2006 - s251 - 'Control of patient information'., Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 ; National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2010-09 – 2020-09 SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
MRIS - Personal Demographics Service
MRIS - Scottish NHS / Registration

Type of data: Identifiable

Objectives:

UK Ductal Carcinoma in situ (DCIS) Trial

To perform a randomised 2 x 2 trial to determine, in screen detected DCIS, the effect on the incidence of subsequent invasive breast cancer of complete excision (WLE) alone compared to that of WLE followed by radiotherapy to the residual breast tissue and/or tamoxifen 20mg daily for five years. The incidence of subsequent DCIS and the cause of death were also monitored as was the continues use of hormone replacement therapy or oral contraceptives.

Policy Research Unit for Children, Young People and Families — DARS-NIC-393510-D6H1D

Opt outs honoured: Y, No - data flow is not identifiable, No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012, Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012), Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), , Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-03 – 2022-03 2017.06 — 2025.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 18th September 2025 final.pdf, AGD minutes - 4 May 2023 final.pdf, IGARD Minutes - 17 February 2022 Final.pdf, IGARD Minutes - 22nd April 2021 final.pdf, igard-minutes-27th-june-2019---final.pdf, igardminutes-8thoctober2020final.pdf, IGARD_Minutes_20.07.17.pdf, IGARDMinutes-4thMarch2021final.pdf, IGARD Minutes - 26 May 2022 final.pdf, IGARD Minutes - 31 March 2022 FINAL.pdf, IGARD Minutes - 25 November 2021 final.pdf

Datasets:

Hospital Episode Statistics Outpatients
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Accident and Emergency
Office for National Statistics Mortality Data (linkable to HES)
Office for National Statistics Mortality Data
Civil Registration - Deaths
HES:Civil Registration (Deaths) bridge
Emergency Care Data Set (ECDS)
Mental Health and Learning Disabilities Data Set
Mental Health Minimum Data Set
Mental Health Services Data Set
Civil Registration (Deaths) - Secondary Care Cut
Birth Notification Data
Civil Registration - Births
Community Services Data Set
COVID-19 Second Generation Surveillance System
Covid-19 UK Non-hospital Antigen Testing Results (pillar 2)
HES-ID to MPS-ID HES Accident and Emergency
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients
MRIS - Bespoke
MSDS (Maternity Services Data Set)
MSDS (Maternity Services Data Set) v1.5
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
Community Services Data Set (CSDS)
COVID-19 Second Generation Surveillance System (SGSS)
COVID-19 UK Non-hospital Antigen Testing Results (Pillar 2)
Maternity Services Data Set (MSDS) v1.5
Mental Health and Learning Disabilities Data Set (MHLDDS)
Mental Health Minimum Data Set (MHMDS)
Mental Health Services Data Set (MHSDS)
COVID-19 SGSS First Positives (Second Generation Surveillance System)
Maternity Services Data Set (MSDS) v2
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant

Objectives:

The data is requested for a programme of research within the healthcare provision theme of the Policy Research Unit for Children, Young People and Families, within University College London (UCL) funded by the Department of Health (DoH). The objectives of the research are:

a) To determine variation in use of secondary care services by children and young people over time and their transition to adult services. UCL will analyse variation by patient characteristics (e.g. age, gender, GP registration), and by area/unit level area characteristics such as trust, practice characteristics such as QOF scores, and area indicators for deprivation.

b) To determine risk factors for emergency use of secondary care and risk factors for recurrent use (e.g. according to individual patient characteristics such as age, chronic conditions, deprivation, sex), past use (e.g. frequency and type of past contact such as A&E or admissions). UCL will also examine NHS trust and area factors associated with secondary care use. Where possible, UCL will use birth cohort analyses, based on postnatal admissions of children linked to maternity to maternal risk factors (e.g.: maternal age) and birth factors (e.g.: birth weight, prolonged stay in neonatal intensive care), to investigate associations with risk of emergency use of secondary care and other outcomes, including mortality.

c) UCL will conduct prognostic analyses for children and young people based on diagnosis and procedure codes to identify risk factors for emergency hospital care and for subsequent long-term adverse outcomes into adulthood (e.g.: further emergency admissions or death).

d) UCL would also like to request all ONS death records for deaths registered in England from 1st January 1998 until as late as possible, for all persons who died aged 0-55; in other words, both records that link to HES as well as those that do not link to HES. UCL need all deaths in order to assess the degree of misclassification of outcome (alive/dead) due to linkage errors between ONS and HES datasets. It is crucial that UCL get the age at death on the ONS records in order to do this. Full dates of death is required to be able to estimate age of death in days for the work on infant mortality (for instance, to be able to distinguish between first week from later neonatal deaths and from postneonatal deaths), as well as for the work on cause specific mortality where UCL classify deaths based on admissions within a certain number of days from death. Additionally, having date of death available enables UCL to determine delay in death registration for data validation.

What will be done with the data?
UCL will use longitudinal HES data, linked to ONS death records, to construct cohorts for a number of patient subgroups defined by age, sex, and clinical characteristics, to address the questions above. All analyses will be done within the safe haven. All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Yielded Benefits:

Benefits from the data already received include: The data has shown an increase in adversity-related injury rates (defined as self-inflicted or related to drug/alcohol use or violence) in teenagers in England. The increases may indicate an increasing problem, particularly in females and for intentional self-injury in males aged 15-19 years. UCL aim to extend this research to determine how early preventive interventions in community services (schools, family, and primary care) can affect presentations to hospital. Research has also included work on mortality in vulnerable mothers with opioid use during pregnancy, which found that mothers with opioid use were 11 times more likely to die during the 10 years after childbirth. UCL also explored time to next live birth for vulnerable mothers, finding that women with records indicating vulnerability, such as mental health problems, age <20 years or high parity (defined as the number of pregnancies reaching viable gestational age (including live births and stillbirths), will have a next child faster than the other women. Results have been fed back to the Department of Health and will inform the next years of the CPRU programme.

Expected Benefits:

The research carried out by UCL directly influences DoH policy makers, service providers, healthcare professionals and the general public. This directly benefits the health of children and the healthcare provided to children in the here and now and this is key to reducing future burden on the NHS.

Benefits from the data already received include:

For example, UCLs research recently published in The Lancet, showed similarly increased risks of death over the 10 years after hospital discharge for adolescents hospitalised for self-harm as for those hospitalised for drug or alcohol misuse or violence. The results have led to recommendations for similar psychosocial interventions to be considered for both groups, not just those admitted for self-harm and to include preventive strategies for drug and alcohol misuse, which accounts for just as many deaths in the 10 years after hospital discharge as does suicide. UCLs research will extend this type of preventive thinking to a range of population subgroups within the child and young adult age range and allow follow up to determine long-term, and potentially preventable outcomes.

Additionally, UCLs research on readmissions has shown that for children and young people these occur predominantly in patients with underlying long-term conditions. When looking specifically at 30-day readmissions (emergency readmissions within 30 days of a previous discharge, which are subject to the readmission rule) UCL found that about half of readmissions were for a problem different from the reason for the first emergency admission. This further suggests that readmissions in children and young people are due to complexity of cases rather than hospital failings, and urge a review of the current policy of not reimbursing hospitals for care provided for 30-day readmissions. UCL have also shown that chronic conditions underlie the sharp increase in admissions across the transition from paediatric to adult services which has important implications for specialist care.

The measurable benefits to the health service will be in improving the understanding of longitudinal patterns of emergency health care use overall and which groups (e.g. with chronic conditions) are most at risk. The study will provide new knowledge about long-term outcomes across the child life course and into adulthood. Specifically,
1) assessing the use of hospital service and relevant outcomes, including mortality before and after transition from paediatric to adult health care for young people with chronic conditions,
2) assessing variation in readmission rates by hospital and determine to what extend this variation is due to case mix (based on the full longitudinal hospitalisation record), organisational factors or changes over time.
3) comparing outcomes for vulnerable mothers (e.g. those with a past history of adversity-related injury admissions) and children.

The research (using the new data) will extend this type of preventive thinking to a range of population subgroups within the child and young adult age range. The benefits to the service will be in improving the understanding of longitudinal patterns of emergency health care use overall and which groups (e.g. with chronic conditions) are most at risk. The study will provide new knowledge about long-term outcomes across the child life course and into adulthood. The results may be used to inform NHS services through, for example, targeting of preventive care strategies, evaluation of the quality of care, and development of services and policy to support follow up of risk groups. UCL will also engage with CLARHC about implementation of the research into practical services within UCL Partners.

Finally, UCL have a focus on vulnerable children and families, and use admission data, combined with their indicators for chronic conditions and birth characteristics to explore use of health services for vulnerable mothers and children. All papers are reviewed and commented on by DoH and findings fed back to DoH policy makers as well as more widely, for example, through presentations to young people groups (through the National Children’s Bureau), to the NHS (e.g. through the Child and YoCLARHC), and through trusts to clinicians (e.g. through seminars and CPRU symposia involving patient groups, policy makers and clinicians).

The results may be used to inform NHS services through, for example, targeting of preventive care strategies during pregnancy to support vulnerable mothers and children, evaluation of the quality of care, and development of services and policy to support follow up of risk groups who can be recognised by hospital services (eg those with underlying chronic conditions, or indicators of adversity).

All of the CPRU current and past research can be found on the CPRU website (https://www.ucl.ac.uk/cpru).

Outputs:

The programme of research in this application informs policy and practice and all proposals and outputs are seen and approved by the Department of Health (DoH). Through UCLs Children’s Policy Research Unit (CPRU) there will be regular engagement with the DoH about the projects during development and outputs, and DoH will review outputs to give feedback. All analyses undertaken as part of this programme of research for the policy research unit aim to provide evidence to inform health care professionals, service providers, policy makers, and service users about children’s health and how services meet their needs.

Other outputs include presentations to service providers through meetings with the RCPCH (Royal College of Paediatricians), the North London Collaborations for Leadership in Applied Research and Care (CLARHC) and through the academic health sciences network (AHSN). The findings will also be presented to clinicians at clinical practice meetings including but not limited to the Royal College of Paediatrics and Child Health in 2017 and 2018. This engagement is occurring with the direct goal of changing practice in the health care field.

The findings will also be published in peer reviewed journals and policy briefings for the DoH. The projects in this application are expected to finish by December 2018.

Specifically, the research will inform DoH policy makers, service providers and practitioners about patient and service factors associated with emergency use of secondary care and long-term adverse outcomes through the child life course and into adulthood. The research programme will engage with DoH policy makers, practitioners and public during the research, to refine questions and applications of findings, and during the dissemination phase. In this way, UCL will ensure that the study is relevant to NHS systems and UCLwill endeavour to feedback results to the NHS.

The mechanisms for engagement and dissemination with NHS systems and the public are as follows:

a) UCL has well-established mechanisms for patient and public involvement through CPRU. This is facilitated by the National Children’s Bureau (NCB) Research Centre.

b) The study is conducted as part of a programme of research for the Policy Research Unit (CPRU) for Children, Young People and Families, funded by the Department of Health Policy Research Programme. CPRU aims to improve the health of children, young people and families by undertaking research to provide evidence for health policy and practice. The CPRU program requires regular engagement with policy makers at the Department of Health.

c) The project team at the Great Ormond Street Institute of Child Health contribute to the Academic Health Science Network at UCL Partners AHSN theme on Integrated children and young people’s programme which aims to implement research findings into practice. Engagement is also through the CLARHC, hosted by UCL Partners.

The papers resulting from these studies will be published in peer-reviewed journals (such as the Lancet, Archives of Disease in Childhood, PLoS Medicine, BMJ Open) and presented at scientific conferences (such as the, International Population Data Linkage Conference, International Society for the Prevention of Child Abuse and Neglect, Royal College of Paediatrics and Child Health annual conference, and Informatics for Health conference). UCL aim to present the work at scientific conferences during 2017 and use feedback provided at these meetings to write up papers to be submitted for publication in late 2017 and 2018.

Processing:

Only individuals, working under appropriate supervision on behalf of data controller(s) / processor(s) within this agreement, who are subject to the same policies, procedures and sanctions as substantive employees will have access to the data and only for the purposes described in this agreement.

ONS mortality data will be processed according to the standard Office for National Statistics terms and conditions.

The data will not be shared with third parties or linked to any other datasets.

UCL have no requirement nor will attempt to re-identify the supplied data.

The data requested will be kept in UCLs Data Safe Haven (IDHS). It has been certified to the ISO27001 information security standard and conforms to the NHS Information Governance Toolkit. A file transfer mechanism enables information to be transferred into the Safe Haven simply and securely.

IDHS uses Dual Factor Authentication to access and handle data transferred into the IDHS service. This ensures that only the named applicants will have access to the data from IDHS. Removing data from the Data Safe Haven is only allowed for the PI.

Data flows
When the data extract is available from NHS Digital, a nominated researcher will download the data and immediately transfer it into the UCL data safe haven. Once in the data safe haven, researchers based at the Institute of Child Health and Farr Institute of Health Informatics London (the researchers are all substantive employees of UCL apart from one PHD student) will be able to access the data in the safe haven. The IDHS safe haven operates as a walled space and researchers are not able to connect to the internet or export data from it.

All outputs will be in aggregate form only with small numbers suppressed in line with the HES analysis guide.

Investigating the utility of machine learning methods to predict prognosis and guide treatment decisions for people with lung cancer (Lung-ORACLE) — DARS-NIC-678273-F2S0V

Opt outs honoured: No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2024-10 – 2026-10 2025.05 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 12th September 2024 finalv2.pdf

Datasets:

NDRS Cancer Registrations
NDRS Linked Cancer Waiting Times (Treatments only)
NDRS Linked HES APC
NDRS Lung Cancer Data Audit (LUCADA)
NDRS National Lung Cancer Audit (NLCA)
NDRS National Radiotherapy Dataset (RTDS)
NDRS Somatic Molecular Dataset
NDRS Systemic Anti-Cancer Therapy Dataset (SACT)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the following research project:
Investigating the utility of machine learning methods to predict prognosis and guide treatment decisions for people with lung cancer (Lung-ORACLE)

The following is a summary of the aims of the research project:
Derive and validate multiple prognostic models for lung cancer patients, using a variety of statistical approaches to identify the most accurate modelling approach, with a view to improve treatment decisions for lung cancer patients

The following National Disease Registration Service (NDRS) National Cancer Registration and Analysis Service (NCRAS) datasets will be accessed:
NDRS Cancer Registrations (inc. Route to Diagnosis Information)- necessary because one of the main objectives of this research is to develop an accurate prognostic tool for patients diagnosed with lung cancer, to aid treatment decision making. The study team therefore require identification of people diagnosed with non-small cell lung cancer (NSCLC); details of their diagnosis date to estimate survival time; details of their cancer to include as prognostic variables (e.g. stage of cancer, grade of cancer); patient demographics which will impact survival to include as prognostic variables (e.g. age at diagnosis, ethnicity, IMD); route to diagnosis as a prognostic variable (as emergency admissions are likely to present in latter stages compared to GP referrals); cause of death and date of death to attain vital status and vital status date for survival analysis models.
NDRS Linked Cancer Waiting Times (Treatments Only)- necessary because the study team need to understand all treatments undertaken to estimate impact on survival. Fields such as whether a patients treatment may be useful as a prognostic variable to explore as this could potentially impact effectiveness of treatments and survival. Knowledge of whether treatment was undertaken in a clinical trial setting is important as this is likely to not reflect real world survival; whilst knowledge of treatment intent will allow analysis of survival following treatment dependent on intent.
NDRS National Radiotherapy Dataset (RTDS) - necessary because the study team's prognostic model aims to estimate survival dependent on treatments undertaken. Hence, data in RTDS is essential to determine what treatments patients underwent and the impact on survival. The dates of treatments will also enable the study team to link to the other datasets to identify the ramifications of cancer therapy as outlined elsewhere.
NDRS Systemic Anti-Cancer Therapy (SACT) Dataset - necessary because the prognostic model aims to estimate survival dependent on treatments undertaken. Hence, data in SACT is essential to determine what treatments patients underwent and the impact on survival. The dates of treatments will also enable the study team to link to the other datasets to identify the ramifications of cancer therapy as outlined elsewhere.
NDRS Somatic Molecular Dataset - necessary because tumour genes impact patient treatment eligibility and prognosis. Hence the study team will be able to adjust the model to consider impact of treatment dependent on tumour gene on survival.
NDRS Linked Hospital Episode Statistics (HES) Admitted Patient Care (APC) - necessary for the identification of patients' attendance, source of admission, length of stay, diagnoses during admission. This is because hospital admissions may be prognostic to survival, hence these variables can be included in the prognostic model to determine their impact on survival. In addition, the linked data set will enable the study team to identify hospital admissions following initiation of cancer treatment (or during treatment), which will allow them to consider outcomes additional to survival - such as predicted days spent in hospital in the year following treatment initiation or predicted number of hospital admissions in the year following treatment initiation - all of which will provide a proxy for quality of life following treatment initiation which was important to our PPI group.
NDRS Lung Cancer Data Audit (LUCADA) - necessary because (1) it is a source of data not available elsewhere e.g. comorbidities, FEV1, performance status which are important prognostic factors for survival; and (2) if it turns out not to be complete enough, including the data at least allows the assessment of completeness and potential usefulness. It is acknowledged that this data is only available for some years (diagnoses up to 2014 and some variables partially within that).
NDRS National Lung Cancer Audit (NLCA) - necessary because (1) it is a source of data not available elsewhere e.g. Charlson comorbidity score, and smoking status; and (2) if it turns out not to be complete enough, including the data at least allows the assessment of completeness and potential usefulness. It is acknowledged that this data is only available for some years (diagnoses up to 2014 and some variables partially within that).

The level of the Data will be Pseudonymised.

The Data will be minimised as follows:
Limited to a study cohort identified by the NDRS as meeting the following criteria: Adult patients (aged 18 and over) who received a diagnosis of non-small-cell lung cancer (as defined by specific ICD codes) in the UK between 2010-2021

UCL is the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The funding is provided by the National Institute of Health and Care Research. The funding is specifically for the study described. The funder will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services provides IT hosting services that support the UCL Data Safe Haven and will store the Data as contracted by UCL.

Data will be accessed by a PhD student affiliated with UCL, substantive employees of UCL and a individual on an honorary contract with UCL. Any individual working with the Data held under this Data Sharing Agreement (DSA) must have completed relevant data protection and confidentiality training and are subject to UCLs policies on data protection and confidentiality. Individuals accessing the Data will do so under the supervision of the Principle Investigator. UCL will be responsible and liable for any work carried out by individuals accessing the data who will only work on the Data for the purpose described in this DSA.

To inform this research, a Public and Patient Involvement and Engagement (PPI) group completed an online survey and participated in an online meeting. The group supported the collection of the data for the purposes described above.

Further, following consultation with healthcare professionals (HCPs) to inform this research, they stressed that any model deployed in healthcare must be appropriately designed for use to support patients and ensure it is helpful. Members of the PPI group upon initial consultation additionally had many thoughts and ideas around prognostic model design/use and were keen the patient/public voice was heard. Consequently, throughout the research, and in addressing the objectives, it will adopt an approach/methodology with commitment to involving HCPs, those who support patients, and members of the public in model design and development.

Approaches to this will include Multidisciplinary expert panel meetings and PPI group sessions. The PPI group was convened by advertising involvement via UCL PPI networks, charities (BME Cancer Communities, Roy Castle lung cancer foundation, Maggies, Macmillan cancer support) and patient advocate groups (Independent Cancer Patients Voice). The group comprises individuals with personal experience of lung cancer or other cancers, those impacted by cancer of a relative, and those interested in this research. Via professional connections and reaching out to charity staff, the Multidisciplinary expert panel was formed to consist of professionals across the country. To offer a broad range of views on model development and use, job roles include medical oncology, clinical oncology, thoracic surgery, nurse specialist, respiratory physician, palliative care physician, Macmillan GP, Maggies centre head. Both PPI and Multidisciplinary expert panel meetings provide a platform for the groups to offer their views on prognostic model design and approach; for example what is important to patients when making cancer treatment decisions, and how user friendly are current prognostic models in clinical use for other cancers. In recent meetings for instance, the groups have provided feedback on an initial prototype for the lung cancer prognostic model - highlighting the need for appropriate language, simplicity and security.

In line with the national data opt-out policy, opt-outs are not applied because the data is not Confidential Patient Information as defined in section 251(10) and section 251(11) of the National Health Service Act 2006.

Where individuals have opted out of disease registration by the National Disease Registration Service (NDRS), their data has been permanently removed from the registry and therefore will not be disseminated under this Data Sharing Agreement (DSA). https://digital.nhs.uk/ndrs/patients/opting-out

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making support for policy-makers, local decision-makers such as doctors, and patients, in the following ways:

1) By working collaboratively with clinicians and patients throughout the study team will ensure the research is interpreted and presented appropriately so as to increase confidence in prognostic models for cancer
2) The work will inform how such prognostic models (including lung and other cancer types) should be designed for use in clinical practice in the future.
3) The study team will review how clinician behaviour in therapeutic decision-making may change due to the availability of model predictions, and the prognostic accuracy afforded by the models versus clinician judgement alone. This will provide an estimate for the potential impact such models may have on lung cancer patient outcomes if implemented widely into clinical care.
4) Following on from this work, future implementation of the lung cancer prognostic tool (following a future multi-centre trial to assess model effectiveness) has the potential to improve clinical outcomes for lung cancer patients and aid clinicians in recommending treatments and providing accurate prognostic predictions. From a patient perspective the uncertainty around outcomes and treatment decisions will be reduced.

Hence, this work provides evidence paving the way for designing effective, implementable prognostic models in the future; informing best practice to improve the care, treatment and experience of health care users relevant to the subject matter of the study.

It is hoped that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

The project team plan to engage with a variety of charities and use connections with NIHR to ensure that findings reach a wider audience as possible.

Outputs:

The expected outputs of the processing will be:
Submissions to peer reviewed journals expected from 2026 onwards
Report to NIHR
A PhD thesis due for submission
Presentations at appropriate conferences: British Medical Journal Future Health,
American Association for Cancer annual meeting, British Thoracic Society.
Production of a prognostic tool to aid treatment decisions for patients diagnosed with lung cancer. The tool will incorporate (and test user thoughts on) aspects of quality of life never before considered for such prognostic tools used in clinical practice. It is not intended that the real-world model implementation of the prognostic tool will be covered during this research. This will involve future work to undertake a large multi-centre trial to assess model effectiveness and implementation strategy development and testing of this.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Workshops
Webinars
Social media

Outputs will be published from March 2026 onwards.
Public reports
Press/media engagement

Processing:

No data will flow to NHS England for the purposes of this Data Sharing Agreement (DSA).

The NHS England NDRS will provide the relevant records from the above-listed datasets. The Data will contain no direct identifying data items. The Data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

Once received, the requested data will be uploaded to the UCL Data Safe Haven. Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven. UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The Data will be accessed by authorised personnel via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England. The data will not leave or be accessed outside of England at any time.

Data will be accessed by an individual with an honorary contract with UCL. Aside from this individual, access is restricted to substantive employees and students of UCL who have authorisation from the Principal Investigator.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will not be linked with any other data not already referenced in this Data Sharing Agreement (DSA).

There will be no requirement and no attempt to reidentify individuals when using the Data.

Analysts from UCL will process the Data for the purposes described above.

SUMMIT Study: Cancer screening study with or without low-dose lung CT to validate a multi-cancer early detection test (Previously ODR1718_316) — DARS-NIC-656813-F4H5W

Opt outs honoured: No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(c)

Purposes: Yes (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-12 – 2027-12 2023.01 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: GRAIL, LLC, UNIVERSITY COLLEGE LONDON (UCL), GRAIL INC., UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 12 January 2023 final.pdf, IGARD Minutes - 15 December 2022 final.pdf, IGARD Minutes - 10 November 2022 finalv1.pdf

Datasets:

Emergency Care Data Set (ECDS)
NDRS Cancer Registry
NDRS Linked DIDs
NDRS Linked HES APC
NDRS Linked HES Outpatient
NDRS National Lung Cancer Audit (NLCA)
NDRS National Radiotherapy Dataset (RTDS)
NDRS Rapid Cancer Registrations
NDRS Somatic Molecular Dataset
NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
NDRS Cancer Registrations

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) and GRAIL, Limited Liability Company (LLC) are requesting NHS Digital record level data for the Study: "SUMMIT: Cancer screening study with or without low-dose lung CT* to validate a multi-cancer early detection test"

(*low-dose computed tomography (also called a low-dose CT scan, or LDCT) is a screening test for lung cancer. During an LDCT scan, you lie on a table and an X-ray machine uses a low dose (amount) of radiation to make detailed images of your lungs. The scan only takes a few minutes and is not painful.)

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital on 1st October 2021. The SUMMIT study previously accessed data via Public Health England under the reference: ODR0718_316.

MAIN AIM AND PURPOSE OF SUMMIT
The SUMMIT Study aims to understand ways to detect lung cancer before there are any symptoms, when treatment can be simpler and more successful.

The SUMMIT Study is a prospective cohort study of approximately 13,000 participants from London designed to investigate how cancer screening can be improved and delivered. The study will recruit individuals at high risk for cancer, especially lung cancer, due to significant smoking history. The study has two main aims:
1. To develop and evaluate the performance of the GRAIL blood test for the detection of multiple cancer types and the identification of tissue of cancer origin
2. To examine the performance and feasibility of delivering a low dose CT (LDCT) screening service for lung cancer to a high-risk population in London and the surrounding area.

***THIS VERSION (v1): OCTOBER 2022***
This is a request to Renew and Amend a Data Sharing Agreement (DSA) with University College London (UCL) and GRAIL, LLC.

STUDY AIMS
The data requested are key to the SUMMIT Study, as it will allow the study team at UCL to understand what types of cancers the participants may develop during the course of, and after LDCT screening. Thereby identifying what types of cancer signals may be present in participants blood. The data requested will also allow the study team to analyse the performance of LDCT screening. LDCT screening has been shown to reduce lung-cancer mortality in at-risk populations by 20-26% and it is hoped that by demonstrating the feasibility of LDCT screening it will enable the study team to make a case for the adoption of a UK national lung cancer screening programme, vastly improving lung cancer outcomes. Similarly the development of a blood test (Galleri Test) for screening would reduce morbidity and mortality for many types of cancers through early detection.
Both GRAIL and UCL blood samples are taken from consenting patients during the SUMMIT study. Genetic testing is being performed by GRAIL, LLC on the samples taken for the purpose of developing the blood test to detect cancer early. UCL may also conduct genetic testing on the samples taken and stored by UCL, however the purpose of this will not be to develop a blood test to detect cancer, but for the purpose of research into lung, cardiac and other diseases. This information is provided to participants on the Patient Information Sheet and Informed Consent form which have received ethical approval from the London - City & East Research Ethics Committee.

NEW DATA REQUESTED
UCL and GRAIL, LLC are requesting further access to the following National Cancer Registration and Analysis Service (NCRAS) National Disease Registration Service (NDRS) datasets (formerly available via Public Health England (PHE):
- NDRS Rapid Cancer Registrations
- NDRS Cancer Registry
- NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
- NDRS National Radiotherapy Dataset (RTDS)
- NDRS Linked HES Admitted Patient Care (APC)
- NDRS Linked HES Outpatient (OP)
- NDRS Somatic Molecular Dataset
- Emergency Care Dataset (ECDS)
- NDRS Linked Diagnostic Imaging Dataset (DIDS)
- NDRS National Lung Cancer Audit (NLCA)

How the data requested will achieve the study aims:
The NDRS Rapid Cancer Registry data and NDRS Cancer Registry data are requested to clinically validate a cell-free nucleic acid (cfNA) based GRAIL blood test for early detection of multiple types of cancer, including lung cancer and also investigate how low dose CT lung cancer screening can be improved and delivered. The additional datasets ECDS, DIDs, HES APC, HES OP, NLCA, RTDS Somatic Molecular Dataset and SACT will all be used for these aims as well as to answer the secondary endpoints of the study including, investigation of the uptake of LDCT screening (and demographic and psychological characteristics of this), examining the harms associated with LDCT screening, and to explore the outcomes following lung cancer screening and subsequent treatment.

All elements of the GRAIL test will be pre-specified prior to combining the assay results with the clinical data including the classifier and the cut-points for defining positive vs negative results. In addition the data should inform on efficient implementation of an LDCT screening service to detect early-stage lung cancer among current and former smokers. This includes understanding important operational parameters such as screening interval, uptake and adherence, and also psychosocial issues such as psychological impact and harms and quality of life measures.

NHS Digital pseudonymised record level data (linked to SUMMIT study ID) from consented participants are required to link to the Galleri Test and LDCT scan test results obtained within the study at participant study visits. In this agreement, NDRS NCRAS data is requested 6 months (183 days) prior to the date the participants consented up to around Winter 2027 (with monthly data drops until December 2023 and then quarterly until the end date of this agreement). The Cancer Registry provide historical cancer data dating back to 1995. Ascertaining the historic cancer diagnosis background of each study participant is important as a historic diagnosis may impact SUMMIT test results or treatment decisions.

The SUMMIT study is currently scheduled to last until 10 years after the last participant was enrolled in the study (last participant enrolled 14/05/2021) at which point the end of study will be declared. An extension to the NDRS Cancer Registry Dataset may be requested in the future to coincide with the end of the study.

Long term follow up data is required to ensure any cancers identified after/outside of SUMMIT LDCT screening or developing late can be linked to any potential biomarkers of cancer in the Galleri blood test or any early radiological signs at screening. In addition, long term follow up data (such as recurrence data) can be assessed alongside radiological parameters collected at LDCT screening (for example assessing whether radiological growth rates can predict post treatment outcomes).

PREVIOUSLY HELD DATA
To note: The study team at UCL currently hold data which was collected via PHE (under ODR0718_316).

The Data sharing contract between UCL and Public Health England (which transferred to NHS Digital via novation agreement on 13/12/2021) granted UCL access to cancer registration data from a bespoke early ascertainment dataset. In this version of the agreement (v1) the study team wish to transfer from receiving the Early ascertainment dataset over to the Rapid Registry Dataset. Additionally, the study team also request that GRAIL, LLC and GRAIL Bio UK Ltd are permitted access to the early ascertainment datasets which UCL already hold. These datasets will be transferred from UCL to GRAIL, LLC and GRAIL Bio UK Ltd in the same way as described for the NDRS NCRAS datasets applied for in this agreement.

It is hoped that the ODR Early Ascertainment datasets, NDRS Rapid Registry Dataset and NDRS Cancer Registry dataset can all be used to inform on future work related to the development of the blood test to detect cancer early and also for the implementation of a UK National Lung Cancer Screening programme. This work will follow on from the SUMMIT study.

Cancer Research UK (CRUK) provides core funding to UCL Clinical Trials Unit (CTU). All employees working for 'Cancer Research UK and UCL Centre Trials Centre' (UCL CTC) are substantive employees of UCL and not employed directly by CRUK. CRUK does not have direct impact on how the SUMMIT study is to be run, and no CRUK employees will be able to view, access or process any NHS Digital record level data, and are therefore not included as a Data Processor on this agreement.

COMMON LAW DUTY OF CONFIDENTIALITY
The study team at UCL will be providing one cohort for this request containing approximately 13,000 individual records.
Consent to data linkage has been sought for all participants in the study. NHS Digital will not apply National Data Opt-Out for these participants and are content that the consent materials are compatible with the flow of data described in this agreement.

LAWFUL BASIS FOR THE PROCESSING OF PERSONAL DATA (GDPR)
University College London is relying on GDPR Article 6 (1)(e): processing is necessary for the performance of a task carried out in the public interest, and additionally (as health data is a special category or Personal Data), Article 9(2)(j): processing is necessary for the archiving purposes in the public interest, scientific or historical research purposes. Participants that lack capacity to provide fully informed consent were not included in the SUMMIT study and consultee consent was not permitted. Data minimisation processes are being followed and only data that is required specifically for the purposes of this study has been requested, to protect the rights of the data subjects. In addition data will not be collected from participants that have withdrawn their consent to future data collection.

GRAIL LLC is using GDPR Article 6(1)(f) "processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child." Processing personal data is necessary for GRAIL, LLCs legitimate interests which are described in this application. The data to which access is requested are proportionate and necessary to achieve those interests. GRAIL, LLC has completed a legitimate interests assessment (LIA). The data subjects interests and fundamental rights are protected through appropriate minimisation of fields and patient records being processed; protection of the data in a secure environment, and guaranteeing secure destruction at any stage at the request of NHS Digital or after a defined period on completion of the project. Additionally (as health data is a special category of Personal Data), GRAIL, LLC is also relying on Article 9(2)(j): special category data used for archiving in the public interest, scientific or historical research or statistical purposes, with a basis in law. The data in the SUMMIT study is being requested to be used in the public interest, as the SUMMIT study aims to understand ways to detect lung cancer before there are any symptoms, when treatment can be simpler and more successful. If the study team are successful in their endeavours, this intervention could be brought to a wider UK population, thereby vastly improving lung cancer outcomes for UK patients and benefiting the NHS.

PATIENT AND PUBLIC INVOLVEMENT (PPIE):
The acceptability of the SUMMIT study has been discussed with and approved by PPIE and General Practitioner (GP) groups, who have been involved throughout the design and running of the study.

There have been three separate SUMMIT specific face-to-face PPIE sessions, with different members in each session. These members were representative of the group being invited to this study (i.e. smokers and former smokers in the eligible age bracket). The earlier sessions discussed the design and concept of the study, particularly the invitation process. The later sessions were extremely focused and looked in detail at the Participant Information Sheet (PIS), consent form and other documents including the collection and processing of personal data. This group also looked at invitation and results letters sent to the participants in order to report back the results of their LDCT scans. All feedback received from both PPI members and GPs has been considered and incorporated appropriately into the study design.

It was important to get input from a diverse PPI group and these included:
Eleven attendees plus one phone feedback.
Nine of the attendees were males and three were females (including phone feedback)
Cancer patients 5 of them have received or are currently receiving treatment for cancer but not lung cancer
Smoking history 8 of the attendees were either light or heavy smokers
Three attendees work in the construction industry
Four attendees are members of the UCLH Cancer Patient and Public Advisory Group
Two people indicated that they have caring responsibilities for a family member or friend with a cancer diagnosis

The study team have two PPIE members on the Project Steering Group and also a centralised UCL Cancer Trials Centre (CTC) PPIE group to call upon when needed. These members have continued to assist the SUMMIT team understand and accommodate the public perspective on LDCT screening, sampling and data processing throughout the duration of the study. The PPIE members will also be key in interpreting and disseminating the study results.

Organisation Roles and Responsibilities:
The SUMMIT study is an academic study sponsored by UCL and funded by, and run in collaboration with GRAIL, LLC.
UCL are a joint data controller and lead for this agreement, who also process the data and are responsible for sending participant identifiers to NHS Digital for data linkage and for receiving NHS Digital record level data, downloading onto the Data Safe Haven and sharing with GRAIL, LLC.
GRAIL, LLC are a joint data controller who also processes the data and are responsible for receiving NHS Digital pseudonymised record level data and sharing with GRAIL Bio UK Ltd.
GRAIL Bio UK Ltd will be receiving pseudonymised record-level NHS Digital data via GRAIL, LLC and are therefore listed as a Data Processor in this agreement.

NOTE: GRAIL, LLC is the successor in interest to GRAIL, Inc. GRAIL, LLC encompasses all GRAIL locations including GRAIL Bio UK Ltd which is also a data processor located in the UK. GRAIL Bio UK Ltd were not in existence when SUMMIT was set up and open to recruitment. It is agreed by UCL and GRAIL that from the participant information sheet and consent documentation that participants would be aware that their data would be processed by GRAIL encompassing both locations.

For full transparency of the Commercial element of this agreement, it is noted that GRAIL LLC and/or GRAIL Bio UK Ltd may take the results of the SUMMIT study to further refine the algorithm of their MCED test that could add commercial value to their product(s). Therefore in the future, GRAIL LLC and/or GRAIL Bio UK Ltd may receive commercial benefit (including intangible or indirect commercial benefits such as positive publicity) from the successful outcomes of the trial.

Yielded Benefits:

There are no current yielded benefits from receipt of the Early Ascertainment data to document as yet.

Expected Benefits:

Demonstrating the feasibility of LDCT screening for lung cancer should enable the study team to add to the evidence needed to establish a national screening programme in the UK. By successfully carrying out the SUMMIT study, the study team could bring this intervention to a wider UK population, thereby vastly improving lung cancer outcomes. Should a national lung cancer screening programme go ahead, SUMMIT will also be able to provide valuable information to shape this programme. This includes improving current lung cancer risk models (such as Prostate, Lung, Colorectal and Ovarian risk (PLCO)), finding optimal invitation strategies, understanding the demographic and psychological characteristics of participants undergoing screening, optimising screening processes, and also informing how best to implement LDCT screening (e.g. annually vs biannually). It is hoped that this data will improve the uptake and efficiency of future screening programmes and increase the accuracy and sensitivity of lung cancer detection of those participating, ultimately improving treatment outcomes and survival.

As the largest population based LDCT screening study in the UK, the SUMMIT Study continues to be of considerable public benefit; both directly through screening high-risk adults for lung cancer and indirectly through answering outstanding questions the national screening committee have on how to implement LDCT screening in future UK screening programmes. SUMMIT has also helped direct and inform the NHS England commissioned Targeted Lung Health Check programme currently being implemented. Many of the key points learnt from running SUMMIT will be implemented into the roll out of this programme at UCLH.

In the medium term, the development of an early cancer blood test will provide improved cancer screening and earlier diagnosis. A minimally invasive, relatively inexpensive blood test to detect multiple types of cancer will be invaluable to future healthcare worldwide where cancer can be detected earlier when it can be better treated and cured. It is expected that the Galleri test will also predict the origin of the cancer signal with high accuracy to help guide diagnosis. Using the Galleri test alongside existing screening tools is expected to improve early cancer detection for patients at an elevated risk of cancer, such as those aged 50 or older.

Through implementation of LDCT lung cancer screening and a blood test to detect cancer, most patients should be diagnosed earlier that they would otherwise be, with an expected decrease in cancer stage for patients presenting in cancer clinics (a general downshift in cancer stage). Although it is understood that an earlier diagnosis might not benefit every patient, in the majority of cases most cancers will be detected and treated earlier where treatment success and survival rates are better. The 1-year survival rates based on cancer stage between 2013-2017 are below (data from CRUK):
- Stage 1: 87.2%
- Stage 2: 73.0%
- Stage 3: 48.7%
- Stage 4: 19.3%

Outputs:

The ultimate result of the data processing for the SUMMIT study is to develop a blood test to detect cancer early and also implement a national LDCT lung cancer screening programme in the UK.

The expected outputs include submission to peer-reviewed journals, conferences and presentations. The planned journals include; Lancet respiratory medicine, Lancet oncology, European Respiratory Journal and Annals of Oncology.

The first primary outputs are expected in with the target end date of 2023/mid-2024. The study team intend to release further publications once the data matures and more NDRS NCRAS data is received. Primary outputs will be linked to study endpoints:
To examine LDCT screening delivery using established measures of performance and risk prediction.
To quantify the uptake of LDCT screening, and examine the demographic and psychological characteristics and smoking status of those who consent to be screened.
To examine adherence to and practicability of a biennial LDCT versus annual screening.
To identify the psychological and screening-related factors which predict uptake of, and repeat adherence to, LDCT screening for lung cancer, as well as their sociodemographic and smoking-related correlates.
To investigate QoL over time, to explore associations with screening adherence, the frequency of screening and abnormal LDCT results.
To examine the harms associated with LDCT screening.
To evaluate the performance of the GRAIL test for the detection of lung cancer within 12 months of Y0, Y1 and Y2 timepoints.
To evaluate the performance of the GRAIL test for the detection of invasive cancer and identification of tissue of cancer origin within 12 months of Y1 and Y2 timepoints.
To evaluate the performance of the GRAIL test for the detection of invasive cancer and identification of tissue of cancer origin within 24 months of Y0 and Y1 timepoints.
To evaluate the performance of the GRAIL test by cancer type, stage and method of diagnosis.
To evaluate association of the GRAIL test result and cause-specific survival (e.g. cancer, cardiovascular) and overall survival.

The plan is to disseminate aggregate results (with small numbers supressed according to the HES Analysis Guide) to public and patient communities, for example on the UCL CTC website, Cancer Research UK (CRUK) website, clinicaltrials.gov and via the HRA Final Report. The study team also plan to send a newsletter summarising the results in lay language for all participants that have taken part in SUMMIT. Newsletters will be reviewed by the SUMMIT PPIE members and by the UCL CTC PPIE group before submission to REC for review. The study team also intend to include the PPIE group in any other dissemination activities within patient groups, such as lung cancer charities, conferences and PPIE Open days.

In addition the outputs of data processing at the end of the study aim to include conference abstracts, reports to NHS England and GRAIL, and submissions of SUMMIT findings to peer reviewed journal(s). The publications will not contain the data, only the results of its statistical analysis that will be summarized overall.

GRAIL may take the results of the SUMMIT study to further refine the algorithm of their MCED test that could add commercial value to their product(s).

Analysis of interim study data has already been published in various journals (see below), posters presented at BTOG 20/22, World Lung 19/20 and ERS 22, and abstracts submitted to BTS 20 and ATS 21:
- Horst C, Dickson JL, Tisi S, Ruparel M, Nair A, Devaraj A, Janes SM. Delivering low-dose CT screening for lung cancer: a pragmatic approach. Thorax 2020;75:831-832.
- Quaife SL, Waller J, Dickson JL, Brain KE, Kurtidu C, McCabe J, Hackshaw A, Duffy SW, Janes SM. Psychological Targets for Lung Cancer Screening Uptake: A Prospective Longitudinal Cohort Study. J Thorac Oncol. 2021 Dec;16(12):2016-2028.
- Dickson JL, Hall H, Horst C, Tisi S, Verghese P, Mullin AM, Teague J, Farrelly L, Bowyer V, Gyertson K, Bojang F, Levermore C, Anastasiadis T, Sennett K, McCabe J, Devaraj A, Nair A, Navani N, Callister ME, Hackshaw A; SUMMIT Consortium, Quaife SL, Janes SM. Telephone risk-based eligibility assessment for low-dose CT lung cancer screening. Thorax. 2022 Jul 21:thoraxjnl-2021-218634.
- Dickson JL, Bhamani A, Quaife SL, Horst C, Tisi S, Hall H, Verghese P, Creamer A, Prendecki R, McCabe J, Gyertson K, Bowyer V, El-Emir E, Cotton A, Mehta S, Bojang F, Levermore C, Mullin AM, Teague J, Farrelly L, Nair A, Devaraj A, Hackshaw A, Janes SM; SUMMIT consortium. The reporting of pulmonary nodule results by letter in a lung cancer screening setting. Lung Cancer. 2022 Jun;168:46-49.

A newsletter has also been disseminated to participants in June 2021 providing an update on the studys progress.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract). There will not be any access to the data by any third parties.

This application is to request the renewal and amendment of data previously provided under the agreement with the Office for Data Release ODR0718_316.

DATAFLOW
The data flow outlines the high level workflow for how data flow occurs between UCL, GRAIL LLC, GRAIL Bio UK Ltd and NHS Digital for the SUMMIT study. The flow covers two key steps:
1. Transfer of linkage file/identifiers from UCL to NHS Digital.
2. Transfer of linked data from NHS Digital back to UCL, GRAIL LLC and GRAIL Bio UK Ltd.

Transfer of linkage file/identifiers from UCL to NHS Digital:
UCL will send the list of Patient Identifiable Data (PID) - including participant Study ID, NHS Number, Gender, Date of Birth and Postcode - securely to NHS Digital via a Secure Electronic File Transfer Service (SEFT) or other secure, NHS Digital approved file transfer mechanism. This identifiable data is stored in the SUMMIT Clinical Records Management system (SCRMS). This list will only include participants that have consented to SUMMIT and have not withdrawn their consent for future data collection. This is to ensure that participants rights to object are respected and to abide by data minimisation principles.
The NHS Digital data production teams will link patient identifiers to the datasets requested in section 3.
The NHS Digital production team to remove patient identifiers from the linked data to mitigate any risk of reidentification of participant data.

Transfer of linked data from NHS Digital back to UCL, GRAIL LLC and GRAIL Bio UK:
NHS Digital transfers the data via SEFT or other secure, NHS Digital approved file transfer mechanism. This data is pseudonymised (only the Study ID from the linkage transfer is kept and identifiable fields are removed).
UCL team downloads the data onto the UCL Data Safe Haven (DSH) (see below).
GRAIL, LLC has a UK-specific, permission controlled and encrypted AWS S3 bucket created. A UCL user is designated and given access by the GRAIL team to transfer data from the DSH to this S3 bucket to GRAIL, LLC.
Once the data is in the S3 bucket, it is replicated to the US and a copy can be ingested into the GRAIL, LLC analysis pipeline.
GRAIL LLC can then provide the pseudonymised data set to GRAIL Bio UK via a separate AWS S3 bucket.
UCL, GRAIL, LLC and GRAIL Bio UK Ltd will have a copy of the same dataset.
Pseudonymised dataset is available in a permission-controlled manner to GRAIL study team users.

STORAGE AND PROCESSING LOCATIONS
Amazon Web Services (AWS UK) supply IT infrastructure for GRAIL Bio UK Ltd and are therefore listed as data processors. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data. AWS UK use only UK data centres and provides a private cloud platform which hosts the Clinical Records Management System (CRMS) which was developed by GRAIL, INC (now GRAIL, LLC) and is currently managed by GRAIL Bio UK Ltd. The record-level pseudonymised data extracts referred to in section 5a (above) will be stored in secure S3 folders which are hosted on Amazon Web Service (AWS UK). The PID will be stored separately in the CRMS and is the main PID used to invite and book patients into study appointments . Only authorised study team members of GRAIL Bio UK Ltd and UCL have access to NHS Digital record-level pseudonymised data and PID data stored in the secure S3 folders hosted by AWS in the United Kingdom. Enrolled participants also consent to the transfer and storage of their health data to Grail Bio UK Ltd for the purposes of processing the pseudonymised data extracts for this application.

Amazon Web Services, Inc (USA) supply IT infrastructure for GRAIL, LLC and are therefore listed as data processors. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data. Enrolled participants also consent, as expressly stated in the consent form and participant information sheet, to the transfer of their pseudonymised health data to GRAIL, LLC in the US for purposes permitted by the study participant consent form. The pseudonymised data will be transferred by UCL from the UCL Data Safe Haven to a secure S3 folder hosted on Amazon Web Services, Inc. (USA). The transfer will be undertaken using a secure, encrypted network connection.

The UCL Data Safe Haven (DSH) is a safe haven system which conforms to NHS Digitals Information Governance Toolkit. Access is via a remote desktop arrangement served via Citrix. Access is controlled via the use of a username, password, PIN and one-time token-based password. The token-based password is generated algorithmically and is changed every minute. Access will only be granted to substantive UCL employees for the purpose of processing outlined in the section above. The data analyses the performance of screening delivery (e.g. uptake and factors that affect uptake) will be undertaken by statisticians at UCL all using pseudonymised data. The Data Safe Haven is subject to external professional penetration testing on an ongoing basis. Failed logon attempts are recorded in the Data Safe Haven system and are managed by the Data Safe Haven Service Operation Manager. Intrusion attempts and port scans are detected and reported to the UCL security function for investigation as necessary. Data is transferred into the system via a secure gateway technology and is then retained via policy and systems that prevent data leakage (for example, through transfer of data to USB media or copy and paste to the client machine). Whilst using the DSH users are prevented from accessing any external network resources (web sites, email, etc). The SLMS Data Safe Haven is certified to ISO 27001:2013. Limited PID is also stored in the DSH for the purposes of SUMMIT research and NHS-D data linkage, this includes Date of birth, Age, GP practice name, NHS number, Postcode, Gender, Ethnicity, IMD score and rank and smoking status. This is stored securely in the UCL DSH, access to which is carefully controlled and only those that have permission to view this PID have access to the UCL data safe haven. Those that do not have permission to access PID will not be able to access to the DSH where the data linkage documents are stored and will not be able to link the data to a specific patient.

Data analysis related to the Galleri Blood test will led by one of GRAILs senior bio-statistician/bio-informaticians (employed by the funder, GRAIL LLC), in the US and UK, all using pseudonymised data. Data may be transferred by GRAIL LLC to GRAIL Bio UK.

Data processing will only be carried out by substantive employees of UCL, GRAIL, LLC and GRAIL Bio UK Ltd. All employees with access to NHS Digital Record level data have been appropriately trained in data protection and confidentiality.

The role of IAPT in the prevention of dementia and the amelioration of its impact on service use and co-morbidities (the MODIFY project) — DARS-NIC-157211-T8B2M

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-01 – 2023-01 2020.06 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 16 March 2023 final.pdf, igard-minutes-17th-october-2019---final.pdf, igard-minutes-19th-december-2019-final.pdf

Datasets:

Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
Mental Health Services Data Set
Mental Health and Learning Disabilities Data Set
Mental Health Minimum Data Set
Civil Registration - Deaths
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Outpatients
Improving Access to Psychological Therapies Data Set
Emergency Care Data Set (ECDS)
Civil Registration (Deaths) - Secondary Care Cut
HES-ID to MPS-ID HES Accident and Emergency
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients
Improving Access to Psychological Therapies Data Set_v1.5
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)
Improving Access to Psychological Therapies (IAPT) v1.5
Mental Health and Learning Disabilities Data Set (MHLDDS)
Mental Health Minimum Data Set (MHMDS)
Mental Health Services Data Set (MHSDS)
Improving Access to Psychological Therapies (IAPT) v2
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant

Objectives:

The MODIFY project, otherwise know as “Mental health and other psychological therapy Outcomes; their relationship to Dementia Incidence in the Following Years”, is funded by the Alzheimer's Society and led by University College London researchers. MODIFY aims to enhance understanding of dementia prevention in the UK by examining the role of psychological therapies offered within the England-wide Improving Access to Psychological Therapies (IAPT) services in dementia prevention.

35% of dementia cases are thought to be attributable to modifiable risk factors. Many of these dementia risk factors such as anxiety, depression, social isolation or alcohol use may be modifiable through psychological therapy. Despite this, no one has yet tested whether psychological therapies are associated with reduced future risk of dementia. Consequently, the overarching purpose of this request is to understand whether and how IAPT psychological therapies might play a role in preventing dementia and help those already living with the condition as well as elucidating which factors might affect their utility in doing so.

The researchers plan to create a data resource that links from the ‘NHS Increasing Access to Psychological Therapies’ (IAPT) services to electronic medical records of dementia diagnosis. Using this resource, the researchers will be able to find out whether receiving successful treatment for anxiety and depression is associated with reduced risk of developing dementia. The Applicant has confirmed that this resource and the data within it will not be used for any other studies other than as described in this data sharing agreement.

The researchers will also look into the possibility of measuring change due to other risk factors not treated in IAPT interventions, such as poor sleep, loneliness, isolation, physical inactivity and high alcohol use. If this is possible, it could lead to a further research into whether psychological therapy may help to prevent dementia by changing other risk factors of dementia.

This request for NHS Digital data comprises one component of the MODIFY project which will commence in November 2019. The MODIFY project also includes a feasibility study examining the impact of psychological therapies on dementia relevant outcomes that are not recorded in the national IAPT dataset. This second component will start approximately in March 2020 and will involve prospectively collecting data from several London based IAPT services. This will be subject to a separate IRAS application and it will not involve any data from NHS Digital.

Young-onset dementia is typically defined as emerging from age 30 onwards, therefore the MODIFY project would like to capture this often ignored group of people with early onset dementia in their data analysis.

DATA REQUESTED
The study requires data for adults 20 years and upwards from the start of the study (2012) up to study end in 2022 and referred to IAPT, and their linked Hospital Episode Statistics (HES), Mental Health records and Mortality records as described in data products section of the application. Annual pseudonymised data for IAPT data set and linked records in MHMDS (01 April 2013 – 31 August 2014), MHLDDS (01 Sept 2014 – 30 Nov 2015) or MHSDS (1 Jan 2016 to present (v4 from 1 April 2019) and Mortality data, and HES APC, OP and AE, and ECDS data sets covering the periods 2012-13 to 2021-22.

The proposed dissemination will create a unique longitudinal linked data set, the analysis of which will provide new information on how psychological therapies in general, and IAPT services in particular, might help prevent dementia. It will, for the first time, enable quantification of:
i) the potential longitudinal impact of IAPT interventions on dementia risk and dementia risk factors,
ii) how that impact might be achieved,
iii) what elements of IAPT provision might maximise that impact and
iv) inequalities in access to IAPT as a potential dementia prevention resource.

The processing of the data is being carried out under article 6(1)e and 9(2)j of the 2018 Data Protection Act. The proposal meets the requirements of article 6(1)e (public interest), because the outputs will provide the first data on the potential utility of an England wide NHS service (IAPT psychological therapies) in addressing a key public health issue - the prevention of dementia and reduction of chronic disease burden.

The proposal meets the requirements of article 9(2)j as it relates to scientific research (described in more detail below) conducted in order to provide the public interest outcome listed above. It is of note, that the scientific research conducted based on this dissemination will also substantially enhance knowledge about dementia prevention globally, since this is the first study to directly examine how psychological therapies for anxiety and depression might prevent dementia.

Expected Benefits:

Findings from this work disseminated and communicated as above have the potential to benefit the future provision of healthcare by providing the first data on an important potential benefit (dementia prevention and amelioration) of psychological therapies already offered throughout England via IAPT.

There are between 850,000 – 1,000,000 people with dementia in the UK and a recent Lancet commission report estimated that 35% of dementia risk is explained by modifiable risk factors. Psychological therapies do have the potential to ameliorate several of the Lancet commission identified risks (depression, social isolation) and other known risks too (anxiety). While precise estimates of reduction in risk are not possible to make, if depression alone were eliminated then it is estimated that dementia prevalence could be reduced by 34000 at 2015 rates. Given the devastation and healthcare utilisation costs that dementia brings, this reduction in incidence would be of significant import to population wellbeing and healthcare budgets. Through quantifying and understanding the potential impact of IAPT on dementia risk reduction, appropriate information can be provided to the 500,000 IAPT attendees yearly.
It may also be possible to use results to tailor psychological therapy provision for the 10s of 1,000s of older people who use IAPT services yearly towards dementia reduction (e.g. it may be that some types of psychological therapies offered in IAPT are better at preventing dementia than others and these could be promoted).

A dementia prevention aspect to psychological therapies would also support the current push to increase the availability of psychological therapies for older adults (who numerous reports suggest are under-represented in IAPT). Indeed, since recent studies have found that dementia is a key fear among older adults, having this knowledge might improve access and uptake in and of itself. Furthermore, IAPT psychological therapies are already offered throughout England and, thus, unlike other potential dementia prevention tools, the proposed prevention activity is already in place and has a delivery system England wide.

This work will also generate important research findings and new questions, the dissemination and communication of which will stimulate researchers and clinicians internationally to develop new research projects and clinical practices aimed at dementia prevention.

Finally, the examination of whether there might be inequalities in access to IAPT services, which might be relevant to dementia prevention, could provide the basis for campaigns to reduce inequalities in IAPT access on dementia prevention grounds. As a consequence of all the above, the proposed dissemination is in the public interest

The benefits above will be achieved following the end of the three-year project as a result of the dissemination and communication of the outputs as discussed in the outputs section to key figures in:
i) Healthcare policy (the founder of IAPT, the director of the UCL Centre for Outcome Research, presentations at the Alzheimer’s Society national research and policy conference)
ii) Clinical practice ( Trust leads for IAPT in London based NHS trusts, integration into training of 150 clinical psychologists and 200 IAPT trainees at UCL, presentations and workshops to clinical audiences)
iii) IAPT service user and dementia movements (Grant co investigators and collaborators include 5 members of the Alzheimer’s Society volunteer research network all of whom are affected by of dementia as well as IAPT service users who are active in promoting IAPT service user interests)
iv) Dementia research

Outputs:

The outputs will include progress reports to Alzheimer's Society, which will be annual with the first report due in July 2020 and the next in 2021, with the final one in July 2022.

It is expected that several empirical papers will be submitted to key relevant peer reviewed journals such as the Lancet Psychiatry, the British Journal of Psychiatry, Alzheimer’s and Dementia and the International Journal of Geriatric Psychiatry. The first of these is expected to be submitted around end of year 1 of the grant (July 2020) with other publications submitted in the three years following that.

There will be conference presentations including presentations at the annual Alzheimer’s Society conference (May/June of July 2020, 2021, 2022) and a planned presentation at the world’s largest international dementia conference the Alzheimer’s Association International Congress (AAIC) in July 2021 and 2022.

OUTPUTS
All data in all outputs will be aggregated with small numbers suppressed as per the HES Analysis Guidance and the IAPT, ECDS and the Mental Health (MHSDS, MHLDDS, MHMDS) data sets Disclosure Policies.

There will be an active process of dissemination of the project to relevant stakeholders as the projects goes on. This will be done through ongoing engagement with the project stakeholder reference group (which includes people affected by dementia, IAPT service users and IAPT clinicians). There will be conference presentations detailed above to audiences that include clinicians, people affected by dementia, depression and anxiety, researchers and policy makers. Summaries of outputs will be published in patient facing journals. The Alzheimer’s Society will publish results in their ‘care and cure’ magazine.

The project also has links with AnxietyUK who would publish results in their newsletter. There will also be direct bilateral engagement with key figures in psychological therapy research and practice. For instance, the founder of IAPT services supported the funding application. Key publications, findings and conference presentations will be disseminated through UCL media channels (including collaborator and departmental twitter accounts). Workshops based on findings will be run with the 150 clinical psychology trainees on the UCL clinical psychology doctorate of which the applicant is Clinical Director. Findings will also be integrated into the IAPT training courses at UCL, over which the applicant, who is involved in the project and a substantive employee of UCL, has oversight. There are direct links to NICE and consequent policy influence through the director of the UCL Centre for Outcomes Research and Effectiveness who is involved with this project and is a substantive employee of UCL. IAPT service leads and senior practitioners in London IAPT services are collaborators on MODIFY and will take findings to senior trust meetings. These contacts in research, policy, clinical practice, dementia and IAPT service user movements will also facilitate the development of new relationships which will be used to further disseminate findings.

There will be active communication to ensure findings reach their intended audience. This will be through many of the dissemination activities and channels listed above but also through a project website (still in production) and promotion of research at conferences with large degrees of lay attendance (e.g. the Alzheimer’s Society Conference). The UCL and Alzheimer’s Society press offices will be contacted at the point of publication of papers or conference presentations to discuss potential dissemination in the wider media. Throughout, there will be consultation with the stakeholder steering group and other partners above as to how best to communicate findings.

There will not be other exploitation of results or outputs. Only substantive employees of UCL, or students on UCL MSc and doctorate courses under the supervision of UCL substantive employees will have access to the data. All outputs outside of this will be aggregated with small numbers suppressed.

Processing:

NHS Digital will link the datasets listed in products (all of which are held by NHS Digital) using a non-identifiable linking key. Aside from these internal NHS Digital linkages there will be no other linkages of this data. The entire linked dataset will be pseudonymised and no identifiable variables have been included in the dataset.

The data flow out of NHS Digital would be the data sets described in products and above, which would include special categories of pseudonymised health data going to the data safe haven at UCL. There will be no subsequent data flows out of UCL. There will be no flow of data into NHS Digital.

The data is not being matched to publicly available data. Re-identification of individuals is not permitted under this agreement.

The data will be held on the UCL data safe haven using UCL approved computers. The Data Safe Haven is UCL's technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital Information Governance Toolkit and ISO 27001 Information Security standard. Access is controlled by the ‘Information Asset Owner’ and they complete training in confidentiality and data protection, which is renewed annually.

System access is from UCL approved computers and is secure requiring the user to enter a username as well as a randomly generated number on a device held by the user which is also combined with a pin and regularly updated password of specified length and complexity.

No organisations other than UCL are involved in the planned data analysis. All those involved in the processing of the data are substantive employees of UCL, or students on UCL MSc and doctorate courses under the supervision of UCL substantive employees, and must a sign and adhere to an Honorary Contract - for which extra wording covering the responsibility for students has been added. The work undertaken by the students is only for the purpose stated in this Purpose section.

No elements of the work will take place outside the UK.

All UCL students are expected to undertake annual training on handling highly confidential information. All Trainees and students register for and complete NHS Digital’s Data Security Awareness (NHSD) course provided by e-Learning for Health. The course covers data security awareness, the law, threats to data security, breaches and incidents, and the General Data Protection Regulation. Completion of the course is sufficient evidence of basic information governance training to handle highly confidential information under the School of Life and Medical Sciences (SLMS) Information Governance Training Policy. Trainees'/ students up-to-date training is recorded in the SLMS IG training register.

All students working on the MODIFY project are students from UCL. All students working on the MODIFY project will undertake the NHS Digital’s Data Security Awareness course provided by e-Learning for Health. UCL has a specific data protection and information security policy, which applies to all staff and students when processing personal data on behalf of UCL. All UCL students working on the MODIFY project are bound by this policy, and that they will face potential sanctions in the event of a breach of the policy.

All students sign up to the UCL's Academic Manual. The Student Academic Misconduct section of the 2019-2020 manual Section 9.1, item 3 states "All instances of Research Misconduct whether by taught students, research students or members of staff will be investigated under UCL’s Procedure for Investigating and Resolving Allegations of Misconduct in Academic Research".

DATA MINIMISATION
• The study requires data for adults aged 20 and above from 2012 (study baseline). There is no upper age limit. Only data is required where an individual also has a record in IAPT data.
• Only periods necessary for analysis have been requested (from the start of IAPT data 2012 to end of study 2022).
• Identifiable data is not required.
• Only fields necessary for analyses of the prevention of dementia or amelioration of its co-morbidities or impacts on health and service use (e.g. the entire maternity category is omitted) have been included.

OUTPUTS
All data in all outputs will be aggregated with small numbers suppressed as per the HES Analysis Guidance and the IAPT, ECDS and the Mental Health (MHSDS, MHLDDS, MHMDS) data sets Disclosure Policies.

British Regional Heart Study (BRHS)- data linkage of established cohort to NHS Digital datsets (HES, MHMDS, DIDS) — DARS-NIC-28591-H5Q3X

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2018-07 – 2021-06 2019.01 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), NEWCASTLE UNIVERSITY, UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY OF NEWCASTLE UPON TYNE

Sublicensing allowed: No

Datasets:

Mental Health Services Data Set
Mental Health Minimum Data Set
Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
Hospital Episode Statistics Outpatients
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Admitted Patient Care
Diagnostic Imaging Dataset
Bridge file: Hospital Episode Statistics to Diagnostic Imaging Dataset
Emergency Care Data Set (ECDS)
Diagnostic Imaging Data Set (DID)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
Mental Health Minimum Data Set (MHMDS)
Mental Health Services Data Set (MHSDS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The British Regional Heart Study (BRHS) is an established, long-term cohort study of cardiovascular disease and other common chronic diseases - the study comprises older men, currently aged 75-94 years, who originally joined the study in 1978. The BRHS currently obtain cancer registration data, mortality data and use the tracing service under NIC-148411-Q64H8 (MR104). The data held under NIC-148411-Q64H8 will not be linked to the data disseminated under this agreement. Morbidity rates in this study population are exceptionally high, with many study participants developing cardiovascular disease (CVD) and other physical illnesses; fractures and dementia are also major health problems. In order to obtain an accurate assessment of chronic disease outcomes, the researchers are seeking to supplement information on the cohort with disease information from hospital consultations and admissions (HES Data, MHMDS data and DIDs data).

The additional data provided by NHS Digital will be used to inform and develop a larger programme of research on the prevention of CVD (CHD and stroke), heart failure and CVD related ageing conditions including dementia, frailty and physical disability. For example, there is growing evidence that dementia and CVD share common risk factors. BRHS data resource now includes a wide range of novel risk factors measured at both 60-79 years and at 72-91 years (with blood stored for the measurement of further markers). The data from NHS Digital will enhance the study and will lay the ground for investigation into the aetiology, mechanisms and prevention of these age-related conditions in older men and allow us to test new hypotheses in cardiovascular ageing.

Linking the NHS Digital data with the BRHS cohort database will strengthen/enhance the data on chronic disease diagnoses and on health service use. The researchers will use these NHS Digital data, using Study ID, along with data already available in the cohort study on social, biological, behavioural and environmental determinants of health - this will allow the researchers to undertake detailed research on the determinants of cardiovascular disease and other chronic diseases in later life.

The overarching objectives/purpose of this data request is to enhance the BRHS cohort study by obtaining more robust detailed data on disease outcomes in order to research ways to prevent CVD, heart failure, dementia and disability in older ages. The researchers will link the NHS Digital data to pseudonymised data in the BRHS cohort study, which has been obtained (from the cohort) over the last 40 years - this includes mortality, cancer, postal questionnaire data completed by the participants, questionnaire data collected from General Practice and data collected during the physical assessments in 1978-1980, 1998-2000 and 2010-2012.

The following scientific/research objectives will be investigated in the BRHS data based on the detailed disease outcomes data from NHS Digital -
1. Prediction of CVD risk in older people - To investigate the use of non-invasive arterial markers and novel blood markers reflecting a range of biological pathways in improving CVD risk prediction in older men.
2. Lifestyle determinants of CVD in older age - To assess patterns of key health behaviour (physical activity, obesity, diet) in influencing CVD morbidity and mortality in both men with and without established CVD.
3. Modifiable risk factors and dementia - To investigate lifestyle factors measured in mid-life and older age (obesity, smoking, physical activity) as well as diet quality and nutritional markers in older age and risk of developing dementia.
4. Socioeconomic determinants of cardiovascular aging - To investigate the impact of socioeconomic factors that are important in preventing CVD and dementia in older people.
5. Dementia and CVD – To investigate shared risk factors and mechanistic pathways underlying CVD and dementia and improving early identification of CVD and dementia. This research will help develop strategies to prevent dementia and CVD.
6. Determinants of Heart failure – (i) To investigate pathways to prevention of heart failure distinguishing between reduced ejection heart failure and preserved ejection heart failure which is more common in older adults; (ii) to develop prediction risk scores for use in clinical practice to identify older adults at high risk of developing heart failure.
7. Later life determinants of stroke – To differentiate subtypes of strokes and distinguish risk factors for ischaemic and hemorrhagic strokes.
8. Physical disability and frailty – To identify social, lifestyle and biological factors that affect physical functioning and frailty and identify common pathways underlying CVD and frailty which can inform efforts to prevent the development of disability in older people with CVD.
9. Type 2 Diabetes, CVD and dementia - To examine the influence of duration of diabetes on CVD risk and dementia and identify metabolic pathways linking diabetes with dementia.
10. Improving clinical outcome – To identify and inform ways of evaluating and improving clinical outcomes in patients with CVD and/or dementia such as reducing hospitalisations and mortality.

Expected Benefits:

Enhanced data on health and disease outcomes in the BHRS cohort study will allow further detailed research on ways to prevent chronic diseases in older ages.

This will lead to development of the research evidence-based that is needed to inform clinical guidelines and health policies to improve the health of ageing populations.

Global trends of ageing populations will acutely increase the health and social care burdens on individuals and society from chronic diseases such as cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life - these chronic diseases present both health and social care challenges in older populations. Therefore, research in this cohort study BRHS aims to establish the contributions of potentially important factors (obesity, diabetes, health behaviours, environmental and social factors) to prevent cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life - this research evidence is crucially needed to inform health policies and clinical guidelines to reduce the health and social care burden of chronic diseases in older people. The long term goal (>5 years) of the research is to lead to improved care and prevention of chronic diseases and disability in older people.

The BRHS has a track record of providing high quality evidence to improve the health of the public in the UK and internationally. To date (using data received under NIC-148411-Q64H8), the study has published over 500 peer reviewed research papers, providing high quality evidence about the epidemiology of these conditions and improving understanding on how to manage, treat and prevent them. Importantly, these papers have informed evidence based strategies to reduce the health and social care burden in older populations, as outlined in detail in section “Specific output” above. The researchers have contributed to a range of influential UK and international clinical guidelines for management and treatment of important chronic conditions including CHD, stroke, angina, arrhythmias, and diabetes which together cause substantial burdens of ill health in UK and globally.

The BRHS have also contributed to public health guidelines and clinical guidelines about the modification of important cardiovascular disease risk factors (e.g. lipids, obesity, alcohol use, physical activity, smoking and passive smoking).

Outputs:

Short term goals - 1 year

Enhancing the BRHS cohort study with detailed data on disease outcomes based on HES, MHSDS and DIDs data requested under this iteration of the Data Sharing Agreement.

Medium term goals - 2-5 years

Peer reviewed publications in scientific and clinical journals based on research objectives mentioned in objective for processing section.

Long term goals - 5 years and over

Adding to the scientific evidence base and knowledge to inform clinical guidelines and health policy.

The specific outputs from the use of the data will be to generate further high quality research evidence about prevention of chronic diseases and to improve the health of older populations. Importantly, the NHS Digital data linkage requested will substantially strengthen and enhance the BRHS data on chronic disease events and diagnosis and on health service use. The new data linkage will substantially increase the quality of data relating to treatment and management of disease events permitting the researchers to investigate the disease endpoints in greater detail than has been done previously (e.g. understanding treatment received, recurrence of events and categorising sub-types of cardio vascular events). Linking the existing BRHS database to NHS Digital data will permit the research into a wide range of public health relevant topics. The potential benefits for the prevention of cardiovascular disease, diabetes, dementia and other chronic diseases and disability in later life are substantial. Target dates will run from the time of acquiring the data until 2019 with plans to further extend funding for the study.

The BRHS cohort study has previously led to the development of evidence, knowledge and translation of evidence into health policies, as described below:

More than 500 peer-reviewed reports have already been published based on the study which uses mortality data from NHS Digital. It is hoped that research from this new study will be published and utilised in the same way.

Research from the BRHS has been used to shape and change many policies on cardiovascular disease prevention, both nationally and internationally. The BRHS provide outputs in the form of peer reviewed publications from the research to directly funding bodies and policy makers (Department of Health, British Heart Foundation, National Institute of Health Research, Medical Research Council, UK Health Forum), clinicians, public health specialists and other health researchers who then use the evidence to develop preventive strategies.

The researchers have a track record of research findings informing health policies on a range of issues related to primary and secondary prevention of cardiovascular disease (CVD), management of stroke, angina, arrhythmias, and diabetes and modification of risk factors e.g. lipids, obesity, alcohol use, physical activity, smoking and passive smoking.

The Study findings will be cited in reports by a range of influential national and international public sector bodies including the UK House of Commons Health Select Committee, the UK Department of Health, the U.S. Surgeon General (whose reports inform health policies both in USA and other countries around the world) and the World Health Organisation (e.g. their Guidelines for assessment and management of cardiovascular risk).

Previous research is also cited in guidelines produced by professional organisations for treatment of specific chronic conditions, e.g. NICE guidelines, American Heart Association guidelines for prevention of stroke and transient ischemic attack, for management of cardiovascular disease, and management of patients with ventricular arrhythmias, Australian guidelines for management of cardiovascular disease risk, Joint British Societies management of cardiovascular disease guidelines, Endocrine Society guidelines on hypertriglyceridemia and obesity.

Evidence generated from the research has also been used to support local public health programmes, for example, in developing initiatives for primary prevention of CVD and dementia in South East London. The research findings have been published in open access peer-reviewed scientific journals related to public health.

Processing:

The BRHS currently receives data from three sources:
1. Study participants- Physical Examinations - 1978-80, 1998-2000, 2010-2012 and regular postal questionnaires - no further personal identifiers are collected. Participants are asked to provide their date of birth when returning the postal questionnaire, to ensure the form in completed by the intended recipient.
2. GP record review - data collected annually directly from participants' GP using a questionnaire sent to the GP. An update of the participant address is requested so that the researchers can continue to contact cohort members.
3. NHS Digital -Participants flagged in 1978-80 and the study receives Mortality notification & Cancer registration on a monthly basis via the Data Exchange Service (received under NIC-148411-Q64H8). This cohort is the original cohort, no new participants are added.

The BRHS cohort has been followed up since 1978. The researchers are requesting historic NHS Digital data (HES, MHSDS and DIDs) as far back as possible for this cohort i.e. all the available years of data) and on an annual basis going forward. Only those members of the cohort who have provided consent will be followed up for the research purposes mentioned in the objective for processing section. These data will be used to enhance the data already held in the cohort and help to produce robust research findings. Data are requested for this cohort going as far back as possible because this will provide detailed information necessary for research on cardiovascular disease and dementia. A key feature of a cohort study is that health outcomes are assessed over time which provides information on incidence (development) of disease. Therefore data on all available years are requested so as to have complete information on development of diseases - this is needed in order to investigate the research objectives which are to investigate determinants and prevention of diseases.

Without all the retrospective data requested, the research will be limited to only assessing the prevalence of diseases and lead to biased and limited data analysis.

Processing of data for the linkage requested:

The BRHS will provide NHS Digital with Study ID, NHS number, DOB, Sex & last known postcode for linkage to the data requested from NHS Digital for 4,123 consented participants.

NHS Digital will return a pseudonymised dataset to the applicant containing study ID and match rank code.

The Data manager will then link this NHS Digital pseudonymised dataset, using the Study ID, to the BRHS cohort data ID for analysis. The NHS Digital data will not be linked back to any personal identifiers.

The pseudonymised dataset will be stored on UCL’s Sync & Share network drives which are only accessible with a UCL user ID and password. The pseudonymised data will then be made available to the research team of Medical Statisticians, Epidemiologists and Public Health clinicians, to carry out their research analysis. All the researchers working on the data are substantive employees of UCL.

For data from the Mental Health (MHSDS, MHLDDS, MHMDS) data sets, and any Mental Health data linked to HES or SUS, the following disclosure control rules must be applied:
• National-level figures only may be presented unrounded, without small number suppression
• Suppress all numbers between 0 and 5
• Round all other numbers to the nearest 5
• Percentages can be calculated based on unrounded values, but need to be rounded to the nearest integer in any outputs
• In addition for Learning Disability data in Mental Health (MHSDS, MHLDDS, MHMDS), the England-level data also must apply the suppression of all numbers between 0 and 5, and rounding of other numbers to the nearest 5.

All small numbers under 5 must be suppressed in line with the HES analysis guide.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

All outputs will be restricted to aggregated data with small numbers suppressed in line with the HES analysis guide. No publications/outputs from the BRHS have ever presented or will present data which allow the identification of individuals. All data presentation is based on groups of subjects (generally >50 subjects, often considerably larger numbers).

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

MR472B - SABRE: Southall and Brent Revisited - Consented participants — DARS-NIC-148407-LRP3M

Opt outs honoured: No - consent provided by participants of research studY, No - data flow is not identifiable, No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012, Other-Data originally supplied on the basis of National Health Service Act 2001 – s60, and subsequently National Health Service Act 2006 - s251., Other-Data originally supplied on the basis of National Health Service Act 2001 – s60. Subsequent data releases under Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007., Other-Data originally supplied on the basis of National Health Service Act 2001 – s60, and subsequently National Health Service Act 2006 - s251 - 'Control of patient information'. | New data to be disseminated on the basis of Informed Patient consent to, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(2)(c), Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(c), Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-04 – 2022-03 2017.12 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igardminutes-17thdecember2020final.pdf, igardminutes-14thjanuary2021final.pdf, IGARD_Minutes_06.04.17.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
MRIS - Flagging Current Status Report
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Members and Postings Report
Demographics
Civil Registration - Deaths
Cancer Registration Data
MRIS - Scottish NHS / Registration
MRIS - List Cleaning Report
Hospital Episode Statistics Admitted Patient Care (HES APC)
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The data supplied by the NHS IC to University College London will be used only for the approved Medical Research Project MR472.

Yielded Benefits:

The research has also enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). Findings include: - Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity. - Lack of adherence to four combined health behaviours was associated with a 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours. - The study has also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. - The study reported detrimental associations between air pollution (particulate measures) and cardiovascular disease mortality in both the SABRE and NHSD cohorts. The study highlighted ethnic differences in associations between prediabetes in midlife and later development of coronary heart disease and stroke. - The study has confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Expected Benefits:

The rich phenotypic and genotypic dataset gathered over a 25 year period will enable analyses assessing mid-life predictors of health and ill-health in older age and will enable unique analyses of how these associations may be related to ethnicity and migration. Good physical and cognitive functions are vital to healthy ageing and factors which influence these across the life course are poorly understood, particularly in non-European populations. As the cohort is reaching older age, an increase in risk of heart failure, which can be severely debilitating, is expected. Ethnic differentials in heart failure rates are not well studied to date. Increasing length of follow-up and novel analytic techniques, both statistical and relating to stored images and samples bring opportunities for more sophisticated analyses and the addition of hospital admission data to key outcome variables enhances the study’s power to identify events and to further elucidate mechanisms underlying the very marked ethnic differences in cardiometabolic disorders which were observed at visit 2.

Understanding of mechanisms in people of different ethnicities will ultimately lead to appropriate preventive strategies and treatments at different stages of life.

As noted with some detailed examples under ‘specific outputs’, previous use of HES data (1989-2011) enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). A brief summary of some of these findings in the cohort to 2011 follows:

Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity.

Lack of adherence to four combined health behaviours was associated with 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours.

The study also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. This is not an exhaustive list of study findings in relation to incident cardiometabolic disease but indicates that the study is building a steady accumulation of understanding of ethnic differentials. There is clear need for further study, which this cohort is uniquely able to address. The addition of HES data to 2016 is key to maximising event ascertainment in old age.

Outputs:

Study findings will continue to be published in peer-reviewed scientific journals, predominantly related to epidemiology, cardiovascular and metabolic disorders, cognitive, physical and psychological function, but also including more generic journals such as the BMJ, reflecting the increasing focus on overall health in older age. Publications will contain only aggregate level data without local identifiers and with suppression of small numbers in line with HES analysis guide.

Publications to date are listed on the study website: www.sabrestudy.org. All publications since 2008 are open-access. The audience is expected to consist mainly of academic researchers and clinicians.

Two examples of previous SABRE study related publications are listed below, both sets of analyses were importantly informed by data from a previous HES extract (no longer retained), and were published in high-impact factor peer-reviewed journals. These generated considerable media interest and are widely cited.

Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, McKeigue PM, Chaturvedi N. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians and African Caribbeans. SABRE (Southall and Brent revisited) – a prospective population based study. J Am Coll Cardiol. 2013 Apr 30;61(17):1777-86. http://dx.doi.org/10.1016/j.jacc.2012.12.046. This paper published in JACC, the world no 1 cardiology journal (impact factor 16.5) confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data, although not directly reported in the manuscript, contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Tillin T, Hughes AD, Godsland IF, Whincup P, Forouhi NG, Welsh P, Sattar N, McKeigue PM, Chaturvedi N. Insulin resistance and truncal obesity as important determinants of the greater incidence of diabetes in Indian Asians and African Caribbeans compared to Europeans? The Southall And Brent REvisited (SABRE) cohort. Diabetes Care 2013;36(2)(383-393). http://care.diabetesjournals.org/content/36/2/383.long. This paper demonstrated the extraordinarily high risk of incident diabetes continuing into old age in South Asians and African Caribbeans in comparison with Europeans. Metabolic pathways leading to diabetes remain poorly understood. The study found that baseline insulin resistance and truncal obesity could explain the ethnic differences in women but not in men. Further work continues to determine the reasons for the excess risk in men and to understand what underlies insulin resistance and truncal obesity. HES data, although not directly reported in this manuscript, supported these analyses by enabling sensitivity analyses to assess the effects of bias due to loss to follow-up.

A further 8 journal publications have examined associations between baseline risk factors and incident coronary heart disease or stroke where the outcomes were a composite of first events identified through participant reported events, primary care record review identified events and HES identified hospital admissions. One of these was published in Circulation (Wurtz et al), impact factor 14.3 and demonstrated in 3 separate population based studies (including SABRE) that metabolite profiling in large prospective cohorts identified phenylalanine, monounsaturated fatty acids, and polyunsaturated fatty acids as biomarkers for cardiovascular risk, substantiating the value of high-throughput metabolomics for biomarker discovery and improved risk assessment. Another publication in Heart (Tillin et al) identified that 2 widely used cardiovascular risk prediction tools (QRISK2 and Framingham) did not perform consistently well in all ethnic groups and suggested that further validation of QRISK2 in other multi-ethnic datasets, and better methods for identifying high risk African Caribbeans and South Asian women, are required.

In addition to journal publications, UCL will continue to submit abstracts for presentation at national and international conferences, such as Diabetes UK, the European Association for the Study of Diabetes, the European Society of Cardiology, Artery, and the British Hypertension Society. All data for abstracts/presentations will be at aggregate level with suppression of small numbers in line with HES analysis guide.

The study team will further disseminate findings via participant and GP feedback sessions; newsletters, and the study website. All data for these occasions will be at aggregate level with suppression of small numbers in line with HES analysis guide.

At the end of the current funding period (2018) a report will be submitted to the funders (the British Heart Foundation) summarising findings. This may be published on their website. It will only contain at most aggregate level data, with small numbers suppressed in line with HES Analysis Guide.

Processing:

The identifiers of SABRE participants have previously been shared with NHS Digital’s predecessor organisation(s) and NHS Digital has provided regular event notifications including notifications of mortality and cancer registrations. The cohort was previously split into two groups: cancer notifiable participants and non-cancer notifiable participants.

The cohort will be reorganised into three groups: participants who gave informed consent (cancer notifiable); cancer notifiable participants covered by section 251 support, and non-cancer notifiable participants covered by section 251 support. To ensure that participants are correctly reorganised into the appropriate groups, UCL will send NHS Digital 3 separate files (one for each respective group) containing participant identifiers.

NHS Digital will then provide reports on a monthly basis while the study is in active follow-up. Notifications will contain no participant identifiers other than unique study Pseudo-IDs. Month and Year of Death will also be included.

NHS Digital will link the respective cohort groups to HES data and will supply to UCL encrypted files containing hospital admissions data identified only by study Pseudo-ID and encrypted HESID and containing no other identifiers. The dataset will be placed immediately into UCL’s Data Safe Haven.

Using the Pseudo-ID, the data is linked at record level to the existing dataset of mortality and cancer records, clinical measures, primary care record review and participant responses to health and lifestyle questionnaires across the course of the study. The data is stored in an encrypted file within the Data Safe Haven at the Gower Street location. The data can be remotely accessed at the Institute of Cardiovascular Science by accredited SABRE study researchers only – all of whom are substantive employees of UCL. Access must be approved by the Data Manager.

The data supplied by NHS Digital will not be downloaded or otherwise transferred from the Data Safe Haven. Data including variables derived from the NHS Digital data may be downloaded from the Data Safe Haven and stored on a UCL server at the Institute of Cardiovascular Science to be used solely for the purposes of statistical analyses in accordance with the study objectives. Such variables include, for example, date of first admission related to a diagnosis of coronary heart disease but will not include any part of the dataset supplied by NHS Digital. Using this pseudonymised dataset, study analysts will examine associations between risk factors measured during the course of the study and cardiometabolic events. The rich phenotypic and genotypic dataset will enable identification of ethnic differences in cardiometabolic disease risk and physical, mental and cognitive function into older age and it will be possible to identify which measured risk factors may explain ethnic differentials and at which period of life they may act most strongly.

To meet study objectives UCL require information on admissions where diagnostic code lists include coronary heart disease, stroke, heart failure, diabetes, renal failure, dementia, retinopathy, hypertension, other cardiovascular disease. Respiratory diseases will also be studied and mental health disorders and other common disorders may be added which are considered to exert important influences on function and well-being in older age. As an example, from the HES extract, and within the UCL Data Safe Haven, it is expected that a variable will be generated which identifies a first or subsequent admission with coronary heart disease (ICD-9 codes 410 through 415 or ICD-10 codes I200 through I259, or any of the following operation codes from the Office of Populations and Surveys classification of interventions and procedures: K401 through K469, K491 through K504, K751 through K759, or U541 (coronary revascularization interventions or rehabilitation for ischemic heart disease)). Date of first or subsequent event would be summarised as year of event.

The data is stored separately to participant identifiers. The two datasets will not be re-linked and the data will remain pseudonymised as described above. Month and Year of Death are stored in the dataset and used for statistical analyses but the dataset does not include full Date of Death. Participant identifiers are retained separately solely for study administration purposes.

MR472A - SABRE: Southall and Brent Revisited - S251 participants not cancer notifiable — DARS-NIC-99077-Q0K6Z

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Section 251 approval is in place for the flow of identifiable data, National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-04 – 2022-03 2017.09 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 17 November 2022 v1.pdf, IGARD Minutes - 4 August 2022.pdf, IGARD_Minutes_06.04.17.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
Hospital Episode Statistics Admitted Patient Care
MRIS - Members and Postings Report
Civil Registration - Deaths
Demographics
MRIS - Flagging Current Status Report
MRIS - List Cleaning Report
Hospital Episode Statistics Admitted Patient Care (HES APC)
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

University College London (UCL) requires notifications of mortality and linked HES data for its study cohort for use in the Medical Research Project: SABRE (Southall And Brent Revisited. This is a population-based cohort study, conducted at University College London, funded by the British Heart Foundation in its current 25 year follow-up phase. It is unique as a long-standing tri-ethnic cohort consisting of people of European descent and first generation migrants of South Asian or African Caribbean descent. This is an academic research study focusing on identifying and understanding the underlying reasons for ethnic group and sex differences in cardiometabolic disease and in physical, psychological and cognitive function in older age.

Specific questions for the 25 year follow-up study are:
1. How large are ethnic /sex differences in cardiac function, cognitive function and hippocampal volumes in older age?
2. To what extent do cardiac function, cognitive function and hippocampal volumes change over a 5 year period in each ethnic group?
3. Which risk factors measured in mid-life and in early old age are most strongly associated with current cardiac and cognitive function and hippocampal volumes and with 5 year changes in these parameters? Can these risk factors explain ethnic differences in cardiac and cognitive function?
4. How large are gender differences in current disorders of cardiac and cognitive function and in their associations with current risk factors?
5. Do ethnic differences in incident cardiometabolic disorders persist into older age?
6. Which risk factors or risk factor profiles measured in mid-life and early old age are most strongly associated with incident cardiometabolic disorders and which best explain ethnic differences in incidence?

The study receives ongoing notifications of mortality from NHS Digital. Continuing supply of this data is required in order to meet study objectives. Death and cause of death are key outcomes for the research objectives.

The study has previously utilised the List Cleaning service from time to time when in active follow-up in order to ensure that the correct participant addresses are used in order to contact participants. Use of this service has helped the study to avoid trying to contact deceased participants. The List Cleaning outputs were used to update the administration database (held separately from other data within the UCL data safe haven) so that UCL could write to as many participants as possible inviting them to complete questionnaires or come into the UCL clinic for a detailed investigation. Under this Data Sharing Agreement, UCL may retain List Cleaning outputs received previously but is not permitted to make further use of the List Cleaning service.

Linked HES data is required to identify incident cardiometabolic events (in particular coronary heart disease, heart failure, stroke, dementia, diabetes), and other events which may affect physical and cognitive function, which have occurred during the follow-up period. Details of all hospital episodes involving the cohort (not limited to the previously stated conditions) are required to address key study objectives with regard to physical and cognitive function in older age in association with current and mid-life risk factors. Analysis needs to consider any and all potential contributing factors.

These events will supplement information provided by participant self-report at 20 and 25 years, from primary care medical record review conducted during the 20 year follow-up and from mortality flagging, together with detailed clinical measurements made at the SABRE clinics at baseline, 20 and 25 year follow-up. The SABRE cohort is increasingly elderly (median age of survivors in 2016=77 years, range 65-98) and at visit 3, although many are willing and able to visit UCL’s clinic and/or to complete questionnaires, many who attended at the last follow-up 5 years ago are now too frail or unwell to attend the 25 year follow-up clinic or to complete the health and lifestyle questionnaires, and sadly many have died (approximately 1,500 (31%)). Diagnosis of disease events/states identified during admission to hospital is increasingly important in assessing health in this elderly cohort and will inform all key event outcomes. This is particularly important in assessing health in those otherwise lost to direct follow-up. The data will be used to analyse risk factors measured in mid- and later life in association with these incident events in order to build on current understanding of causal mechanisms.

Data from 1989 to the present is required because participants underwent detailed examinations at baseline (1989-91) and the aim is to follow this cohort through their experiences since to understand what happened in later life and relate that to the baseline. This will enable UCL to gain as complete as possible a picture of hospital admissions, and hence incident events, over the entire cohort follow-up. Data from the entire study period are crucial for determining age of onset of events, as well as the extent and nature of ill-health from mid to later life, and for relating these to current and mid-life cardiometabolic and other risk factors and how these influence the key study outcomes of physical and cognitive function in older life in each of the three ethnic groups.

Yielded Benefits:

The research has also enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). Findings include: - Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity. - Lack of adherence to four combined health behaviours was associated with a 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours. - The study has also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. - The study reported detrimental associations between air pollution (particulate measures) and cardiovascular disease mortality in both the SABRE and NHSD cohorts. The study highlighted ethnic differences in associations between prediabetes in midlife and later development of coronary heart disease and stroke. - The study has confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Expected Benefits:

The rich phenotypic and genotypic dataset gathered over a 25 year period will enable analyses assessing mid-life predictors of health and ill-health in older age and will enable unique analyses of how these associations may be related to ethnicity and migration. Good physical and cognitive functions are vital to healthy ageing and factors which influence these across the life course are poorly understood, particularly in non-European populations. As the cohort is reaching older age, an increase in risk of heart failure, which can be severely debilitating, is expected. Ethnic differentials in heart failure rates are not well studied to date. Increasing length of follow-up and novel analytic techniques, both statistical and relating to stored images and samples bring opportunities for more sophisticated analyses and the addition of hospital admission data to key outcome variables enhances the study’s power to identify events and to further elucidate mechanisms underlying the very marked ethnic differences in cardiometabolic disorders which were observed at visit 2.

Understanding of mechanisms in people of different ethnicities will ultimately lead to appropriate preventive strategies and treatments at different stages of life.

As noted with some detailed examples under ‘specific outputs’, previous use of HES data (1989-2011) enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). A brief summary of some of these findings in the cohort to 2011 follows:

Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity.

Lack of adherence to four combined health behaviours was associated with 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours.

The study also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. This is not an exhaustive list of study findings in relation to incident cardiometabolic disease but indicates that the study is building a steady accumulation of understanding of ethnic differentials. There is clear need for further study, which this cohort is uniquely able to address. The addition of HES data to 2016 is key to maximising event ascertainment in old age.

Outputs:

Study findings will continue to be published in peer-reviewed scientific journals, predominantly related to epidemiology, cardiovascular and metabolic disorders, cognitive, physical and psychological function, but also including more generic journals such as the BMJ, reflecting the increasing focus on overall health in older age. Publications will contain only aggregate level data without local identifiers and with suppression of small numbers in line with HES analysis guide.

Publications to date are listed on the study website: www.sabrestudy.org. All publications since 2008 are open-access. The audience is expected to consist mainly of academic researchers and clinicians.

Two examples of previous SABRE study related publications are listed below, both sets of analyses were importantly informed by data from a previous HES extract (no longer retained), and were published in high-impact factor peer-reviewed journals. These generated considerable media interest and are widely cited.

Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, McKeigue PM, Chaturvedi N. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians and African Caribbeans. SABRE (Southall and Brent revisited) – a prospective population based study. J Am Coll Cardiol. 2013 Apr 30;61(17):1777-86. http://dx.doi.org/10.1016/j.jacc.2012.12.046. This paper published in JACC, the world no 1 cardiology journal (impact factor 16.5) confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data, although not directly reported in the manuscript, contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Tillin T, Hughes AD, Godsland IF, Whincup P, Forouhi NG, Welsh P, Sattar N, McKeigue PM, Chaturvedi N. Insulin resistance and truncal obesity as important determinants of the greater incidence of diabetes in Indian Asians and African Caribbeans compared to Europeans? The Southall And Brent REvisited (SABRE) cohort. Diabetes Care 2013;36(2)(383-393). http://care.diabetesjournals.org/content/36/2/383.long. This paper demonstrated the extraordinarily high risk of incident diabetes continuing into old age in South Asians and African Caribbeans in comparison with Europeans. Metabolic pathways leading to diabetes remain poorly understood. The study found that baseline insulin resistance and truncal obesity could explain the ethnic differences in women but not in men. Further work continues to determine the reasons for the excess risk in men and to understand what underlies insulin resistance and truncal obesity. HES data, although not directly reported in this manuscript, supported these analyses by enabling sensitivity analyses to assess the effects of bias due to loss to follow-up.

A further 8 journal publications have examined associations between baseline risk factors and incident coronary heart disease or stroke where the outcomes were a composite of first events identified through participant reported events, primary care record review identified events and HES identified hospital admissions. One of these was published in Circulation (Wurtz et al), impact factor 14.3 and demonstrated in 3 separate population based studies (including SABRE) that metabolite profiling in large prospective cohorts identified phenylalanine, monounsaturated fatty acids, and polyunsaturated fatty acids as biomarkers for cardiovascular risk, substantiating the value of high-throughput metabolomics for biomarker discovery and improved risk assessment. Another publication in Heart (Tillin et al) identified that 2 widely used cardiovascular risk prediction tools (QRISK2 and Framingham) did not perform consistently well in all ethnic groups and suggested that further validation of QRISK2 in other multi-ethnic datasets, and better methods for identifying high risk African Caribbeans and South Asian women, are required.

In addition to journal publications, UCL will continue to submit abstracts for presentation at national and international conferences, such as Diabetes UK, the European Association for the Study of Diabetes, the European Society of Cardiology, Artery, and the British Hypertension Society. All data for abstracts/presentations will be at aggregate level with suppression of small numbers in line with HES analysis guide.

The study team will further disseminate findings via participant and GP feedback sessions; newsletters, and the study website. All data for these occasions will be at aggregate level with suppression of small numbers in line with HES analysis guide.

At the end of the current funding period (2018) a report will be submitted to the funders (the British Heart Foundation) summarising findings. This may be published on their website. It will only contain at most aggregate level data, with small numbers suppressed in line with HES Analysis Guide.

Processing:

The identifiers of SABRE participants have previously been shared with NHS Digital’s predecessor organisation(s) and NHS Digital has provided regular event notifications including notifications of mortality and cancer registrations (for eligible participants only). The cohort was previously split into two groups: cancer notifiable participants and non-cancer notifiable participants.

The cohort will be reorganised into three groups: participants who gave informed consent (cancer notifiable); cancer notifiable participants covered by section 251 support, and non-cancer notifiable participants covered by section 251 support. To ensure that participants are correctly reorganised into the appropriate groups, UCL will send NHS Digital 3 separate files (one for each respective group) containing participant identifiers.

NHS Digital will then provide reports on a monthly basis while the study is in active follow-up. Notifications will contain no participant identifiers other than unique study Pseudo-IDs. Month and Year of Death will also be included.

NHS Digital will link the respective cohort groups to HES data and will supply to UCL encrypted files containing hospital admissions data identified only by study Pseudo-ID and encrypted HESID and containing no other identifiers. The dataset will be placed immediately into UCL’s Data Safe Haven.

Using the Pseudo-ID, the data is linked at record level to the existing dataset of mortality, clinical measures, primary care record review and participant responses to health and lifestyle questionnaires across the course of the study. The data is stored in an encrypted file within the Data Safe Haven at the Gower Street location. The data can be remotely accessed at the Institute of Cardiovascular Science by accredited SABRE study researchers only – all of whom are substantive employees of UCL. Access must be approved by the Data Manager.

The data supplied by NHS Digital will not be downloaded or otherwise transferred from the Data Safe Haven. Data including variables derived from the NHS Digital data may be downloaded from the Data Safe Haven and stored on a UCL server at the Institute of Cardiovascular Science to be used solely for the purposes of statistical analyses in accordance with the study objectives. Such variables include, for example, date of first admission related to a diagnosis of coronary heart disease but will not include any part of the dataset supplied by NHS Digital. Using this pseudonymised dataset, study analysts will examine associations between risk factors measured during the course of the study and cardiometabolic events. The rich phenotypic and genotypic dataset will enable identification of ethnic differences in cardiometabolic disease risk and physical, mental and cognitive function into older age and it will be possible to identify which measured risk factors may explain ethnic differentials and at which period of life they may act most strongly.

To meet study objectives UCL require information on admissions where diagnostic code lists include coronary heart disease, stroke, heart failure, diabetes, renal failure, dementia, retinopathy, hypertension, other cardiovascular disease. Respiratory diseases will also be studied and mental health disorders and other common disorders may be added which are considered to exert important influences on function and well-being in older age. As an example, from the HES extract, and within the UCL Data Safe Haven, it is expected that a variable will be generated which identifies a first or subsequent admission with coronary heart disease (ICD-9 codes 410 through 415 or ICD-10 codes I200 through I259, or any of the following operation codes from the Office of Populations and Surveys classification of interventions and procedures: K401 through K469, K491 through K504, K751 through K759, or U541 (coronary revascularization interventions or rehabilitation for ischemic heart disease)). Date of first or subsequent event would be summarised as year of event.

The data is stored separately to participant identifiers. The two datasets will not be re-linked and the data will remain pseudonymised as described above. Month and Year of Death are stored in the dataset and used for statistical analyses but the dataset does not include full Date of Death. Participant identifiers are retained separately solely for study administration purposes.

MR472 - SABRE: Southall and Brent Revisited - S251 participants — DARS-NIC-91374-Z5V6Y

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Section 251 approval is in place for the flow of identifiable data, National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-04 – 2022-03 2017.09 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-9th-april-2020-final.pdf, IGARD_Minutes_06.04.17.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
Hospital Episode Statistics Admitted Patient Care
MRIS - Members and Postings Report
Civil Registration - Deaths
Demographics
Cancer Registration Data
MRIS - Flagging Current Status Report
MRIS - List Cleaning Report
Hospital Episode Statistics Admitted Patient Care (HES APC)
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

University College London (UCL) requires notifications of mortality and cancer registrations and linked HES data for its study cohort for use in the Medical Research Project: SABRE (Southall And Brent Revisited. This is a population-based cohort study, conducted at University College London, funded by the British Heart Foundation in its current 25 year follow-up phase. It is unique as a long-standing tri-ethnic cohort consisting of people of European descent and first generation migrants of South Asian or African Caribbean descent. This is an academic research study focusing on identifying and understanding the underlying reasons for ethnic group and sex differences in cardiometabolic disease and in physical, psychological and cognitive function in older age.

Specific questions for the 25 year follow-up study are:
1. How large are ethnic /sex differences in cardiac function, cognitive function and hippocampal volumes in older age?
2. To what extent do cardiac function, cognitive function and hippocampal volumes change over a 5 year period in each ethnic group?
3. Which risk factors measured in mid-life and in early old age are most strongly associated with current cardiac and cognitive function and hippocampal volumes and with 5 year changes in these parameters? Can these risk factors explain ethnic differences in cardiac and cognitive function?
4. How large are gender differences in current disorders of cardiac and cognitive function and in their associations with current risk factors?
5. Do ethnic differences in incident cardiometabolic disorders persist into older age?
6. Which risk factors or risk factor profiles measured in mid-life and early old age are most strongly associated with incident cardiometabolic disorders and which best explain ethnic differences in incidence?

The study receives ongoing notifications of mortality and cancer registrations from NHS Digital. Continuing supply of this data is required in order to meet study objectives. Death and cause of death are key outcomes for the research objectives and cancer registrations are also key to understanding ethnic disparities in development and survival from the most frequent types of cancer and how these impact upon function in older age.

The study has previously utilised the List Cleaning service from time to time when in active follow-up in order to ensure that the correct participant addresses are used in order to contact participants. Use of this service has helped the study to avoid trying to contact deceased participants. The List Cleaning outputs were used to update the administration database (held separately from other data within the UCL data safe haven) so that UCL could write to as many participants as possible inviting them to complete questionnaires or come into the UCL clinic for a detailed investigation. Under this Data Sharing Agreement, UCL may retain List Cleaning outputs received previously but is not permitted to make further use of the List Cleaning service.

Linked HES data is required to identify incident cardiometabolic events (in particular coronary heart disease, heart failure, stroke, dementia, diabetes), and other events which may affect physical and cognitive function, which have occurred during the follow-up period. Details of all hospital episodes involving the cohort (not limited to the previously stated conditions) are required to address key study objectives with regard to physical and cognitive function in older age in association with current and mid-life risk factors. Analysis needs to consider any and all potential contributing factors.

These events will supplement information provided by participant self-report at 20 and 25 years, from primary care medical record review conducted during the 20 year follow-up and from cancer and mortality flagging, together with detailed clinical measurements made at the SABRE clinics at baseline, 20 and 25 year follow-up. The SABRE cohort is increasingly elderly (median age of survivors in 2016=77 years, range 65-98) and at visit 3, although many are willing and able to visit UCL’s clinic and/or to complete questionnaires, many who attended at the last follow-up 5 years ago are now too frail or unwell to attend the 25 year follow-up clinic or to complete the health and lifestyle questionnaires, and sadly many have died (approximately 1,500 (31%)). Diagnosis of disease events/states identified during admission to hospital is increasingly important in assessing health in this elderly cohort and will inform all key event outcomes. This is particularly important in assessing health in those otherwise lost to direct follow-up. The data will be used to analyse risk factors measured in mid- and later life in association with these incident events in order to build on current understanding of causal mechanisms.

Data from 1989 to the present is required because participants underwent detailed examinations at baseline (1989-91) and the aim is to follow this cohort through their experiences since to understand what happened in later life and relate that to the baseline. This will enable UCL to gain as complete as possible a picture of hospital admissions, and hence incident events, over the entire cohort follow-up. Data from the entire study period are crucial for determining age of onset of events, as well as the extent and nature of ill-health from mid to later life, and for relating these to current and mid-life cardiometabolic and other risk factors and how these influence the key study outcomes of physical and cognitive function in older life in each of the three ethnic groups.

Yielded Benefits:

The research has also enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). Findings include: - Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity. - Lack of adherence to four combined health behaviours was associated with a 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours. - The study has also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. - The study reported detrimental associations between air pollution (particulate measures) and cardiovascular disease mortality in both the SABRE and NHSD cohorts. The study highlighted ethnic differences in associations between prediabetes in midlife and later development of coronary heart disease and stroke. - The study has confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Expected Benefits:

The rich phenotypic and genotypic dataset gathered over a 25 year period will enable analyses assessing mid-life predictors of health and ill-health in older age and will enable unique analyses of how these associations may be related to ethnicity and migration. Good physical and cognitive functions are vital to healthy ageing and factors which influence these across the life course are poorly understood, particularly in non-European populations. As the cohort is reaching older age, an increase in risk of heart failure, which can be severely debilitating, is expected. Ethnic differentials in heart failure rates are not well studied to date. Increasing length of follow-up and novel analytic techniques, both statistical and relating to stored images and samples bring opportunities for more sophisticated analyses and the addition of hospital admission data to key outcome variables enhances the study’s power to identify events and to further elucidate mechanisms underlying the very marked ethnic differences in cardiometabolic disorders which were observed at visit 2.

Understanding of mechanisms in people of different ethnicities will ultimately lead to appropriate preventive strategies and treatments at different stages of life.

As noted with some detailed examples under ‘specific outputs’, previous use of HES data (1989-2011) enabled improved ascertainment of incident coronary heart disease and stroke events and resulted in 10 publications in high impact journals relating these outcomes to risk factors measured in mid-life (ages 40-70 at baseline). A brief summary of some of these findings in the cohort to 2011 follows:

Diabetes incidence in older British South Asians and African Caribbeans remains at least 2-fold even at age 80 years compared with British Europeans. The ethnic differentials in women were largely explained by midlife truncal obesity and insulin resistance, but the study was unable to explain the ethnic difference in men. The study showed that obesity cut-points of 24 and 27 kg/m2 in South Asians and African Caribbeans respectively were equivalent to a body mass index of 30kg/m2 in Europeans in terms of diabetes risk; these latter analyses contributed to recent NICE guidelines for prevention of diabetes. Diabetes was also found to be more ‘toxic’ in terms of stroke risk in the ethnic minorities. Widely used tools (Framingham and QRISK2) for estimation of cardiovascular disease risk were found to be less precise in South Asians and African Caribbeans (particularly women), while a selection of 3 metabolic markers measured by NMR spectroscopy was found to be strongly predictive of cardiovascular risk regardless of ethnicity.

Lack of adherence to four combined health behaviours was associated with 2 to 3-fold increased risk of incident CVD in Europeans and South Asians. A substantial population impact in the South Asian group indicates important potential for disease prevention in this high-risk group by adherence to healthy behaviours.

The study also found marked ethnic differences in associations between blood pressure parameters and stroke and concluded that undue focus on systolic blood pressure for risk prediction, and current age and treatment thresholds may be inappropriate for individuals of South Asian ancestry. This is not an exhaustive list of study findings in relation to incident cardiometabolic disease but indicates that the study is building a steady accumulation of understanding of ethnic differentials. There is clear need for further study, which this cohort is uniquely able to address. The addition of HES data to 2016 is key to maximising event ascertainment in old age.

Outputs:

Study findings will continue to be published in peer-reviewed scientific journals, predominantly related to epidemiology, cardiovascular and metabolic disorders, cognitive, physical and psychological function, but also including more generic journals such as the BMJ, reflecting the increasing focus on overall health in older age. Publications will contain only aggregate level data without local identifiers and with suppression of small numbers in line with HES analysis guide.

Publications to date are listed on the study website: www.sabrestudy.org. All publications since 2008 are open-access. The audience is expected to consist mainly of academic researchers and clinicians.

Two examples of previous SABRE study related publications are listed below, both sets of analyses were importantly informed by data from a previous HES extract (no longer retained), and were published in high-impact factor peer-reviewed journals. These generated considerable media interest and are widely cited.

Tillin T, Hughes AD, Mayet J, Whincup P, Sattar N, Forouhi NG, McKeigue PM, Chaturvedi N. The relationship between metabolic risk factors and incident cardiovascular disease in Europeans, South Asians and African Caribbeans. SABRE (Southall and Brent revisited) – a prospective population based study. J Am Coll Cardiol. 2013 Apr 30;61(17):1777-86. http://dx.doi.org/10.1016/j.jacc.2012.12.046. This paper published in JACC, the world no 1 cardiology journal (impact factor 16.5) confirmed ongoing excess coronary heart disease incidence in South Asians, with lower incidence in African Caribbeans compared with Europeans and confirmed elevated risk of stroke in both ethnic minority groups. Measured baseline metabolic risk factors could not explain the ethnic group differences. Future work in the cohort will examine whether these ethnic differentials continue into older age and whether newer genetic, epigenetic and metabolomic analyses will add to understanding of the underlying mechanisms. Of particular concern was a much stronger association between diabetes and stroke risk in both ethnic minority groups compared with Europeans with diabetes- an association which is the subject of ongoing study in the SABRE cohort. HES data, although not directly reported in the manuscript, contributed importantly to the identification of incident coronary and stroke events reported in these analyses.

Tillin T, Hughes AD, Godsland IF, Whincup P, Forouhi NG, Welsh P, Sattar N, McKeigue PM, Chaturvedi N. Insulin resistance and truncal obesity as important determinants of the greater incidence of diabetes in Indian Asians and African Caribbeans compared to Europeans? The Southall And Brent REvisited (SABRE) cohort. Diabetes Care 2013;36(2)(383-393). http://care.diabetesjournals.org/content/36/2/383.long. This paper demonstrated the extraordinarily high risk of incident diabetes continuing into old age in South Asians and African Caribbeans in comparison with Europeans. Metabolic pathways leading to diabetes remain poorly understood. The study found that baseline insulin resistance and truncal obesity could explain the ethnic differences in women but not in men. Further work continues to determine the reasons for the excess risk in men and to understand what underlies insulin resistance and truncal obesity. HES data, although not directly reported in this manuscript, supported these analyses by enabling sensitivity analyses to assess the effects of bias due to loss to follow-up.

A further 8 journal publications have examined associations between baseline risk factors and incident coronary heart disease or stroke where the outcomes were a composite of first events identified through participant reported events, primary care record review identified events and HES identified hospital admissions. One of these was published in Circulation (Wurtz et al), impact factor 14.3 and demonstrated in 3 separate population based studies (including SABRE) that metabolite profiling in large prospective cohorts identified phenylalanine, monounsaturated fatty acids, and polyunsaturated fatty acids as biomarkers for cardiovascular risk, substantiating the value of high-throughput metabolomics for biomarker discovery and improved risk assessment. Another publication in Heart (Tillin et al) identified that 2 widely used cardiovascular risk prediction tools (QRISK2 and Framingham) did not perform consistently well in all ethnic groups and suggested that further validation of QRISK2 in other multi-ethnic datasets, and better methods for identifying high risk African Caribbeans and South Asian women, are required.

In addition to journal publications, UCL will continue to submit abstracts for presentation at national and international conferences, such as Diabetes UK, the European Association for the Study of Diabetes, the European Society of Cardiology, Artery, and the British Hypertension Society. All data for abstracts/presentations will be at aggregate level with suppression of small numbers in line with HES analysis guide.

The study team will further disseminate findings via participant and GP feedback sessions; newsletters, and the study website. All data for these occasions will be at aggregate level with suppression of small numbers in line with HES analysis guide.

At the end of the current funding period (2018) a report will be submitted to the funders (the British Heart Foundation) summarising findings. This may be published on their website. It will only contain at most aggregate level data, with small numbers suppressed in line with HES Analysis Guide.

Processing:

The identifiers of SABRE participants have previously been shared with NHS Digital’s predecessor organisation(s) and NHS Digital has provided regular event notifications including notifications of mortality and cancer registrations. The cohort was previously split into two groups: cancer notifiable participants and non-cancer notifiable participants.

The cohort will be reorganised into three groups: participants who gave informed consent (cancer notifiable); cancer notifiable participants covered by section 251 support, and non-cancer notifiable participants covered by section 251 support. To ensure that participants are correctly reorganised into the appropriate groups, UCL will send NHS Digital 3 separate files (one for each respective group) containing participant identifiers.

NHS Digital will then provide reports on a monthly basis while the study is in active follow-up. Notifications will contain no participant identifiers other than unique study Pseudo-IDs. Month and Year of Death will also be included.

NHS Digital will link the respective cohort groups to HES data and will supply to UCL encrypted files containing hospital admissions data identified only by study Pseudo-ID and encrypted HESID and containing no other identifiers. The dataset will be placed immediately into UCL’s Data Safe Haven.

Using the Pseudo-ID, the data is linked at record level to the existing dataset of mortality and cancer records, clinical measures, primary care record review and participant responses to health and lifestyle questionnaires across the course of the study. The data is stored in an encrypted file within the Data Safe Haven at the Gower Street location. The data can be remotely accessed at the Institute of Cardiovascular Science by accredited SABRE study researchers only – all of whom are substantive employees of UCL. Access must be approved by the Data Manager.

The data supplied by NHS Digital will not be downloaded or otherwise transferred from the Data Safe Haven. Data including variables derived from the NHS Digital data may be downloaded from the Data Safe Haven and stored on a UCL server at the Institute of Cardiovascular Science to be used solely for the purposes of statistical analyses in accordance with the study objectives. Such variables include, for example, date of first admission related to a diagnosis of coronary heart disease but will not include any part of the dataset supplied by NHS Digital. Using this pseudonymised dataset, study analysts will examine associations between risk factors measured during the course of the study and cardiometabolic events. The rich phenotypic and genotypic dataset will enable identification of ethnic differences in cardiometabolic disease risk and physical, mental and cognitive function into older age and it will be possible to identify which measured risk factors may explain ethnic differentials and at which period of life they may act most strongly.

To meet study objectives UCL require information on admissions where diagnostic code lists include coronary heart disease, stroke, heart failure, diabetes, renal failure, dementia, retinopathy, hypertension, other cardiovascular disease. Respiratory diseases will also be studied and mental health disorders and other common disorders may be added which are considered to exert important influences on function and well-being in older age. As an example, from the HES extract, and within the UCL Data Safe Haven, it is expected that a variable will be generated which identifies a first or subsequent admission with coronary heart disease (ICD-9 codes 410 through 415 or ICD-10 codes I200 through I259, or any of the following operation codes from the Office of Populations and Surveys classification of interventions and procedures: K401 through K469, K491 through K504, K751 through K759, or U541 (coronary revascularization interventions or rehabilitation for ischemic heart disease)). Date of first or subsequent event would be summarised as year of event.

The data is stored separately to participant identifiers. The two datasets will not be re-linked and the data will remain pseudonymised as described above. Month and Year of Death are stored in the dataset and used for statistical analyses but the dataset does not include full Date of Death. Participant identifiers are retained separately solely for study administration purposes.

Whitehall II (MR262) — DARS-NIC-346693-F2X1G

Opt outs honoured: Y, N, Yes - patient objections upheld, No - data flow is not identifiable, Yes, No (Excuses: Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.,

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2018-06 – 2021-06 2017.09 — 2025.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 15 June 2023 final.pdf, IGARD_Minutes_06.07.17.pdf, DAAG_Minutes_15.09.15.pdf, DAAG_Minutes_06.10.15.pdf, IGARD_Minutes_03.08.17.pdf

Datasets:

Mental Health and Learning Disabilities Data Set
Bridge file: Hospital Episode Statistics to Diagnostic Imaging Dataset
Diagnostic Imaging Dataset
Hospital Episode Statistics Admitted Patient Care
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Scottish NHS / Registration
Hospital Episode Statistics Accident and Emergency
Mental Health Services Data Set
Civil Registration - Deaths
Demographics
Cancer Registration Data
Hospital Episode Statistics Outpatients
Mental Health Minimum Data Set
MRIS - Members and Postings Report
Diagnostic Imaging Data Set (DID)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)
Mental Health and Learning Disabilities Data Set (MHLDDS)
Mental Health Minimum Data Set (MHMDS)
Mental Health Services Data Set (MHSDS)
Civil Registrations of Death
COVID-19 SGSS First Positives (Second Generation Surveillance System)
COVID-19 Vaccination Status
Emergency Care Data Set (ECDS)
HES-ID to MPS-ID HES Accident and Emergency
HES-ID to MPS-ID HES Admitted Patient Care

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The Whitehall II study was setup in 1985 as a prospective cohort project to explore the relationship between socio-economic status, stress and cardiovascular disease. The study, based at University College London (UCL), recruited civil servants working in London. The participants were sent a self-completion questionnaire covering a wide range of topics, and underwent a comprehensive clinical examination.

Since 1985 there have been eleven phases of data collection of similar nature. These data have always been collected on the original cohort recruited in 1985, and no additional recruitment of participants has taken place since then. In addition to cardiovascular measures, the Whitehall II study have over the years added further measures to test physical functioning, cognitive functioning, mental health, measures of cortisol levels and new cardiovascular tests such as Heart Rate Variability (HRV) and Pulse Wave Velocity (PWV).

There are three distinct aspects to the study:
1) The compilation of research data, which consists of the collection of self-completion questionnaires and medical examination data from the Whitehall II cohort participants. Medical data and mortality data from this cohort are also obtained through data linkage with external data sources such as NHS Digital and ONS. The totality of these data are compiled into the Whitehall II research database for use as a research resource;

2) Public health research studies undertaken within the scope of the Whitehall II study, which aim to answer specific questions and are primarily funded by grants from the Medical Research Council and the British Heart Foundation. Further studies have been funded by the European Commission Horizon 2020 and the Economic and Social Research Council. No raw or record level NHS Digital data is shared with funders.

3) Making pseudonymised data available to the scientific community for use in UCL-approved research studies beyond the scope of the Whitehall II study. This will not include NHS Digital data. Any data supplied to third parties, whether as part of the EU-funded LIFEPATH project or for any other purpose will comprise of:
a. Self-reported data provided voluntarily by the participants; and/or
b. Variables derived from the ONS mortality data. Specifically ‘yes’ or ‘no’ indicators to indicate if the participant is deceased and, if so, if specific causes of death were applicable or not; and/or
c. Verified self-reported clinical events in the form of ‘yes’ or ‘no’ variables to indicate if the participant is has had a specific clinical event such as a stroke, cancer or CHD episode.

Regarding the last item, HES, Mental Health and Cancer registration data are used solely for the purpose of verifying self-reported data and are not included in any datasets shared with third parties. As an example, if a participant self-reported a stroke, the applicant would cross-check the data with the HES data to verify the diagnosis. If verified, the research data that could potentially be made available to third parties would include a ‘yes’/’no’ n indicator confirming the self-reported stroke.

The data will be used for public health research purposes. Based on 30 years of follow-up, the aim is to examine the interrelationships between biological, psychosocial and behavioural factors in the ageing process, and identify key determinants of late life depression, cognitive decline, chronic disease, and physical functioning. The study’s healthy ageing cohort is an ideal platform for studying primary prevention of vascular disease (CHD, stroke) and diabetes. The cohort is now aged 62-84 years and is measured for age-related physical and cognitive functioning and mental health. The study contributes to the evidence on the potential for preventing functional decline through therapeutic risk factor reduction and behavioural interventions.

Self-reported clinical events data are open to major limitations of bias, including missing responses and attrition. Therefore, since 1997, UCL have supplemented the self-reported events with information extracted from GP and paper hospital notes, and also with data provided by NHS Digital.

The Whitehall II MRC grant (K013351, Adult Determinants of Late Life Depression, Cognitive Decline and Physical Functioning - The Whitehall II Study of Ageing) has dementia, disability and depression as the outcome variables. In order to be able to study these outcomes in older individuals it is crucial to have complete data from all possible sources. Data on psychiatric conditions are important outcomes in their own right, but it is also needed to study other conditions. For example, the diagnosis of dementia involves ruling out major psychiatric disorder as an underlying condition for the observed clinical phenotype. In order to do external data on psychiatric conditions is required. This cannot possibly be achieved without access to the Mental Health and Learning Disabilities Data Set (MHLDDS). In addition, information on clinical procedures, such as brain MRI or CT, is important to evaluate the validity of dementia diagnosis and changes in diagnostic testing over time, a potential source of bias that needs to be considered longitudinal analyses. For this reason, records from the Digital Imaging Dataset are also needed for the Whitehall II dementia project.

Yielded Benefits:

The key benefit to the public/patients is that linkage to English and Welsh records is helping expand the knowledge base to which the Whitehall II study has already contributed. UCL’s analyses will continue to generate evidence to improve public health policies, clinical guidelines, health care professionals, workplaces and promote healthier lifestyles in the general public for the benefit of patients and the healthcare system. Evidence from previous benefits: Whitehall II have contributed evidence to current clinical guidelines, such as the ‘European Guidelines on Cardiovascular Prevention in Clinical Practice’ (see Eur Heart J 2012;33:1635-1701 and Eur Heart J 2016;37:2315-2381) and the ‘Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association’ (see Stroke. 2014;45:2160-2236). UCL have used Whitehall II data in their state-of-the-art reviews on prediabetes (Tabak A,Kivimaki M. Lancet 2013; 379:2279-2290) and stress (Steptoe A, Kivimaki, M. Nature Reviews Cardiology 2012;9(6):360-70 and Steptoe A, Kivimaki, M . Annu Rev Public Health 2013;34:337-54.) to inform health professionals and policy makers in the UK and elsewhere. The Whitehall II study has contributed evidence to the World Health Organization (WHO) policy documents for reducing social inequalities in heath globally (Commission of Social Determinants in Health 2008) and the European and the UK reviews of inequalities and working conditions (Review of Social Determinants and the Health Divide in the WHO European Region, updated in 2014 and Fair Society, Healthy Lives, 2010) and developed a guide for evidence-based public health in a project led by the UK National Institute of Clinical Excellence (NICE, Killoran 2009). In addition, the study has provided evidence to European Union Occupational Safety and Health recommendations and contributed to priority settings in occupational health research at a European level (https://osha.europa.eu/en/tools-andpublications/publications/e-facts/efact18/view; https://osha.europa.eu/en/tools-andpublications/publications/reports/management-psychosocial-risks-esener;https://osha.europa.eu/en/tools andpublications/publications/reports/summary-priorities-for-osh-research-in-eu-for-2013-20). Furthermore, the Whitehall II study has contributed evidence to the American Heart Association prevention policy, which in turn influences UK policy (American Heart Association Behavior Change Committee of the Council on Epidemiology and Prevention, Council on Lifestyle and Cardiometabolic Health, Council for High Blood Pressure Research, and Council on Cardiovascular and Stroke Nursing. ‘Better population Health through behaviour change in adults: a call to action’ Circulation. 2013 Nov5;128(19):2169-76). The paper on long working hours and stroke (Lancet. 2015 Oct 31;386(10005):1739-46) was referenced by WHO (Preventing disease through healthy environments: a global assessment of the burden of disease from environmental risks. World Health Organization. http://www.who.int/iris/handle/10665/204585) and received widespread media coverage (rated the 12th in the 100 in the world Altmetric ratings). Two Whitehall papers contributed evidence to NICE guidelines on Dementia (NG16) published in October 2015. The papers were referenced in “Dementia, disability and frailty in later life – mid-life approaches to delay or prevent onset”. (Sabia S, Singh‑Manoux A, Hagger‑Johnson G et al. (2012) Influence of individual and combined healthy behaviours on successful aging. Canadian Medical Association Journal doi: 10.1503/cmaj.121080; Singh‑Manoux A, Marmot MG, Glymour M et al. (2011) Does cognitive reserve shape cognitive decline? Annals of Neurology 70: 296–304) More recently, a Whitehall II paper on alcohol consumption and cognitive decline ‘Moderate alcohol consumption as risk factor for adverse brain outcomes and cognitive decline: longitudinal cohort study’ (BMJ 2017;357:j2353) received widespread media coverage, demonstrating the neurotoxicity of alcohol consumption (the paper was rated in the top 5% of all outputs scored by Altmetric).

Expected Benefits:

UCL’s analyses will continue to generate evidence to improve public health policies, clinical guidelines, health care professionals, workplaces and promote healthier lifestyles in the general public for the benefit of patients and the healthcare system.

Evidence from previous benefits:
Whitehall II have contributed evidence to current clinical guidelines, such as the “European Guidelines on Cardiovascular Prevention in Clinical Practice” (see Eur Heart J 2012;33:1635-1701) and the “Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association” (see Stroke. 2014;45:2160-2236). UCL have used Whitehall II data in their state-of-the-art reviews on prediabetes (Tabak A,…Kivimaki M. Lancet 2013; 379:2279–2290) and stress (Steptoe A, Kivimäki M. Nature Reviews Cardiology 2012;9(6):360-70 and Steptoe A, Kivimäki M. Annu Rev Public Health. 2013;34:337-54.) to inform health professionals and policy makers in the UK and elsewhere.

UCL have contributed evidence to the World Health Organization (WHO) policy documents for reducing social inequalities in heath globally (Commission of Social Determinants in Health 2008) and the European and the UK reviews of inequalities and working conditions (Review of Social Determinants and the Health Divide in the WHO European Region, updated in 2014 and Fair Society, Healthy Lives, 2010) and developed a guide for evidence-based public health in a project led by the UK National Institute of Clinical Excellence (NICE, Killoran 2009). In addition, UCL have provided evidence to European Union Occupational Safety and Health recommendations and contributed to priority settings in occupational health research at a European level (https://osha.europa.eu/en/tools-and-publications/publications/e-facts/efact18/view; https://osha.europa.eu/en/tools-and-publications/publications/reports/management-psychosocial-risks-esener; https://osha.europa.eu/en/tools-and-publications/publications/reports/summary-priorities-for-osh-research-in-eu-for-2013-20).

In addition, UCL have contributed evidence to the American Heart Association prevention policy, which in turn influences UK policy (American Heart Association Behavior Change Committee of the Council on Epidemiology and Prevention, Council on Lifestyle and Cardiometabolic Health, Council for High Blood Pressure Research, and Council on Cardiovascular and Stroke Nursing. “Better population Health through behaviour change in adults: a call to action”. Circulation. 2013 Nov5;128(19):2169-76)

Whitehall II findings on modifiable protective factors and risk factors have generated wide interest in media in the UK and worldwide. For example, the paper on alcohol consumption and cognitive ageing was ranked Altmetric Top 100 in the world in 2014 (http://www.altmetric.com/top100/2014/).

Outputs:

All published outputs will be aggregate with small number supressed in line with the HES analysis and mental health guides.

The Whitehall II researchers will use peer-review journals to report the contribution of midlife inflammatory, vascular, and metabolic factors to chronic disease, depression, cognitive impairment and functional health in later life. They will also assess whether the adoption of healthy lifestyle even at older ages modifies functional trajectories, and aim to develop multi-factorial predictive algorithms, like those developed for cardiovascular diseases, to facilitate early identification of adverse ageing outcomes.

The study dissemination plan, which has been very successful up to now (please see examples below), involves publications in high impact scientific journals, scientific meetings, briefing papers for policy makers, regular and ad hoc meetings with interested parties such as the UK Health Forum.

A research dataset will be created for the UCL study researchers named in the Data Sharing Agreement. It will contain all records with all directly identifiable data removed. It will include the study ID but no personal variables. Any sensitive variables that might identify a participant (such as hospital dates or full ICD-10 codes) will never be published, reported or provided to third parties.

The scientific conclusions of the Whitehall II study will be published in international peer-reviewed journals starting from a few months after the data are available. UCL aim to continue publishing the analyses in journals with high coverage and high impact factor and UCL’s preference is journals with an open access option (web version of the paper available free of charge). Some examples of journals where the Whitehall II researchers have published their results in 2015 are PLoS One, American Journal of Medicine, Stroke, European Heart Journal, Neurology, Nature, Lancet, Epidemiology, British Medical Journal, etc.

A full list of the project publications to date is published on the Whitehall II website (https://www.ucl.ac.uk/whitehallII/publications).

Processing:

Only substantive employees of UCL will have access to the data and only for the purposes described in this document.

UCL will process the ONS data in accordance with the standard ONS terms and conditions.

The Whitehall II study at UCL currently holds sensitive and identifiable data from several sources, all linked to the cohort. These are the Personal Demographics Service (PDS), Cancer Registrations, ONS Mortality, HES admitted patient care, HES outpatient, MHLDDS and HES Accident and Emergency.

UCL have already supplied NHS number, date of birth, and gender to the NHS Digital for linkage.

Linking with electronic health records is at the core of the project, as they provide the objective health outcomes needed for our project. These data will be used by the researchers using a variety of statistical methods to fulfil the study aims described above.

All personal information about the study participants is treated in the strictest confidence in accordance with the Data Protection Act (1998) and the NHS Information Governance requirements. As described in the study NHS IG Toolkit, the study safeguards and security policies ensure appropriate use of all personal information collected. Personal data about study participants (e.g. name, NHS number, contact details, date of birth, GP details, etc) are stored securely on the UCL secure computer network managed by the UCL School of Life and Medical Sciences (SLMS). These data are handled by the Whitehall II administrative and data management personnel and are used only to contact participants.

Clinical information about participants provided by external sources such as from NHS Digital are also stored separately on this secure UCL SLMS area.

Whitehall II researchers do not have direct access to the identifiable records in neither paper nor electronic form. No identifiable personal data will ever be published.

The clinical, questionnaire and medical data collected by the study (including HES, mental health, digital imaging and ONS data) are used for research purposes only. These data are pseudonymised before being moved from the secure area to the research area on the UCL SMLS network. Pseudonymisation is achieved by assigning each participant a unique identifier and by removing all personal information (e.g. name, NHS number, contact details, date of birth, GP details, etc) before the data are added to the database used by the researchers.

A data sharing policy is in place to make the pseudonymised research data available to the scientific community this refers to the trial data and not data supplied by NHS Digital.

All collaborators must be bona-fide scientists with an established record, who will conduct high quality, ethical research. The research files provided to these external collaborators are tailored to their project and are securely transferred for their use only. Any data supplied to third parties, will comprise of:
• Self-reported data provided voluntarily by the participants; and/or
• Aive or dead status flag derived from the ONS mortality data. Specifically ‘yes’ or ‘no’ indicators to indicate if the participant is deceased and, if so, if specific causes of death were applicable or not; and/or
• Verified self-reported clinical events in the form of ‘yes’ or ‘no’ variables to indicate if the participant is has had a specific clinical event such as a stroke, cancer or CHD episode.

Regarding the last item, HES, Mental Health, DID and Cancer registration data are used solely for the purpose of verifying self-reported medical events and are not included in any datasets shared with third parties. As an example, if a participant self-reported a stroke, the applicant would cross-check the data with the HES data to verify the diagnosis. If verified, the research data that could potentially be made available to third parties would include an indicator confirming the self-reported stroke.

Funding arrangements, both UK and non-UK funding, will not include sharing NHS Digital record level data with these funders or permit them to influence the results or dissemination of results.

Recording Antimicrobial Resistance during Death Certification in England — DARS-NIC-734773-Y3P3J

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2025-08 – 2026-08 2025.09 — 2025.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 14th November 2024 final.pdf

Datasets:

Civil Registrations of Death
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)

Type of data: Identifiable

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the following research project:
The AMR-DC study: Antimicrobial Resistance in Death Certification

The following is a summary of the aims of the research project provided by University College London (UCL):
The aim of this project is to calculate the burden of antimicrobial resistance (AMR) on mortality in England. AMR has been declared a top 10 threat to humanity by the World Health Organisation. Precise estimation of the mortality burden of AMR is important, as it raises awareness regarding the magnitude of the problem among relevant stakeholders, including governments and the public. It also allows accurate epidemiological monitoring of drug-resistant infections and correct allocation of resources to address the infections with the highest burden of disease.

Aims
· To describe the burden and trend of AMR-associated mortality in England in the years 2021 through to 2023
· To establish a pathway for public health authorities for calculating AMR-associated deaths in England using routinely collected data

The following NHS England Data will be accessed:
· Hospital Episode Statistics
o Admitted Patient Care
o Critical Care
necessary to calculate the total number of AMR-associated deaths.
· Civil Registration Mortality necessary to identify the cohort.

The level of the Data will be:
· Identifiable identifiers necessary to enable linkage of the data with Second Generation Surveillance System (SGSS) data collected from UK Health Security Agency (UKHSA).

The Data will be minimised as follows:
· Limited to a study cohort identified by NHS England as meeting the following criteria: All patients who had their death registered;
· Between 01/01/2021 and 31/12/2023;
· Limited to the following geographic areas: England

University College London (Great Ormond Street Institute of Child Health) is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the Controller.

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

The funding is provided by UKHSA. The funding is specifically for the study described. The funder will have no ability to suppress or otherwise limit the publication of findings.

UKHSA is a processor acting under the instructions of University College London (UCL). UKHSAs role is limited to receiving identifiable data from NHS England, linking it to SGSS data, removing all identifiers, and then sending these data to UCL.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure back-up of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The London School of Hygiene and Tropical Medicine and the British Society for Antimicrobial Chemotherapy are study collaborators and have advised on the study protocol as experts in the field.

Individuals holding an honorary contract under the supervision of a substantive employee of [Organisation (a named controller or processor)] for the purposes described in this DSA only. UCL must maintain records in a single location that cover the following details of each individual given access under an honorary contract:
Their substantive employer;
Their role in respect of the purpose for the processing specified in the DSA;
The start date and end date of the duration in which the Data will be accessed by the individual under an honorary contract;
The necessity for the Data to be accessed by the person(s) holding an honorary contract, instead of a substantive employee of an organisation named as controller or a processor in this DSA;
Confirmation that an appropriate contract is in place which follows the relevant guidance and is countersigned by the substantive employer of the honorary contract holder.

A Public and Patient Involvement and Engagement group helped refine the purpose of the research. The group supported the collection of the data for the purposes described above. Thirteen PPIE members were consulted through both interviews and questionnaires. PPIE members agreed that this study is important; the need to learn more about drug-resistant infections and protect future generations was emphasized. They agreed that the steps taken by the study team to safeguard patient confidential data were sufficient. PPIE members agreed that the aim of this study justifies temporary access and dissemination of patient identifiable information for the purposes of data linkage. PPIE members agreed that the study poses low risk to patient confidentiality.

Expected Benefits:

The findings of this research study are expected to contribute to:
· evidence-based decision-making for the management of antibiotic-resistant infections, which have been declared a public health emergency by the WHO.
· better quality evidence to the UK Healthcare Security Agency on the epidemiology and burden of antibiotic resistant infections.
· identifying improvements required in the death certification system with regards to antibiotic resistant infections
· advancing understanding of regional and national trends in mortality from antibiotic resistant infections and assess inequality metrics among patients dying from these infections.

The use of the data could:
· help the system to better understand the health and care needs of populations.
· lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
· advance understanding of regional and national trends in health and social care needs.
· advance understanding of the need for, or effectiveness of, preventative health and care measures for particular populations or conditions such as drug resistant infections.
· inform planning health services and programmes, for example to improve equity of access, experience and outcomes.
· inform decisions on how to effectively allocate and evaluate funding according to health needs.
· provide a mechanism for checking the quality of care. This could include identifying areas of good practice to learn from, or areas of poorer practice which need to be addressed.

It is hoped that through journal and media publications of the study findings, this research will add to the body of evidence that is considered by organisations and individual care practitioners with regards to the management of drug resistant infections within the NHS.

The findings will be advertised to the general public through the following charities: British Society for Antimicrobial Chemotherapy, Antibiotic Research UK.

Outputs:

The expected outputs of the processing will be:
· Publications in peer-reviewed journals within 2025 (one or two)
· Conference presentations (oral and poster) within 2025

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
· Social media
· Website
· Newsletter of the British Society for Antimicrobial Chemotherapy
· Journals

Processing:

No data will flow to NHS England for the purposes of this Data Sharing Agreement (DSA).

NHS England will provide the data from the HES and Civil Registrations of Death datasets to UKHSA. The Data will contain directly identifying data items (specifically NHS Number and Date of death) which are required to link the Data at record level with data already held by UKHSA.

UKHSA will extract linked SGSS data comprised of anonymised data and securely transfer the linked data to UCL.

The Data shared directly with UCL will contain no direct identifying data items but will contain a unique person ID which can be used to remove duplicate records.

The de-identified data will be stored on servers at UCL Data Safe Haven.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL. UCL uses offsite data centre services provided by VIRTUS data centre.

The Data will be accessed by authorised personnel via remote access.

The University College London (UCL)(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.
The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England/Wales. The data will not leave England/Wales at any time.

Access is restricted to individuals within the Great Ormond Street Institute of Child Health department of University College London (UCL); who have authorisation from one of the Principal Investigators. All such individuals are employees or agents of UCL.

Access to confidential patient identifiable data is restricted to employees or agents of UKHSA for the purposes of data linkage for this study only.

Employees or agents of UCL Great Ormond Street Institute of Child Health are permitted to access anonymised data only.

Data will be accessed by individuals with an honorary contract with UCL. The individuals will act as an agent of UCL at all times under supervision from employees of UCL. Aside from these individuals, access is restricted to employees or agents of UCL who have authorisation from one of the Principal Investigators.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with SGSS dataset obtained from UKHSA.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Researchers from UCL will analyse the Data for the purposes described above.

MR358 - NATIONAL CHILD DEVELOPMENT STUDY 1958 COHORT — DARS-NIC-147922-T7W2F

Opt outs honoured: Yes - patient objections upheld, N, Yes

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Section 251 approval is in place for the flow of identifiable data, National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 – s261(7), Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007, National Health Service Act 2006 - s251 - 'Control of patient information'., , Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2011-11 – 2026-11 2016.06 — 2025.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

AGD/predecessor discussions: AGD minutes - 27th March 2025 final.pdf, AGD minutes - 21st November 2024 final.pdf, AGD Minutes - 27 June 2024 final.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Scottish NHS / Registration
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
MRIS - Personal Demographics Service
Civil Registrations of Death
Demographics

Type of data: Identifiable

Objectives:

British Cohort Study 1970 (BCS70) is a national longitudinal study which provides the medical and social science community with a set of data comprising information about the lives of its cohort members and their parents. This data set can be used to investigate factors which influence physical, psychological, educational and social development and outcomes.

Since 1970, there have been three subsequent attempts to gather information from the full cohort. With each successive attempt, the scope of enquiry has broadened from a strictly medical focus at birth, to encompass physical and educational development at the ages of ten and sixteen.

Aims and Investigations

1) To monitor child development educational, physical and psychological in the 1970s, in comparison with those made by the two previous surveys during the 1950s and 1960s. There is a desirability of further study of early hospitalisation, e.g. maternal employment, immunisation and vaccination, housing conditions.

2) To analyse via in-depth studies, with comparison of special groups from the 1958 Study, involving 'deprived' children or children in anomalous family situations e.g. children in care, illegitimate, fostered, from one-parent families, or socially disadvantaged.

3) To examine associations between high-risk medical and social factors in the perinatal period and subsequent child development, e.g. smoking in pregnancy, X-rays etc. This type of analysis, as in any longitudinal investigation, involves gathering data on a large range of 'intermediate variables', both social and environmental, which might affect both the 'casual' factor and the outcome.

4) To identify special groups in childhood who fail to use the services provided by DHSS and DES e.g. schooling, health centres, dental care, speech therapy, etc. and the availability of these services to children in different areas, e.g. urban/rural districts.

5) To analyse regional variations in many factors on which data are available. The needs of pre-school children, for example, may vary from one region to the next on account of differences in the degree of urbanisation, level and type of industrialisation, family incomes, educational and housing policies, etc. This type of information would be of great value in the administrative and policy-making areas of local government.

MR104 - Regional Heart Study — DARS-NIC-148411-Q64H8

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251, Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-03 – 2021-03 2017.06 — 2025.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), NEWCASTLE UNIVERSITY, UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY OF NEWCASTLE UPON TYNE

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 3 November 2022 finalv1.pdf, IGARD Minutes - 17 February 2022 Final.pdf, igard-minutes-2nd-august-2018-final.pdf, IGARD Minutes - 12 January 2023 final.pdf, igard-minutes-23-august-2018-final.pdf

Datasets:

MRIS - Cohort Event Notification Report
MRIS - Cause of Death Report
MRIS - Members and Postings Report
Civil Registration - Deaths
Demographics
Cancer Registration Data
MRIS - Flagging Current Status Report
Civil Registrations of Death

Type of data: Identifiable

Objectives:

The data supplied by the NHS IC to UCL Medical School will be used only for the approved Medical Research project

Yielded Benefits:

Research from the BRHS has been used to shape and change many policies on cardiovascular disease prevention, both nationally and internationally- a selection of these are listed below. Previous research is also cited in guidelines produced by professional organisations for treatment of specific chronic conditions, e.g. NICE guidelines, American Heart Association guidelines for prevention of stroke and transient ischemic attack, for management of cardiovascular disease, and management of patients with ventricular arrhythmias, Australian guidelines for management of cardiovascular disease risk, Joint British Societies management of cardiovascular disease guidelines, Endocrine Society guidelines on hypertriglyceridemia and obesity. Evidence generated from the research has also been used to support local public health programmes, for example, in developing initiatives for primary prevention of CVD and dementia in South East London. The research findings have been published in open access peer-reviewed scientific journals related to public health. Cardiovascular and Stroke Prevention - 2000 UK Parliament Select committee on Health, Memorandum by the Stroke Association (TB 17). - 2003 European Society of Cardiology clinical practice guidelines - European Heart Risk Score. Estimation of ten-year risk of fatal cardiovascular disease in Europe: the score project . European Heart Journal (2003) - 2005 Joint British Societies􀍛􀀃Guidelines on Prevention of Cardiovascular Disease in Clinical Practice. - 2004 NICE - Public health guidance on the prevention of cardiovascular disease (CVD) at population level. - 2007 Management of stable angina. SIGN guidance 96. - 2007 Risk estimation and the prevention of cardiovascular disease. SIGN guidance 97. - 2007 WHO Prevention of Cardiovascular Disease Guidelines for assessment and management of cardiovascular risk. - 2008 Management of patients with stroke or TIA: assessment, investigation, immediate management and secondary prevention A national clinical guideline. SIGN National guideline 108. - 2008 European Guidelines for management of ischaemic stroke and transient ischaemic attack. - 2010 Cardiovascular disease prevention Public health guideline [PH25] NICE Guidance. - 2011 European Guidelines for management of ischaemic stroke and transient ischaemic attack. - 2011 AHA / ASA Guidelines for the Primary Prevention of Stroke - 2014 AHA / ASA Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack. - 2014 AHA / ASA Guidelines for the Primary Prevention of Stroke. - 2014 Joint British Societies􀍛􀀃consensus recommendations for the prevention of cardiovascular disease (JBS3). - 2016 European guidelines on cardiovascular disease prevention in clinical practice. Smoking & Passive smoking - 2016 Stopping Smoking: What health professionals should know and how to encourage smokers to quit: British Thoracic Society Tobacco Specialist Advisory Group March 2016. - 2012 Papers examining the health effects of passive smoking using an objective measure of smoke exposure rather than a self report, were published in 2009-10 and received news media coverage. The findings informed the UK 2012 government campaign about the dangers of passive smoking. The BRHS papers are cited in the evidence about passive smoking and risk of CHD, stroke in the updated Surgeon General report in USA. - 2014: The Health Consequences of Smoking - 50 Years of Progress: A Report of the Surgeon General Editors National Center for Chronic Disease Prevention and Health Promotion (US) Office on Smoking and Health. Atlanta (GA): Centers for Disease Control and Prevention (US); 2014. Alcohol - 2010 Dietary guidelines for Americans: Alcohol. - 2012 House of Commons Science and Technology Committee Alcohol guidelines Eleventh Report of Session 2010-12: Volume II Additional written evidence Ordered by the House of Commons to be published 12 and 19 October 2011. Diabetes - 2008 An Endocrine Society Clinical Practice Guideline. Primary Prevention of Cardiovascular Disease and Type 2 Diabetes in Patients at Metabolic Risk. - 2010 Management of diabetes. SIGN National clinical guideline 116. - 2012 Endocrine Society clinical practice guidelines for Hypertriglyceridemia. - 2014 Lipid modification NICE clinical guideline CG181. - 2015 AHA/ ADA. Update on Prevention of Cardiovascular Disease in Adults With Type 2 Diabetes - 2011 ASA/ACCF/AHA/AANN/AANS/ACR/ASNR/CNS/ SAIP/SCAI/SIR/SNIS/SVM/SVS Guideline on the Management of Patients With Extracranial Carotid and Vertebral Artery Disease. - 2012 UK National Screening Committee. The Handbook for Vascular Risk Assessment, Risk Reduction and Risk Management. - 2015 Endocrine Society clinical practice guidelines for the pharmacological management of obesity. - 2015 NICE Clinical Guideline CG 43 Obesity Prevention. Social Determinants - 2015 AHA Scientific Statement Social Determinants of Risk and Outcomes for Cardiovascular Disease . Physical activity (Not a guideline but a resource) - ACSM's Resource Manual for Guidelines for Exercise Testing and Prescription. edited by David P. Swain, ACSM, Clinton A. Brawner. - 2013 IACR Cardiac Rehabilitation Guidelines.

Expected Benefits:

The BRHS has a track record of providing high quality evidence to improve health of the public in UK and internationally. Global trends of ageing populations will acutely increase the health and social care burdens on individuals and society from chronic diseases such as cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life. Therefore, research in a cohort study of older men aims to establish the contributions of potentially important factors (obesity, diabetes, health behaviours, environmental and social factors) to prevent cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life.

To date the study has published over 500 peer reviewed research papers, providing high quality evidence about the epidemiology of these conditions and improving understanding on how to manage, treat and prevent them.

Importantly, these papers have informed evidence based strategies to reduce the health and social care burden in older populations, as outlined in detail in section “Specific output” above. The researchers have contributed to a range of influential UK and international clinical guidelines for management and treatment of important chronic conditions including CHD, stroke, angina, arrhythmias, and diabetes which together cause substantial burdens of ill health in UK and globally, and will continue to contribute with findings from the new data requested.

The specific benefits from the use of the data will be to generate further high quality research evidence about prevention of chronic diseases and to improve the health of older populations. Linking the existing BRHS databases to NHS Digital data will permit the researcher to study a wider range of public health relevant topics. The potential benefits for prevention of cardiovascular disease, diabetes, dementia, other chronic diseases and disability in later life are substantial. Target dates will run from the time of acquiring the data until 2019 with plans to further extend funding for our study.

Outputs:

More than 500 peer-reviewed reports have already been published based on the study which uses mortality data from NHS Digital. It is hoped that research from using the data requested under this Agreement will be published and utilised in the same way.

Research from the BRHS has been used to shape and change many policies on cardiovascular disease prevention, both nationally and internationally, for example, in developing initiatives for primary prevention of CVD and dementia in South East London. The BRHS provide outputs in the form of peer reviewed publications from the research in speciality journals in cardiovascular disease, heart failure, diabetes, stroke and geriatric medicine. It also provides research directly to funding bodies and policy makers (Department of Health, British Heart Foundation, National Institute of Health Research, Medical Research Council, UK Health Forum), clinicians, public health specialists and other health researchers who then use the evidence to develop preventive strategies.

Findings will be further disseminated via national conference presentations including The Society for Social Medicine, the Nutrition Society, the British Geriatric Society, and Public Health England and via international conference presentations including the AHA Epidemiology and Prevention | Lifestyle and Cardiometabolic Health and The International Society of Behavioral Nutrition and Physical Activity meetings.

The Study findings will also be cited in reports by a range of influential national and international public sector bodies including the UK House of Commons Health Select Committee, the UK Department of Health, the U.S. Surgeon General (whose reports inform health policies both in USA and other countries around the world) and the World Health Organisation (e.g. their Guidelines for assessment and management of cardiovascular risk).

All outputs will be restricted to aggregate data with small numbers suppressed in line with the HES Analysis Guide. No publications/ outputs from the British Regional Heart Study have ever presented or will present data which allow the identification of individuals. All data presentation is based on groups of subjects (generally > 50 subjects, often considerably larger numbers). The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

Processing:

UCL have requested continuation of the monthly updates to the cohort regarding cancer registration, date, fact and cause of death.

The BRHS currently receives data from three sources;

1. Study participants- Physical Examinations - 1978-80, 1998-2000, 2010-2012 and regular postal
questionnaires
2. GP record review - Morbidity data collected annually directly from participants GP
3. NHS Digital - Participants flagged in 1978-80 and the study receives Mortality notification &
Cancer registration on a monthly basis via this existing data sharing agreement. HES, MHMDS and DIDs data under NIC-28591-H5Q3X-v0.18

The personal identifiers are held in the data safe haven and access to this is strictly limited to a few named individuals, all substantive employees of University College London (UCL). UCL will provide NHS Digital with the following cohort identifiers for linkage to the datasets:

1) Study ID
2) NHS Number
3) Date of Birth
4) Sex
5) Last known postcode.

NHS Digital will return a pseudonymised dataset to the applicant containing Study ID and match rank code. UCL's Data manager will then link this NHS Digital pseudonymised dataset to the BRHS cohort data Study ID for analysis.

*****
Only study ids are used to link the NHS Digital data to the BRHS cohort data. No Personal identifiers are contained within this dataset. ******

The data will then be made available to the research team of Medical Statisticians, Epidemiologists and Public Health clinicians, to carry out their research analysis. All the researchers working on the data are substantive employees of UCL.

No publications/outputs from the British Regional Heart Study have ever presented or will present data which allow the identification of individuals. All data presentation is based on groups of subjects (generally >50 subjects, often considerably larger numbers). Therefore all outputs will be restricted to aggregate data with small numbers suppressed in line with the HES Analysis Guide.

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement. Data will not be linked with any other sources, other than those specified in this Agreement.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data). There will be no requirement nor attempt to re-identify individuals from the data. All processing of ONS data will be in line with ONS standard conditions. The data from NHS Digital will not be used for any other purpose other than that outlined in this agreement.

Inequalities in cancer care pathways — DARS-NIC-777554-J2V4K

Opt outs honoured: No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2025-05 – 2028-05 2025.08 — 2025.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 20th March 2025 final.pdf

Datasets:

NDRS Cancer Consolidated Data Set

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the following research project:
Inequalities in cancer care pathways

The following is a summary of the aims of the research project provided by UCL:

UCL aim to understand proximal causes of inequalities in cancer outcomes (i.e., survival), within England, the UK, and internationally. Specific objectives are to understand the causes of inequalities in cancer survival, stage at diagnosis, route to diagnosis, and first-line treatment.

UCL are separately requesting data for Northern Ireland, Wales and Scotland from the relevant data owners. UCL will not be combining row-level datasets across jurisdictions but do seek to apply the same statistical analyses in all datasets. For some analyses, this will involve fitting models in one jurisdiction (i.e., England), and then using those coefficients for statistical analyses in other jurisdictions.

For international analyses, UCL will work with partners to run substantively identical analyses in all jurisdictions and use meta-analysis to combine and explore variation in results.

The following NHS England Data will be accessed:
> NDRS Cancer Consolidated Data Set necessary to understand proximal causes of inequalities in cancer outcomes to complete the research project.

The level of the Data will be:
> Pseudonymised

The Data will be minimised as follows:
> Limited to a study cohort identified by NHS England as meeting the following criteria: all patients diagnosed with one of the 21 solid organ cancers that form part of the composite measure used to monitor progress towards the 2028 Government early stage target (oral cavity, oropharynx, oesophagus, stomach, colon, rectum, pancreas, lung, female breast, melanoma, kidney, bladder, uterine cervix, uterus, ovary, prostate, testis, Hodgkin lymphoma, non-Hodgkin lymphoma, thyroid, and larynx) in England. A list of ICD10 cancer site codes for the data to be minimised by has been provided to NHS England by UCL.
> Limited to data between January 2013 to latest available. This data period is required as UCL are specifically interested in trends and changes in stage, route and treatment (and associated changes in survival) and thus request all available years with reasonable quality stage data.
> All of England data required: this is because the study aims to look at geographical differences in inequalities in cancer care pathways

UCL is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because the research aims to examine inequalities in patient diagnosis and treatment, and how these explain differences in survival. As such, it may support improvements in patient care leading to better outcomes and as such is in the public's interest.

The funding comes from multiple sources. Current funders include:
> Cancer Research UK
> International Cancer Benchmarking Partnership
Funding to continue the work described will be sought on an ongoing basis.

The funder(s) will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services (AWS) provides IT hosting services to UCL and will store the Data as contracted by UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

A professor at the Karolinska Institute will be acting as an advisor but will not have access to the Data other than aggregated output information with small numbers suppressed, nor any input into determining the means or purpose of data processing.

Data will be accessed by:
> Substantive employees of UCL
> PhD students enrolled with UCL. The individuals have completed mandatory data protection and confidentiality training and is subject to UCLs policies on data protection and confidentiality. The individuals accessing the data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by the individuals. The PhD student would only work on the data for the purposes described in this Data Sharing Agreement (DSA).
> Individuals holding an honorary contract under the supervision of a substantive employee of UCL for the purposes described in this DSA only. UCL must maintain records in a single location that cover the following details of each individual given access under an honorary contract:
- Their substantive employer;
- Their role in respect of the purpose for the processing specified in the DSA;
- The start date and end date of the duration in which the Data will be accessed by the individual under an honorary contract;
- The necessity for the Data to be accessed by the person(s) holding an honorary contract, instead of a substantive employee of an organisation named as controller or a processor in this DSA;
- Confirmation that an appropriate contract is in place which follows the relevant guidance and is countersigned by the substantive employer of the honorary contract holder.

UCL have an ongoing Patient and Public Involvement (PPI) programme, at which, they discuss planned and ongoing projects with cancer survivors and other members of the public. This project specifically was discussed with four PPI volunteers, including a lung cancer survivor and a person diagnosed with cancer as an emergency, in June 2024, helping crystallise key components of the research. UCL held additional online meetings with two of the four volunteers, including the person with personal experience of emergency diagnosis of cancer, and these meetings helped clarify our research focus. UCL held a further PPI meeting in November 2024, focusing on experience of emergency diagnosis. PPI representatives highlighted various relevant questions, particularly what they perceived as an obvious connection between GP shortages and increased A&E use, something that supports the study's interest in examining area-level measures of diagnostic capacity and function. PPI representatives were also concerned about the geographic and demographic diversity within the dataset, which supports the study team's intention to use UK-wide data and their specific focus on inequalities in diagnosis and care.

Expected Benefits:

UCL expect this study to identify areas of the UK where cancer patients are receiving care that would generally be viewed as unusual, or perhaps inappropriate. This is expected to support further investigations by health professionals to identify the reasons for this deviance from normal practice, and so improve patient care.

The research findings are expected to:
> help the healthcare system to better understand health and care needs of populations
> help identify improvement of treatments or interventions or healthcare system-design to improve health and care outcomes
> advance understanding of regional and national trends in health and social care needs
> inform planning health services and programmes, for example, to improve equity of access and outcomes
> provide a mechanism for checking quality of care which could include identifying areas of good practice to adopt, or areas of poorer practice which should be addressed
> support knowledge creation or exploratory research, along with the innovations and developments that might result from that exploratory work

Benefits to patients are expected to arise from improvements in the quality of care across the UK.

UCL will engage with National Clinical Audits to discuss whether the proposed measures of 'appropriate' care can or should be added to their reporting. UCL will work with Cancer Research UK to inform their own early diagnosis and treatment strategies and would expect to attend CRUK-organised events aimed at disseminating relevant results to key stakeholders (e.g., Cancer Alliance leads).

UCL are co-investigators in the NIHR Policy Research Unit for Cancer Awareness, Screening and Early Diagnosis and, if relevant, will feed key findings directly to DHSC.

Outputs:

The expected outputs of the processing will be:
> Submissions to peer reviewed journals; a minimum of four submissions are expected over the next three years. In reality, UCL expect to publish at least ten papers based on the planned analyses of this data extract.
> Presentations at specific conferences, namely, the CRUK Early Diagnosis conference, the Health Services Research UK conference, the European Network of Cancer Registries conference, and the World Cancer Congress.
> UCL also expect to publish fully anonymous aggregate information for further use in examining trends and in meta-analyses. This may be in the form of dashboards.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
> Journals; aimed at researchers
> Conferences; aimed at researchers and relevant professionals
> Social media; aimed at researchers
> Blogs; aimed at members of the public
> PPIE events; aimed at members of the public
> University and press news articles; aimed at members of the public
> Events targeted at Cancer Alliances
> Engagement with national clinical audits

Processing:

No data will flow to NHS England for the purposes of this Data Sharing Agreement (DSA).

NHS England will provide the relevant records from the NDRS cancer consolidated dataset to the UCL Data Safe Haven (DSH). The Data will contain no direct identifying data items. The Data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

The Data will not be transferred to any other location.

The data will be stored on servers at the UCL DSH.

UCL uses offsite data centre services provided by VIRTUS data centre.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The Data will be accessed by authorised personnel via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:

- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within the UK. The data will not leave the UK at any time.

Data will be accessed by individuals with an honorary contract with UCL. The individuals will act as agents of UCL at all times under supervision from employees of UCL. Aside from these individuals, access is restricted to employees of UCL who have authorisation from the Principal Investigator.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked with some additional information at geographic-level (i.e., sub-ICB or Cancer Alliance), from public data sources. Namely, UCL will link in some information on population structure and on diagnostic test use, including changes in screening. For each patient, UCL will know which sub-ICB or Cancer Alliance they live in, as this is given in the NDRS dataset. UCL will also know the number of faster diagnostic standard referrals per head in the sub-ICB from other data. This geographic information will therefore be linked using this data.

The Data will not be linked with any other data.

The aggregated data with small numbers suppressed derived from the Data will be combined with aggregated data from other UK nations to allow cross-UK analyses. This includes defining case-mix adjustment models in one national dataset and taking the coefficients or weights of that model to other datasets.

Aggregated data with small numbers suppressed derived from this data will also be used to support meta-analyses covering international jurisdictions.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Analysts from UCL and individuals with an honorary contract with UCL will analyse the Data for the purposes described above.

General Health Outcomes in Subfertile Men: a UK register-based cohort study — DARS-NIC-692254-N3J5W

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2024-08 – 2027-07 2025.08 — 2025.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 11 July 2024 final.pdf

Datasets:

Cancer Registration Data
Civil Registrations of Death
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the following research project: General Health Outcomes in Sub fertile Men: a UK register-based cohort study.

The following is a summary of the aims of the research project provided by UCL:

To investigate the risks of long-term malignant and non-malignant health outcomes, as well as early death, in men with known subfertility in the UK.

The proposed study will utilize routinely collected administrative health records to investigate the risk of long-term malignant and non-malignant health outcomes, as well as early death, in subfertile men in the UK and compare them with men with similar demographic and socioeconomic characteristics from the general population. This project will link data held in several large existing national databases and will not involve any direct patient/participant contact. Records of males with known subfertility will be identified through the Human Fertilisation and Embryology Authority (HFEA) Register of couples who underwent non-donor assisted fertility treatments in UK clinics between August 1991 and September 2009. These will then be linked to hospital admissions, cancer, and mortality data held by NHS England to allow investigation of longitudinal health outcomes in this cohort.

The following NHS England Data will be accessed:
Hospital Episode Statistics Admitted Patient Care, Critical Care, Accident & Emergency, and Outpatients necessary to, for example, estimate the risk of non-oncological morbidities such as diabetes or cardiovascular disease, other chronic diseases such as urogenital systemic infections, endocrinopathies and metabolic disorders, respiratory disease, autoimmune diseases and also to estimate the risk of hospital admissions (e.g., rates per year, causes of admission, and length of hospital stay).
Civil Registrations of Death - necessary to estimate, for example, mortality rates and risk of early death
Cancer Registration Data - necessary to, for example, estimate the risk of oncological conditions such as testicular or prostate cancer and other types of cancers such as urinary or digestive system cancers, cancers of the lymphatic or circulatory systems etc.

The level of the Data will be:
Pseudonymised

The Data will be minimised as follows:
Limited to a study cohort identified by HFEA - ART cohort of males with known subfertility identified through the Human Fertilisation and Embryology Authority (HFEA) Register of couples who underwent non-donor assisted reproduction treatments in UK clinics between August 1991 and September 2009.
Hospital data -1997/98 to latest available
Cancer Registration Data and Mortality data- up to latest available
The majority of outcome measures (e.g., malignancies, chronic non-malignant health conditions, early death) being examined in this study are conditions that develop in later life. Accessing data up to the most recent year available will allow UCL to investigate the health effects of subfertility/infertility using a life-course approach. It will provide valuable insight into the longitudinal health of these men which, in turn, can help identify opportunities for early detection and intervention.

UCL is the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

This study is funded by a Wellcome Investigator Award in Science awarded to the principal investigator.

The funder will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services (AWS) is the processors acting under the instructions of UCL. AWS role is limited to secure back-up of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The data will be accessed by substantive employees of the Great Ormond Street Institute of Child Health, University College London (GOS ICH, UCL).

The proposed research was developed in conjunction with members of the British Fertility Society (consisting of andrologists, counsellors, embryologists, endocrinologists, nurses, and other professional groups working in this field) and the Fertility Network UK (the largest fertility patient support network in the UK).

Additionally, UCL also invited men in the general public to share their views by completing a short anonymous survey (available at: https://liftresearchucl.com/surveys/), with the aim of understanding their (1) awareness and concerns about potential adverse health effects associated with male fertility problems and (2) their views on studies such as ours. The survey was distributed via newsletters, social media, and a study-specific website. The group members and survey respondents supported a study of this nature and recognised the importance of collecting data for this purpose and also felt that, in addition to providing critical insight into the health consequences of subfertility, a study of this nature could be considered as an important starting point in addressing the stigma around male subfertility.

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers such as doctors, and patients.

The use of the data could:
help the system to better understand the health and care needs of subfertile males.
Identify opportunities and develop strategies for early detection and prevention of particular conditions in this population.
inform planning health services and programmes to improve outcomes.
inform decisions on how to effectively allocate and evaluate funding according to health needs.
lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
support knowledge creation and exploratory research.

There will be no direct benefits to the study participants. However, the study will have wider benefits to various assisted reproduction technologies (ART) stakeholder groups by addressing the gap in scientific knowledge regarding the long-term health outcomes of sub fertile males. Dissemination of the research findings to researchers and scientists will involve presentation at national and international conferences and publications in peer reviewed medical journals. This will benefit the medical and scientific community by addressing the gaps in knowledge regarding risks associated with infertility. It can also potentially facilitate further research and influence early detection and prevention policy decisions. Dissemination of the research findings to the public (key stakeholders being sub fertile males) will be facilitated through existing collaborations with the HFEA and Fertility Network UK (the leading patient organization supporting people suffering from infertility). This will benefit patients and families affected by subfertility, commissioners and providers, and staff caring for people needing fertility treatments.

It is hoped that, through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

The results will further the understanding of the long-term health outcomes of subfertility in males which, in turn, will allow patients and families affected by subfertility to make informed decisions. It will also enable clinicians and staff caring for people requiring fertility treatments to develop a better understanding of prognosis and provide targeted treatment and appropriate support. Finally, it will allow researchers and health care services to develop strategies for prevention or early identification of high-risk individuals, thereby reducing the potential impact on the NHS burden.

Outputs:

The expected outputs of the processing will be:
Peer-reviewed publications in medical journals. The main results of the study will be submitted for publication in a high-impact general medical journal (such as The Lancet, The Journal of the American Medical Association (JAMA), or the British Medical Journal (BMJ)). Publications will be Open Access as per UCL policy, and freely available via both journal websites and UCL webpages.
National and international conference presentations.
Dissemination of findings via the UCL press department and through the study-specific website (https://liftresearchucl.com/)
Report for the study funder (Wellcome Trust) which will be publicly available via their study webpage on the Wellcome Trust website.
Lay summary report which will be made publicly available via the websites of stakeholder organizations (HFEA, Fertility Network UK; NHS England's Fertility Treatments Advisory Group; the Royal Colleges including Royal College of Obstetrics and Gynaecology and Royal College of Paediatrics and Child Health; Wellcome Trust; and UCL)

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate, in line with the relevant disclosure rules for the datasets from which the information was derived.

Dissemination of the research findings to researchers and scientists will involve presentations at national and international conferences and publications in peer review medical journals, as detailed above. Dissemination of the research findings to the public (key stakeholders being sub fertile males) will be facilitated through existing collaborations with the HFEA and Fertility Network UK (the leading patient organization supporting people suffering from infertility). Dissemination of the research findings to a lay audience will be in the form of a brief research report and a video shared via the websites, newsletters, and social media channels of stakeholder organizations (HFEA, Fertility Network UK, Welcome Trust, and UCL). Research regarding fertility treatments, including previous work published by the research team, has attracted a high level of media interest, and the team anticipates that this will be the case for the proposed study. The team is acutely aware of the potentially harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for couples and parents. The team will work closely with the HFEA, Fertility Network UK, & UCL to coordinate press releases and ensure that information is conveyed accurately and responsibly.

The research team will commence analysing the data as soon as it has been made available. The team would anticipate that the process of data analysis, interpretation, and report writing would take approximately 18 months, and outputs will be generated by early 2025 although this estimate is dependent upon the timeframe for data access approvals being obtained and the data linkage being completed.

Processing:

HFEA will transfer cohort data to NHS England. The data will consist of identifying details specifically Name and Date of Birth to allow linkage of the cohort to NHS England data.

The flow of identifying data from HFEA to NHS England is permitted under the Human Fertilisation and Embryology (Disclosure of Information for Research Purposes) Regulations 2010 which enables HFEA to lawfully share this data with NHS England for the purpose of this research.

NHS England will provide the relevant records from the Civil Registrations of Death, Cancer Registration Data and HES datasets to UCL. The Data will contain no direct identifying data items but will contain a unique person ID which can be used to link with other record level data already held by the recipient (i.e., pseudonymised background fertility data for the males provided by HFEA and transferred directly to UCL).

The Data will not be transferred to any other location.

The data will be stored in the UCL Data Safe Haven (DSH).

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL. UCL uses offsite data centre services provided by VIRTUS data centre.

The Data will be accessed by authorised personnel via remote access and onsite.

UCL must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.
For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The Data will not leave England at any time.

Access to pseudonymised data is restricted to individuals within the Population, Policy and Practice unit of the GOS ICH, who have authorisation from the Principal Investigator. All such individuals are substantive employees of UCL.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with a pseudonymised dataset containing background fertility information for the males. This dataset, provided by the HFEA, will be transferred to UCL directly and will contain the same study id numbers as the dataset obtained from NHS England. This will allow researchers at UCL to link the two datasets for analysis without the need for access to identifiable data.

The UCL research team will not have access to any personal identifiable data at any stage. The identifying details are held by HFEA. All analyses will use the pseudonymised dataset. There will be no requirement and no attempt to re-identify individuals when using the pseudonymised dataset.

Analysts and researchers at GOS ICH, UCL will analyse the Data for the purposes described above.

Virus Watch: Understanding community incidence, symptom profiles, and transmission of COVID-19 in relation to population movement and behaviour — DARS-NIC-372269-N8D7Z

Opt outs honoured: No - consent provided by participants of research study, No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-09 – 2021-09 2020.10 — 2025.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 22 June 2023 final.pdf, IGARD Minutes - 30 September 2021 final.pdf, IGARD Draft Minutes - 22 July 2021 FINAL.pdf, igard-minutes---30th-july-2020-final.pdf, igard-minutes---28th-may-2020-final.pdf.pdf, IGARD Minutes - 2 July 2020 final.pdf, igardminutes-8thoctober2020final.pdf

Datasets:

Covid-19 UK Non-hospital Antigen Testing Results (pillar 2)
COVID-19 Second Generation Surveillance System (Beta version)
COVID-19 Second Generation Surveillance System
Civil Registration - Deaths
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
COVID-19 Vaccination Adverse Reactions
COVID-19 Vaccination Status
HES-ID to MPS-ID HES Outpatients
Civil Registrations of Death
COVID-19 Second Generation Surveillance System (SGSS)
COVID-19 UK Non-hospital Antigen Testing Results (Pillar 2)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
COVID-19 SGSS First Positives (Second Generation Surveillance System)

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The Coronavirus (COVID-19) pandemic has caused large numbers of deaths and impacted lives around the world with the closure of schools, workplaces, and limitations on our freedom of movement. Most current knowledge of the COVID-19 comes from observations at the more severe end of the disease in hospitalised patients. There is currently a lack of understanding of COVID-19 community incidence, symptom profile, severity, infectious period, risk factors, strength and duration of immunity, genetic differences in immune response, asymptomatic infection and viral shedding, household and community transmission risk and population behaviours during periods of wellness and illness (including social contact and movement and respiratory hygiene). This information can only be gathered accurately through large scale community studies. Virus Watch is one of the largest of such studies anywhere in the world and will help to inform NHS planning and the national public health response.

Virus Watch is a household community cohort study. Approximately 42,500 participants will be recruited via a postal invitation or using social media platforms, and asked to fill out a baseline questionnaire, followed by weekly and monthly update questionnaires, all online. Information will be gathered on all members of participating households. There is concern about an increased risk of COVID19 infection and death among people who are from a black or minority ethnic or migrant group. Persons from black and minority ethnic (BAME), and some migrant groups will be oversampled.

The approximate cohort size of 42,500 will consist of a targeted recruitment of 12,500 individuals from BAME groups and 30,000 from the general population. Persons from Poland will also be oversampled. This is because Poland is the most common European country of birth for people born abroad and resident in the UK, and the most common nationality in the UK after British according to the ONS. Polish is also the second most common language spoken in England according to the 2011 Census. The Polish population resident in Britain is therefore a sizable and important minority population that the researchers are interested in in terms of their risks of COVID19 infection.

A subset of 10,000 participants will be recruited for swab and blood sampling to estimate the incidence of COVID-19 infections and development of antibody responses. Participants can also choose to submit geotracking data via their mobile phone.

The data has been requested by University College London (UCL) who are acting as the sole data controller who is also acting as the sole data processor.

The primary purpose of linking the Virus Watch questionnaire data to hospital and mortality data held by NHS Digital is to estimate population-based COVID-19 related hospital visits (accident and emergency attendances and admissions) and deaths, to address objective h). A secondary purpose of linkage to HES data is to examine how social distancing measures have affected routine use of health services (eg planned procedures and outpatient appointments), to address objective g).

The primary purpose of linking VirusWatch to PHE Second Generation Surveillance System (SGSS) and National Pathology Exchange (NPEX)data is to identify any laboratory confirmed infections in the cohort (including individuals who are not part of the 10,000 participant-swabbing cohort), addressing objectives a) and i)-k).

The VirusWatch study has multiple objectives:

a) To measure the frequency of respiratory infection syndromes and related behaviours across the population of England & Wales.
b) To compare the impact in different sociodemographic, occupational and ethnic groups
c) To understand reasons underlying differential mortality impact in different ethnic groups
d) To assess the impact of the pandemic control measures on different population groups
e) To monitor population movement and assess the extent to which public contact increases the risk of infection, and social distancing measures decrease the risk.
f) To assess uptake, compliance with and effectiveness of and impact of recommended COVID-19 control measures
g) To assess the impact of social distancing on routine use of health services
h) To measure the impact of infections on hospitalisations and deaths.
i) To measure the incidence of PCR confirmable COVID 19
j) To measure COVID 19 clinical profiles (including the range of symptoms of COVID19 disease and the proportion of infections that are asymptomatic)
k) To measure the proportion of the population infected after each wave of the pandemic
l) To measure the protective effect of antibodies acquired through natural infection to seasonal and pandemic coronavirus.
m) To assess the accuracy of finger prick blood tests for antibodies to COVID-19 for potential use in COVID-19 control and vaccine effectiveness studies.
n) To measure the extent of pre-symptomatic and asymptomatic viral shedding in household contacts.
o) To ensure availability of specimens to measure the protective effect of T and B cell responses and to assess the value of proteomic analysis in assessing vulnerability to severe infection. proposal

The legal basis for processing personal data for this purpose data at UCL falls under Article 6(1)(e) of the General Data Protection Regulations (GDPR), i.e. “a task carried out in the public interest”. It also falls under Article 9(2)(j), “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”. The processing of data for this study is a task of public interest as it will help with better understanding of the risk posed by COVID-19 such as likelihood of infection, likely symptoms, those at greatest risk of complications and the effect of risk factors such as social contact will help guide proportionate public responses and reinforce public health messaging.

Due to the nature of this study and the urgent national call to set it up as soon as possible, UCL did not involve participants in its design. UCL have previously conducted Patient and Public Involvement to support similar community cohort studies of acute infections using similar methodologies. UCL have engaged the Young Persons' Advisory Group for research at Great Ormond Street Hospital to provide feedback on the Children's Participant Information Sheets. UCL will provide opportunities for survey participants to comment on survey methodology at the first monthly survey and consider revisions based on this. UCL will produce regular newsletters for survey participants.

Expected Benefits:

Virus Watch will provide data relevant to a wide range of audiences involved in pandemic response. Summary data at national and regional level will be presented on open access dashboards so that it is available to all these audiences in a timely way. Audiences include-

1) THOSE PLANNING AND UNDERTAKING PUBLIC HEALTH MEASURES TO MINIMISE TRANSMISSION –Cutting-edge methods to measure contact with others (including time spent at home and work, in social venues, transport modes, conversational contact, household contact) and hand/respiratory hygiene – and determine how these change over time, affect risk of infection and are affected by illness. Development of technologies and pathways for remotely supporting self-testing and self-isolation.
2) THOSE RESPONSIBLE FOR PLANNING THE NHS RESPONSE - Understanding of the number of people affected over time, the range of severity, health care seeking behaviour and the case hospitalisation and mortality ratios will allow better predictions of surges in NHS activity supporting measures such as triaging, cohorting, care outside hospital, cancelling routine activities etc.
3) THOSE PROVIDING FRONT LINE CARE –Improved case definitions by age, sex and ethnicity guiding targeting of diagnostics and contact tracing.
4) ACADEMIC GROUPS INVOLVED IN UNDERSTANDING THE PANDEMIC. Extensive data shared according to the principles of the Joint statement on sharing COVID-19 data.
5) THE GENERAL PUBLIC AND THE MEDIA. Better understanding of the risk posed by COVID-19 such as likelihood of infection, likely symptoms, those at greatest risk of complications and the effect of risk factors such as social contact will help guide proportionate public responses and reinforce public health messaging.

Outputs:

UCL plans to disseminate the outputs through a number of channels:

1) Journal publications (open), including The Lancet, the British Medical Journal (target dates: March 2021, June 2021)
2) Presentations to the Scientific Advisory Group for Emergencies (SAGE), (target dates: October & December 2020, February, April, June 2021); presentation to Department of Health and Social Care (target date: October 2020)
3) Presentations at scientific conferences, including European Respiratory Society Annual Congress, European Society for Paediatric Infections Diseases Scientific meeting, Public Health England Annual conference (target dates: April 2021, September 2021)
4) Regular updates published on the VirusWatch website (http://ucl-virus-watch.net/) and results dashboard, aimed at the general public. These will be published monthly throughout the study.

All outputs will be in aggregate form only with small numbers suppressed in line with the HES analysis guide.

Processing:

UCL will use the Royal Mail Post Office Address File and systematically sample addresses in order to ensure representative samples are taken within each subgroup of interest. There is increasing evidence that people from the BAME group are at greater risk of hospitalisation from COVID-19 and their outcomes are worse. UCL will therefore oversampled participants from BAME groups. UCL will also seek to recruit Polish groups.

UCL will use a commercial company to efficiently send recruitment postcards inviting households to sign up online. UCL anticipate a 25% response rate from the general population yielding approximately 30,000 participants and 15% from the BAME population yielding approx. 12,500 participants (42,500 participants in total). The sample size is required in order to have sufficient statistical power to estimate less
common events, including hospital admissions, in different subgroups, including people from Black and Minority Ethnic backgrounds. Depending on response rate and demographics of those registering, UCL will undertake a second recruitment that allows them to target any underrepresented groups. Invitation postcards will be in English and include a sentence in six languages, pointing participants to the study website containing Patient Information Sheets translated into their language.

All information sheets and consent forms have been translated into 6 languages (Urdu, Bengali, Punjabi, Portuguese, French, Polish). It is not feasible to obtain verbal consent on the phone, given the number of people participating, and in any case written consent is preferred. An online system for consent and participation is the only feasible option for this size of study.

UCL will assess recruitment rates and the representativeness of our sample following the first mail out of 50,000 postcards. If recruitment is lower than expected and or under-representative of the national population the researcher will create a digital recruitment campaign that will use social media adverts on the following platforms: Facebook, Google, Twitter, Instagram, LinkedIn. The social media adverts will have tailored messages that aim to improve our recruitment of BAME communities in addition to achieving an age, sex and geographically representative sample of the UK population. Social media users will receive our recruitment adverts and be directed to our website http://ucl-virus-watch.net/.

In order for a household to be enrolled, they must have an internet connection and email address and all household members must agree to take part. Households will nominate a lead householder with whom the study will communicate and email weekly surveys to. The lead householder needs to be able to read English to support other household members in survey completion. The lead householder will need to be proficient in English in order to answer the weekly and monthly surveys which will be in English only.

The concept of a ‘lead householder’ has been successfully used in previous studies run by UCL, including Fluwatch and Bugwatch (see: https://doi.org/10.1093/ije/dyv370; https://doi.org/10.1136/bmjopen-2018-028676).

UCL will supply NHS Digital with study participants’ Study ID, NHS numbers, names, addresses (including postcodes) and dates of birth for linkage. The cohort would only be submitted to NHS Digital once. The NHS Digital Data linkage team will hold the VirusWatch identifiers throughout the period of the DSA and link the VirusWatch identifiers to HES, Civil Registration Deaths and SGSS datasets on a quarterly basis. Only de-identified data (Study ID and attribute variables from the datasets held by NHS Digital) will be returned to UCL. The linked data supplied by NHS Digital will be uploaded to the DSH by UCL as soon as received. A file transfer mechanism enables information to be transferred into the Safe Haven simply and securely.

UCL are requesting the following data to be linked to the VirusWatch cohort:
• Hospital Episode Statistics (HES) Admitted Patient Care
• HES Critical Care data
• HES Emergency Care Dataset
• HES Outpatient Dataset
• Civil Registration Deaths data - to ensure the researchers capture deaths for all participants, not just those who have a HES record.
• Public Health England (PHE) Second Generation Surveillance System (SGSS) data on confirmed cases of SARS-CoV-2 and other respiratory infections (influenza, respiratory syncytial virus, seasonal coronavirus, adenovirus, rhinovirus, parainfluenza virus, human metapneumovirus).
• National Pathology Exchange (NPEX) (the 'Pillar 2' testing programme) data on results of COVID19 PCR tests carried out by commercial partners (the ‘Pillar 2’ testing programme).

The Virus Watch survey data collection will take place between June 2020 and April 2021. UCL are requesting linked HES and SGSS data for the period January 2020 (the first UK case of COVID-19 was detected in late January) until five years after the end of follow up (2026) as COVID-19 may continue to circulate in the population, and to allow UCL researchers to examine long-term health impacts of COVID-19 infection. UCL will request Civil Registration Deaths data from the start of follow-up up to 5 years after the end of the study.

UCL request that linkage to HES, SGSS and Death data are refreshed quarterly during the period of questionnaire data collection (June 2020 and April 2021). UCL may request less frequent updates after April 2021 (depending on COVID-19 circulation). This will be subject to an amended Data Sharing Agreement with NHS Digital.

All VirusWatch data, including the data requested from NHS Digital will be stored in the UCL Data Safe Haven (DSH). Patient identifying variables (including names and addresses), which are requested from survey participants, will be kept in a different file in the DSH. Identifiers will be kept separate to survey responses, laboratory test results, linked NHS Digital data and geotracking data. The UCL DSH uses Dual Factor Authentication to access and handle data transferred into the DSH service. This ensures that only the named applicants will have access to the data from the DSH.

Only researchers (UCL substantive employees or PhD students), working under appropriate supervision on behalf of the data controller/processor within this agreement will have access to the data and only for the purposes described in this agreement. These individuals are experienced in handling individual-level, sensitive data and complete annual courses in information governance and data protection (a requirement for accessing the UCL DSH).

The data supplied by NHS Digital will not be shared with third parties or linked to any other datasets. UCL have no requirement nor will attempt to re-identify the supplied data.

MR1393 - Join Dementia Research — DARS-NIC-366913-C2V5F

Opt outs honoured: No - consent provided by participants of research study, No - data flow is not identifiable, No, Yes (Excuses: Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(2)(c), Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Research)

Sensitive: Non Sensitive, and Sensitive

When:DSA runs 2019-02 – 2022-01 2017.09 — 2025.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: DEPARTMENT OF HEALTH AND SOCIAL CARE

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 13th November 2025 final.pdf, AGD minutes - 3rd July 2025 Final.pdf, IGARD Minutes - 7th July 2022 final.pdf, igard-draft-minutes-10th-october-2019-final.pdf, igard-draft-minutes-11th-july-2019-final.pdf

Datasets:

MRIS - List Cleaning Report
Demographics

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

The ‘Join dementia research’ register is a national service funded by the Department of Health; it enables members of the public to register to be contacted about potential research studies. In registering they consent for their information to be available to the dementia research community

The link requested to HSCIC information will ensure people who are deceased are removed from the Department of Health (DOH) letter states that the ‘delegation will run up until September 2015’the ‘register’ of potential research volunteers to ensure that no harm or distress is caused by contacting people who have died.
The intention is to send HSCIC information on all volunteers from the register on a monthly or quarterly basis (depending on cost). The HSCIC will simply confirm if any of the volunteers have died by supplying fact of death. No updated demographics will be provided to University College London (UCL).

Yielded Benefits:

Using the NHS List Cleaning Product has yielded several benefits to several parties: • NIHR CRNCC has been able to remove several hundred deceased volunteers, enabling the Register to meet its required standards; • NIHR CRNCC is able to maintain the currency of the JDR Register; • NIHR CRNCC is able to prevent undue distress to JDR Registrants or their families by ensuring the research staff do not contact bereaved families; • NIHR CRNCC is able keep its promise to JDR Registrants and/or their families by ensuring the Registrant’s details are removed from the Register upon death; • The currency of the Register fosters trust between individuals and encourages participants to sign up; • Increased numbers signing up to the Register increases the likelihood of the JDR system being able to meet the PM challenge. The JDR Register ensures a steady supply of research participants to research studies for which they may have been matched. This increases the level of research into dementia, the potential for improving treatments for those with dementia and the likelihood of finding a cure for this terrible disease. • Nearly 40,000 volunteers have signed up to JDR to be contacted about Research opportunities, from these over 11,000 volunteers have been enrolled into research studies. JDR has been used on over 330 research studies in over 250 NHS, University and commercial sites. Data that has been supplied/will continue to be supplied from NHS Digital will not be used in support af a particular PhD or post graduate research study.

Expected Benefits:

The benefits are that ‘Join Dementia Research’ will be able to process data fairly without unintentionally breaching the undertaking given to volunteers that their identifiable information will be removed from the register after their deaths and a reduction in the risk of causing distress by attempting to contact members of the cohort that have deceased.
The following information provides background on ‘Join Dementia Research’:
The service has been running since July 2014, and was nationally launched in February 2015. The benefits described are already being recognized, but they will increase over the next 2-3 years and the register grows.
Benefits of the register -
‘Join dementia research’ has been funded by the Department of Health and is delivered in partnership with the National Institute for Health Research, Alzheimer Scotland, Alzheimer’s Research UK and the Alzheimer’s Society. Its development was prompted by the Prime Ministers Challenge on Dementia, and it’s purpose is to support the PM Challenge target to ensure 10% of all people with dementia are involved in dementia research.
The benefit of this being that:
a. The system enables everyone in the country aged over 18 has an opportunity to express an interest in being involved in research.
b. All dementia research studies taking place in the UK (funded by government, NIHR, charities and commercial organizations) with ethical approval can use the system. Providing a new and improved way of identifying and recruiting volunteers into vitally important dementia research studies.
c. The traditional way of recruiting dementia research volunteers, in through NHS memory clinics. This method takes time, as researchers wait for suitable subjects to come through clinic. Join dementia research removes this barrier, by having volunteers ready and waiting to join studies. All dementia research studies will recruit more quickly, saving time and money. Currently over 70% of research studies exceed recruitment target times, this system will speed up those times.
d. As a result of studies being concluded more quickly, we can ensure that the findings of those studies can be acted upon and implemented or considered for the benefit of patients and the public. The service will also help ensure that studies funded and delivered across the world could be attracted to take place in the UK.
e. The studies look at prevention, diagnosis, treatment, care and potentially cures for people living with dementia.
f. Over the next 12-18 months we expect the service to have attracted 100,000 volunteers and to become the main mechanism by which researchers find study volunteers.
g. The system is already supporting recruitment to 29 studies (over half of all studies on the NIHR CRN portfolio) and has recruited 219 (over 10%) of all participants into research studies which as the PROTECT study at Kings College, an important study which gathers data to support innovative research to improve our understanding of the ageing brain and why people develop dementia, and EXPEDITION 3 and Eli Lilly study which is testing a new medication for people with mild early dementia symptoms.
h. The service was nationally launched in February, it was announced in the media and here is a link to the press release with comments from Secretary of State for Health and Chief Medical Officer http://news.joindementiaresearch.nihr.ac.uk/press-pack-toolkit/
i. It will contribute to delivery of the Prime Minister Challenge on Dementia target of having 10% of all people with dementia involved in research.
www.joindementiaresearch.nihr.ac.uk

Outputs:

The link to HSCIC will lead to the removal of records of deceased patients from the register which enables ‘Join Dementia Research’ to comply with the following undertaking from the consent forms used when recruiting patients:
“I understand that if I withdraw, or after my death, then all identifiable information will be removed from ‘Join dementia research’.”
The timely removal of deceased patients records will reduce the chances of contacting people who are deceased.
The register will be updated at least quarterly and possibly more frequently.

Processing:

University College London Hospitals NHS Foundation Trust (UCLH) will periodically provide HSCIC with lists of identifying details of patients from the register. The lists will include name, date of birth, NHS number, postcode and gender Using its List Cleaning service, the HSCIC will confirm which patients are deceased. The information is then used to remove deceased patients from the register. Once the deceased patients have been removed from the register, the data supplied by the HSCIC will be permanently deleted. The data provided by HSCIC will not be shared, or processed by any third party and no third party can access records of patients deleted from the register to identify which were reported as deceased by the HSCIC.

Evaluation of aid to diagnosis for congenital dysplasia of the hip in general practice: controlled randomised trial — DARS-NIC-309509-L2G1J

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-12 – 2026-12 2025.07 — 2025.07. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: GREAT ORMOND STREET HOSPITAL FOR CHILDREN NHS FOUNDATION TRUST

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 6 July 2023 final.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Identifiable

Objectives:

Great Ormond Street Hospital for Children NHS Foundation Trust (GOSH) requires access to NHS England data for the purpose of the following research project: Hip Dysplasia Screening Programme (HipDys study)

The following is a summary of the aims of the research project provided by GOSH:
Developmental dysplasia of the hip (DDH) is one of the most common congenital abnormalities and early diagnosis is key for successful treatment. Infant hips are initially examined at birth; however, this cannot always detect cases of DDH, therefore all infants undergo a second examination with their General Practitioner (GP) at 6-8 weeks of age. Despite both of these hip checks, 1-2 in 1000 children are still diagnosed late with DDH. As a result, more than 2000 hip replacements are performed every year in the UK because of DDH. It can be harmful both to miss DDH and to incorrectly diagnose infants as having DDH.

It is not well understood why children that do not have DDH are incorrectly diagnosed as having DDH and why some who actually have DDH are not detected early enough. Studies suggest that this may be related to the examiners knowledge, skills and/or the way the hip check consultation is conducted. The Hip Dysplasia Screening Programme (HipDys study) seeks to address these disparities to improve the ability of GPs in evaluating infants hips using a diagnostic aid, which in turn could yield a financial benefit to the NHS by way of eliminating unnecessary referrals to secondary care, appointments and subsequent treatments for DDH. More specifically, the main aim of this randomised control trial is to determine whether the diagnostic aid for DDH reduces the number of clinically insignificant referrals from primary to secondary care, as well as reduce the number of DDH cases diagnosed late.

GPs from 172 GP practices across England, who carry out the 6-week hip check on infants between 42 and 70 days old, will be divided into two groups. Eligible participants will be identified by general practice patient registers and infants will be invited to attend a 6-week check at the practice. One group of GP practices will be given the diagnostic aid, comprising of a video tool and a checklist (HipDyS checklist) to use in all hip checks they carry out. The other group will screen for DDH as normal, without the use of the HipDyS checklist. The two groups will then be compared to see if the first group better identified infants with DDH than the second group. Researchers from the study will also evaluate whether using the checklist reduces costs for families around trips to doctors or hospitals, and costs to the NHS.

The following NHS England data will be accessed:
Hospital Episode Statistics
o Admitted Patient Care necessary because inpatient data will provide details of any inpatient procedures conducted by orthopaedics (e.g. surgery) following the 6-week hip check for infants who have undergone their 6-week check.

The level of the data will be:
Identifiable necessary because although the applicant will pseudonymise the data before analysis, Date of Birth is required to ensure that the data received is matched to the correct infant.

The data will be minimised as follows:
Limited to a study cohort of approximately 16,720 infants between 42 and 70 days old, invited for the 6-week hip check at participating GP surgeries in England.
Limited to data between 1st December 2020 to June 2026. For each individual patient, data will only be provided from the provided hip check date and will be restricted up to the 2nd anniversary of said hip check date.

The study has received Section 251 support from the Confidentiality Advisory Group (CAG) (19/CAG/0198) to access information on the 6-week hip check for infants who meet that criteria. The study team seek to get this information through the HES inpatient (admitted patient care) data set, as well as through central monitoring. The alternative to following up with these infants would be to visit all 172 practices and the hospitals that infants have been referred to and going through the records of at least 110 infants per GP practice, at 2 years however, the study does not have the resources (e.g. staffing or financial) to achieve this.

GOSH is the research sponsor and the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The study is in the public interest as it aims to reduce unnecessary costs to the NHS (thus enabling the NHS to relocate investment) and it aims to reduce unnecessary stress on affected infants and their parents/guardians.

The funding is provided by the National Institute for Health and Care Research (NIHR). The funding is specifically for the HipDyS study as described.

University College London (UCL) is a processor acting under the instructions of GOSH. UCLs role is limited to storing and processing the data in line with the study protocol (as determined by GOSH).

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure back-up of data stored in UCLs Data Safe Haven.

Under the HipDyS study umbrella, a qualitative study will examine the effects of implementing the trial intervention in practice and includes interviews and non-participant observations with a sample of GPs, parents/carers of infants and hospital consultants by a researcher from Kings College London (KCL), who is leading the qualitative study with the University of Bedfordshire. The qualitative study does not include data from NHS England and neither KCL nor University of Bedfordshire have any influence over the NHS England data under this Data Sharing Agreement.

Data will only be accessed by individuals who are substantive employees of UCL.

GOSH will take advice from patient and public involvement (PPI) groups as to the best way to disseminate results to relevant service users. The PPI group are comprised of professionals and parents from GOSH and the STEPs charity (https://www.stepsworldwide.org/). The PPI group have played a vital role in designing the trial. They advised and collaborated with the study to put together the ethics application and developed study materials including the on-line training video for GPs (part of the diagnostic aid), the study questionnaires and the interview schedules for parents/carers and GPs. The PPI group were also involved in the review of the findings from prior research linked to this randomised controlled trial. The PPI group meet with the study team at various points throughout the trial with the outcome of these meetings feeding into the trials steering group meetings.

Expected Benefits:

At present, the number of referrals that are deemed clinically insignificant are higher than those deemed clinically significant. In addition, every 1-2 in 1000 children are lately diagnosed with DDH. The primary aim of this trial is to reduce the number of clinically insignificant referrals to secondary care (referrals that result in immediate discharge and no diagnosis). Additionally, the trial aims to reduce the number of late diagnoses of DDH and increase the accuracy of clinically significant referrals (referrals that result in treatment/monitoring), to improve health outcomes for infants found to have DDH.

Having access to HES data will allow the study team to track children that were assessed using the HipDyS trial checklist and compare their data to those who were not assessed using the checklist. This comparison will allow the trial team to assess whether the diagnostic aid reduces the number of insignificant and or/ late referrals to secondary care. As such, the diagnostic aid will provide a structured approach for GPs to the examination of infants within primary care, thus utilising the referral pathways for those truly in need, with the potential of allocating NHS resources more efficiently in the long term and relieving the impact on the already strained NHS service.

Similarly, the intended benefit to infants and their parent/carers are:
Alleviate unnecessary worry for infants that would have otherwise been incorrectly diagnosed as having DDH
Improve health outcomes of infants who do truly have DDH, as theyre able to be seen and treated by a specialist quicker
Reduction or elimination of travel costs, to unnecessary hospital appointments, for parent/carers of infants that would have otherwise been incorrectly diagnosed.

The use of routine electronic health records within multicentre trials has the potential to both reduce the impact of additional trial-specific GP Practice visits for both clinical research teams and, significantly reduce the costs of central trial coordination and data collection.

If the results of the study are favourable, the team hope to make the diagnostic aid available to GPs by mid 2027 for use when conducting 6-week checks, alongside other current government guidance they may use on conducting these checks.

Outputs:

Trial findings will be disseminated in the following ways:

1. Presentations at national and international academic conferences to ensure members of the academic community within paediatrics, orthopaedic and behavioural psychology (regarding the qualitative and Health Economics studies) are informed These conferences include; The European Paediatric Orthopaedic Society conference, British Society for Childrens Orthopaedic Surgery, Paediatric Orthopaedic Society of North America, International Society of Behavioural Medicine conference and International Society of Behavioural Medicine conference

2. Publication in biomedical journals: the trial will be reported in accordance with the CONSORT (Consolidated Standards of Reporting Trials) statement (www.consort-statement.org). GOSH aim to publish its results in a high-impact general medical journal. GOSH will also contact the free publications received by most UK GPs to ask them to publicise the results. These journals include: The Journal of Bone & Joint Surgery, Journal of Childrens Orthopaedics, Lancet, British Medical Journal, The Journal of Pediatrics, Translational Behavioural Medicine, Implementation Science Journal and British Journal of Health Psychology

3. Royal Colleges: GOSH will ensure the primary care community, physicians and nurses are informed of the results through links with the Royal Colleges of General Practitioners; Surgeons; Paediatrics & Child Health.

4. NHS: All results will be communicated to NHS England and to local Integrated Care Boards, especially the Clinical Reference Orthopaedics and Paediatrics, of which the chief investigator is a member.

5. National Institute for Health and Clinical Excellence (NICE), Clinical Reference Group Orthopaedics NHS England, British Society of Childrens Orthopaedics: All will be aware of trial results.

6. Service Users: GOSH will actively communicate with patient and public involvement (PPI) groups

7. Press: UCL and Great Ormond Street have press offices, which connect medical journalists throughout the global media.

8. Websites: GOSH will publicise results on the UCL website. GOSH will also approach Great Ormond Street Hospital Childrens Charity to link to other web sites.

9. Support groups such as STEPs or MumsNet to make members of the public who have an interest or are affected by hip problems aware of the trial results and their care could be impacted.

10. Several co-applicants have international reputations and may be asked to lecture on various aspects of primary care, paediatric orthopaedics, and research methodology, thus introducing new practices into the relevant medical curriculum.

The specific journals and conferences mentioned will allow for coverage of key stakeholders in the setting of primary care, clinical trial design and clinicians involved in orthopaedics research. The aim is to begin presenting results and submitting papers to journals within months of trial results being established, approx. December 2026.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

UCL will transfer data to NHS England. The data will consist of identifying details (specifically NHS Number, Date of Birth) and a unique person ID for the cohort to be linked with NHS England data.

NHS England data will provide the relevant records from the HES Admitted Patient Care dataset to UCL. The data will contain directly identifying data items (Date of Birth) which are required to confirm the correct link at record level with data already held by the recipient. Additionally, the data flow from NHS England will contain a unique person ID which can be used to link the data with other record level data already held by the recipient.

The data will not be transferred to any other location.

The data will be stored on servers at UCL.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The Data will be accessed by authorised personnel via remote access.
The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.
For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.
The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The data will not leave England at any time.

Access is restricted to individuals within the department of UCL who have authorisation from the study team. All such individuals are substantive employees of UCL.

GOSH is not permitted to access the data.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

The data will be linked at person record level with the reported study data obtained from GPs.

The identifying details will be stored in a separate database to the linked dataset used for analysis. All analyses will use the pseudonymised dataset. There will be no requirement and no attempt to reidentify individuals when using the pseudonymised dataset.

Researchers from the PRIMENT Clinical Trials Unit (https://www.ucl.ac.uk/priment/home/priment-clinical-trials-unit) within UCL will process the data for the purposes described above.

PreHOspital Triage for potential stroke patients: lessONs from systems Implemented in response to COVID19 (PHOTONIC) — DARS-NIC-680546-S0V4K

Opt outs honoured: No (Excuses: Does not include the flow of confidential data, Statutory exemption to flow confidential data without consent)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2023-10 – 2025-10 2024.03 — 2025.07. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 3 August 2023 final.pdf

Datasets:

Civil Registrations of Death - Secondary Care Cut
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the PHOTONIC study.

The following is a summary of the aims of the research project provided by UCL:

The PHOTONIC study aims to assess how prehospital triage for suspected stroke patients was set up, run and experienced by patients, carers and staff during the COVID-19 (Coronavirus Disease 2019) pandemic. It aims to provide robust evidence on how prehospital video triage affects patient outcomes and cost-effectiveness, with the results used to improve care for both stroke patients and patients displaying stroke symptoms (stroke mimics) which were subsequently diagnosed not to be stroke.

Some of the most common stroke mimics are seizures, migraine, fainting, serious infections and functional neurological disorder.

The study will utilise four study areas (North Central London, East Kent, Maidstone, and Darent Valley) where prehospital video triage was implemented since 2020. Throughout the analysis intervention is introduced in different sites at different times and different individuals are in the analysis at each timepoint. This allows for the period before implementation in the intervention sites to act as a further control. The quantitative component of the study will assess how prehospital triage affects patient transfer, care delivery and patient outcomes for stroke patients and mimics using longitudinal data.

Firstly, the study will analyse how prehospital triage was set up, run, and experienced by patients, carers, and staff.

Secondly, the study will analyse the impact of prehospital triage on healthcare services and patient outcomes. NHS England data will be studied to analyse whether introducing prehospital triage results in: more patients being taken to the right hospital service; more patients remaining at home and avoiding hospital admission; patients (stroke and non-stroke) getting the right care more quickly, and if there are any adverse effects of prehospital triage; and improving how well patients do after their stroke.

Thirdly, the study will analyse whether prehospital video triage services are delivering good value for money to the NHS. Analyses will compare performance before and after prehospital triage was introduced. Analyses will also compare the areas that are using prehospital triage with other parts of England that are not currently using it.

The following NHS England data will be accessed:
Hospital Episode Statistics Admitted Patient Care - Necessary to identify patients admitted with strokes and stroke mimic symptoms through ICD-10 diagnosis codes and to be able to obtain the overall length of stay for both subsets of patients
Emergency Care Data Set (ECDS) - necessary to identify which patients went to an emergency department and whether they arrived by ambulance.
o UCL hopes to be able to identify patients with stroke mimic symptoms by identifying those who had an ECDS diagnosis of stroke but no longer a diagnosis of stroke once they were admitted as impatient (after linkage).
Civil Registration Deaths data - necessary to obtain mortality data for both subsets of patients

The level of data will be pseudonymised.

The data for the cohort will be minimised as follows:
Limited to data between April 2019 The latest month available
Limited to a study cohort identified by NHS England limited to conditions relevant to the study identified by specific diagnosis codes;
Limited to individuals (in addition to meeting the diagnosis criteria above) who were aged 18 or over at the start of the episode.
Limited to England - having a larger control group from the whole of England would provide a larger and more representative sample, thereby diluting the effects of potential contamination and heterogeneity in stroke services.
Civil Registration Deaths will be restricted to individuals meeting the ECDS or HES APC criteria as specified above.

UCL is the research sponsor and the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

The study has been funded by the National Institute of Health Research (NIHR) Health and Social Care Delivery Research programme and is specifically for the PHOTONIC study described.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

Patients and the public were central to the study from the outset. The PPIE group includes two stroke survivors and a Stroke Association representative. The patient representatives provided detailed feedback on the wording and content of the research application. This included clarifying the language used in the plain English summary and research questions; patient representatives also influenced how the case for the research was presented (e.g identifying ways in which the research might benefit the NHS at system-level, and encouraging UCL to foreground the need for prompt, appropriate care for both stroke and non-stroke patients), the study design (e.g. helping us clarify the approach to conducting and analysing interviews), and knowledge mobilisation strategy (e.g. identifying several relevant dissemination opportunities).

The team incorporated this valuable feedback throughout. In addition, over the course of preparing this study the team consulted with other patient representatives of the South East Coast Ambluance Service (SECAmb) patient representative group, who confirmed that they are fully supportive of the purpose and approach of this work and have agreed to join the Study Steering Committee. The stroke survivor representatives are full members of the Study team. They will thus participate in the monthly team meetings and contribute to all aspects of the research that they wish to, e.g. research strategy, recruitment documents, interpretation of findings, co-authoring articles and summaries, and presenting at events

Expected Benefits:

It is hoped the findings of this research study will influence UK national guidelines on the use of prehospital video triage for patients with stroke. UCL expect that intervention will improve patient transfer and care delivery, and consequently patient outcomes in terms of length of stay and mortality.

The use of the data could:
Help the system to better understand the health and care needs of populations.
Help inform health economic models which will generate incremental cost effectiveness rations (ICER) When judged against cost effectiveness thresholds, ICERs provide policymakers such as the National Institute for Health and Care Excellence (NICE) with information to inform the cost effectiveness of prehospital triage.
Advance understanding of regional and national trends in health and social care needs.

If the intervention is found not to be effective and/or cost effective, the study will provide the intervention sites with justification to discontinue prehospital video triage for stroke patients. Time and resources can then be used more efficiently elsewhere.

Patients will benefit from findings as policymakers are better informed on the most effective form of emergency care stroke patients. It is crucial to identify whether prehospital video triage results in more efficient and timely care for patients since this will have a positive impact on patient outcomes for stroke patients, and free up resources for the treatment of other patients in the NHS.

The studys results are of urgence as they will inform sites who are yet to roll out prehospital video triage, preventing them from potentially implementing an intervention which is not beneficial.

It is hoped that publication of findings will add to the body of evidence that is considered by those making decisions that affect patients and the public such as organisations setting policy and strategy or health and care professionals delivering individual care.

The results from this study will feed into a wider body of research exploring the use of prehospital video triage in stroke patients. A national pilot study will roll out trial periods of prehospital triage across England in 2023. This pilot is overseen by a group of individuals, of which two are the National Speciality Adviser and GIRFT National Clinical Lead for Stroke who are both members of the PHOTONIC team. There are therefore vital links in place to ensure the results of PHOTONIC reaches the relevant sites

Outputs:

5c. Specific Outputs Expected, Including Target Date:
The expected outputs of the processing will be:
A final report of findings to key stakeholders (July 2024)
Submissions to peer reviewed journals (Open access from July 2024)

The outputs will be communicated to relevant recipients through the following dissemination channels:
A final report of findings to key stakeholders (July 2024)
Submission to open access peer reviewed journals (Open access from July 2024)
Presentations to conferences
Accessible summaries
Short films
Slide sets
PHOTONIC podcast series
Webinars

The outputs will not contain NHS England data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived. Outputs from the analysis are expected July 2024.

Processing:

No data will flow to NHS England for the purposes of this agreement.

NHS England data will provide the relevant records from the HES APC, ECDS and Civil Registration Deaths datasets to UCL. The data will contain no direct identifying data items. The data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient. For the avoidance of doubt, the data from NHS England is not linked with any other data.

The data will not be transferred to any other location and will be stored on the servers at UCL in the Data Safe Haven. Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The data will be accessed onsite at the premises of UCL or by authorised personnel from the Institute of Epidemiology and Health Care at UCL via remote access.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:

- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The data will not leave England at any time.

Access is restricted to individuals within the Institute of Epidemiology and Health Care of UCL who have authorisation from the Principal Investigator of the PHOTONIC Study. All such individuals are substantive employees of UCL. All personnel accessing the data have been appropriately trained in data protection and confidentiality.

The ECDS and HES APC dataset extracts will be sent separately by NHS England to UCL and linked at the person record level using EPIKEY by authorised personnel at UCL.

The patients in these extracts are linked to Civil Registration (Deaths). UCL will receive the full Date of Death (DOD) and on receipt, UCL will derive 30-day and 90-day mortality indicators and destroy the full DOD.

The analysis will use various models to compare mortality, length of stay and discharge destination between patients in intervention and control areas. These results will feed into the health economic decision models.

There will be no requirement and no attempt to reidentify individuals when using the data. Researchers from the Institute of Epidemiology and Health Care at UCL will analyse the data for the purposes described above.

MR737 - ESRC MILLENNIUM COHORT STUDY (MCS) child of the new century — DARS-NIC-147860-0RSHN

Opt outs honoured: N, Yes - patient objections upheld, Yes

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(7), , Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.; Other-National Health Service Act 2006 - s251 - 'Control of patient information'. ,, Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'. ; Other-National Health Service Act 2006 - s251 - 'Control of patient information'. ,, Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'. ; Other-National Health Service Act 2006 - s251 - 'Control of patient information'. ,

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2011-11 – 2026-11 2016.04 — 2025.07. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY OF LONDON (UOL), UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

AGD/predecessor discussions: AGD minutes - 21st November 2024 final.pdf, AGD Minutes - 27 June 2024 final.pdf, IGARD Minutes - 24th June 2021 final.pdf

Datasets:

MRIS - Scottish NHS / Registration
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
MRIS - Personal Demographics Service
Civil Registration - Deaths
Demographics
Civil Registrations of Death

Type of data: Identifiable

Objectives:

ESRC Millennium Cohort Study (MCS) Child of the New CenturyA longitudinal cohort study involving parental interviews of participating babies, when the baby was about 9-10 months old. The main aim is to lay the foundations of a multi-purpose dataset to be used into the future by the researchers. In the short term, it will enable a national study of the childhood in the first years of the new Millennium.The objectives are to:1) Chart the initial conditions of social, economic and health advantages and disadvantages facing new children in the new century, capturing information that the research community of the future will require.2) Provide a basis for comparing patterns of development with that of members of the preceding cohorts.3) Collect information on previously neglected topics, such as father's involvement in the children's care and development.4) Focus on the children's parents as the mist immediate elements of the child's 'background', charting their experience as mothers and fathers of this year's babies, to record how they (and any other children in the family) are adapting to the newcomer, and what their aspirations for his/her future may be.5) Establish intergenerational links including those back to the parent's own childhood.6) Investigate the wider social ecology of the family including, social networks, civic engagement and community facilities and services, splicing in geo-coded data when available.7) Improve the quality and completeness of the health data and linkage on pregnancy and birth by collecting information available on hospital records and birth registrations. Assessing the health and other outcomes of recruited individuals over a life-long period in relation to the social, health and clinical conditions identified around and at the time of birth.

AspECT EXceL- Aspirin Esomeprazole Chemoprevention Trial- Section 251 Subset of the Cohort — DARS-NIC-776114-J7Q9J

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2025-02 – 2028-02 2025.04 — 2025.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 30th January 2025 - Final.pdf

Datasets:

Civil Registrations of Death
Medicines dispensed in Primary Care (NHSBSA data)
NDRS Cancer Consolidated Data Set

Type of data: Identifiable

Objectives:

The University College London requires access to NHS England data for the purpose of the following Clinical Trial:

AspECT EXceL - Aspirin Esomeprazole Chemoprevention Trial EXtension Long-term; for the definitive risks vs benefits.

AspECT EXceL is a phase 4 study that involves long term follow-up of individuals previously recruited into the AspECT trial. There are no further direct patient interventions required. This study aims to capture a single data snapshot.

As part of the original AspECT trial, participants were assigned to receive either:

1) Low dose oral Proton Pump Inhibitor (PPI) (Esomeprazole) and no aspirin
2) High dose PPI (Esomeprazole) and no aspirin
3) Low dose PPI (Esomeprazole) and aspirin
4) High dose PPI (Esomeprazole) and aspirin

Inclusion Criteria:
1. Participants recruited to the original AspECT trial.
2. AspECT participants who have signed an AspECT EXceL consent form
3. Participants who have given consent to allow access to AspECT data

Exclusion Criteria:
1. Participants who are alive and unable or unwilling to give consent to participate in AspECT ExceL
2. Participants unwilling to give consent to allow access to AspECT data
3. Participants not included in the original AspECT trial

Data accessed under this agreement is for the deceased participants only.

Primary Outcomes:
The primary composite endpoint is time to all-cause mortality, oesophageal adenocarcinoma, or high-grade dysplasia, whichever occurs first, between randomisation into AspECT and the single data capture of AspECT EXceL.
This will be analysed with accelerated failure time modelling adjusted for minimisation factors (age, Barrett's oesophagus length, intestinal metaplasia) in all participantsin the intention-to-treat population.

Key Secondary Outcomes:
1. Effects of combined PPI and aspirin on each of the three components of the primary endpoint separately.
2. Time to progression to low grade dysplasia.
3. Death from oesophageal cancer.
4. New solid Gastrointestinal (GI) tumours.
5. COVID-19 related deaths.
6. The Charlson Comorbidity Index (CCI).

The primary composite endpoint in AspECT was time to all-cause mortality, oesophageal adenocarcinoma, or high-grade dysplasia, which was analysed with accelerated failure time modelling adjusted for minimisation factors (age, Barrett's oesophagus length, intestinal metaplasia) in all participants in the intention-to-treat population. University College London now wish to follow up the same endpoints in AspECT trial participants in the AspECT EXceL study.

The data provided will be more accurate and complete than what can be obtained from hospital sites completion of Case Report Forms, which relies on medical records that are often outdated or missing as a result of patients having moved hospitals after the conduct of the original AspECT trial and restructuring of NHS Hospital Trusts. If the analysis confirms that aspirin and PPI use leads to lower rates of oesophageal cancer, this may lead to changes in NICE recommendations/ national guidelines for the use of these drugs in the prevention of oesophageal cancer which is more cost-effective for the NHS compared to the current standard for diagnosis, endoscopy, which often involves 1+ year waiting lists for patients.

The following NHS England Data will be accessed:

Civil Registrations of Death- necessary to support the analysis of time to progression to oesophageal cancer and/or death against Aspirin and PPI use.

NDRS Cancer Consolidated Data Set- necessary to support the analysis of time to progression to oesophageal cancer against Aspirin and PPI use.

Medicines Dispensed in Primary Care- necessary to support the analysis of aspirin and PPI use for the prevention of oesophageal cancer.

The level of the Data will be:
Identifiable

The Data will be minimised as follows:
Limited to a trial- This subset of the cohort consists of 300 individuals (deceased patients of the AspECT EXceL trial)
Limited to data from
- NHSBSA and NDRS Cancer Consolidated Data Set - 2017 to latest available
- CRD- Latest available

The University College London is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

UCL has delegated responsibility for the overall management of AspECT EXceL in the Comprehensive Clinical Trials Unit (CCTU) based at the UCL.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

The funding is provided by the Cancer Research UK (CRUK). The funding is specifically for the trial described.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the

There are multiple clinical collaborators and Patient and public (PPI) involvement representatives involved in the management of the trial. They will not be involved with, or advise on data processing, but are likely to help interpret the aggregated results (with small numbers suppressed) for the purpose of the report.

The Oesophageal Patients Association (OPA) has worked with UCL to support the study throughout its duration, with 4 Patient and Public Involvement (PPI) members who bring broad experience to the PPI group.

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers such as doctors, and patients to inform best practice to improve the care, treatment and experience of health care users relevant to the subject matter of the study.

The use of the data could:
lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
Advance understanding of the need for, or effectiveness of, preventative health and care measures for the Barretts oesophagus patient population who are at risk of developing oesophageal adenocarcinoma.
Inform planning health services and programmes, for example to improve equity of access, experience and outcomes.
Inform decisions on how to effectively allocate and evaluate funding according to health needs.
Support knowledge creation or exploratory research (and the innovations and developments that might result from that exploratory work).

Specific benefits to patients expected are:
- What is the optimum dose, duration or age to begin aspirin therapy for the prevention of oesophageal cancer
- Prevent Barrets patients that are often in 1+ years waiting lists for endoscopies (the current standard for diagnosis), from having undiagnosed and untreated oesophageal cancer for a long time.

It is hoped that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients. E.g. for NICE to make national recommendations for the use of these drugs to prevent cancer.

CRUK has funded the study, the controller has provided regular updates on the study to CRUK. The controller will also generate a report of the study findings at the end of the study, which will support the findings from the original AspECT study which CRUK also funded.

Outputs:

The expected outputs of the processing will be:

A single report of findings to CRUK, the trial funder, upon completion of the trial.
Single submission to peer reviewed journals upon completion of the trial.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Webinars
Social media
Public reports
Industry newsletters
Press/media engagement
Public promotion of the research via journals or presentations
Participant newsletters

The target period to produce these outputs will be soon after the results obtained from the analysis of the data has been completed, approximately August 2025.

Processing:

UCL CCTU will transfer data to NHS England. The data will consist of identifying details specifically NHS Number, Date of Birth, Family Name, Gender and a Study ID for the cohort to be linked with NHS England data.

NHS England will provide the relevant records from the Civil Registrations Deaths, NDRS Rapid Cancer Registrations and Medicines Dispensed In Primary Care datasets to UCL..

The Data will
contain directly identifying data items which are required for analysis purposes and a unique person ID which can be used to link the Data with other record level data already held by the recipient

The Data will be stored on servers at UCL.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The Data will not leave England or Wales at any time.

Access is restricted to individuals within the CCTU at UCL who work on the AspECT EXceL trial.

Data will be accessed by an individual (the Chief Investigator) with an honorary contract with UCL. This individual will act as an agent of UCL at all times under supervision from employees of UCL. Aside from this individual, access is restricted to employees or agents of UCL who have authorisation from Principal Investigator.

Oxford Oncology Clinical Trials Office (OCTO (sponsor of original AspECT study) and hospital sites are not permitted to access the Data.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with their data from the original AspECT study obtained from OCTO, in addition to AspECT ExceL trial data obtained from hospital sites.

The identifying details will be stored in a separate database to the linked dataset used for analysis.

AspECT EXceL- Aspirin Esomeprazole Chemoprevention Trial- Consenting Subset of the Cohort — DARS-NIC-644891-N5T3S

Opt outs honoured: No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2025-02 – 2028-02 2025.04 — 2025.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 30th January 2025 - Final.pdf

Datasets:

Civil Registrations of Death
Medicines dispensed in Primary Care (NHSBSA data)
NDRS Cancer Consolidated Data Set

Type of data: Identifiable

Objectives:

The University College London requires access to NHS England data for the purpose of the following Clinical Trial:

AspECT EXceL - Aspirin Esomeprazole Chemoprevention Trial EXtension Long-term; for the definitive risks vs benefits.

AspECT EXceL is a phase 4 study that involves long term follow-up of individuals previously recruited into the AspECT trial. There are no further direct patient interventions required. This study aims to capture a single data snapshot.

As part of the original AspECT trial, participants were assigned to receive either:

1) Low dose oral Proton Pump Inhibitor (PPI) (Esomeprazole) and no aspirin
2) High dose PPI (Esomeprazole) and no aspirin
3) Low dose PPI (Esomeprazole) and aspirin
4) High dose PPI (Esomeprazole) and aspirin

Inclusion Criteria:
1. Participants recruited to the original AspECT trial.
2. AspECT participants who have signed an AspECT EXceL consent form or have the use of personal data covered by the Data Sharing Agreement (DSA) reference DARS-NIC-776114-J7Q9J (the Section 251 subset of the cohort)
3. Participants who have given consent to allow access to AspECT data or have access covered by DARS-NIC-776114-J7Q9J

Exclusion Criteria:
1. Participants who are alive and unable or unwilling to give consent to participate in AspECT Excel.
2. Participants unwilling to give consent to allow access to AspECT data and are not covered by DARS-NIC-776114-J7Q9J
3. Participants not included in the original AspECT trial

Primary Outcomes:
The primary composite endpoint is time to all-cause mortality, oesophageal adenocarcinoma, or high-grade dysplasia, whichever occurs first, between randomisation into AspECT and the single data capture of AspECT EXceL.
This will be analysed with accelerated failure time modelling adjusted for minimisation factors (age, Barrett's oesophagus length, intestinal metaplasia) in all participantsin the intention-to-treat population.

Key Secondary Outcomes:
1. Effects of combined PPI and aspirin on each of the three components of the primary endpoint separately.
2. Time to progression to low grade dysplasia.
3. Death from oesophageal cancer.
4. New solid Gastrointestinal (GI) tumours.
5. COVID-19 related deaths.
6. The Charlson Comorbidity Index (CCI).

The primary composite endpoint in AspECT was time to all-cause mortality, oesophageal adenocarcinoma, or high-grade dysplasia, which was analysed with accelerated failure time modelling adjusted for minimisation factors (age, Barrett's oesophagus length, intestinal metaplasia) in all participants in the intention-to-treat population. University College London now wish to follow up the same endpoints in AspECT trial participants in the AspECT EXceL study.

The data provided will be more accurate and complete than what can be obtained from hospital sites completion of Case Report Forms, which relies on medical records that are often outdated or missing as a result of patients having moved hospitals after the conduct of the original AspECT trial and restructuring of NHS Hospital Trusts. If the analysis confirms that aspirin and PPI use leads to lower rates of oesophageal cancer, this may lead to changes in NICE recommendations/ national guidelines for the use of these drugs in the prevention of oesophageal cancer which is more cost-effective for the NHS compared to the current standard for diagnosis, endoscopy, which often involves 1+ year waiting lists for patients.

The following NHS England Data will be accessed:

Civil Registrations of Death (CRD)- necessary to support the analysis of time to progression to oesophageal cancer and/or death against Aspirin and PPI use.

NDRS Cancer Consolidated Data Set- necessary to support the analysis of time to progression to oesophageal cancer against Aspirin and PPI use.

Medicines Dispensed in Primary Care- necessary to support the analysis of aspirin and PPI use for the prevention of oesophageal cancer.

The level of the Data will be:
Identifiable

The Data will be minimised as follows:
Limited to a trial- This subset of the cohort consists of 550 individuals who survived and regained capacity in order to provide informed patient consent to participate in the study.
Limited to data from
- NHSBSA and NDRS Cancer Consolidated Data Set - 2017 to latest available
- CRD- Latest available

The University College London is the research sponsor and the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

UCL has delegated responsibility for the overall management of AspECT EXceL in the Comprehensive Clinical Trials Unit (CCTU) based at the UCL.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

The funding is provided by the Cancer Research UK (CRUK). The funding is specifically for the trial described.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the

There are multiple clinical collaborators and Patient and public involvement (PPI) representatives involved in the management of the trial. They will not be involved with, or advise on data processing, but are likely to help interpret the aggregated results (with small numbers suppressed) for the purpose of the report.

The Oesophageal Patients Association (OPA) has worked with UCL to support the study throughout its duration, with 4 Patient and Public Involvement (PPI) members who bring broad experience to the PPI group.

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers such as doctors, and patients to inform best practice to improve the care, treatment and experience of health care users relevant to the subject matter of the study.

The use of the data could:
lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
Advance understanding of the need for, or effectiveness of, preventative health and care measures for the Barretts oesophagus patient population who are at risk of developing oesophageal adenocarcinoma.
Inform planning health services and programmes, for example to improve equity of access, experience and outcomes.
Inform decisions on how to effectively allocate and evaluate funding according to health needs.
Support knowledge creation or exploratory research (and the innovations and developments that might result from that exploratory work).

Specific benefits to patients expected are:
- What is the optimum dose, duration or age to begin aspirin therapy for the prevention of oesophageal cancer
- Prevent Barrets patients that are often in 1+ years waiting lists for endoscopies (the current standard for diagnosis), from having undiagnosed and untreated oesophageal cancer for a long time.

It is hoped that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients. E.g. for NICE to make national recommendations for the use of these drugs to prevent cancer.

CRUK has funded the study, the controller has provided regular updates on the study to CRUK. The controller will also generate a report of the study findings at the end of the study, which will support the findings from the original AspECT study which CRUK also funded.

Outputs:

The expected outputs of the processing will be:

A single report of findings to CRUK, the trial funder, upon completion of the trial.
Single submission to peer reviewed journals upon completion of the trial.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Webinars
Social media
Public reports
Industry newsletters
Press/media engagement
Public promotion of the research via journals or presentations
Participant newsletters

The target period to produce these outputs will be soon after the results obtained from the analysis of the data has been completed, approximately August 2025.

Processing:

UCL CCTU will transfer data to NHS England. The data will consist of identifying details specifically NHS Number, Date of Birth, Family Name, Gender and a Study ID for the cohort to be linked with NHS England data.

NHS England will provide the relevant records from the Civil Registrations Deaths, NDRS Rapid Cancer Registrations and Medicines Dispensed In Primary Care datasets to UCL..

The Data will
contain directly identifying data items which are required for analysis purposes and a unique person ID which can be used to link the Data with other record level data already held by the recipient

The Data will be stored on servers at UCL.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

The Data will not leave England or Wales at any time.

Access is restricted to individuals within the CCTU at UCL who work on the AspECT EXceL trial.

Data will be accessed by an individual (the Chief Investigator) with an honorary contract with UCL. This individual will act as an agent of UCL at all times under supervision from employees of UCL. Aside from this individual, access is restricted to employees or agents of UCL who have authorisation from Principal Investigator.

Oxford Oncology Clinical Trials Office (OCTO (sponsor of original AspECT study) and hospital sites are not permitted to access the Data.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at person record level with their data from the original AspECT study obtained from OCTO, in addition to AspECT ExceL trial data obtained from hospital sites.

The identifying details will be stored in a separate database to the linked dataset used for analysis.

Centre for Longitudinal Studies -Next Steps Study - Mortality — DARS-NIC-431565-K9V9N

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2025-03 – 2026-03 2025.05 — 2025.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: Yes

AGD/predecessor discussions: AGD minutes - 21st November 2024 final.pdf, AGD Minutes - 27 June 2024 final.pdf

Datasets:

Civil Registrations of Death
Demographics

Type of data: Identifiable

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) requires access to NHS England data to support the following research programme:
Centre for Longitudinal Studies - Next Steps Study

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. CLS manages four world-renowned cohort studies; the National Child Development Study 1958 (NCDS), the 1970 British Cohort study (BCS70) the Millennium Cohort Study (MCS) 2000 and now have the Next Steps cohort in their portfolio. Three of these studies (NCDS, BCS70 and MCS) are 'birth' studies, following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being. CLS is an Economic and Social Research Council (ESRC) Centre, based at the Department of Quantitative Social Science, UCL Institute of Education. This Data Sharing Agreement covers data access granted to UCL for the purpose of the Next Steps study.

Next Steps previously known as the Longitudinal Study of Young People in England (LSYPE) is an established longitudinal study which has followed the lives of - originally - people born in 1989/90, since year 9 of secondary school. Study members were interviewed annually between 2004 and 2010. The study was previously managed by the Department of Education (DfE). In 2013 the Economic and Social Research Council took over the funding and the study management legally transferred to the Centre for Longitudinal Studies (CLS) at the University College London (UCL) Institute of Education. The study includes rounds of data collection at different age intervals to assess the progression of participants in different aspects of life. The most recent round of data collection - Next Steps Age 25 survey - took place between August 2015 and September 2016. Information was collected from cohort members on many aspects of cohort members lives such as education, employment, health and well-being, relationships and family life, housing and finances, social participation and attitudes. Data collection focused on young peoples transitions into further/higher education and the labour market or to other outcomes, such as parenthood. Therefore, LSYPE is the largest and most detailed research study of its kind.

Identifiable Civil Registration Mortality and Demographics are requested in support of the following aims:
Update participant's details on the CLS database with a view to 1) preventing seeking contact with those who have died and potentially causing distress to friends and relatives, and 2) preventing study resources from being wasted trying to contact individuals who have since emigrated outside of the UK.
Understand the Mortality outcomes of the Next Steps cohort and investigate how individual behaviours and social or economic determinations of health behaviours such as drug and alcohol use, sexual health, diet and exercise may have influenced outcomes.
Support further research within the CLS
Support further research outside the CLS via sub licencing arrangements

The CLS has previously received data for the purposes of updating the studys database to prevent contact with those who have died or emigrated only. The CLS now requests to use this data for research purposes, and to sublicence this data via the UKDS.

The requested data will be minimised as follows:
Data is limited to the 15,770 individuals included in the Next Steps cohort.

Where data is being shared within the CLS, or is being shared outside the CLS to support further research the data will be minimised on a project by project basis, this minimisation will be evaluated by the CLS Data Access Committee (DAC). The date of death will be de-identified as month and year, cause of death (ICD-10 codes) will may be de-identified by truncation where needed..

UCL is the data controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under UK GDPR is Article 9(2)(j) processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. CLS sought permission from the Confidentiality Advisory Group and obtained S251 approval to trace participants and to use their mortality data for research.

The funding is provided by the Economic and Social Research Council.

The University of Essex is a processor acting under the instructions of UCL. The University of Essex hosts the UKDS, their role is limited to storing the linked pseudonymised data and facilitating access to third-party researchers who have the necessary approvals and contractual measures in place required to access the data.

Only substantive employees whose role supports the provision of the UKDS as a service are permitted to process the data. Should any researchers from the University of Essex wish to use the data deposited within the UKDS for research purposes they will be required to apply for access as per the process described in the Sublicensing section below.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

****Sharing Data Within the CLS****
CLS researchers may request access to the pseudonymised mortality dataset disseminated under this Agreement for research purposes.

CLS researchers requesting access to the data must submit a CLS Data Access Application Form. The form requires that they provide general information about the project including its title, aims and data required. Applicants are required to demonstrate how the project provides a measurable benefit in the provision of health and social care in England.

Should the request be approved the following restrictions apply:

Access will be restricted to CLS researchers who meet the following requirements:
i. The researcher must be substantively employed in the CLS by UCL;
ii. The researcher must be registered with the UKDS;
iii. The researcher must have completed NHS Englands Data Security Awareness course;
iv. The researcher must have submitted a project proposal for review by the CLS Data Access Committee (DAC) and the CLS DAC must have approved the access request;
v. Once, approved, the researcher will sign a licence agreement with CLS and will then be granted access to the relevant subset of data via the UCL Data Safe Haven (DSH

Current proposed projects intend to:
- Document and monitor socio-economic, demographic and other inequalities in cause-specific mortality over time
- Understand the joint progress of morbidity and mortality and to what extend healthy life expectancy keeps pace.
- Investigate the links between mental health and cause specific mortality
- Investigate the links between early life circumstances, childhood characteristics and cause specific mortality.

****Sharing Data via the UKDS Sublicensing ****
Non-CLS researchers may request access to a pseudonymised subset of the mortality data disseminated under this Agreement for research purposes. All access is via the UK Data Service (UKDS).

The process of accessing data, and the licence type under which data is accessed varies depending on the disclosivity of the data, the sensitivity of the data and the potential consequences of the misuse of data. Research data have been classified into Tiers depending on how disclosive, sensitive and risky individual data items are. Full details of how the CLS classify data can be found in the CLS Data Classification Policy
https://cls.ucl.ac.uk/wp-content/uploads/2017/02/CLS_Data_Classification_Policy-1.pdf

** UKDS Special safeguarded data (Tier 1b)**
Special safeguarded data (Tier 1b) have a medium level of potential disclosure risk and/or sensitivity. Examples of mortality fields that fall within tier 1b include the fact of death and the month and year of death (mm/yyyy).

Tier 1b data is accessed via a UKDS Special Licence. A UKDS Special Licence may be granted to successful applicants based within the UK.

The process for access to Tier 1b data is as follows:
1. The researcher registers with the UKDS and signs the UKDS End User Licence Agreement.
2. Applicant submits and signs the UKDS Special Licence application form.
3. By signing the Special Licence application form the researcher agrees with the terms in the UKDS Research Data Handling and Security Guide for Users (https://ukdataservice.ac.uk/app/uploads/cd171-researchdatahandling.pdf)
4. The UKDS Data Access team screens the application to ensure it is properly completed.
5. The UKDS Data Access team sends the application form to CLS for approval by the Data Access Committee (CLS DAC). The CLS DAC have delegated to the CLS Research Data Management (RDM)* team the capability to evaluate and approve Special Licence data access requests. However, the RDM will seek advice and guidance from the Committee where novel issues arise.
6. A member of the CLS RDM team will assess the application on behalf of the CLS DAC and decide to approve it or request further information. The assessment is done against the CLS DAC assessment criteria set out in the CLS DAC Terms of Reference (https://cls.ucl.ac.uk/wp-content/uploads/2023/03/CLS_DAC_Terms_of_Reference.pdf). Should an application be rejected, a researcher can apply again with a revised application.
7. The CLS RDM team will inform UKDS that the project has been approved.
8. Special Licence-approved applications are reported at the next CLS DAC.
9. UKDS will inform the researcher that their project was approved and make the data available to them.
10. The researcher downloads the Special Licence data into their institutional server. Researchers must abide by the conditions laid out in the UKDS Research Data Handling and Security Guide for Users, which includes details of the individual and institutional penalties that are enforceable in the event of a breach of the conditions. They can only merge these data with CLS highly de-identified non-disclosive research data, which are also subject to UKDS Data Sharing Agreement terms and conditions.

*RDM follow the DAC Terms of Reference i.e. they are reviewing against the same criteria

**UKDS Controlled data (Tier 2)**
Controlled data (Tier 2) have a high level of potential disclosure risk (for example exact dates, detailed geographical indicators) and/or high sensitivity. Examples of mortality fields that fall within tier 2 include date of death (dd/mm/yyyy) and cause of death including clinical codes for the cause of death.

Tier 2 data is accessed via the UKDS SecureLab, which is the UKDS Trusted Research Environment (TRE). The UKDS SecureLab supports UK-based research projects only.

The process for access to Tier 2 data is as follows:
i. The researcher submits an application, including the UKDS 'Accredited Researcher application form', the 'Research proposal' and the UCL Licence Agreement to the UKDS.
ii. The UKDS Data Access team screens the application to ensure it is properly completed. Once it is, they forward the application to the CLS.
iii. The CLS Research Data Management (RDM) managers check the Organisational Information Governance and security assurance evidence provided in the UCL Licence Agreement and either requests further evidence to support the request or submits the application for CLS DAC approval.
iv. The CLS DAC assesses both project documents (UKDS project proposal and the UCL Licence Agreement) and decides whether to approve it, not approve it, or they require further information. CLS DAC considerations include an assessment of the expected benefits to health care, adult social care or the promotion of health. Should an application be rejected, a researcher can apply again with a revised application.
v. If the CLS DAC approves the project:
a. CLS RDM team informs UKDS that the project has been approved.
b. The CLS authorised representative signs the UCL License agreement and sends it back to the UKDS to be forwarded to the researcher.
vi. UKDS informs the researcher that their project was approved; sends them a countersigned copy of the UCL Licence Agreement
vii. UKDS makes the data available to the researcher via theUKDS SecureLab account via Multi Factor Authentication. ,
viii. The data will be provided to the researcher via their own UKDS SecureLab project folder, which will contain only the data that the researcher needs to see for their project. The research-linked data provided to researchers are pseudonymised and de-identified, and will never contain identifiable information such as name, address, date of birth, NHS or NI number.
ix. The researcher accessing the data via the UKDS SecureLab will not be able to download any data into their own institutional server. Once the researcher has finished their research, the UKDS will delete the data folder with the tailored dataset for the specific project.
x. Strict disclosure control checks are carried out by the UKDS before any research outputs, e.g. publication, can be extracted from the UKDS SecureLab account.
xi. CLS DAC will publish the information about any data dissemination on the CLS website, including the name of the organisation to which data was provided, purpose (summary of the project) and what data was released. (NB: If CLS DAC does not approve the project, no data will be disseminated). https://cls.ucl.ac.uk/data-access-training/data-access/

Yielded Benefits:

Evidence from Next Steps used in Government Policy on Youth Unemployment. Findings from Next Steps have been used to inform the Governments strategy on getting young people into education, employment or training. The struggling economic has often been cited as the reason for the large number of young adults not in education, employment or training (NEET). However, new research using data from Next Steps have shown that there are other factors that influence whether or not a person spends NEET. The findings showed that education attainment at age 16 is one of the most determining factors in determining a persons future path. Forty-five per cent of those with no GCSEs had spent more than a year NEET by the time they turned 18, compared to just 4 per cent of those with five or more GCSEs at A*-C. Building Engagement, Building Futures lays out the Governments plans to tackle the causes of youth unemployment. This strategy proposes to maximise the participation of 16-24 years old in education, training, and work. More info can be found here. The strategy was developed by the Department for Work and Pensions, the Department for Education, and the Department for Business, Innovation and Skills. It comes in response to recommendations of the influential Wolf Report, which also draws on evidence from Next Steps. (https://nextstepsstudy.org.uk/home/what-have-we-learned/bullying/) Helping the Next generation Findings from Next Steps have been used in several anti-bullying campaigns and initiatives, as well as guidance for teachers and schools about how to stops bullying. And the efforts have paid off. A recent study has shown that secondary school pupils today are less likely to be bullied. Ten thousand fewer pupils are being bullied every day than 10 years ago, a major new study of secondary school pupils has revealed. The Department of Education report compares the experiences of Next Steps study members to members of Our Future, a new study which started following Year 9 pupils in 2013. This landmark research, which involved tens of thousands of young people from 2004 and 2013, is one of the largest of its kind ever undertaken. It shows bullying among Year 9 pupils has fallen dramatically since you were in school. The findings show that: 30,000 fewer pupils said they had been bullied in the last 12 months 30,000 fewer pupils said they had been victims of violent bullying 10,000 fewer pupils reported being bullied every day Speaking before the start of Anti-Bullying Week 2014, Education Secretary Nicky Morgan praised teachers, charities and parents for their efforts. She also urged them to continue their moral mission to further reduce bullying, recognising that many parents consider it their number one concern about what happens at school. Benefits of the mortality data receive Use of the notification of deaths and embarkations have helped to minimise the risk of invitations going to the incorrect address, and contact being made with participants who have died. CLS wishes to maintain a strong relationship with its study participants and to avoid any instances of contacting participants that have passed away which can potentially causes distress to friends and family members. Notification of participants who have left the UK have allowed better management of resources devoted to contacting participant, reducing instances where attempts are made to contact individuals who are no longer in the country. In addition to the above CLS was notified of the death of 21 cohort members under the previous version of this agreement, this prevented CLS from contacting and potentially causing upset to family members but also saved resources by preventing CLS from trying to contact those who have passed away.

Expected Benefits:

The study produces rich, longitudinal, policy-relevant data, currently unavailable elsewhere, for a large representative sample of young adults. Next steps data is widely used by policy makers to evaluate and develop policy and improve services for young people and also by academic researchers to chart and understand social change. The information provided by cohort members provides valuable evidence for the research and policy community about the cohort's transitions out of education and into early adult life. To enhance the research resource for secondary users, a fully documented, pseudonymised dataset has been archived with the UK Data Service in May 2017.

Next Steps Age 32 and previous age 25 survey data will enrich the already deposited data for the cohort (waves 1 to 7) and is expected to be particularly valuable for the research community, including researchers in health and social care, providing rich survey data on a range of different domains of young people's lives. Particularly beneficial is the opportunity for a life course approach and to follow young people's experiences over time to analyse later life outcomes. Next Steps data is a resource with great potential for the research and policy community, and the information collected on health and its social determinants widens its potential value for health research and policy interventions. Cohort members were asked a range of questions about their physical and emotional health and wellbeing. There is, however, a great deal more information about potential underlying determinants, in this and the earlier sweeps of Next Steps, available for researchers via the UKDS.

The age 25 survey data is already providing important research evidence on transitions out of education and into early adult life, informing a range of key interlinked policy questions relating to higher education, employment, housing and family formation, and health. Data from the age 25 survey was deposited at the UK Data Service in June 2017 and have already been downloaded for over 100 research projects in many disciplines including economics, education and sociology; examples of outputs from those projects can be found at the Next Steps home page (https://cls.ucl.ac.uk/cls-studies/next-steps/), as well as publications in the Journal of Physical Activity and Health, the Journal of Adolescence and the European Journal of Public Health, and others. Its influence and impact will grow over the next few years, as it is used for research and policy on a wide range of different issues, and as the existing data is enhanced and augmented, particularly with linked to mortality data.

The use of the survey combined with mortality data will result in papers that will be published, presented at conferences and sometimes reach media coverage.
Most papers will contribute to a body of evidence which will result in improvements to health care users experience or health care delivery. It is expected that occasionally, these may have a higher impact such as the impact examples highlighted below:

Evidence from Next Steps used to reform vocational education for young people.
In 2011 the Department for Education has commissioned an investigative work to find out the effectiveness of vocational education system in the UK in helping young adults securing jobs. The study was conducted using data from Next Steps study showed that young adults aged 16-19 were actively seeking work, but that around a third to a half of them struggled to find appropriate courses and jobs, and as a result changed occupations frequently and spent periods of time not in work, education or training.

Based on these findings, 27 specific recommendations were made on how to improve practical education and training opportunities for young people that will help them to get jobs with good opportunities for progression.

In March 2014 the Department for Business Innovation and Skills announced their governments reform plan for vocational qualifications report , which included a series of reforms to the vocational qualifications , such as the apprenticeship reform where by existing apprenticeship frameworks were replaced by new employer-designed apprenticeship standards. More information on this can be found here. These reforms came in response to recommendations of the influential Wolf Report, which also drew on evidence from Next Steps.

https://nextstepsstudy.org.uk/evidence-from-next-steps-used-in-government-policy-on-youth-unemployment/

Processing:

** Data linkage **
The CLS will supply NHS England with a file of 15770 study members to match to NHS data.

The file supplied will contain all cohort members who have ever participated in the study (excluding those who have requested the study to stop using their data).

Participants can;
a) withdraw from a specific data collection sweep of the study (in which case they would still be invited to take part in future sweeps of the study)
b) permanently withdraw from the study (in which case they will not be invited to take part in any future sweeps of the study) or
c) permanently withdraw from the study AND request that the data can no longer be used.

The file will contain the following items:
-CLS ID
-First name
-Last name
-Middle name (where available)
-Sex
-Date of birth
-Postcode
-NHS number (where available)

Following data linkage, NHS England will supply the following details to CLS:
- CLS identifier
- NHS Number
- Forename of the deceased
- Latest middle name (where available),
-Surname of the deceased
- Date of birth
-Date of death
-Fact of death
-Cause of death
- Gender
- Town of birth
-Address
- Date of address registration or update
-Other variables as chosen in the Civil Registration and Demographic dataset.

The identifiable data received under this agreement will be used to validate the information of the cohort on CLS's cohort maintenance database (meaning names and addresses are used to ensure the correct cohort member has passed away or emigrated). These identifiable data will not be linked to any other data, and it will not be made available to researchers.

The pseudonymised mortality data will be linked to the pseudonymised survey data and therefore will be linked to the survey data for research by the CLS Research Data Management team.

Researchers applying to use the survey linked to mortality data may also apply to use the linked data in combination with other datasets including Hospital Episode Statistics (HES) held by the study DARS-NIC-51342-V1M5W.

***Access by the study operational teams and the wider CLS***
All data being accessed by the study team and the wider CLS is stored on the UCL Data Safe Haven (DSH).

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

Access to the UCL DHS is via remote access. For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

Remote processing will be from secure locations within the UK. The data will not leave the UK at any time.

Access to data held on the UCL DSH is restricted to the direct study team, and other individuals employed by UCL in the CLS who have received approval from the CLS DAC to access the data.

***Access by Sublicensees***
Access to data via sublicence depends on the tier the data falls under.

** UKDS Special safeguarded data (Tier 1b)**
The data will be stored on servers at the UK Data Archive based at the University of Essex. The UK Data Archive will store the pseudonymised analysis file only.

Subject to approval by the CLS DAC and obtaining a UKDS Special Licence, researchers receive data and work on their institutional servers.

Access is restricted to researchers who have received authorisation from CLS DAC and have obtained a UKDS Special Licence for their specific project.

This data dissemination includes the following safeguards:
i. Data transfers are made securely e.g. encrypted.
ii. The data will be used for database update and statistical research purposes and will not involve any direct decision-making about the health or treatment of a participant
iii. Data are stored in secure environments certified to ISO 27001 and/or a Standards Met DSP Toolkit. (this is true about the data we receive from you).
iv. Data are only accessed in pseudonymised form and treated for disclosure if necessary.
v. Data will not be shared outside of the UK at any time.

** UKDS Controlled data (Tier 2)** **
The data will be stored on servers at the UK Data Archive based at the University of Essex. The UK Data Archive will store the pseudonymised analysis file only.

Subject to approval by the CLS, the UKDS will make data available to researchers via the UKDS Secure Lab.

The Data will be accessed via remote access. For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

Remote processing will be from secure locations within the UK. The data will not leave the UK at any time.

Access is restricted to researchers who have received authorisation from CLS DAC for their specific project.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

This data dissemination includes the following safeguards:
i. Linkages are covered by the Section 251 support (use of mortality data for research projects)
ii. Identifying variables are held separately from the survey responses, including during the matching process.
iii. Data transfers are made securely e.g. encrypted.
iv. The data will only be used for database update and statistical research purposes and will not involve any direct decision-making about the health or treatment of a participant
v. Data are stored in secure environments certified to ISO 27001 and/or a Standards Met DSP Toolkit.
vi. Data are only accessed in pseudonymised form and treated for disclosure if necessary.
vii. Data are accessed via the receiving organisation's secure environment.
viii. Disclosure control checks are carried out before any research publication.

Data will not be shared outside of the UK at any time.

1970 British Cohort Study - MR21 — DARS-NIC-17218-B0W9X

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251, Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(7), National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 s261(7); Other-National Health Service Act 2006 - s251 - 'Control of patient information'. ,, Health and Social Care Act 2012 s261(7); Other-National Health Service Act 2006-S251- Control of Patient information', Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-06 – 2021-05 2018.03 — 2025.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

Datasets:

MRIS - List Cleaning Report
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
Civil Registration - Deaths
Demographics
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
Civil Registrations of Death

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

The British Cohort Study 1970 (BCS) is one of Britain’s world renowned national longitudinal birth cohort studies. It follows a large sample of individuals born over a limited period of time (all those born in one week in 1970) through the course of their lives, charting the effects of events and circumstances in early life on outcomes and achievements later on. They show how histories of health, wealth, education, family and employment are interwoven for individuals and vary between them.

The study is run by the Centre for Longitudinal Studies (CLS), at the Institute of Education, University of London and funded by the Economic and Social Research Council.

Since 1970 there have been nine attempts to gather information from the whole cohort. Over time, the scope of enquiry has broadened from a medical focus at birth, to encompass physical and educational development at the age of five, physical, educational and social development at the ages of ten and sixteen, and then to include economic development and other wider factors at ages 26, 30, 34, 38 and 42. The current survey is at Age 46 and future sweeps surveys will take place roughly every 5 years. The ongoing success of the study depends on maintaining contact with as many study members as possible.

The study has its origins in the British Births Survey in which information was gathered about almost 17,500 babies. The original study focused on the circumstances and outcomes of birth but since then the study has broadened in scope to map all aspects of health, education, social and economic development. The current survey will provide an updated picture of the circumstances and experiences of those born in the early seventies in England, Scotland and Wales and will help develop an understanding of their progress into the latter period of their lives.

The current Age 46 survey began in July 2016 and is scheduled to run until July 2018. As of the beginning of February 2018, data has been collected from just over 6,000 participants and by completion it is projected that approximately 8,500 will have taken part. The Age 46 Survey has a particular focus on health and is being conducted by interviewers and registered nurses. The survey involves an interview, anthropometric measurements, blood pressure assessment, measures of physical functioning (grip strength and balance assessments) and the collection of blood samples (for immediate analysis of cholesterol and glycated haemoglobin, storage for future analysis and future DNA extraction). In addition, participants are asked to wear a device which measures physical activity levels for 7 days and to complete an online questionnaire about their diet. The objective measures of health have been funded by the Medical Research Council and the British Heart Foundation.

For the BCS70 Age 46 survey, CLS have contracted an external supplier NatCen Social Research (the trading name of the National Centre for Social Research) to carry out the individual study members’ interviews’.

The pseudonymised research data will be deposited at the UK Data Service. Aggregated data with small numbers suppressed will be made available to the research community in late 2019, forming an invaluable resource for health research. Researchers will be able to use the rich life-history data collected over the duration of the study’s life in conjunction with the data collected in the BCS70 Age 46 survey to examine of the longitudinal predictors of health in mid-life. It is then planned that the measures conducted will be repeated in future sweeps of the study, which will allow for research which deepens the understanding of changes in health which occur with ageing. No patient identifiable data is made available for research.

Of the roughly 12,500 individuals invited to participate in the BCS70 Age 46 Survey there are approximately 1,370 where the address held by CLS has been found to be out of date. These are not individuals which have informed CLS that they wish to withdraw from the study, the researchers have simply lost touch with them as they have moved home and not informed CLS. This application for the MRIS list cleaning report is for the members of the cohort lost to follow up only, namely 1,370 of the cohort.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for these 1370 study members whose whereabouts are currently unknown. CLS feel that a substantial number of these individuals would be willing to participate in the Age 46 survey if they could be contacted. Previous efforts to re-establish contact for our BCS70 cohort study have been very successful using this route.

CLS will supply NHS Digital with a data file containing the personal contact details currently held for these study members - CLS are specifically interested in receiving the 'address' that NHS Digital holds for them. The data returned by NHS Digital will be entered into the secure address database for BCS70 and used to invite these newly traced study members to take part in the BCS70 Age 46 survey.

The on-going success of the study depends on maximising participation. The successful list clean exercise which took place in August/September 2015, prior to the launch of fieldwork, allowed CLS to invite around 600 previously untraced study members to take part. Therefore, a 2nd matching list clean exercise was planned i.e. this application (for 1370 cases), to try and find updated addresses for those who are found during fieldwork to have moved from the address held. This will further boost the number who can be contacted and invited to participate.

Yielded Benefits:

There have been hundreds of published journal articles, books, chapters, reports or conference presentations based on data from the 1970 British Cohort Study. Below are some examples of existing publications using BCS70 data benefiting public health: • TAYLOR, B and WADSWORTH, J. (1987) Maternal smoking during pregnancy and lower respiratory tract illness in early life. Archives of Disease in Childhood, 62(8), 786-791. IMPACT: Research from BCS70 has contributed to the understanding of the effects of maternal smoking on child health. SUMMARY: In a national study of 12,743 children maternal, but not paternal, smoking was confirmed as having a significant influence on the reported incidence of bronchitis and admission to hospital for lower respiratory tract illness during the first five years of life. Reported rates of admissions to hospital for lower respiratory tract diseases were found to be as high in children born to mothers who stopped smoking during pregnancy as in those whose mothers smoked continuously both during and after pregnancy. Rates of admissions to hospital for lower respiratory tract diseases in children whose mothers started smoking only postnatally were no higher than in those whose mothers remained non-smokers. Postnatal smoking seemed to exert a significant influence on the reported incidence of bronchitis, but less than smoking during pregnancy. These findings suggest that maternal smoking influences the incidence of respiratory illnesses in children mainly through a congenital effect, and only to a lesser extent through passive exposure after birth. • MARMOT, M and BELL, R. (2016) Social inequalities in health: a proper concern of epidemiology. Annals of Epidemiology, 26(4), 238-240. IMPACT: Research using BCS70 has highlighted an interrogated socio-economic inequalities in health. Abstract: Social inequalities are a proper concern of epidemiology. Epidemiological thinking and modes of analysis are central, but epidemiological research is one among many areas of study that provide the evidence for understanding the causes of social inequalities in health and what can be done to reduce them. Understanding the causes of health inequalities requires insights from social, behavioural and biological sciences, and a chain of reasoning that examines how the accumulation of positive and negative influences over the life course leads to health inequalities in adult life. Evidence that the social gradient in health can be reduced should make us optimistic that reducing health inequalities is a realistic goal for all societies. • PLOUBIDIS, G.B, SULLIVAN, A, BROWN, M and GOODMAN, A. (2017) Psychological Distress in Mid-Life: Evidence from the 1958 and 1970 British Birth Cohorts. Psychological Medicine, 47(2), 291-303. IMPACT: Research using BCS70 has highlighted the growing problem of depression in the UK. Abstract: This paper addresses the levels of psychological distress experienced at age 42 years by men and women born in 1958 and 1970. Comparing these cohorts born 12 years apart, we ask whether psychological distress has increased, and, if so, whether this increase can be explained by differences in their childhood conditions. Data were utilized from two well-known population-based birth cohorts, the National Child Development Study and the 1970 British Cohort Study. Latent variable models and causal mediation methods were employed. After establishing the measurement equivalence of psychological distress in the two cohorts we found that men and women born in 1970 reported higher levels of psychological distress compared with those born in 1958. These differences were more pronounced in men (b = 0.314, 95% confidence interval 0.252–0.375), with the magnitude of the effect being twice as strong compared with women (b = 0.147, 95% confidence interval 0.076–0.218). The effect of all hypothesized early-life mediators in explaining these differences was modest. Our findings have implications for public health policy, indicating a higher average level of psychological distress among a cohort born in 1970 compared with a generation born 12 years earlier. Due to increases in life expectancy, more recently born cohorts are expected to live longer, which implies – if such differences persist – that they are likely to spend more years with mental health-related morbidity compared with earlier-born cohorts. • V.P. Mateia, A.I. Mihăilescub, L.V. Diaconescuc, T. Purnichid, R. Grigorașe, O. Popa-Veleac (2018) Depression in young adults diagnosed with cancer – an analysis of the outcomes of 1970 British Cohort Study. IMPACT: Research using BCS70 Cohort Study data has contributed to the understanding of the risk of depression in young patients diagnosed with cancer. Summary The study found that the risk of depression is higher at people with onset of cancer before 30. The study did not identify an increased risk for depression by socioeconomic status. Instead, they suggest the importance of active screening and treatment of depression at young patients with cancer. Detailed information about the study can be found here https://www.sciencedirect.com/science/article/pii/S0022399918303088?via%3Dihub BANN, D, JOHNSON, W, LI, L, KUH, D and HARDY, R. (2018) Socioeconomic inequalities in childhood and adolescent body-mass index, weight, and height from 1953 to 2015: an analysis of four longitudinal, observational, British birth cohort studies. Lancet Public Health, 3(4), e194-e203. IMPACT: Research using BCS70 Cohort Study data has contributed to the understanding of how socioeconomic inequalities in childhood body-mass index (BMI) have been documented in high-income countries, how they have changed over time, how inequalities in the composite parts (ie, weight and height) of BMI have changed, and whether inequalities differ in magnitude across the outcome distribution. The study investigated how socioeconomic inequalities in childhood and adolescent weight, height, and BMI have changed over time in Britain. Detailed information about the study can be found here https://www.sciencedirect.com/science/article/pii/S2468266718300458?via%3Dihub CHENG, H and FURNHAM, A. (2018) Teenage locus of control, psychological distress, educational qualifications and occupational prestige as well as sex are independent predictors of adult binge drinking. Alcohol, advance online access, 1 Sept 2018. IMPACT : Research using BCS70 Cohort Study data has contributed to the understanding of how various psychological and socio-demographic factors in childhood and adulthood that relate to alcohol intake and binge drinking at age 42 years. abstract: Data were drawn from the 1970 British Cohort Study (BCS70), The analytic sample comprised 5267 cohort members with data on parental social class at birth, cognitive ability at age 10, locus of control at age 16, psychological distress at age 30, educational qualifications at age 34, and current occupation and alcohol consumption at age 42 years. Results showed that sex (male), lower parental social class, adolescent external locus of control, psychological distress, lower scores on childhood intelligence, lower educational qualifications and less professional occupations were all significantly and positively associated with binge drinking in adulthood. Both psychological and social factors influence adult excessive alcohol consumption. Adolescent locus of control beliefs had a modest but significant effect on adult binge drinking 26 years later. detailed information can be found here: https://www.sciencedirect.com/science/article/pii/S0741832916301677?via%3Dihub For more information about published work using this study access cls.ucl.ac.uk

Expected Benefits:

The British Cohort Study 1970 (BCS70) is one of Britain’s world renowned national longitudinal birth cohort studies. It follows all those born in one week in 1970 through the course of their lives, charting the effects of experiences in early life on outcomes and achievements later on. They show how histories of health, wealth, education, family and employment are interwoven for individuals and vary between them.

The study has its origins in the British Births Survey in which information was gathered about almost 17,500 babies. The original study focused on the circumstances and outcomes of birth but since then the study has broadened in scope to map all aspects of health, education, social and economic development. Forthcoming sweeps will provide an updated picture of the circumstances and experiences of those born in the early seventies in England, Scotland and Wales and will help develop an understanding of their progress into the latter period of their lives.

The study is run by the Centre for Longitudinal Studies (CLS), at the UCL Institute of Education and funded by the Economic and Social Research Council (ESRC).

Since 1970 information has been gathered from the cohort on nine previous occasions with the scope of enquiry broadened from a strictly medical focus at birth, to encompass physical and educational development at the age of seven, physical, educational and social development at the ages of eleven and sixteen, and then to include economic development and other wider factors at ages 23, 33, 42, 44, 46, 50 and 55. The age 46 study is currently in the field and is expected to end in the summer of 2018. Future sweeps of the study are planned to take place every five years.

The data collected by the study is used extensively by researchers in the UK and elsewhere and has had much impact on policy over the years, for example the Welsh Government policy on early years planning - http://www.closer.ac.uk/news-opinion/2013/welsh-governments-early-years-childcare-plan-draws-evidence/ The continuing success of the study will be underpinned by the successful matching of untraced cases.

The information collected during the Age 46 Survey will enable researchers to uncover life course and inter-generational factors which contribute to healthy ageing among this generation, and thus to inform the development of preventative health policies across the whole of life that will expand healthy life expectancy, and reduce the burden of ill-health and disease at older ages.

Benefits of the list cleaning:

Submitting the cohort for list cleaning will allow the researchers to recontact the participants who CLS have lost touch with and give them the opportunity to re-engage or clearly state that they wish to withdraw. It will also ensure that literature goes to the correct name and address. It will also ensure that no contact will be made with participants who have died.

Outputs:

The data file supplied from NHS Digital, as part of this list clean application, will be processed within CLS and entered into CLS’s secure confidential address database i.e. CLS will load more recent addresses into the database. Furthermore, any updated addresses will be used by NatCen to invite study members to take part in the current survey. All BCS70 study members contact information is held in this secure confidential address database at CLS.

Any study members choosing not to take part in the study are flagged on this database with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which case this data is securely deleted.

These addresses obtained from NHS Digital will be used to invite study members (whom we have lost contact with are class as ‘UNTRACED’) to take part in the BCS70 Age 46 survey. However, this data received via this application is never sent or published to the UK Data Service.

With regard to a request for 'withdrawal' from a participant CLS classifies them as a 'withdrawal from the current survey' or a 'withdrawal from the study' and these are handled slightly differently:

• Withdrawal from the current survey: CLS will flag this on its computer system to indicate that the participant will not be taking part in the current survey and the reason for not wanting to take part is also recorded. For example, they may just not have the time to take part. Therefore there will be no further contact with the participant for the duration of the current survey but they will be invited to take part in the next survey.

• Withdrawal from the study: CLS will flag this on its computer system as a permanent refusal to indicate that the participant will not be taking any further part in the study itself and the reason for this type of withdrawal is also recorded for analysis purposes. Therefore there will be no further contact with the participant for the remainder of the longitudinal study. If this request is received in writing then CLS will acknowledge the request and notify the participant that they have been flagged and will no longer be contacted or receive any further communications. This request may sometimes be accompanied by a request for the destruction of their data.

Outputs for the List Clean:
The main outcome from the BCS70 Age 46 survey will be a fully documented, anonymised research dataset and this will be archived with the UK Data Service in late 2019 to provide a strategically important resource for UK Social Science, inc. researchers in health and social care.

Processing:

ACTIVITY 1. NHS address tracing. CLS wish to use the patient status and tracking products which uses NHS registration data to trace as many of the 1370 supplied BCS70 study members as possible, either by finding new address details or verifying existing address details for the cohort.

1.1 CLS will supply NHS Digital with a file of 1370 study members to match to NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of BCS70. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following data items:

- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Sex
- Last known address, and postcode
- NHS Number

1.2 CLS want all 1370 cases to be sent for auto-matching.

1.3 Once the auto-matching process is complete, CLS want any unsuccessful cases to be put through for operator matching.

1.4. NHS Digital would supply the following details to CLS:

- CLS identifier
- Latest surname
- Latest forename
- Latest middle name (where available),
- Date of birth
- Gender
- Latest address and postcode
- Fact of Death
- Date of address registration or update
- NHS Number.

In addition to the receipt of any 'new' matched address information for the study members, CLS would like NHS Digital to add an additional variable that describes the outcome of the matching process to the data that is returned to CLS – that is, this additional variable will allocate each study member to one of the following three categories:

• new/different address found,
• existing address confirmed,
• no match found.

Understanding excess child and adolescent mortality in the UK — DARS-NIC-141410-W6H4Y

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2018-11 – 2021-11 2019.04 — 2025.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes---9-july-2020-final.pdf, igard-minutes-13th-december-2018-final.pdf, igard-minutes---20th-august-2020-final.pdf, igard-minutes-24th-january-2019---final.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
Civil Registration - Deaths
Hospital Episode Statistics Accident and Emergency
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Critical Care
Civil Registration (Deaths) - Secondary Care Cut
Emergency Care Data Set (ECDS)
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
Civil Registrations of Death
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

University College London (UCL) are the sole Data Controller who also process data for this project. There are no other organisations involved in the project.

This project was part of a successful application for a Medical Research Council (MRC) Clinical Research Training Fellowship. The project will contribute to a PhD. The funder is not involved in any aspect of the analysis.

The objective of this project is to explore why the rate at which children and young people (CYP) die in the United Kingdom is higher than in many other developed countries.

In the 1970s UK child mortality rates (the number of child deaths per 100,00 population) were similar to those in comparable wealthy nations, and in many areas the UK performed well. Although UK child mortality has been falling since then, the rate of decline has been slower than in other countries, and the UK now has one of the highest child mortality rates in Europe. If the UK had a mortality rate similar to Sweden (the best performing country), about 2000 fewer children would die each year, or 5 fewer a day.

Aims of the study:

1) Identify causes of death in CYP 0-24 where the UK performs poorly compared to similar countries using publicly available data provided by the WHO World Mortality Database.

2) Analyse geographic and socioeconomic variability in mortality outcomes by cause for CYP 0-24 within England and Wales using death certification data provided by the Office for National Statistics (Office for National Statistics. (2017). Death Registrations in England and Wales, 1993 – 2016: Secure Access. [data collection] 2nd Edition. Accessed via UK Data Service).

3) Analyse the contribution of health service factors to mortality for CYP 0-24 in England for causes of death identified in aims 1 and 2 (i.e causes of death where the UK performs poorly internationally and where there is wide geographic and socioeconomic variability in outcomes). This will require analysing data on health service use prior to death provided by Hospital Episode Statistics linked with Civil Registration (Deaths) data, requested from NHS Digital.

Analyses within aim 3 will be performed by age-group / sex as appropriate and will include:
a. Variability in the contribution of health service factors to predominant mortality causes for CYP in England by NHS provider Trust in children and young people 0-24.
b. Variability in the contribution of health service factors to predominant mortality causes for CYP in England by demographic factors (e.g socioeconomic status).
c. An analysis of how the contribution of health service factors to predominant mortality causes for CYP in England have changed over time 2007 – 2017.

For aim 3 processing, the data subjects are all children and young people aged 0-24 who have accessed secondary health services in England between 2007 and latest date available. The data requested is Civil Registration (Deaths) Secondary Care which is to be linked to Hospital Episode Statistics (HES) Accident and Emergency, HES Outpatients, HES Admitted Patient Care, HES Critical Care.

The datasets requested are Hospital Episode Statistics (HES) Accident and Emergency, HES Outpatients, HES Admitted Patient Care, HES Critical Care, Civil Registration (Deaths) Secondary Care. The data are pseudonymised (no identifiable variables will be required).

The data requested will achieve the identified aim by allowing cohorts to be identified of CYP who have accessed secondary health services with the predominant conditions where UK mortality is higher than in other wealthy countries within each age-group and by sex. This will allow a comparison of patterns of healthcare usage amongst CYP who died of these conditions with age matched controls, who did not die but presented to health services with the same diagnosis. The predominant conditions will be determined from the analysis outcomes of aim 1 and 2.

The years of data that will be requested are 2007 to latest date available. These dates were defined due to:

a) Constraints of data availability
In order to examine healthcare utilisation across all types of secondary care use, the study will require data for years where all datasets are available (HES Accident and Emergency, HES Outpatients, HES Admitted Patient Care 2007/2008; HES Critical Care 2008/2009).

b) Examine healthcare use prior to death
Multiple years of data are required to examine healthcare use (planned outpatient appointments / missed appointments / emergency admissions) over three year periods amongst CYP who died compared with those who did not die but attended secondary services with the same diagnosis. These patterns of healthcare use will be used as markers of severity, standard of care received, and predictors of mortality risk in the years prior to death.

c) The need to combine deaths over several years for some causes
It will be necessary to combine deaths/admission episodes over 3-5 year periods due to anticipated low numbers in some age / sex groups / regions of the country for some causes.

d) To examine trends over time
A key aim of this project is for a longitudinal analysis to examine trends in of any association between healthcare utilisation and mortality over time.

The geographic spread of the data (England) will allow for an analysis of how patterns of healthcare use prior to death vary by NHS provider trust/geographic region in England.

The evidence for excess UK mortality extends throughout the early life course, with high total mortality amongst infants and 1-4 year olds, and high non-communicable diseases (NCD) mortality for all CYP age-groups, particularly adolescents and young people (10-24). In order to fully explore the contribution of health service factors to excess CYP mortality causes, and how this varies by age, the project will require data on secondary healthcare use and mortality within England for children and young people 0-24.

It was considered at length whether data could be filtered to specific conditions of relevance. This project will require data for all secondary care attendance in CYP 0-24, linked to mortality outcomes, for all causes over the study period (2007 – 2016). Causes of death will be mapped to the Global Burden of Disease mortality hierarchy across 4 levels. For example, acute lymphoblastic leukaemia (level 4) is classified within leukaemias (level 3), neoplasms (level 2) and non-communicable diseases (NCD) (level 1). Due to the low number of deaths in CYP in each year/sex/age group, it will not always be possible to analyse mortality by level 4 cause, and causes may need to be aggregated by level 3, level 2, or even level 1 group. The level at which a cause of death can be analysed will only be determined after the number of deaths/attendances to secondary care within the dataset (by sex/age group/year) are known. This need to be flexible to allow grouping of causes over different levels depending on numbers of deaths and admissions will mean it will not be possible to perform the analysis if data are only requested on mortality and healthcare use for specific causes. Thus, it is not possible to limit the request to only specific conditions.

The study will only use the minimum amount of personal data required to perform the analyses. The data requested will not be identifying, and will be pseudonymised. Data will then be aggregated by 5-year age group, cause of death group and 3-5 period (by year of death /admission).

The research proposal and dissemination plan were presented to members of the National Children’s Bureau Young Research Advisors (YRAs) group in March 2018, as part of a Patient and Public Involvement and Engagement initiative. The YRAs are a diverse group of CYP recruited from across the country who have received training in research methods and policy. A focus group of 25 young people aged 7-22 (and parents) was held to discuss the acceptability of the research methods (including the use of data without consent). The YRAs were supportive of the importance of the research and the necessity of analysing data without consent. Specific feedback regarding strategies to inform young people and their families of the research were incorporated in to the project proposal and transparency statement.

The individual accessing the data under this agreement is a substantive employee of UCL and are funded by an MRC Clinical Research Training Fellowship which is held by UCL.

The wider study is the PhD project, which includes analysis where the study will use HES data, and the other analyses described in the application. The PhD project has three aims: 1) Identify causes where the UK performs poorly compared with other wealthy nations 2)Analyse variation in cause specific mortality by region of the UK/England and socioeconomic status and then 3)Compare health service use for predominant causes of child and young person mortality amongst children who die to those who did not die. the 3rd strand will be the strand of the PhD which will use the Data disseminated under this agreement.

Expected Benefits:

This project will increase understanding of high UK child and young person mortality, directly impacting on efforts to improve outcomes, and thus enhance the quality of life, health and wellbeing of the population.

The research findings will achieve these benefits by informing public health, healthcare systems, and healthcare financing research. This has the potential to directly influence health policy development for CYP, leading to reform of services and improved outcomes. NHS England is currently developing its 10-year long-term plan, working closely with people on the project including the Principle Investigator, and reducing excess child mortality is a central plank in planned work for CYP. Other countries (e.g Netherlands) have significantly reduced infant and child mortality over the past decade through targeted interventions based upon knowledge of where the problems lie – and this research will provide data to inform similar targeting of interventions in England.

In addition to the moral case for reducing CYP mortality, there are substantial economic benefits. CYP are the workers of the next 20 years and the parents of the next generation. Higher mortality amongst CYP in the UK compared with other wealthy nations puts the UK at a direct economic productivity disadvantage: essentially the UK is losing 1000 potential workers each year compared with the European average, and 2000 per year compared with the best in Europe. Improving the survival of healthy children and young people in the UK will directly contribute to national wealth and productivity.

The number of healthcare users affected by excess CYP mortality, and so who would potentially benefit as a result of this research, is large. Reducing current CYP mortality to be the best in Europe would save 2000 lives a year, or 5 a day, and as UK outcomes are set to further diverge from other wealthy countries, this number is likely to increase.

Analysing the contribution of health service factors to excess UK mortality for CYP 0-24 (requiring the data processing activities described above) will directly influence health service delivery reform. The findings will enable health providers (e.g. NHS England; Clinical Commissioning Groups; Trusts) to identify variation in performance within certain groups of causes of death relating to NHS provider. This will allow local services to learn from the best performing units, and so introduce specific interventions to improve outcomes. These benefits maybe realised within 2-3 years of finalisation of the research.

In the medium term (3-5 years), these findings will benefit research into implementing different models of accessing paediatric specialists in the community, already established in the best performing countries for CYP mortality. The findings may also be directly used to plan studies investigating how to intervene in improving child health services; for example, the Evelina Children and Young People's Health Partnership.

In the longer term (5-10 years), this analysis could be used as evidence to support a fundamental change in the way national health services are delivered for CYP in the UK. This is likely to include improving integration between primary and secondary services, and a move away from the UK’s predominately hospital-centric model. This will improve health service efficiency and sustainability, further benefiting healthcare users.

This study is in support of a PhD research study.

Outputs:

The primary output will be the analysis of secondary healthcare usage amongst CYP prior to death in England for causes where UK mortality is poor, compared with age matched controls. This will provide estimates of contributions of a range of health system and provider factors to excess CYP UK mortality.

The first stage of analysis will be completed within 6 months of gaining access to the data (Jun 2019) and the aim is to publish preliminary results within 1 year (Dec 2019). The final analysis will be completed within 18 months (Jun 2020).

Each sub-analysis within aim 3 will form a separate publication exploring the contribution of health service factors to mortality outcomes by NHS provider trust, socio-economic status and changes over time. The primary targets for publication will be peer-reviewed journals including the Lancet, British Medical Journal and Archives of Disease in Childhood. Estimated publication date for these analyses will be Dec 2019 – Sept 2020.

The wider project will contribute to a PhD thesis which will be submitted to UCL in September 2020.

Findings will also be presented at national and international conferences such as the Royal College of Paediatrics and Child Health (RCPCH) and International Paediatric Association (IPA), and through public and media initiatives organised through UCL and Kings College London. Other professional bodies such as the Royal College of Nursing, Royal College of General Practitioners and the British Association for Child and Adolescent Public Health will provide further opportunities for knowledge exchange and communication to a range of interested parties. Charities focusing on CYP will also be potential partners for dissemination and will include the NSPCC and the Child Accident Prevention Trust, who actively campaign to reduce UK child mortality. All publications, conference presentations, media engagements and other dissemination activities are promoted on twitter, via institutional (UCL) accounts and the Principle Investigator’s (>1500 followers).

The aims, methods and ethical considerations of this project were presented to members of the National Children’s Bureau Young Research Advisors group in March 2018. As part of this process, the Young Research Advisors expressed interest in presenting the main research findings in an accessible way for young people, which will be facilitated by the National Children’s Bureau. This may include a written summary of the report, short videos, animations, or engaging with social media platforms.

All outputs will contain only data that is aggregated with small numbers supressed in line with the HES Analysis Guide.

Processing:

This study will require Civil Registration (Deaths) Secondary Care to be linked to Hospital Episode Statistics (HES) Accident and Emergency, HES Outpatients, HES Admitted Patient Care, HES Critical Care. NHS Digital will perform this linking of data.

HES data on CYP secondary healthcare use in England between 2007 and latest available date, linked with CYP from civil registration (deaths) data, will flow out of NHS Digital to UCL.

The NHS Digital data will be transferred to UCL Data Safe Haven and the NHS Digital data will only be analysed within the UCL Data Safe Haven. Data will be fully anonymised prior to the analysis. The individual level data will be aggregated by cause of death group, 3-5 period (by year of death /admission), 5-year age group (1-4, 5-9, 10-14, 15-19, 20-24) and sex. The anonymised data will then be extracted from UCL Data Safe Haven after analysis.

The analysis of secondary health care usage amongst CYP prior to death in England for causes where the UK performs poorly, compared with age matched controls, will then be performed on the anonymised dataset.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Data will only be accessed by individuals within UCL who have authorisation to access the data for the purpose described, all of whom are substantive employees of UCL.

The data will not be linked with any record level data. There will be no requirement nor attempt to re-identify individuals from the data. The data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

EVenti: The Prognostic Performance of the Enhanced Liver Fibrosis Test in UK Patients with Chronic Liver Disease Assessed 20 Years After Recruitment to the EUROGOLF study (EVenti). — DARS-NIC-414309-R7H4W

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: Yes (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2024-12 – 2027-12 2025.02 — 2025.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 8th August 2024 final.pdf

Datasets:

Cancer Registration Data
Civil Registrations of Death
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Identifiable

Expected Benefits:

The findings of this research study are expected to contribute to evidence-based decision-making for policy-makers, local decision-makers such as doctors, and patients to inform best practice to improve the care, treatment and experience of health care users relevant to the subject matter of the study.

Chronic liver disease is the third most common cause of death in the UK. In all cases, liver injury leads to liver fibrosis that accumulates to cause cirrhosis which may lead to fatal complications including liver failure and liver cancer. Liver fibrosis does not cause symptoms and is silent in most cases until complications of cirrhosis develop. At this point treatment is often too late.

New tests have recently made it possible to detect liver fibrosis before serious irreversible harm has developed. The EUROGOLF study set out to identify a blood test that could assess liver fibrosis without the need for a liver biopsy (inserting a needle into the liver). Through the EUROGOLF study, The Enhanced Liver Fibrosis (ELF) test was discovered through the examination of blood samples obtained from 921 patients with chronic liver disease between 1998 and 2000, including 457 patients in the UK. Twenty years after the initial recruitment of this cohort UCL want to determine the long-term prognostic performance of the ELF test.

This work may:
Better inform those planning health services and programmes for those with liver cirrhosis

It is hoped that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients.

Measures will be taken to attempt to disseminate the results to the general public through press releases to major media outlets including the print press and radio and television journalists. The British Liver Trust is aware of this work and will be informed of the results, along with the British Liver Alliance.

Outputs:

The expected outputs of the processing will be:
Submissions to peer reviewed journals. It is anticipated that at least two publications will be submitted within 12 months of receipt of the data from NHS England.
Presentations to the original participating sites in England that participated in EUROGOLF so that the results can be disseminated to colleagues and patients in the year following completion of the data analyses.
Presentations at the British Association for the Study of the Liver annual meeting, the European Association for the Study of the Liver annual meeting and the American Association for the Study of the Liver meeting in the cycle of the year following completion of the analyses.

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Journals
Social media
Posters displayed at appropriate conferences.
Press/media engagement
Public promotion of the research via the UCL webpages and the studys own webpages

Outputs will be produced from ~ six months post receipt of the data, and will be produced on an on-going basis.

Processing:

UCL will transfer data to NHS England. The data will consist of identifying details, specifically: NHS Number, Date of Birth, Gender and a unique person ID. This will allow for the cohort to be linked with NHS England data.

NHS England will provide the relevant records from the HES/ECDS, Deaths and Cancer Registration datasets to the recipient. The Data will contain no direct identifying data items but will contain a unique person ID which can be used to link the Data with other record level data already held by the recipient.

Once received, the requested data will be uploaded to the UCL Data Safe Haven. Amazon Web Services (AWS) is a processor acting under the instructions of UCL. AWS role is limited to secure backup of data stored in UCLs Data Safe Haven. UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

The Data will be accessed by authorised personnel via remote access.

The Controller must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within England. The data will not leave England at any time.

Access is restricted to substantive employees or agents of UCL who have authorisation from the Principal Investigator.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will be linked at the person record level with data collected during the original EUROGOLF study.

Researchers from UCL will process the Data for the purposes described above.

OLIVE: Improving the early detection of lung cancer in never-smokers — DARS-NIC-740806-W2F6Q

Opt outs honoured: No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2024-09 – 2027-09 2025.03 — 2025.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

NDRS Cancer Pathway
NDRS Cancer Registrations
NDRS Somatic Molecular Dataset

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires access to NHS England data for the purpose of the following research project:
OLIVE: Improving the early detection of lung cancer in never-smokers

The following is a summary of the aims of the research project provided by UCL:

Lung cancer is the second most common cancer and the biggest cause of cancer death both worldwide and in the UK. Only 39% of adults diagnosed with lung cancer survive more than one year and this is likely related to late diagnosis: half of all individuals are diagnosed with stage IV disease at presentation and a quarter are diagnosed through emergency care. Although commonly associated with smoking, lung cancer also occurs in never-smokers, or adults who have smoked less than 100 tobacco cigarettes in their lifetime. Up to 25% of lung cancer worldwide is diagnosed in never-smokers; lung cancer in never-smokers (LCINS) is the seventh most common cause of cancer death.

In the UK, people who have smoked are invited for screening with scans. However, we do not have any methods to detect lung cancer in never-smokers (LCINS) early. LCINS is generally not very well understood. If this is changed, more lives can be saved and more lives of people living with lung cancer can be improved. With the proportion of never-smokers increasing worldwide and in Europe, LCINS mortality is likely to increase further. Methods to reduce mortality are clearly needed and this is likely to be best achieved by ensuring adults are diagnosed early and at an earlier stage.

OLIVE is a PhD project with four work packages.
> WP1: A systematic review will be performed to identify and summarise factors which predict LCINS.
> WP2: Registry data will be analysed to examine the relationship between sociodemographic factors (such as gender and ethnicity) and LCINS (including outcomes like survival).
> WP3: Standard statistical and machine learning techniques in multiple datasets will be used to create and validate a risk prediction model to estimate the five-year incidence and mortality risk of LCINS.
> WP4: Machine learning will be used to review primary care data of never-smokers to identify patterns that will contribute to early diagnosis.

The NHS England Data requested under this Agreement is only required for WP2. Only aggregated results with small numbers suppressed generated from the processing of NHS England Data for WP2 may be used for the other WPs. The aim of this part of the project is to describe LCINS in the UK and the relationship between factors such as age, gender and ethnicity and LCINS.

The following NHS England Data will be accessed:
> NDRS Cancer Registration and NDRS Somatic Molecular Testing necessary to understand important variables which may be associated with the outcomes of stage and survival from LCINS.
> NDRS Cancer Pathway - necessary to understand treatment which may affect outcomes, treatment data is required as it may be linked to survival from lung cancer.

The level of the Data will be:
> Pseudonymised

The Data will be minimised as follows :
> Limited to a study cohort identified by NHS England as meeting the following criteria:
- Inclusion Criteria
For cases: adults over the age of 21 with a diagnosis of primary lung cancer
- Exclusion Criteria
Adults who do not meet inclusion criteria
> Limited to data between 2015-2022.
> Limited to a diagnosis of primary lung cancer
> Limited to the following geographic areas: England.
> Data requested will be minimised to avoid any chance of re-identifiability e.g. age in years instead of birth date.
> Ethnicity will be requested as it is likely to be associated with diagnosis of and survival from lung cancer.

UCL as the research sponsor is the controller as the organisation responsible for ensuring that the Data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
> Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the University College London (UCL);

The lawful basis for processing special category data under the UK GDPR is:
> Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it could lead to a better understanding of LCINS which could ultimately save more lives and improve the lives of people living with lung cancer.

The funding comes from multiple sources. Funders include:
> NIHR Doctoral Fellowship
> Ruth Strauss Foundation a charity which aims to provide emotional support for families to prepare for the death of a parent and raise awareness of the need for more research & collaboration in the fight against non-smoking lung cancers.

The funding is specifically for the project described.

The funders will have no ability to suppress or otherwise limit the publication of findings.

Amazon Web Services (AWS) is the processor acting under the instructions of UCL. AWS role is limited to secure back-up of data stored in UCLs Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre. VIRTUS does not have access to the data.

Data will be accessed by:
- Substantive employees of UCL
- A PhD student enrolled with UCL. The individual has completed mandatory data protection and confidentiality training and is subject to UCLs policies on data protection and confidentiality. The individual accessing the data will do so under the supervision of a substantive employee of UCL. UCL would be responsible and liable for any work carried out by the individual. The PhD student would only work on the data for the purposes described in this Data Sharing Agreement (DSA).

A Public and Patient Involvement and Engagement group helped refine the purpose of the research. The group supported the collection of the data for the purposes described above. Patients also helped to ensure that the research outcomes and variables studied are appropriate and acceptable to the public. They felt this research is important and likely to lead to meaningful change for other patients and the public. They also felt the processes, such as data storage, were acceptable and did not raise any other concerns or ethical objections. Ongoing research will be discussed to ensure it remains acceptable and gather structured feedback for change. Feedback from patient representatives has been invaluable in adapting the methods and outcomes of this research to ensure it is relevant to patients and the public. These representatives will continue to be involved to guide research and develop effective methods for implementing findings.

Expected Benefits:

This research will improve the currently limited understanding of LCINS by advancing knowledge about symptoms as well as sociodemographic and risk factors. As recommended by patient representatives and the National Cancer Equality Initiative, measuring and understanding differences leads to improving inequalities and designing appropriate interventions. Understanding symptoms may influence a future NICE guideline update on referral from primary care.

It will also contribute to creating novel risk prediction models to identify high-risk adults who will benefit from early detection strategies. Such research will be word-leading and capitalise on the co-operation between primary and secondary care already established by screening ever-smokers.

Many patient representatives say this research will increase awareness of LCINS amongst healthcare professionals and health-seeking individuals and contribute to other research and public health measures.

The use of the data could:
> help the system to better understand the health and care needs of populations.
> lead to the identification or improvement of treatments or interventions, or health and care system design to improve health and care outcomes or experience.
> advance understanding of regional and national trends in health and social care needs.
> advance understanding of the need for, or effectiveness of, preventative health and care measures for particular populations or conditions.
> inform planning health services and programmes, for example to improve equity of access, experience and outcomes.
> inform decisions on how to effectively allocate and evaluate funding according to health needs.
> provide a mechanism for checking the quality of care. This could include identifying areas of good practice to learn from, or areas of poorer practice which need to be addressed.
> support knowledge creation or exploratory research (and the innovations and developments that might result from that exploratory work).

This research will improve the currently limited understanding of LCINS by advancing knowledge about symptoms as well as sociodemographic and risk factors. Measuring and understanding differences leads to improving inequalities and designing appropriate interventions. Understanding symptoms may influence a future NICE guideline update on referral from primary care. It will also contribute to creating novel risk prediction models to identify high-risk adults who will benefit from early detection strategies.

It is hoped that through publication of findings in appropriate media, the findings of this research will add to the body of evidence that is considered by the bodies, organisations and individual care practitioners charged with making policy decisions for or within the NHS or treatment decisions in relation to specific patients. Clients will need to take action based on the information provided to them in order to realise the potential improvement opportunities.

The applicant is funded by a charity, Ruth Strauss Foundation, as well as NIHR. The applicant will work with the funders and with patient representatives to ensure that the research is shared with a wider audience than the academic/scientific community.

Outputs:

The expected outputs of the processing will be:
> Annual reports to NIHR
> At least one submission to peer reviewed journals e.g. European respiratory journal, Lung cancer, Thorax.
> Presentations at appropriate conferences e.g. British thoracic society annual conference, British thoracic oncology group conference.
> Production of algorithms and risk prediction models to predict and detect lung cancer in never-smokers for use in NHS healthcare. Expected completion date 2027.
> Inclusion in PhD thesis

The outputs will not contain NHS England Data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
> Journals
> Patient engagement events at least twice a year
> Patient information leaflets either at GP practices or work with the Ruth Strauss charity.
> Posters at GP practices and work with the Ruth Strauss charity.
> Oral presentations at national and international meetings

The target date will be 2027 (end of PhD fellowship).

Processing:

No data will flow to NHS England for the purposes of this Data Sharing Agreement (DSA).

NHS England will provide the relevant records from the Cancer Pathway Data, Cancer Registration Data and Somatic Molecular Data datasets to University College London (UCL). The Data will contain no direct identifying data items. The Data will be pseudonymised and individuals cannot be reidentified through linkage with other data in the possession of the recipient.

The Data will not be transferred to any other location.

The Data will be stored on servers at UCL Data Safe Haven.

UCL uses offsite data centre services provided by VIRTUS data centre.

Amazon Web Services provides cloud hosting services to UCL and will store the data as contracted by UCL.

The Data will be accessed by authorised personnel via remote access.

The Controller(s) must confirm and provide evidence upon audit by NHS England that access via any remote device complies with the data security obligations within this DSA and the Data Sharing Framework Contract.

For remote access:
- Remote access will only be from secure locations situated within the territory of use (as further restricted elsewhere within the DSA if so done) stated within this DSA;
- Access controls granting users the minimum level of access required are in place;
- Remote access is only via secure connections (e.g., VPNs or secure protocols) to protect data;
- Multifactor authentication (MFA) is required for remote access;
- Device security, including up-to-date software and operating systems, antivirus software, and enabled firewalls are utilised for the remote access;
- All remote access is undertaken within the scope of the organisations DSPT (or other security arrangements as per this DSA) and complies with the organisations remote access policy.

The above applies in addition to any condition set out elsewhere within the DSA (e.g. who may carry out processing, and for what purpose).

Remote processing will be from secure locations within the UK. The data will not leave the UK at any time.

Access is restricted to employees of University College London (UCL).

University College London Hospitals NHS Foundation Trust (UCLH) is not permitted to access the Data.

All personnel accessing the Data have been appropriately trained in data protection and confidentiality.

The Data will not be linked with any other data.

There will be no requirement and no attempt to reidentify individuals when using the Data.

Researchers from the University College London (UCL) will analyse the Data for the purposes described above.

MR1362: Extension of NIC-349413-F1J1N - Next Steps Cohort Study — DARS-NIC-15226-X7Z9R

Opt outs honoured: N, Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-09 – 2022-09 2018.03 — 2025.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

MRIS - List Cleaning Report
Civil Registration - Deaths
Demographics
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
Civil Registrations of Death

Type of data: Identifiable

Objectives:

Next Steps Longitudinal Study of Young People in England (LSYPE) is an established longitudinal study which has followed the lives of 15,620 people born in 1989/90, since year 9 of secondary school. Study members were interviewed annually between 2004 and 2010 to map their transitions through education and into adulthood and the labour market. Therefore, LSYPE is the largest and most detailed research study of its kind.

The most recent round of data collection - Next Steps Age 25 survey - took place between August 2015 and September 2016. Information was collected from 7,707 cohort members on many aspects of cohort members’ lives such as education, employment, health and well-being, relationships and family life, housing and finances. Additionally, during the Age 25 survey, a wide range of data linkage consents were collected, including consent to health records linkage held by NHS.

The study was previously managed by the Department of Education (DfE). In 2013 the Economic and Social Research Council took over the funding and the study management legally transferred to the Centre for Longitudinal Studies (CLS) at the University College London Institute of Education.

CLS have already received data under the original data agreement (NIC-316681-W7P2R) and, under the subsequent amendment (NIC-349413-F1J1N), were able to pass data to their contracted fieldwork agency NATCEN to allow them to initiate/conduct the survey.

Next Steps was conducted annually between 2004 and 2010. Data collection focused on young people’s transitions into further/higher education and the labour market or to other outcomes, such as parenthood. The next wave took place in 2015 when cohort members were aged 24/25 years. The Age 25 survey gathered information about the lives of the cohort including education, employment, economic circumstances, family life, physical and emotional health and well-being, social participation and attitudes.

The ongoing success of the study depends on re-establishing and maintaining contact with as many study members as possible.

The aim of being provided with the address details and other up to date identifiers for the cohort was to trace as many study members as possible in advance of the next wave of fieldwork.

CLS supplied NHS Digital with a file of study members and their last known address, extracted from the CLS address database. CLS asked that these details are matched with NHS Registration Data and registered addresses supplied, where available.

The data supplied was entered into the secure address database and is used to maintain contact with study members and to invite them to take part in each fieldwork wave. When study members were contacted to invite them to participate in the Next Steps Age 25 study, it was made explicitly clear that they can inform CLS that they no longer wish to participate in the study and they will not be contacted again.

To conduct the age 25 survey, CLS contracted an external supplier NatCen Social Research (the trading name of the National Centre for Social Research - www.natcen.ac.uk) to carry out the individual cohort study members’ interviews’.

The survey is now completed and the aggregated outputs from the survey have now been deposited with the UK Data Archive, located at the University of Essex, no patient identifiable data is deposited. Please note that the data files supplied from NHS Digital, as part of this application, have been processed within CLS and entered into CLS’s secure address database. They have been used to maintain contact with study members and to invite them to take part in each fieldwork wave – this data is not sent to the UK Data Service.

NatCen Social Research, were an external Data Processor and carried out the survey fieldwork and associated mailings for the Next Steps Longitudinal Study of Young People in England (LSYPE) Age 25 survey – specifically they were contracted to carry out: (1) Email and postal mailings to LSYPE cohort members about the study; (2) Interviews with Next Steps (LSYPE) cohort members. As the survey is now completed, the contract with NatCen is also ended and therefore NatCen are no longer acting as a data processor for CLS, and any data files have been securely deleted from NatCen’s systems.

CLS also require access re-instated to NHS Numbers for the cohort participants so that future matching and linkage exercises including those related to a Next Steps linked data application (NIC-51342), a separate data sharing agreement which provides HES data linked to the cohort).

Yielded Benefits:

The age 25 survey data is already providing important research evidence on transitions out of education and into early adult life, informing a range of key interlinked policy questions relating to higher education, employment, housing and family formation, and health. Data from the age 25 survey was deposited at the UK Data Service in June 2017 and have already been downloaded for over 100 research projects in many disciplines including economics, education and sociology; examples of outputs from those projects can be found at the Next Steps home page (https://cls.ucl.ac.uk/cls-studies/next-steps/), as well as publications in the Journal of Physical Activity and Health, the Journal of Adolescence and the European Journal of Public Health, and others. Its influence and impact will grow over the next few years, as it is used for research and policy on a wide range of different issues, and as the existing data is enhanced and augmented, particularly with linked administrative data. The Next Steps study has shown to be a strong resource to researchers in addressing different developing aspects of life for young people. Through this research, Next Steps has contributed to the restructuring of public opinion and policy focused on young people in England. Next Steps being used in the reform of vocational education for young people: In 2011 the Department for Education has commissioned an investigative work to find out the effectiveness of ‘vocational’ education system in the UK in helping young adults securing jobs. The study, conducted by Professor Alison Wolf using data from Next Steps study showed that young adults aged 16-19 were actively seeking work, but that around a third to a half of them struggled to find appropriate courses and jobs, and as a result changed occupations frequently and spent periods of time not in work, education or training. Based on these findings, Professor Wolf was able to make 27 specific recommendations of how to improve practical education and training opportunities for young people that will help them to get jobs with good opportunities for progression. More information on this report can be found at the link: - https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/180504/DFE-00031-2011.pdf - The government committed to acting on all the recommendations in Professor Wolf’s report with comments addressing the input from the Wolf Report. Next Steps being used in Government Policy on Youth Unemployment. Findings from Next Steps have been used to inform the Government’s strategy on getting young people into education, employment or training. The struggling economic climate has often been cited as the reason for the large number of young adults not in education, employment or training (NEET). However, new research using data from Next Steps have shown that there are other factors that influence whether or not a person spends NEET. The findings showed that education attainment at age 16 is one of the most determining factors in determining a person’s future path. Forty-five per cent of those with no GCSEs had spent more than a year NEET by the time they turned 18, compared to just 4 per cent of those with five or more GCSEs at A*-C. Building Engagement, Building Futures lays out the Government’s plans to tackle the causes of youth unemployment. This strategy proposes to maximise the participation of 16-24 years old in education, training and work. More info can be found here - https://www.gov.uk/government/publications/building-engagement-building-futures - The strategy was developed by the Department for Work and Pensions, the Department for Education, and the Department for Business, Innovation and Skills. It comes in response to recommendations of the influential Wolf Report, which also draws on evidence from Next Steps. Next Steps being used in Anti-Bullying Campaigns Findings from Next Steps have been used in several anti-bullying campaigns and initiatives, as well as guidance for teachers and schools about how to stops bullying. And the efforts have paid off. A recent study has shown that secondary school pupils today are less likely to be bullied. Ten thousand fewer pupils are being bullied every day than 10 years ago, a major new study of secondary school pupils has revealed. The Department of Education report compares the experiences of Next Steps study members to members of Our Future, a new study which started following Year 9 pupils in 2013. This landmark research, which involved tens of thousands of young people from 2004 and 2013, is one of the largest of its kind ever undertaken and highlighted a fall in bullying since participants of Next Steps were recruited for the study. Initial findings from the age 25 data, produced and published by CLS, have also contributed to political debate in relation to the labour market conditions for this generation, with Members of Parliament referring to findings on the negative impact of zero hours contracts on health in Prime Ministers’ questions in July 2017. Further examples of what has been learnt from the study includes: Education: Next Steps has provided information on the factors that influence young people's performance at school, including the fact that attainment gaps between young people from rich and poor backgrounds emerged early in life and were very large by the time GCSEs were taken. Findings were used in setting up the Education Maintenance Allowance which is a scheme which helps young people from low income families with the costs of travel, books and equipment for school or college. Employment: Next Steps has contributed to the understanding of young people's experiences of the labour market. It has shown that young people's educational attainment at age 16 is the most important factor affecting whether they are in education, training or employment at age 18. In 2011 the government used the findings in their policy on tackling the root causes of youth unemployment. Social exclusion linked to academic struggles for young people in poor health: according to new research from Next Steps, teenagers with poor physical and mental health are often excluded from social circles and activities, which can have a knock-on effect on their performance at school and in the labour market. More information can be found here - https://nextstepsstudy.org.uk/social-exclusion-linked-to-academic-struggles-for-young-people-in-poor-health/

Expected Benefits:

The expected measurable benefits from the original agreement:

The study produces rich, longitudinal, policy-relevant data, currently unavailable elsewhere, for a large, representative sample of young adults. LSYPE data is widely used by policy makers to evaluate and develop policy and improve services for young people and also by academic researchers to chart and understand social change.

The information provided by cohort members provides valuable evidence for the research and policy community about the cohort’s transitions out of education and into early adult life. To enhance the research resource for secondary users, a fully documented, anonymised dataset has been archived with the UK Data Service in May 2017.

Next Steps Age 25 survey data will enrich the already deposited data for the cohort (waves 1 to 7) and is expected to be particularly valuable for the research community, including researchers in health and social care, providing rich survey data on a range of different domains of young people’s lives. Particularly beneficial is the opportunity for a life course approach and to follow young people’s experiences over time to analyse later life outcomes.

Next Steps data is a resource with great potential for the research and policy community, and the information collected on health and its social determinants widens its potential value for health research and policy interventions. Through the set up at the UK Data Service, researchers are able to apply and carry out research utilising the established link to benefit health and social care.

Next Steps Age 25 Survey data has been deposited with the UKDS and the cohort members’ health is an important aspect in the Age 25 Sweep. Cohort members were asked a range of questions about their physical and emotional health and wellbeing and CLS is currently looking at initial findings on probable mental ill health at age 25 and its association with a number of potential risk factors. There is, however, a great deal more information about potential underlying determinants, in this and the earlier sweeps of Next Steps, available for researchers via the UKDS.

This request is to extend the Data Sharing Agreement. Retaining contact details of non-responders (to an annual mail-out) will enable the researcher to try and re-establish contact before the next wave of the longitudinal study (date to be confirmed) to be able to continue with the research.

Outputs:

Previous outputs from original agreement:

On receipt of the data, CLS processed the files and loaded more recent addresses to the database. Contact was made with the cohort members via the contracted fieldwork agency, NatCen, inviting them to take part in the survey. CLS have posted a participant survey information pack to all cohort members announcing the imminent launch of the survey. It will be made explicitly clear to study members that they can withdraw from the study if they no longer wish to participate. CLS: Participant contact information is held in a secure address database at the Centre for Longitudinal Studies. Any participants choosing not to take part in the study are flagged on this database with a code denoting whether their refusal is temporary (i.e. to this particular wave of data collection) or permanent (i.e. they wish to have no further involvement in the study). Anonymised survey data and confidential data from the address database are retained unless the participant specifically asks us not to, in which case this data is deleted.

As mentioned earlier in this section, CLS have contracted an external supplier (& Data Processor) NatCen Social Research to carry out the survey fieldwork and associated mailings for the Next Steps (LSYPE) Age 25 survey – specifically to carry out: (1) Email and postal mailings to LSYPE cohort members about the study; (2) Interviews with Next Steps (LSYPE) cohort members. These activities have been completed and the data files will be securely deleted from NatCen Social Research systems - this is in the process of being completed and a special condition has been added stating that NATCEN will delete the data within 3 month of the signing of this new agreement.

The fully documented, anonymised research dataset was archived with the UK Data Service in early-2017 to provide a strategically important resource for UK Social Science, including researchers in health and social care.

Extension request 2018:
There will be no further outputs at this stage, this request is to extend the Data Sharing Agreement to enable the researcher to retain the cohort contact details so that they can be contacted at the next wave of the longitudinal study (date to be confirmed) to be able to continue with the research.

Processing:

Previous processing activities from the previous agreement:

ACTIVITY 1. NHS address tracing. CLS wished to use NHS Digital patient status and tracking products which uses NHS registration data to trace as many study members as possible, either by finding new address details or verifying existing address details for the cohort.

1.1 CLS supplied NHS Digital with a file of around 15,600 cohort members to match to the NHS data. The file supplied only contained eligible study members who had participated in at least one wave of Next Steps. It did not include study members known to have died or to have withdrawn from the study. The file contained the following data items:
- CLS identifier,
- First name,
- Last name,
- Middle name (where available),
- Date of birth,
- Sex,
- Last known address, and postcode,
NHS numbers were not available for any study members.

1.2 CLS required all 15,600 cases to be sent for auto-matching.

1.3 Once the auto-matching process was complete, CLS reviewed the results and took a decision about which cases should be put forward for operator matching. CLS thought that any cases classified as status: 'gone away' or ‘unconfirmed address’ (1,026 and 3,367 cases respectively) were likely sub-groups for operator matching.

1.4. NHS Digital supplied the following details to CLS.
- CLS identifier,
- Latest surname,
- Latest forename,
- Latest middle name (where available),
- Date of birth,
- Gender,
- Latest address and postcode,
- Fact of Death
- Date of address registration or update.

In addition to the receipt of any 'new' matched address information for the cohort members, CLS required NHS Digital to add an additional variable that described the outcome of the matching process to the data that is returned to them. This additional variable allocated each cohort member to one of the following three categories:
• new/different address found,
• existing address confirmed,
• no match found.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

Extension request 2018:
There will be no further processing at this stage, this request is to extend the Data Sharing Agreement to enable the researcher to retain the cohort contact details so that they can be contacted at the next wave of the longitudinal study (date to be confirmed) to be able to continue with the research.

Centre for Longitudinal Studies - Millennium Cohort Study (MCS)- (Age 17 consent) — DARS-NIC-384504-N2V5B

Opt outs honoured: No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(c), Health and Social Care Act 2012 - s261 - 'Other dissemination of information'

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2021-01 – 2022-01 2021.09 — 2025.01. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: Yes

AGD/predecessor discussions: AGD minutes - 23rd January 2025 final.pdf, AGD Minutes - 27 June 2024 final.pdf, AGD Draft minutes - 9 May 2024 final.pdf, IGARD Minutes - 24th June 2021 final.pdf, igardminutes-14thjanuary2021final.pdf

Datasets:

Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. It is responsible for four of Britain's internationally renowned longitudinal cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study, the Next Steps and the Millennium Cohort Study (MCS). All these studies are following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being. The purpose of this application covers two aspects:

a) Request linkage of Hospital Episodes Statistics (HES) and Emergency Care Data Set (ECDS) data to a subset of the MCS (only cohort members who consented to have their health records linked to their survey data)
b) CLS seeks permission to sub-licence this linked data with the research community via the UKDS.

MCS is renowned worldwide for the evidence it provides on childrens experience of growing up in the United Kingdom in the 21st Century. Since the studys launch there have been seven attempts to re-contact and gather information from the whole cohort (at ages 9 months, 3 years, 5 years, 7 years, 11 years, 14 years and 17 years). The MCS covers such diverse topics as parenting; childcare; schooling and education (e.g academic qualifications, vocational qualifications); daily activities and behaviour; cognitive development; child and parent mental and physical health; employment and education; income and poverty; housing, neighbourhood and residential mobility; and social capital, ethnicity and identity. The information collected in previous sweeps of the study has formed the high quality data resource, that is MCS, for scientific investigation across the life course and domains.

The seventh, Age 17 survey (2018-19) added to the data already collected in previous sweeps by updating information on current circumstances of the cohort and experiences they have had since the last sweep. In previous sweeps, schooling will have been the main activity common to the vast majority of cohort members. The Age 17 survey marked an important transitional time in the cohort members lives, where educational and occupational paths can diverge significantly. It is also an important age in data collection terms since it may be the last sweep at which parents are interviewed and it is an age when direct engagement with the cohort members themselves rather than their families is crucial to the long term viability of the study. To reflect this, CLS conducted face to face interviews with the cohort members for the first time. Cohort members were also asked to do a range of other activities including filling in a self-completion questionnaire on the interviewers tablet, completing a cognitive assessment (number activity) and having their height weight and body fat measurements taken.

This was a unique opportunity to measure factors that underlie different types of transition into adult life, which may affect future wellbeing in unprecedented ways. Capturing these transitions well, alongside the contemporary factors underlying them was critical. It was important to build up a picture of daily life, including factors such as: relationship with parents, family and peers, risky behaviours, social media engagement and efforts on activities such as education /school. Additional factors affecting decisions at this age include attitudes and preferences, such as preferences for education, attitudes to risk, willingness to trade off resources at different points in time, and expectations about future life events. Measuring social and emotional development, mental health and cognitive development and using well-validated instruments, was also a critical component of the survey.

During this survey, CLS also obtained informed consent from cohort members for their health data to be linked to the data collected in the study. In total, consent to health linkage was obtained from approximately 6118 cohort members in England. These are the cases which CLS is seeking permission to link to HES and ECDS data. More information about this survey can be found here- https://cls.ucl.ac.uk/wp-content/uploads/2020/09/MCS7-user-guide-Age-17-ed1.pdf

Linking health data from HES and ECDS to the MCS survey data will greatly increase the possibilities for using the cohort to study how health outcomes impact on the individual aspects of their life such as education, work, relationships and family life and, likewise, how health outcomes relate to the individual behaviours and lifestyles aspects such as drug and alcohol use, sexual health, diet and exercise, which are all documented as part of the study. The successful inclusion of HES and ECDS data will enrich these data by revealing which cohort members have been admitted to or attended hospital and the reasons for this, e.g. drug and alcohol treatment, accident and emergency, maternity and mental health services which could help CLS better understand how health conditions could be better treated or supported. This kind of analysis necessitates pseudonymised record level data.

Data about health behaviours may be more accurate if obtained from administrative records as a result of misreporting of complex health conditions, under-reporting of particular health problems or due to perceived sensitivities around certain behaviours and lifestyle aspects. There are no alternative, less intrusive ways of obtaining such information. This also offers an interesting methodological opportunity to validate the data collected in the survey and vice versa.

CLS at UCL is requesting data from 2001 (where available) to most recent data available. The first data collection of the MCS study happened in 2001 when cohort members were 9 months old. CLS therefore wants to access the historical information for its cohort members. Health events can be experienced over an extended period. The objective HES/ ECDS data will complement and enhance the existing survey data, also improving the accuracy of the data collected in the survey. The large data range will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of cohort members, offering huge potential for scientific and policy-related research. This will build on the extensive body of work focused on the millennium cohort as given in Yielded Benefits.

The MCS study follows the lives of young people across England, Scotland, Wales and Northern Ireland. As this project requests data linkage to the MCS study, the geographical spread of the data requested will be across England.

CLS at UCL have considered data minimisation in terms of what CLS needs but further minimisation is not possible. The data requested in this application will be part of a database, created to serve various research projects. Data minimisation will be applied when CLS sub-licences the data to third parties. Third party organisations wishing to access the data will need to specify the variables needed and will only be given access to a sub-set of the data which is needed to conduct their research.

The overall aim of the linkage is to:

1. Validate, enhance and improve the quality of the cohort data, in this way creating a uniquely rich administrative/survey linked data set (HES/ECDS - MCS).
2. Use this data set to produce methodological papers on the quality of the data (eg around measurement, representativeness) and research papers helping to showcase its benefits for health and social care.
3. Promote and make possible wider use of this linked data set, through providing wider access to the linked NHS Digital HES / CLS MCS data to the research community via the UK Data Service (UKDS) Secure Lab, through a sub-licensing agreement agreed between CLS and NHS Digital

THE SUB-LICENCE:
Sub-licensing of the data will be in line with the DARS sub-licensing data standard: https://digital.nhs.uk/services/data-access-request-service-dars/dars-guidance/sub-licencing-and-onward-sharing-of-data

UCL seeks permission to include onward sharing of the linked HES/ ECDS and CLS MCS data with the UK Data Service (UKDS), where data can be accessed by accredited researchers in a Secure Research Environment, known as Secure Lab, following a "Sub-licensing model".

The UKDS is funded by the Economic and Social Research Council (ESRC) with contributions from the University of Essex, the University of Manchester and Jisc (Jisc is a United Kingdom not-for-profit company whose role is to support post-16 and higher education, and research, by providing relevant and useful advice, digital resources and network and technology services, while researching and developing new technologies and ways of working). The UKDS provides access to high quality data to meet the data needs of researchers, students and teachers from all sectors including academia and central and local government.

The UKDS is based at, and hosted by, the University of Essex. The University of Essex are therefore listed as a data processor and also listed in the data processing and storage location sections. Only staff who are permitted to work at the UKDS (and are substantively employed by University of Essex) will process the data. Should any substantively employed researchers from University of Essex wish to use the UKDS data, they will be required to apply via the sub-licence route, the same as other researchers from other organisations.

Under the "Sub-licensing model", NHS Digital shares data with UCL who are in turn licensed to share these data with other organisations, subject to agreed controls, scoped in this agreement between NHS Digital and UCL. In line with this onward sharing model, the data sharing controls in place between NHS Digital and UCL are replicated between UCL and the other organisations. UCL, which houses CLS at the UCL Institute of Education, is fully accountable for the actions of the parties involved in subsequent data share and use. The agreement mirrors the Data Sharing Framework Contract in place between NHS Digital and UCL. It also requests information about the research proposal, benefits to health and/or social care, organisational security assurance and terms and conditions regarding onward sharing of data, responsibilities and processing activities etc.

Under the sub-licensing model, CLS will deposit the linked data with UKDS who will serve as a data repository. Access to the deposited data will be granted to approved researchers within a Secure Research Environment on behalf of UCL as outlined below. NHS Digital will retain the ability to directly audit UKDS's compliance with the outlined and agreed data access arrangements.

The anticipated volume / number of licences is 1-2 sub-licences per month, and the potential length of the sub-licences is 2-3 years in length.

There will be no charge applied to licenses supplied by UCL.

The territory of use in the sub-licence will be the same or narrower than the territory of use stated in this data sharing agreement, namely the UK.

In this sharing model of the linked data, UCL will be a data controller, determining the purposes for which and the manner in which the linked data are processed. UKDS will be a data processor, as they will be processing the data on behalf of UCL. This includes holding the linked data in a secure environment, screening for completeness of applications for data access, providing training for use of linked data securely, entering into contractual agreements with approved researchers, extraction of approved data and setting up access systems, and approving statistical outputs, following a statistical disclosure control procedure.

The approved organisations and researchers who are granted an access to the linked data via the UKDS Secure Lab, agree to terms and conditions of use, their rights and responsibilities as users of the linked data, as defined by the UKDS. In addition to the agreements signed with the UKDS, the organisation of the researcher applying to use the linked data will enter into a Licence agreement with UCL.

ORGANISATIONAL AGREEMENTS
UCL will provide a sub-licence to UK organisations undertaking research that will be of benefit to the public (this will be assessed in the project proposal form submitted to the UKDS and to UCL). Applicants (potential licensees) will need to show that the provision of the sub licencing will be in the public interest and that the data will be used either (i) for the provision of health care or adult social care; or (ii) for the promotion of health. UCL will not provide data access to commercial organisations for research for commercial purposes. Additionally, the UCL Licence agreement will assess the project proposal against its assessment criteria to determine the details of the project, the people who will be accessing the data, and what data will be requested. Applicants will need to be accredited researchers or agree to undertake training and become accredited, prior to accessing the data. Additionally, an applicant's organisation will need to provide evidence that they have information governance and security assurances in place. Members of the CLS Data Access Committee (DAC) will review and decide if the evidence provided satisfy the criteria requirements.

Applicants (licensees) and their organisations will have to sign two agreements to obtain a sub-licence, one with the UKDS and another with UCL. In both cases the licensee will agree with the terms stated in the Confidentiality Section of the UCL Licence Agreement and with the Confidentiality Terms stated in the Secure Access Agreement which will be signed with the UKDS. By signing these agreements, the licensee agrees to adhere to these terms, including respecting the privacy of health services user data they will receive. Licensees are also reminded of the penalties they are likely to incur if they do not comply with the terms they have agreed. In addition to the above, the UKDS agreement stipulates that data users must complete mandatory training before they are allowed to access the data.

To ensure the security of the linked information, shared with UCL by NHS Digital, and subsequently shared by CLS at UCL with the UKDS; where data could be accessed by approved researchers in a Secure Lab, UCL envisage the following controls employed at the different steps of the process of depositing, approving and sharing of the linked information:

-An agreement between NHS Digital and UCL to onwardly share linked HES/ ECDS and CLS MCS information under the "Sub-licencing model" which outlines the terms and conditions of use of the linked data via the UKDS Service Secure Lab as the data repository, and the full accountability of UCL (housing CLS at the UCL Institute of Education) to the actions of the parties involved in subsequent access to the linked data.

-An agreement between UCL (as a data controller) and UKDS at The University of Essex (as a data processor), which outlines the terms and conditions under which the linked data can be accessed via the UKDS Secure Lab.

-An agreement between UKDS and the approved researcher, which outlines the terms and conditions of use of the linked data in the UKDS's Secure Lab.

-An Agreement between CLS at UCL and the organisation requesting to use the linked data via the UKDS, which outlines the terms and conditions of use of the linked data.

The researcher accessing the data via the UKDS Secure Lab will not be able to download any data. Once the researcher has finished their research, the UKDS will destroy the data folder with the tailored dataset for the specific project.

The data held at UCL will be destroyed if the data sharing agreement between NHS Digital and UCL were to cease. If it were to cease, the license agreement between UCL and the licensee organisation will be terminated.

UCL legal basis for processing (acquiring, linking and sharing) personal data is for a public task under GDPR (article 6(1)(e)) i.e. processing is necessary for the performance of a task carried out in the public interest (as is made explicit to participants in the information leaflets provided). UCL also process special categories of personal data for research under GDPR (article 9(2)(j)) i.e. processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. In addition, for ethical reasons and under the Common Law Duty of Confidentiality, UCL sought permission from cohort members to access and link their routine health records to their survey data, and to the onward sharing of this linked data in pseudonymised form (via a secure setting with appropriate safeguards).

This data dissemination includes the following safeguards:
i. Linkages are based on informed consent
ii. Identifying variables are held separately to the survey responses, including during the matching process
iii. Data transfers are made securely e.g. encrypted
iv. The data will only be used for statistical research purposes and will not involve any decision making affecting a person
v. Data are stored in secure environments certified to ISO 27001
vi. Data are only accessed in pseudonymised form and treated for disclosure if necessary
vii. Data are accessed in a secure environment.
viii. Disclosure control checks are carried before any research publication.

All data processed under the sub-license will be completed using the same legal basis as mentioned above, namely GDPR (article 6(1)(e))and GDPR (article 9(2)(j)). The UCL Licence agreement will require licensees to provide the Legal Basis of their request to link health data via CLS and therefore CLS Data Access Committee (DAC) will only grant approval to applications from researchers within public bodies who have a legal basis to process data under GDPR. The data disseminated to UCL will be accessed by substantive employees of UCL who will work on the data to make it research ready and deposit it at the UKDS for researchers applying to use it for specific projects.

Yielded Benefits:

Below are examples of existing publications using the MCS data benefiting public health. Drinking in pregnancy In the age 3 survey, MCS cohort children completed activities to show which words they understood and spoke, and which colours, letters, numbers, shapes and objects they were familiar with. Parents were also asked about different aspects of children's behaviour, such as how well they got on with other children and how active they were. Research using MCS survey data have found that children whose mothers drank heavily while they were pregnant were more likely to have behaviour problems at age 3 than those whose mothers didnt drink or drank lightly. On average they also did less well in the different activities, although lots of other factors are also important too. Smoking in pregnancy Several studies based on MCS have looked at how smoking during pregnancy relates to childrens development. One group of researchers found that babies with mothers who smoked at any point while they were pregnant weighed on average 146 grams less when they were born (around the weight of a smartphone) than babies with mums who did not smoke. Overall, the more cigarettes a mother smoked a day, the less her baby weighed at birth. Babies with mothers whose partners smoked around them while they were pregnant also weighed on average 36 grams less (about the weight of a chocolate bar) than those with mothers who were not exposed to smoke. Breastfeeding and child health An influential study found that babies who were breastfed in the first months of their lives were less likely to go to hospital for diarrhoea or respiratory problems, such as infections and pneumonia. The researchers estimated that half of hospital stays for diarrhoea, and a quarter of stays for respiratory problems, could be prevented every month if all babies in the UK were fed entirely on breast milk for at least six months. Breastfeeding and child development Between ages 3 and 7 MCS children took part in a range of activities to show which words they knew and the patterns they could identify in shapes and images. Studies have found that children who were breastfed tended to do better in these exercises and to have less behaviour problems. Research has also suggested that there is a relationship between breastfeeding and young childrens ability to coordinate the movements of their arms and legs and to reach milestones such as standing up for the first time and taking their first steps.

Expected Benefits:

MCS surveys include questions relating to health. CLS at UCL will use these responses to compare with their data available on HES/ ECDS to obtain a better understanding of relationship between self-reporting and administrative data. This will be shared via methodological information which will assess the data quality and comparability of two important data sources. This will benefit research looking at Health and Social Care.

This data linkage will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of cohort members. This could be of direct benefit to the NHS and to community services interfacing with schools through informing policy to improve healthy lifestyles.

The creation of this linked MCS- HES/ ECDS database will be a rich data resource with great potential for research on health and for informing policy interventions on health and social care.

Outputs:

Following the data quality and validation work, the first output will be the creation of the linked MCS- HES/ ECDS dataset. The HES/ ECDS data will add an important layer to this already rich data as well as providing the means for data quality checking.

The second output will be health-related research, alongside methodological papers on the linked dataset, published in peer-reviewed journals. The methodological assessments are expected to finish three years after obtaining the data. Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide. CLS researchers doing research and/or methodological work will access this data via the UCL Data Safe Heaven.

The creation of this MCS- HES/ ECDS database and the research and methodological papers are the first steps in establishing a robust research database which will be of benefit to health and social care. This data linkage opens new research opportunities by combining reliable administrative data with detailed survey data. This linkage increases the number of variables available for research in the dataset and complements the health information provided by the participants in the survey. Combined, these data sources enhance each other, making it possible to capture detailed information regarding an individuals health and wellbeing. Health events can be experienced over an extended period, tracking all relevant events over such a long period may not be feasible in a single database. Using HES/ ECDS data which records such information can improve the accuracy of the data collected in the survey, offering huge potential for scientific and policy-related research.

CLS at UCL actively promotes the use of their data among the research community through publications and events (e.g. training and workshops on each data set to help researchers better use the data), as well as providing extensive documentation, guidance on the use of the data and so ultimately benefit health and social care.

Processing:

1. CLS at UCL will supply NHS Digital with identifiers of cohort members who have consented to this data linkage, including full name, sex, postcode, date of birth, NHS number (if known) and study ID (study-specific pseudonymised identifier).

2. NHS Digital will link the identifiable study data to HES and ECDS data. NHS Digital will then remove identifiers from the linked datasets and return the pseudonymised datasets to the CLS team at UCL with the study ID. The data disseminated to UCL will be accessed by substantive employees of UCL who have been appropriately trained in data protection and confidentiality. The data will be held at the secure server in the UCL Data Safe Haven (DSH) and accessed remotely by CLS staff.

3. CLS will carry out validation of the administrative pseudonymised data received (linked HES/ ECDS data) and will combine the supplied administrative data with the information collected from the participant as part of the MCS study using the study ID.

Once the linked survey-administrative data files have been created, CLS may perform other activities to prepare the data for use such as coding and cleaning, derivation of summary variables and compilation of data documentation.

4. CLS researchers will use these data to create an analysis file which will not contain any identifiable data.

5. CLS will create derived variables that summarise study members' hospitalisation and health histories (e.g. hospital admissions and re-admissions, incidence of common diseases, children's ailments etc.), and will compare MCS survey data with data from hospital statistics, in order to compare and validate the data collected in CLS surveys.

CLS researchers who need to access the data to produce methodological papers on the quality of the data (eg around measurement, representativeness) and research papers helping to showcase its benefits for health and social care will need to submit an application to CLS DAC detailing their project proposal. Upon DAC approval, a pseudonymised dataset will be provided to the researcher. The data will be held at the secure server in the UCL Data Safe Haven.

Identifiers will be held separately from attribute characteristics. HES/ ECDS data will not be relinked to the identifiable data which is held separately from the survey responses. Re-identification will only happen at the occasion of a request, made from a cohort member, for withdrawal from the study, and this includes removal of data. Where a participant wishes to withdraw from the study, the identifiable data is used to locate the study id, and then in turn destroy their data.

UCL DATA SAFE HAVEN

The UCL DSH is certified to ISO 27001:2013 and is compliant with NHS Digitals Data Security and Protection Toolkit. Research teams using the DSH complete annual training and regularly review data access arrangements ensuring data is only limited to those authorised to access it. UCL Computing Regulations are based on the premise that access to resources is generally forbidden unless expressly permitted. All data transfers from the DSH require approval and are carried out through secure portals which are fully audited. Access to the UCL DSH is via remote desktop and requires multi-factor authentication. In addition to a strong password each user has to use a six digit number generated by a smartphone app or physical token at each login. Passwords must be changed at regular intervals, and unused accounts are automatically disabled after a fixed period. Once inside the environment, robust access control ensures that researchers can only examine information that they are approved to use.

UKDS DATA ACCESS MECHANISMS:

As an ESRC resource centre, CLS at UCL shares its survey data with the research community via the UKDS under safeguarded or controlled access mechanisms, dependent on the likelihood and potential impact of disclosure. Data with higher risk of disclosure is treated with an appropriate degree of security and management. CLS data fall into the following categories which are defined by the likelihood and potential impact of disclosure:

-Tier 1: data with low level of disclosure: e.g. participant self-reported survey data. These data are made available through the UKDS End User Licence and have a low impact of disclosure;
-Tier 2a: data that is potentially disclosive: e.g. medium level and coarse geographies or sensitive information about cohort members. These data are made available through the UKDS Special Licence and have a medium impact of disclosure;
-Tier 2: data that are too detailed, sensitive or confidential to be made available under the standard End User Licence or Special Licence, such as detailed geographical indicators or fine-grained individual level linked data. These data have a high impact of disclosure and are made available through the UKDS Secure Access.

ACCESS MECHANISMS TO NHS DIGITAL HES/ ECDS DATA LINKED TO CLS COHORT STUDIES VIA THE UKDS:

The HES/ ECDS data provided to CLS at UCL by NHS Digital, which are linked to the CLS cohort members, have been processed by the CLS data management team to minimise the risk of disclosure when linked to the CLS survey data. This has been achieved by removing highly identifiable variables and altering other variables by top-coding or truncating them. Following this processing, the final health datasets have been classified under Tier 2.

It is UCL's intention to deposit these Tier 2 linked HES/ ECDS data with UKDS under the UKDS Secure Access, and provide access to this information for approved researchers, following the process and contractual arrangements, outlined above and described in more detail below, following the onward sharing model agreed between NHS Digital, CLS at UCL, and UKDS.

The data provided will be pseudonymised, and will be accessed only via the UKDS secure lab. Data accessed in this way cannot be downloaded. This means that researchers will have access to a screen view only and it is not possible to remove data from the environment. Once researchers and their projects are approved, they can analyse the data remotely from their organisational desktop, or by using the UKDS Safe Room. Specialised staff will apply statistical control techniques to ensure the delivery of safe statistical results.

ADDITION OF THE SUB-LICENCE:

The process of accessing the linked data via the UKDS Service Secure Lab include the following steps:
-Registration with the UKDS Service.
-Submission of an application, including an 'Accredited Researcher application form' and 'Research proposal'.
-Screening of application by UKDS for completeness.

Once a researcher has registered and UKDS has screened / approved the application:

1) UKDS sends project proposal (researcher application forms) to CLS at UCL for Data Access Committee (DAC) approval. Applicants are required to demonstrate they have security assurance in place (System Level Security Policy/ISO Certificate/DSPT).

2) CLS sends the CLS Licence Agreement for linked NHS Digital Data (henceforth referred to as UCL License Agreement) to researchers to be completed and signed by their organisation (this includes the benefits to health and social care and evidence of organisational security assurances) not covered by UKDS application.

3) Researcher/their organisation representative will send the UCL License Agreement for Linked NHS Digital data completed and signed back to CLS.

4) CLS will: a) check the organisational Information Governance and security assurance evidence provided as per Section 15 Organisational Security Assurance of the UCL Licence agreement, b) send the project for CLS DAC approval.

5) Should evidence of organisational Information Governance and security assurance provided not meet the requirements (as outlined in the sub-license), CLS will request the applicant to provide further evidence, and will only submit the project to CLS DAC for approval when evidence provided is satisfactory. Approval for data access will only be granted to applicant organisations that meet the security assurance requirement.

6) CLS DAC will assess both documents (UKDS project proposal + UCL Licence agreement) and make a decision to approve it, not approve it, or require further information. Should an application be rejected, a researcher can apply again with a revised application.

7) In the UCL License agreement, CLS DAC will, among other things, assess the benefits for health and social care statement and decide whether it is satisfied with the answer.

8) Once CLS DAC approves the project :
a) CLS will inform UKDS that the project has been approved.
b) CLS representative should sign the UCL License agreement noting the DAC reference number on the License document and send it back to the organisation of the applicant (the Principal Investigator for the study requiring access will sign the agreement - they will be an authorised signatory for their organisation).
c) CLS DAC will publish the information about any data dissemination on the NHS Digital release register, including the name of the organisation to which data was provided, purpose (summary of the project) and what data was released. (NB. If CLS DAC doesn't approve the project, no data will be disseminated).

9) If CLS DAC is not satisfied with the evidence provided by the applicant about the benefits to health and social care, then CLS DAC can ask the applicant to provide additional information and the project can be re-submitted for CLS DAC approval on the next CLS DAC meeting or via Chair approval.

10) UKDS will inform the researcher that their project was approved and make the data available to them via Secure access to linked data at the Safe Centre at the UKDS (hosted at the University of Essex) or via the researcher's own institutional desktop PC, depending on the sensitivity/impact level of the data being requested.

11) UCL will inform NHS Digital as to who they have issued sub-licences to, in a format agreed with NHS Digital.

Only staff who are permitted to work at the UKDS (and are substantively employed by University of Essex) will process the data for sub-licensing. Note that any data accessed through the UKDS Secure Lab can only be accessed under secure conditions and cannot be downloaded. The linked data provided to approved researchers may be subject to sub-setting of variables (and if necessary cases) to minimize disclosure risks and ensure that no individual or organisation can be identified from the results. In addition, all statistical outputs are subject to statistical disclosure control procedure. Access to the Secure Lab is only available to researchers who are based at a UK academic institution or an ESRC-funded research centre, and are an ESRC Accredited Researcher. PhD and research students can request access but must apply jointly with their supervisors from established organisations.

No further onward sharing can occur beyond the sub-licence.

UKDS SECURE DATA HANDLING PROCEDURES:

UKDS has received government technical accreditation and has been certified for its secure data handling procedures under the international standard for information security (ISO 27001). To maintain this certification, regular internal and external audits are undertaken. UKDS also hires a government-approved company to conduct internal and external penetration testing of its Secure Lab systems.

More widely, the UKDS employs an Information Security Management System (ISMS) to ensure compliance with the ISO accreditation. The Secure Lab falls into this system, and a number of documented processes are regularly maintained and reviewed to ensure these processes are robust, relevant, and fit-for-purpose. The ISMS is overseen by an Information Security Management Group (ISMG), which regularly meets and approves changes to procedures.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by 'Personnel' (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Extended follow-up of the TARGIT A Trial — DARS-NIC-126676-G1X4M

Opt outs honoured: No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(c),

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2019-04 – 2022-03 2025.01 — 2025.01. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 4 July 2024 final.pdf, IGARD Draft Minutes - 22 July 2021 FINAL.pdf, igard-minutes-7th-february-2019-final.pdf, igard-minutes-7th-march-2019---final.pdf

Datasets:

Civil Registration (Deaths) - Secondary Care Cut
Civil Registrations of Death - Secondary Care Cut
Cancer Registration Data
Civil Registrations of Death
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Identifiable

Objectives:

Breast cancer remains the most common female malignancy and its incidence continues to rise. The common conventional treatment of early breast cancer involves surgical excision of the tumour and surgery to the axillary lymph nodes. This breast conserving surgery needs to be followed by external beam radiotherapy given over several weeks of daily treatments, given with the intention of reducing the rate of further cancer developing within the operated breast. Whilst this is an effective treatment with a low rate of local recurrence of cancer, laboratory work and its clinical correlation has suggested that radiation to the whole breast may not be necessary in all cases and radiotherapy to the tissue only around the tumour within a risk-adapted approach may be as effective.

University College London (UCL) has been awarded a grant from NIHR Health Technology Assessment (HTA) to run the study Extended follow up of the TARGIT-A trial through the Surgical & Interventional Trials Unit (SITU). The SITU is part of UCL. The SITU specialises in providing infrastructure for running studies and has a history of managing large-scale randomised controlled clinical trials in solid tumours, historically in breast cancer. UCL will be the only organisation with access to the record-level data requested from and supplied by NHS Digital.

The TARGIT-A randomised clinical trial, compared a risk-adapted approach with use of single dose targeted intra-operative radiotherapy (TARGIT IORT) vs. conventional external beam radiotherapy (EBRT) given as a daily course over 3 to 6 weeks. The initial and 5 year results have been published and found that TARGIT-IORT is non-inferior to EBRT.

The published results of the TARGIT-A trial show that compared with conventional external beam radiotherapy given over several weeks, TARGIT given at the time of lumpectomy within a risk-adapted approach achieves much the same results in terms of breast cancer control (locally and systemically). Interestingly, TARGIT was found to have a significantly lower mortality from causes other than breast cancer due to fewer deaths from cardiovascular causes and other cancers.

Although the current results are convincing enough for the treatment to be adopted worldwide (over 20,000 women have now had this treatment worldwide), it is essential that all the UK cohort of 608 patients who were randomised into the trial are followed up over a longer period of time and data analysed as per the original TARGIT-A trial protocol. For patients in the UK cohort, their data will come from NHS Digital. The current plan is to analyse as by per-pathology and post-pathology strata as well as subgroup analysis as per hormone receptor status and hormone therapy. Multivariate analysis will also be performed for assessing the predictive value of other tumour and patient factors such as age, tumour size, grade lymph node status, margins, lymphovascular invasion, time since randomisation, etc. The recruitment in the trial was completed in June 2012.

This extended follow-up study will enable timely recording of additional local recurrences and deaths. With a higher number of events, it would be possible to perform meaningful subgroup analysis using predictive factors such as hormone receptors (available data suggests that these have a predictive value), tumour grade and lymph node involvement that would allow fine tuning of patient selection criteria. Furthermore the effect on non-breast-cancer and overall mortality will also be ascertained.

It is expected that the new data (including that from NHS Digital) will significantly influence wider and enthusiastic adoption of this approach that will be greatly welcomed by patients. As a large proportion of such patients are screen-detected, their overtreatment would be avoided by such adoption.

Expected Benefits:

The results from the TARGIT-A trial show that both conventional and novel means of administering radiotherapy produce similar results. SITU would like to continue to collect data about the health status of all patients in the trial to enable the researcher to learn about long-term differences in the effects of these treatments on health.

This extended follow-up study will enable timely recording of additional deaths. Furthermore, the effect on non-breast-cancer and overall mortality will also be ascertained. This study will therefore be expected to produce measurable benefits to the health of NHS patients within the UK, as patients could receive all of their radiotherapy treatment whilst in the operating theatre rather than having to return daily for several weeks for radiotherapy treatment. For the patients, the biggest benefit of having TARGIT-IORT during their lumpectomy procedure, under the same anaesthetic, is that they complete their local treatment in one session and with lower toxicity.

It is anticipated that the results from this study will add to the researchers' overall understanding of breast cancer and how it may be better treated in future. In addition, a successful outcome will mean that the methods for obtaining follow-up information used in this study could be applied to future clinical trials where long-term follow-up of patients is important. Early breast cancer has a very good survival rate, and the vast majority of women will have no problems after their initial treatments (surgery and radiotherapy). Therefore, in the UK these women tend to be discharged from hospital clinical care after three years. However, in the trial we want to obtain follow-up data for at least ten years, so obtaining the information from hospitals is becoming increasingly difficult. Directly contacting patients seems to be a way to obtain the required data in a more straightforward manner, and fills the gap between hospital follow-up and ONS data.

For any healthcare system including the NHS, TARGIT-IORT has been shown to be cost effective and incurs a lower overall cost to the NHS. It also reduces the journey times for patients who would otherwise need to travel, on average, 730 miles for their EBRT treatment.

Outputs:

TARGIT-A data has been published several times at different junctures through the trial follow up. This request to obtain Civil Registration data from NHS Digital, will enable SITU to incorporate death data in order to update the dataset used in this publication.

SITU will also aim to publish the results in high impact international journals, such as The Lancet. Patient-level data from NHS Digital will not be published.

In addition, reports of interim results will be provided to the Extended follow-up of TARGIT-A Trial Steering Committee, Sponsor and Funder.

The final report of results is anticipated to be submitted to the funder in 2023. This publication is expected to be open access in a peer reviewed journal such as The Lancet, British Medical Journal, etc.

Further academic papers will be published in open-access, high impact, peer reviewed journals on the methodology and impact on mortality. A patient-friendly version of the findings will be published on the UCL website.

For each paper published, a short presentation will be developed to summarise the findings for a range of stakeholders who have an active interest in the TARGIT method of administering radiotherapy, including health care professionals, patient groups, and/or their carers. Findings will be presented at national (NCRI Cancer Conference event, Association of Breast Surgeons - ABS meeting) and international events (ASCO annual meeting). It is important for patients, their family, and friends, that they are reassured the TARGIT technique has established long term safety and efficacy.

Attendance is planned at national conferences such as The National Cancer Research Institute (NCRI) annual meeting and international conferences such as The American Society of Clinical Oncology (ASCO).

The overall mortality and onset of new cancers are two critical outcomes for this study.

All outputs and publications contain only aggregated data with small numbers suppressed in line with the HES Analysis Guide.

Processing:

The study has been divided into two Work Packages.

Work Package 1: For patients in the UK cohort (England only), continue to gather efficacy, safety and follow-up data to year 10 by contacting patients directly and asking them to consent to have their information collected through Work Package 2 (below), and complete an annual questionnaire.

Work Package 2: Collect death data for UK patients through NHS Digital.

Collection of death data from UK patients through NHS Digital will help improve the completeness of the data, and will support WP1.

Identifiers for the cohort who have consented to be in the study will be sent to NHS Digital. Identifiers include:
~ NHS Number
~ Date of birth
~ Postcode
~ Unique Study ID

In addition, as part of Work Package 1, the patient will be contacted annually directly by SITU and asked to complete a follow-up (questionnaire) form, and return by post in the pre-paid envelope provided. Patients will be followed up until death or withdrawal of consent. Patients can refuse consent at any time by contacting either SITU or the PI, in which case no further contact will be made.

NHS Digital will then return linked Civil Registration data (death data linked to the study identifier) to the SITU unit at UCL.

This data file will then be linked to the existing TARGIT A Trial database (using the Unique Study ID). UCL will extract a subset of the data containing no direct patient identifiers and periodically send this to the Trial Statistician for analysis via a secure encrypted electronic process carried out at UCL. Each patient will only be identified by a unique subject number (e.g. TT 018 001 X).

SITU do not require direct patient identifiers from NHS Digital. SITU already hold identifiable data on current TARGIT A Trial patients. Identifiable data already held is only accessed by authorised individuals within UCL, all of whom are substantive employees.

No attempts will be made to re-identify individuals from the data and the identifiable data will not be made available to any third parties.

In order to comply with the requirements of the grant, the patients will be followed up for 5 years in the first instance, as most patients already have 5 years of follow-up and data needs to be collected until they have been in the study for 10 years. NHS Digital civil registration data is only required from patients who were recruited into the TARGIT A trial from 6 UK based hospitals (London UCL, London Royal Free, London Whittington, London Guy's and Winchester Royal Hampshire.

All data obtained will be held securely in UCL. Patient identifiers (such as name, address, etc.) will be held on a separate Data Safe Haven which is a service that provides a technical solution for storing, handling and analysing identifiable data provided within UCL.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

Data will only be requested from NHS Digital where patients have provided a complete and valid signed consent form. For avoidance of doubt, the death data will only be requested and will only flow after the consent has taken place.

This application relates solely to the follow-up phase of the study.

Linkage of NHS Digital data to young people with perinatal HIV, to monitor cancers and deaths. — DARS-NIC-368477-C9Q1X

Opt outs honoured: No, Yes (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 - s261 - 'Other dissemination of information'; Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-02 – 2025-02 2022.04 — 2024.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 9 December 2021 final.pdf, IGARD Minutes - 13 January 2022 final.pdf

Datasets:

Cancer Registration Data
Civil Registration - Deaths
Demographics
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant, Identifiable

Expected Benefits:

This study hopes to provide evidence on health outcomes in early adulthood and to provide the foundation for long-term monitoring. The linkage to NHS Digital, for cancer and mortality data will be critical to estimate the risk of disease progression, hospitalisation and mortality and - it is hoped - will help to tailor future HIV care accordingly, either to help diagnose cancers earlier, or to prevent cancers and deaths from occurring. Improved data on critical health risks such as cancers and deaths are of public interest to government agencies such as the NHS and UK Health Security Agency (UKHSA), to quantify the burden of PHIV-related ill health and to tailor prevention and treatment.

The outputs of this study aim to either give reassurance that there is no additional risk of cancer or mortality among young people with PHIV, or to indicate what the increased risk is, and in which if any subgroups. Should increased risk be detected, results (through conference presentation and also review by guideline committees) are likely to change clinical guidelines both in the UK and Europe, and internationally, on the management of young people with PHIV. The aim would be for increased healthcare resources to be available for this group in order to maximise health benefits in the future. It is hoped that this will be achieved through sharing publications with the UK Health Security Agency and NHS England.

The MRC CTU at UCL aims to conduct analyses of these data to inform this overall aim. The study team aim to analyse incidence of cancers and deaths by age in young people living with PHIV, controlling for potential confounding factors.

Outputs:

The results of this study are expected to be published in high-impact peer reviewed journals such as Clinical Infectious Diseases or AIDS Care to give the highest impact and broadest readership. Papers are aimed to be published based on UCL open access policy. This would include publication in open access journals, and summaries of results may be made available on the MRC CTU on the UCL website which is freely open to the public.

Outputs will be anonymised to the level required by the Information Standards Board for Health and Social Care (ISB) anonymisation standard and will contain aggregate and suppressed data (according to the HES analysis guide) only.

In addition, it is hoped that the results will be presented at scientific conferences and professional meetings related to HIV. The conferences will be chosen depending on the key findings and also the target audience. Possible conferences include the Children's HIV Association (CHIVA) and the British HIV Association (BHIVA) conferences. This would allow for coverage of key stakeholders in the setting of registry data and clinicians involved in HIV research, as well as young people themselves and their families.

The study team have already produced a leaflet about CHIPS and CHIPS+, targeted at young people themselves, to help the recruitment of young people into the study so the study team plan to produce another leaflet about the findings of CHIPS+. The study team found this worked really well in the Adolescents and Adults Living with HIV Cohort Study (AALPHI) where the study team engaged young people in a project on the dissemination of the findings and they developed a leaflet and film with a graphic designer and film maker. It is hoped that the leaflet will be sent out to participating clinics, and put on the Childrens HIV Association (CHIVA) website. CHIVA is a registered charity based in Bristol which, among other things, aims to enhance the health and social outcomes of children, young people and families living with HIV https://www.chiva.org.uk/about/]

The Youth Trials Board, which is part of CHIVA and made up of nine young people who have some training on clinical trials are going to help develop the study dissemination strategy targeting young people. They have suggested using social media to disseminate the results using their twitter chat with @freedom2speak and their Instagram content is being developed in 2022 and could be also used as a platform to share results with young people. They could also do presentations at CHIVA events including CHIVA camp which happens every summer.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract).

The processing activities are as follows:

1. The CHIPS+ Study team will identify the study participants for linkage to NHS Digital data, and inform the UCL MRC CTU Head of Data Management Systems (DMS). The team will provide the Study ID, Date of Birth, initials and sex to the Head of DMS.

2. The UCL MRC CTU Head of DMS has a secure database of Patient Identifiable Data (PID) in UCLs Data Safe Haven; and will merge the PID with a list of Study IDs.

3. The resultant study cohort of participants (approx. 750 records) will be sent by the UCL MRC CTU Head of DMS to NHS Digital via Secure Electronic File Transfer Service (SEFT) with the following identifiers for linkage to the requested NHS Digital data products (Demographics data extract, Cancer Registrations data extract, and Civil Registration (Deaths) data extracts):
Study ID (CHIPS+ participant identifier)
NHS Number
Date of birth
Date of withdrawal (if applicable)
Additionally, those participants who have been recruited under a consultee will be flagged and will have National Data Opt Out applied to them.

Patients have consented for data to be shared with researchers in an anonymised form, which will be aggregated with small numbers suppressed as per the HES analysis guidance. They have also agreed that personal details can be used to obtain long term follow up information from national registries.

4. NHS Digital will use the cohort identifiers to extract linked data from the requested data products, including the full date and cause of death, cumulative from cohort member's date of first recruitment (unless subsequently withdrawn from the study).

5. The pseudonymised record-level NHS Digital datasets with the Study ID will be sent to the MRC CTU at UCL using SEFT. There will be three drops of pseudonymised record-level NHS Digital data during the period of this agreement.

6. NHS Digital records will be uploaded to UCLs Data Safe Haven, an output file for trial statisticians with the study-specific trial number will be prepared, and checked to ensure there is no PID within the file.

7. This output file is placed in a secure directory with limited access only to certain members of CHIPS+ study team.

8. Trial statisticians undertake data cleaning/validation activities for the processing of the data for the CHIPS+ study.

The requested pseudonymised record-level data from NHS Digital will be used to ascertain the incidence of cancers and mortality in young people with PHIV from the CHIPS+ cohort. The data from NHS Digital will allow the MRC CTU at UCL to ensure that deaths are captured and included promptly by all centres and clinics involved in the trial, and to verify that all information is correct and recorded. The number of participants and the rationale for their inclusion will always be included in all presented results.

No linkage will be made to any other data set not already stated in this agreement.

The data will reside in UCLs Data Safe Haven and will be identified by Study ID only, thus there will be no identifying personal data attached to a study number. Only defined members of the CHIPS+ study team will have access to Data Safe Haven for data analysis all are substantive employees of UCL. All UCL substantive employees have completed training in data protection and confidentiality, and users of Data Safe Haven receive appropriate training before being granted access.

The data will be held on UCLs Data Safe Haven. The Data Safe Haven is UCLs technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital DSP Toolkit and ISO 27001 Information Security standard. Access is controlled by the Information Asset Owner, and all UCL staff complete training in confidentiality and data protection, which is renewed annually.

Statistical data analysis will be carried out via UCL owned devices connected to the UCL network either directly in person or remotely, using an appropriate statistical package. To remotely access the server with a remote device requires a secure 2-factor authenticator (VPN) and users are then able to securely access the secure server on the Universitys IT framework. All data analysis will be conducted within the confines of the Universitys secure server, and will not be downloaded to remote devices for storage or processing.

The study team collects the participants NHS Number and Date of Birth at the time of consent/registration in the study only to enable linkage with NHS Digital. NHS Number and Date of Birth will be stored in the UCL Data Safe Haven until they are transferred to NHS Digital for linkage.

Records of all NHS Numbers will be immediately destroyed following the linkage process. Linkage with NHS Digital will enable the study team to capture data on incidence of cancers (including site and type) and deaths (date and cause).

Pseudonymised record-level data provided by NHS Digital will only be accessed and processed by substantive employees of UCL. The pseudonymised data will be stored separately to the Identifiable data on the UCL Data Safe Haven.

HES DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION
In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, data processors must make sure that:
· National-level figures only may be presented unrounded, without small number suppression
· cell values from 1 to 7 (inclusive) are suppressed at a sub-national level to prevent possible identification of individuals from small counts within the table.
· Zeros (0) do not need to be suppressed.
· All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

MR1450 - National Child Development Study (NCDS) — DARS-NIC-137864-T1P9B

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 - s261(5)(d)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2019-01 – 2022-01 2018.03 — 2024.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-28th-february-2019---final.pdf, igard_minutes_1_february_2018.pdf, igardminutes-17thdecember2020final.pdf, igard-minutes-25th-april-2019---final.pdf

Datasets:

MRIS - List Cleaning Report
Demographics

Type of data: Identifiable

Objectives:

The National Child Development Study (NCDS) is the second of Britain’s world renowned national longitudinal birth cohort studies. It follows all those born in one week in 1958 through the course of their lives, charting the effects of experiences in early life on outcomes and achievements later on. The study has its origins in the Perinatal Mortality Survey. Sponsored by the National Birthday Trust Fund, this was designed to examine the social and obstetric factors associated with stillbirth and death in early infancy among the children born in Great Britain in that one week. Information was gathered from almost 17,500 babies.

Since 1958 information has been gathered from the NCDS cohort on nine occasions. Over time, the scope of enquiry has broadened from a strictly medical focus at birth, to encompass physical and educational development at the age of seven, physical, educational and social development at the ages of eleven and sixteen, and then to include economic development and other wider factors at ages 23, 33, 42, 44, 46, 50 and 55. The next NCDS survey will take place in 2019 when study members will be aged 61.

In 1958, when the birth survey was carried out, consent to participate in surveys was gained by respondents agreeing to be interviewed or respondents returning the completed questionnaire to the study team. Involvement in subsequent surveys adopted the same approach. Individuals could withdraw from the study at any time by simply expressing the wish to do so.

In all recent follow-ups the approach to collecting consent has been very similar. During fieldwork, study members were sent an advance letter advising them about the survey. The letter was accompanied by an information leaflet explaining what is involved. Study members had the opportunity to request further information, or to opt out of the survey at this point. They could also seek further information, or refuse further involvement when the interviewer attempted to make an appointment to visit; when the interviewer visited and at any point during the administration of any elements of the surveys.

Of the approximately 18,500 individuals that have ever participated in the study there are now approximately 3,500 for whom the Centre for Longitudinal Studies (CLS) at University College London do not currently have a confirmed address. These are not individuals which have informed CLS that they wish to withdraw from the study, CLS have simply lost touch with them.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for these 3,500 study members whose whereabouts are currently unknown. All of these individuals have made an informed decision to participate in the study over the years and have been made aware that the study is seeking to follow them throughout their lives.

Objective:

Each year CLS sends an annual birthday card postal mailing in March to all NCDS participants. CLS asks that participants complete a ‘reply slip’ which is returned to CLS which allows participants to provide CLS with any change in their details e.g. a new email address, phone number, etc. CLS also ask them to return the reply slip even if none of their details have changed i.e. seeking a positive confirmation that that is the address CLS hold for them.

As a result CLS, can maintain the cohorts' latest details on the NCDS database. In the event of the birthday card not reaching the participant it is returned to CLS as a ‘return to sender’. CLS will attempt to trace all these returns – but if CLS cannot locate the participants then they are flagged on the database as a ‘gone-away’. It is these cases (3500) that are being sent to NHS Digital for list cleaning as the NHS may potentially hold a more recent address and provide CLS with an opportunity to invite the cohort to re-join the study.

NHS Digital will supply new addresses for untraced study members who can be matched to the NHS Central Registry/Personal Demographics Service (PDS).

CLS require to trace lost study members between now and the Age 61 survey in 2019 which is currently in the planning stage. Any study members successfully traced via this route would be written to and asked to provide updated contact details. They will then subsequently be invited to participate in the NCDS Age 61 survey (unless they withdraw from the study).

All those the researchers would seek to trace have participated in at least one prior sweep of the study and none have ever informed CLS that they no longer wish to participate in the study. The researchers feel that a substantial number of these individuals would be willing to participate in the study if they could be contacted. Previous efforts to re-establish contact for other cohort studies have been very successful using this route. When the cohort are contacted they will be given the opportunity to withdraw.

If the participant has died no contact will be made and the study will be updated to reflect this.

Yielded Benefits:

Expected Benefits:

Benefits of the list cleaning:

Submitting the cohort for list cleaning will allow the researchers to recontact the participants who CLS have lost touch with and give them the opportunity to re-engage or clearly state that they wish to withdraw. It will also minimise the risk of literature going to the incorrect address. and contact being made with participants who have died.

Benefits of the Study:

The 1958 cohort, as it approaches age 60, has now entered a critical period for the understanding of the heterogeneous processes of ageing, and in particular how earlier life experiences impact on health and well-being later in life. The continued ageing of the population in the UK and elsewhere make the understanding of healthy ageing a top priority policy concern, across a wide range of health and social policy domains.

The main outcome from the NCDS Age 61 survey will be fully documented, anonymised research dataset and this will be archived with the UK Data Service in early 2022 to provide a strategically important resource for UK Social Science, including researchers in health and social care.

GENERAL BACKGROUND INFORMATION & CONTEXT inc. PUBLICATIONS

The Age 61 Survey will be comprised of two major components:
1) A core interview which will cover the following topics:
- Health, well-being and cognition: physical health, mental health, medical care, health behaviours (e.g. smoking, drinking, diet, exercise), cognitive function.
- Finances and employment: work, income, wealth (savings and debts, pensions, & housing), retirement plans & education.
- Family, relationships and identity: social networks, relationships with partners, parents, children, friends, neighbourhood, social capital, social and political participation, attitudes and values, and religion.

2) A detailed biomedical assessment including measures of anthropometry, physical functioning, cardiovascular risk factors and a full range of blood tests. The central aim of this proposed biomedical assessment is to enable new research that will inform key public health concerns.

The cohort is now transitioning between midlife and early older age, a critical time when biological ageing in key systems (e.g., cardiovascular, metabolic, immunity) start to accelerate, and a series of health conditions that have a profound influence on well-being first become clinically manifest.

The information collected during the Age 61 Survey will enable researchers to uncover life course and inter-generational factors which contribute to healthy ageing among this generation, and thus to inform the development of preventative health policies across the whole of life that will expand healthy life expectancy, and reduce the burden of ill-health and disease at older ages.

Below are some examples of existing publications using NCDS data benefiting public health:

• Power, C., & Matthews, S. (1997). Origins of health inequalities in a national population sample. The Lancet, 350(9091), 1584-1589.
• Hyppönen E, Power C. Hypovitaminosis D in British adults at age 45 y: nationwide cohort study of dietary and lifestyle predictors. Am J Clin Nutr. 2007; 85 (3):860-8.
• Strachan, D.P., 2000. Family size, infection and atopy: the first decade of the 'hygiene hypothesis'. Thorax, 55 (Suppl 1), p.S2.
• Clark C, Rodgers B, Caldwell T, Power C, Stansfeld S. Childhood and adulthood psychological ill health as predictors of midlife affective and anxiety disorders: the 1958 British Birth Cohort. Arch Gen Psychiatry. 2007; 64 (6):668-78.
• Orfei L, Strachan DP, Rudnicka AR, Wadsworth M. Early influences on adult lung function in two national British cohorts. Arch Dis Child. 2008; 93 (7):570-4.
• Johnson W, Li L, Kuh D, Hardy R. How Has the Age-Related Process of Overweight or Obesity Development Changed over Time? Co-ordinated Analyses of Individual Participant Data from Five United Kingdom Birth Cohorts. PLoS Med. 2015; 12 (5):e1001828.

Outputs:

Any study members choosing not to take part in the study are flagged on this the secure confidential address database at the CLS with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited pseudo-anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which cases this data is securely deleted.

These addresses obtained from NHS Digital will be used to maintain contact with study members e.g. to send them a special birthday mailing for their 60th birthday in March 2018 and then later to invite them to take part in the Age 61 survey. However this data received via this application is never sent or published to the UK Data Service.

The main outcome from the NCDS Age 61 survey will be a fully documented, anonymised research dataset and this will be archived with the UK Data Service in early 2022 to provide a strategically important resource for UK Social Science, including researchers in health and social care.

Processing:

NHS address tracing. CLS wish to use the patient status and tracking products which uses NHS registration data to trace as many NCDS study members as possible, either by finding new address details or verifying existing address details for the cohort.

CLS will supply NHS Digital with a file of around 3500 study members to match to the NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of NCDS. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following data items:

- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Sex
- Last known address, and postcode
- NHS Number

NHS Digital would supply the following details to CLS:

- CLS identifier
- Latest surname
- Latest forename
- Latest middle name (where available),
- Date of birth
- Gender
- Latest address and postcode
- Fact of Death (and embarkations)
- Date of address registration or update
- NHS Number.

In addition to the receipt of any 'new' matched address information for the study members, NHS Digital will add an additional variable that describes the outcome of the matching process to the data that is returned to CLS – that is, this additional variable will allocate each study member to one of the following four categories:

• new/different address found,
• existing address confirmed,
• no match found,
. participant has died.

The data file supplied from NHS Digital, will be processed within CLS only and entered into CLS’s secure database i.e. CLS will load more recent addresses into the database. All NCDS study members contact information is held in this secure confidential address database at the Centre for Longitudinal Studies and used to maintain contact with study members and to invite them to take part in the NCDS Age 61 survey.

Study members newly traced would be written to and invited to re-engage with the study. Any newly traced study members who on being contacted were to indicate that they no longer wish to participate in the study would be recorded as a 'permanent refusal' on the CLS database and not approached again.

All those accessing the data supplied by NHS Digital are substantive employees of University College London.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

MR1470 - Using routine data to identify and assess clinical outcomes for the STAMPEDE trial: Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy. — DARS-NIC-59873-D8C6G

Opt outs honoured: No - data flow is not identifiable, No, Yes (Excuses: Consent (Reasonable Expectation), Mixture of confidential data flow(s) with consent and flow(s) with support under section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 - s261(5)(d); Health and Social Care Act 2012 s261(2)(c); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No, Yes (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-07 – 2023-06 2021.01 — 2024.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 10 August 2023 final.pdf, igard-minutes---18th-june-2020-final.pdf, igard-minutes-9th-august-2018-final.pdf, IGARD Minutes - 26 January 2023 final.pdf, igard-minutes--16-july-2020-final.pdf

Datasets:

Demographics
Cancer Registration Data
Civil Registration - Deaths
Civil Registrations of Death

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

Prostate cancer is a major health problem world-wide and accounts for nearly one fifth of all newly-diagnosed male cancers. In the UK, approximately 48,500 men were diagnosed with prostate cancer in 2017 and over 12,000 men died from the disease.

The Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy (STAMPEDE) trial is a randomised controlled trial which is looking at adding therapies to standard care for men with high-risk prostate cancer starting long-term hormone therapy for the first time. The trial's definitive primary outcome measure is overall survival and intermediate primary outcome measure is failure-free survival.

The Medical Research Council Clinical Trials Unit at University College London (MRC CTU at UCL) is the Data Controller and processes data for this trial.

This project plans to use routinely collected health data within STAMPEDE to identify and assess clinical outcomes. The overall aim of STAMPEDE is to assess novel approaches for the treatment of men with prostate cancer who are starting long-term Androgen Deprivation Therapy (ADT). Open since October 2005, this Multi-Arm-Multi-Stage (MAMS) trial is the largest study of treatments for prostate cancer in the world, and is currently recruiting to three arms:
• Standard of Care (SOC, arm A)
• SOC with Metformin (arm K)
• SOC with Transdermal oestradiol (arm L).

With over 13,000 men recruited, most are from England (85%), with others from Wales (6%), Scotland (7%), N Ireland (2%) and a small number from overseas (1%). To take part in STAMPEDE, patients agree for us to collect information on outcomes such as long-term survival and failure-free survival which can be accessed from routine data sources such as the ONS mortality data and Hospital Episode Statistics (HES).

STAMPEDE initially assessed the effects of several medications: bisphosphonate (zoledronic acid), a cytotoxic chemotherapeutic agent (docetaxel) and a cyclooxygenase (Cox-2) inhibitor (celecoxib), as single agents or combinations, in patients commencing long-term ADT for locally advancing or metastatic prostate cancer. Since the start of the trial, a number of new research arms have been added to STAMPEDE over time to evaluate: abiraterone, a steroid synthesis inhibitor; prostate radiotherapy for patients with newly-diagnosed metastatic disease; enzalutamide, an inhibitor of androgen receptor signalling, given with abiraterone; and metformin, an anti-diabetic medication. In Protocol version 16.0, a new research arm was added for transdermal oestradiol, given as an alternative form of ADT.

The multi-stage element of the trial design allows patient recruitment to discontinue in treatment arms that are not showing sufficient activity, based on a series of pre-planned, interim, lack-of-benefit analyses. In general, MAMS is an adaptive design and can be regarded as one type of group sequential design.

There are eight research arms closed to recruitment, five of which have already reported results, and three arms are now in long-term follow-up with further analyses planned:
• SOC with Abiraterone (arm G)
• SOC with Prostate radiotherapy (arm H)
• SOC with Enzalutamide and abiraterone (arm J)

Over the next three years (2020 to 2023), five analyses are planned with the following comparisons:
1. Arm A vs Arm G: participants with metastatic prostate cancer
2. Arm G vs Arm J: participants with locally advanced prostate cancer
3. Arm A vs Arm J: participants with metastatic prostate cancer
4. Arm A vs Arm H: participants with metastatic prostate cancer
5. Arm A vs Arm K: all participants.

More than half of men taking part in STAMPEDE will survive for five years or more, but the Medical Research Council (MRC) Clinical Trials Unit (CTU) at University College London (UCL) are concerned that some trial centres find it difficult to maintain full follow-up over this period. Flagging data through NHS Digital’s three data products, Demographics, Civil Registration (Deaths) and Cancer Registration Data will allow the MRC CTU at UCL to ensure that deaths and cancer events are captured promptly. Linked data from NHS Digital will therefore improve the estimates of survival, and may also reduce the burden on NHS sites. For the purposes of survival analysis, the MRC CTU will also be able to assume that patients are alive at a set point in time if not reported dead, thereby increasing reliability of data for this study.

Furthermore, no man should die from prostate cancer without prior progression, so a reported death will allow MRC CTU to check that events are not missed on the Case Report Forms (CRFs) that clinicians complete for their trial participants. Based on previous discussions with ONS statisticians, the MRC CTU will make assumptions about the survival of patients not reported as dead.

For the pre-specified primary analysis of time to overall survival or censoring, the MRC CTU uses the date of death to the nearest day (collected on the Death CRF), or time to the day the patient was last known to be alive for individuals who are censored. The aim is to maintain this level of precision in any analyses using information from external sources; using routine data that records time to death or censoring to the nearest day (as opposed to the nearest week or month) enhances the precision with which the MRC CTU will be able to distinguish a difference in survival between two treatment groups.

Provided that the statistical models are correct, this enables MRC CTU's estimates of the survival difference for patients allocated to one treatment relative to another to reflect as closely as possible the reality of any survival difference attributable to treatment allocation in the study setting; and to obtain a greater degree of confidence in understanding of the true effect on survival of a new treatment based on the data available to us.

This can have important implications for the treatment future patients receive: if the evidence collected is strong enough to conclusively suggest a survival advantage gained from a treatment, it is more likely that this treatment will be made available to those patients. Conversely, if the analysis suggests the treatment is effective but the MRC CTU at UCL are not sufficiently confident in the strength of the evidence to support this, the findings are less likely to translate into a real difference for patients.

For information, there is intention to request access to Hospital Episode Statistics (HES) data linked to this request later, which would allow the MRC CTU at UCL to understand the treatment patterns of the trial cohort. This will be subject to a new application, and thus agreement, with NHS Digital.

The linkage requested is necessary for the performance of a task carried out in the public interest; improving the treatment offered to men with prostate cancer (GDPR section 6:1(e)). Processing is also necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with section 9:2(j) of GDPR.

STAMPEDE is currently funded by Cancer Research UK’s Clinical Research Committee (formerly the Clinical
Trials Advisory Awards Committee; CTAAC; ref C547/A3804 - STAMPEDE). UCL is the trial’s Sponsor and delegates sponsorship responsibilities to MRC CTU at UCL.

Expected Benefits:

In the UK there are 48,500 new cases of prostate cancer each year and 12,000 deaths. STAMPEDE aims to provide evidence as to what is the best way of treating men with newly diagnosed advanced prostate cancer, and the trial has already led to improvements in standard care. It involves many different comparisons, and as an ongoing trial is expected to provide further important results over coming years.

Since opening in 2005 over 13 000 participants have joined the trial. MRC CTU at UCL has already reported practice-changing results that show adding docetaxel or abiraterone improve disease control and life-expectancy. Several other strategies have been tested with more results expected soon, including the results of abiraterone and enzalutamide combination and radiotherapy to the prostate in men with newly-diagnosed metastatic disease.

The benefits of utilising routine data from NHS Digital will be improvement in the timely collection of survival data, with improved accuracy, thereby gaining robust evidence contributing to the impact on healthcare of men with prostate cancer. It will also alleviate some of the reporting burden on trial centre staff.

Outputs:

The MRC CTU at UCL will create intermediate trial reports for review by the Independent Data Monitoring Committee (IDMC), who are an independent group of experts who monitor patient safety and treatment efficacy data. The IDMC usually meets a minimum of once a year per "comparison". Reports to the committee are confidential, and they are the only people to see data by randomised group while the trial is in progress. The IDMC will only see NHS Digital data that has been aggregated with small numbers suppressed. They may recommend changes to the trial, for action by the trial steering committee. Results for each comparison are triggered by a certain number of deaths in the contemporaneously randomised control arm patients for each comparison. Peer reviewed publications and high impact medical journals - either cancer-specific journal (like JCO or Lancet Oncology) or a general medical journal (like Lancet, The Journal of the American Medical Association, The New England Journal of Medicine) will be produced. MRC CTU at UCL will look to general journals first but will review the results and whether they might or might not appear to a general audience.

MRC CTU will communicate the STAMPEDE results using at least:
• Presentation at major international and national scientific conferences
• Publication in high-impact peer-reviewed journals
• A written summary of results distributed to participants
• News articles on the STAMPEDE website
• Tweets on the @MRCCTU Twitter account

MRC CTU at UCL will communicate the results to the wider patient population via articles in the Tackle Prostate newsletter, Prostate Matters.

MRC CTU at UCL will also inform Prostate Cancer UK of the results, building on the relationship MRC CTU at UCL have with them for other trials in MRC CTU's prostate cancer portfolio. If appropriate, MRC CTU at UCL will work with the MRC and UCL press offices to develop press release(s) about the results. Depending on what the results show, MRC CTU at UCL may also look at other methods of communication. For previous prostate cancer trials MRC CTU at UCL have used films, briefing papers and events to communicate the results to health-workers and patients.

MRC CTU will communicate the results to the trial participants via a lay summary which will be distributed by STAMPEDE site staff. The summary is prepared at the MRC CTU by the STAMPEDE trial team and the MRC CTU PPI Group (includes patient representatives and our Policy, Communications and Research Impact Coordinator); this is the same way that previous STAMPEDE findings were disseminated to participants, and examples of the correspondence to STAMPEDE site staff and the participant summary from earlier comparisons have been saved as supporting documents.

MRC CTU at UCL will communicate the results to the wider patient population via articles in the Tackle Prostate newsletter, Prostate Matters. MRC CTU at UCL will also inform Prostate Cancer UK of the results, building on the relationship MRC CTU at UCL have with them for other trials in MRC CTU's prostate cancer portfolio. If appropriate, MRC CTU at UCL will work with the MRC and UCL press offices to develop press release(s) about the results. Depending on what the results show, MRC CTU at UCL may also look at other methods of communication. For previous prostate cancer trials MRC CTU at UCL have used films, briefing papers and events to communicate the results to health-workers and patients.

All outputs will be aggregated with small numbers suppressed and in line with the HES Analysis Guide.

The next indicative date for data output is Summer 2020 for the Abiraterone long-term comparison (Arms A vs G) in metastatic prostate cancer patients. All outputs will be aggregated with data already held on the consenting patients. No data published will lead to individuals being identified.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract). There will not be any access to the data by any third parties.

The requested data will be used for the long-term assessment of men in both the many comparisons of the STAMPEDE trial protocol. The primary outcome measure is survival, but cause of death, which can be difficult to ascertain in men with prostate cancer, is an important secondary outcome measure. The data from NHS Digital will allow the MRC CTU at UCL to ensure that deaths are captured and included promptly by all centres and clinics involved in the trial, and to verify that all information is correct and recorded. The number of participants and the rationale for their inclusion will always be included in all presented results.

MRC CTU at UCL is not permitted to re-identify individuals under this agreement.

The data will be held on UCL’s Data Safe Haven using UCL approved computers. The Data Safe Haven is UCL’s technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital DSP Toolkit and ISO 27001 Information Security standard. Access is controlled by the Information Asset Owner, and all UCL staff complete training in confidentiality and data protection, which is renewed annually.

The processing activities are as follows:

1. The STAMPEDE Trial team will identify the trial participants for linkage to NHS Digital data, and inform the UCL MRC CTU Head of Data Management Systems (DMS; acting as the Cohort Contributor and the Data Recipient). The team will provide Study ID, date of birth and date of last visit to the Head of DMS.

2. The UCL MRC CTU Head of DMS has a secure database of Patient Identifiable Data (PID) in UCL’s Data Safe Haven; and will extract the PID and merge this with a list of Study IDs.

3. The Study ID list will be sent by the UCL MRC CTU Head of DMS to NHS Digital with the following identifiers for linkage to the requested NHS Digital data products (Demographics, Cancer Registration, Civil Registration (Deaths):
• Study ID (STAMPEDE trial participant identifier)
• NHS Number
• Date of birth
• Postcode
There is no linkage field for gender as the entire trial cohort is male.

Patients have consented for data to be shared with researchers in an anonymised or linked anonymised form (this is where the 'Linked anonymised data' are anonymous to the people who receive and hold it (e.g. a research team) but contain information or codes that would allow the suppliers of the data to identify people from it). They have also consented that personal details can be used to obtain long term follow up information from national registries.

A privacy notice detailing the linkage of trial participant information to electronic health records held by NHS Digital and other similar bodies was published on the STAMPEDE website in 2018. At the same time, two letters about 1) Ongoing participation and trial updates for participants in Arms A (joined after 15 Nov 2011), G, H, J, K and L, and 2) End of participation for participants in Arms B, C, D, E, F and Arm A (who joined before 15 Nov 2011) were sent by trial centres. These letters provided updated information to trial participants about the use of their personal data (name, postcode, NHS number) to obtain health data from NHS Digital, Public Health England and the National Cancer Registration and Analysis Service. All participants have the opportunity to withdraw from the trial if they have any objections to the use of their data in this way.

4. NHS Digital will use the supplied information to extract linked data from the requested data products, including the full date and cause of death. These pseudonymised datasets with the Study ID will be sent to the MRC CTU at UCL using the specified transfer method. The data will reside in UCL’s Data Safe Haven and will be identified by Study ID only, thus there will be no identifying personal data attached to a study number. Only defined members of the STAMPEDE trial team and MRC CTU’s methodology team will have access to Data Safe Haven for data analysis - all are substantive employees of UCL. All UCL substantive employees have completed training in data protection and confidentiality, and users of Data Safe Haven receive appropriate training before granted access.

5. NHS Digital records will be uploaded to UCL’s Data Safe Haven, an output file for trial statisticians with the study-specific trial number will be prepared, and checked to ensure there is no PID within the file. .

6. This output file is placed in a secure directory with limited access only to certain members of STAMPEDE trial team.

7. Trial statisticians undertake data cleaning/validation activities for the processing of the data for the STAMPEDE trial.

8. The data will be used as a prompt to follow-up with site to get them to complete STAMPEDE Case Report Forms.

Data provided by NHS Digital will only be accessed and processed by substantive employees of UCL.

There will be no access to data by other third parties not listed in this agreement.

All outputs produced with data provided by NHS Digital will be aggregated with small numbers suppressed and in line with the HES Analysis Guide.

Assessing the impact of the COVID-19 pandemic on vulnerable children: the DHSC-ECHILD-COVID study — DARS-NIC-381972-Q5F0V

Opt outs honoured: No - data flow is not identifiable, No, Yes (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-08 – 2023-08 2020.12 — 2024.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

Datasets:

Hospital Episode Statistics Admitted Patient Care
Civil Registration - Deaths
Emergency Care Data Set (ECDS)
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
HES-ID to MPS-ID HES Admitted Patient Care
Civil Registrations of Death
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
Birth Notification Data
Community Services Data Set (CSDS)
Maternity Services Data Set (MSDS) v1.5
Maternity Services Data Set (MSDS) v2
Mental Health and Learning Disabilities Data Set (MHLDDS)
Mental Health Minimum Data Set (MHMDS)
Mental Health Services Data Set (MHSDS)
Mental Health Services Data Set (MHSDS) v5.0
Civil Registration - Births

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The data is requested for a programme of research relevant to the aims of the of the National Institute of Health Research Policy Research Unit for Children, Young People and Families (CPRU), within University College London (UCL).

CPRU is one of 15 NIHR Policy Research Units formed to undertake research to inform decision-making by government and arms-length bodies. CPRU works closely with the Department of Health and Social Care to determine priorities and provide evidence directly to the Secretary of State for Health, government departments and arms-length bodies, such as NHS England and Public Health England.

For this programme of research, UCL are the sole Data Controller who also process data. The London School of Hygiene and Tropical Medicine (LSHTM), the Office for National Statistics (ONS) and The Institute for Fiscal Studies (IFS) are also listed as data processors.

The study is looking at the impact of COVID-19 and lockdown on Children and young people and whether there are any differences in the health and social effects of household confinement on vulnerable children and young people when compared to other children and young people. Children and young people (CYP) who are vulnerable due to social welfare or chronic health needs are expected to experience more adverse health and social effects of the COVID-19 lockdown than other CYP.

Key concerns for services are the effects of household confinement during the COVID-19 lockdown, combined with the limited access to support from health, social care and education services. The researchers urgently need to understand what impacts COVID-19 infection and related public health responses (such as lockdown) have had on CYP, to inform strategies for the current wave of infection, and any future waves.

This study; Department of Health and Social Care - Education and Child Health Insight Linked Data - COVID (DHSC-ECHILD-COVID) builds on the Education and Child Health Insight Linked Data (ECHILD) project (DARS-NIC-27404-D5Z3F - approved), which uses linked education and HES data for four one-year cohorts amounting to two million CYP in England. The linkage under this application will be extended urgently to address the impact of COVID-19 on all CYP (linkage involving an expected 18 million CYP) and in particular vulnerable CYP as this is the group most likely to be impacted by lockdown. The researchers wish to include all children and young people (CYP) appearing in HES records from (the latest of) birth or April 1997 onwards, who are aged between 0 and 24 years in the COVID-pandemic year (hence start date for birth is the start of school year 1.9.1995).

PURPOSE

DHSC-ECHILD-COVID addresses four priority areas raised by the Department of Health and Social Care (DHSC) with the Children’s Policy Research Unit (CPRU) team relating to the secondary impacts of infection and lockdown on:

~ CYP who need safeguarding
~ poorer families
~ CYP with special educational needs
~ health inequalities

These vulnerable groups can only be reliably identified through linkage of longitudinal health, education and social care data.

For the purpose of this application 'vulnerable' can be defined as:

The researchers will draw on the published DfE definition for vulnerable children and young people. This relates to children and young people aged 0-25 years who are assessed as being in need under section 17 of the Children Act 1989 (i.e. have a child in need plan, child protection plan, or are a looked-after child), have an education, health and care (EHC) plan or have been assessed as otherwise vulnerable by educational providers or local authorities (e.g. children on the edge of receiving support or those at risk of becoming not in employment, education or training).

The researchers also explore whether children with long-term health conditions such as asthma or poor mental health, and those allocated any special educational needs (as indicators of underlying health or behavioural problems), are at greater risk of adverse impacts of infection or lockdown.
Children with indicators of vulnerability can only reliably be identified through linkage of health, education and social care data.

The researcher will focus on two specific research questions:

RQ1: What are the differences in emergency hospital contacts during the COVID-19 pandemic for vulnerable CYP compared with other CYP? Is there any evidence that differences are related to COVID-19 infection or the secondary effects of lockdown?

RQ2: What is the predicted deferred health care use and what are the long-term health, education and social care outcomes due to restrictions during the COVID-19 pandemic?

The researcher will use longitudinal linked data from hospital episodes statistics (HES), linked to education and social care data (held by DfE) to assess the impact of the COVID-19 pandemic on CYP and in particular vulnerable CYP. As vulnerable CYP are hard to identify in healthcare records, the researcher will identify these CYP through administrative data histories of ever being a Child in Need (CiN), having special educational needs (SEN), a chronic health condition requiring hospitalisation, or combinations of these exposures. The researcher will derive these vulnerability indicators from a linked longitudinal dataset comprising social care, education and hospital records (HES) for all CYP in England. Examining health data from the time of birth to current age (up to, but not including, age 25 years) is critical for identifying markers of vulnerability in administrative data. For example, previous work completed by UCL has shown that chronic underlying conditions, or congenital disorders associated with special education needs may not be recorded at every admission (e.g., asthma may not be recorded when a child is admitted for an operation) and UCL have demonstrated the added value of using the whole longitudinal record.

To enable the analyses to address these research questions, the researcher will link HES data (i.e. HES APC, outpatient, critical care, A&E and ECDS data, plus death registration data) to administrative data contained in the datasets collectively supplied within the National Pupil Dataset (NPD), provided by DfE (the researcher refers to NPD data as education, CiN, and children looked after (CLA)). These datasets (HES-NPD) will be linked by NHS Digital for children and young people in England using pseudonymised linkage keys.

The legal basis for processing personal data for this purpose data at UCL falls under Article 6(1)(e) of the General Data Protection Regulations (GDPR), i.e. “a task carried out in the public interest”. It also falls under Article 9(2)(j), “processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”. The processing of data for this study is a task of public interest as it will provide evidence on the effect of the COVID-19 pandemic on health outcomes and use of healthcare services among vulnerable children. This will benefit and inform policy makers, service providers, vulnerable children and their families.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Expected Benefits:

This project aims to produce urgent results on the impact of COVID-19 infection and lockdown on the health of children and young people and in particular vulnerable children, characterised by education and social care indices from the NPD linked datasets. The study will provide vital understanding of the repercussions of the current response strategies on the health and well-being of key population groups, and provide insight into how infection control and lockdown strategies should be developed to better meet the needs of children and young people. These results (preliminary results in October 2020 and January 2021) are critical for addressing current health needs arising from COVID-19 infection and responses, and also for informing strategies for future waves of infection.

The study will compare different groups of vulnerable and non-vulnerable groups of children and young people, using indicators of vulnerability drawn from health, social care and education histories in administrative data. The analyses will address a priority for DHSC policy makers, that COVID-19 and lockdown have resulted in disproportionate impacts on health for some groups. The analyses aim to explore this question for the whole population, to inform policy to better support children and young people, and to better understand which types of vulnerability are most affected.

The study will examine which groups of children and young people did present to services during the lockdown, whether their problems were directly related to COVID19 or the secondary impact of lockdown, and what underlying health or social risk factors were present. The study also aim to understand unmet need and predict future health needs for the large proportion of CYP who would have been expected to present to services, based on past patterns of care, but did not attend during the pandemic. For example, despite messages to urge patients requiring urgent medical treatment to seek care through the appropriate channels (e.g. A&E), during the pandemic there was a dramatic and unexplained decrease in A&E attendances. Serious concerns have been raised about the impact of the resulting treatment delays, yet much more evidence is required in to quantify the scale of the problem in different population groups and to predict future/ongoing needs.

The research seeks to help to fill this gap, firstly by evaluating high level differences in impacts for groups of vulnerable (in terms of clinical, socio-demographic and educational needs) versus other children and young people, which will guide the development of more detailed, in depth research within groups for which there is evidence of the most significant adverse impacts. Specific examples include examining the impact of delays in time-sensitive procedures, where delays are expected to have significant and prolonged negative impacts on health and education outcomes (e.g. surgical correction of cleft lip and palate). The results will establish the scale and urgency (e.g. how many children, how extensive were the delays, what are likely the unmet healthcare and education needs) of these impacts and guide the development of reactive policies and changes in service provision to mitigate long-term impacts for specific population groups. The research will also add evidence on the impact of COVID-19 on health inequalities - including for black and minority ethnic groups – and the mechanisms that drive these. The comprehensive geographical coverage and population base of our research is a real strength of this research and will allow the researchers to draw conclusions for all vulnerable children and young people in England, and to identify groups that are being failed by current policy and services.

The project is commissioned and funded by DHSC, and findings will be reported directly to NHS policy makers. The project also addresses two priorities on the impact of COVID19 on vulnerable patients set out by Health Data Research UK. The study will report preliminary results to DHSC policy makers through our regular 2-monthly meetings, through seminars with wider NHS staff (DHSC, PHE and NHS England – Simon Kenny, NHSE clinical director), and through briefing reports and papers published in peer reviewed journals.

Outputs:

All outputs will contain aggregate level data only and all small numbers will be suppressed in line with the HES analysis guide. Outputs will be monitored for compliance with ADRN statistical output controls and the HES Analysis Guides. No potentially disclosive outputs will be shared or published.

Preliminary reports will be shared with DHSC, PHE and NHS England, and with DfE through the ECHILD project and a project advisory group. Preliminary results will be produced for October 2020 (RQ1) and January 2021 (RQ2).

The researcher will submit full reports for fast-track publication in peer reviewed journals and produce briefing reports for DHSC, DfE and other public bodies through the Children and Families Policy Research Unit (CPRU). Findings will also be used in public involvement and engagement events. Study findings will be also disseminated through peer-reviewed academic journals (e.g. BMJ, Lancet Public Health), and social media including lay summaries.

UCL would expect to present findings at conferences such as the Lancet Public Health conference, and International Population Data Linkage Conference within two years of obtaining the data.

Relevant findings will be shared with policy makers, clinicians/health professionals, educators and parent groups particularly in accessible formats (e.g. lay summaries, videos or animations). This could include forums such as the National Children's Bureau (NCB) Young Person and Parent group, the Great Ormond Street Hospital (GOSH) Patient Engagement group. These groups can be accessed through the joint institute of UCL Great Ormond Street Hospital Institute of Child Health, through the North Thames ARC (led by UCL) and through the CPRU. Lay summaries of the study findings can be published on the CPRU website, and linked through websites for these organisations.

The data analyses are conducted on the ONS Secure Research Service. Detailed individual level child data cannot leave the ONS Secure Research Service. Results of analyses can be exported by a secure encrypted transfer system, which is audited.

Any outputs from analyses that are published have to meet statistical disclosure controls that prevent small sizes in accordance with NHS Digital and DfE requirements. Tabulations of aggregate data are assessed for statistical disclosure control and authorized for export by an ONS data scientist not involved in the project.

Processing:

The application shows data held, this data is what is held under NIC-393510. This agreement (NIC-381972) will allow the data to be linked to that which is already held at UCL. The only data requested under NIC-381972 is a record identifier which allow the customer to link those matched with the ECDS data held under NIC-393510, this is because there are no HES IDs in ECDS.

Analyses for RQ1:

The researcher will analyse outcomes for all CYP, and compare differences in health outcomes (e.g. emergency hospital contacts, deaths) between groups classified as vulnerable or not (stratified by age group), in the years before, compared with after, COVID-19 onset. Descriptive analyses will explore whether changes post-COVID-19 reflect increased risks of contacts related to infection, mental health, adversity, or acute complications of chronic or complex health conditions for all CYP, and according to whether they had indices for vulnerability or not. The researcher will use diagnostic and procedure codes, and types of hospital contact (eg emergency/elective admission), recorded in HES, to infer whether hospital contacts are related to underlying chronic conditions. Preliminary results October 2020.

Analyses for RQ2:

The researcher will model expected healthcare contacts for all CYP after the onset of the COVID-19 pandemic, based on observed trajectories of healthcare contact for CYP for periods prior to the start of COVID-19. The researcher will assess potential unmet healthcare need, according to age and vulnerability status, by comparing expected vs observed healthcare contacts after COVID-19 onset. Differences between expected and observed new diagnoses and interventions, will be used to infer potential unmet need. The researcher will vary their assumptions about whether and when these healthcare needs may manifest, and whether deferred presentation is likely to be more severe. The researcher will estimate how much deferred healthcare use could add to predicted rates of healthcare contacts in the post-COVID-19 period, and potential long-term outcomes of COVID-19 restrictions. Preliminary results in Jan 2021.

The researcher will also conduct analyses of all CYP to explore associations between inequalities (using index of multiple deprivation), ethnic group, and vulnerable vs other CYP, and outcomes measured in health care, NPD data (i.e. education and CiN/CLA) in periods before, during and after the COVID-19 pandemic. The researcher will refresh the linked NPD datasets annually in 2021 and 2022 to evaluate the longer-term impacts of COVID-19 infection and response on health, education and social care outcomes for vulnerable, compared with other CYP.

To address these research questions, HES (APC, A&E, OPD, ECDS, critical care) and death registration data (HES-mortality data) provided by NHS Digital will be linked to administrative data from the National Pupil Dataset (NPD) provided by the Department for Education (DfE) to the ONS SRS. Using a pseudonymised linkage key, these datasets will be linked for all children and young people (CYP) born in England on or after 1.9.1995.

The full cohort (longitudinal data for all children and young people in England) is justified for the following reasons:

1) Longitudinal coverage:

Examining health data from the time of birth to current age (up to, but not including, age 25 years) is critical for identifying markers of vulnerability in administrative data. For example, previous work completed by UCL has shown that chronic underlying conditions, or congenital disorders associated with special education needs may not be recorded at every admission (e.g., asthma may not be recorded when a child is admitted for an operation) and UCL have demonstrated the added value of using the whole longitudinal record. The researchers have requested the minimum data necessary for their research, which reflects the administrative history of the child for a subset of the available fields (e.g. the researchers have requested 60% of available inpatient fields, with no sensitive or identifiable fields).

2) Geographical coverage:

The research aims to draw conclusions that are valid for all children and young people in England. Yet the pandemic has had differential impacts across the country (reflecting both infection rates and public health responses) at different times. For example surveys (e.g. RCPCH) indicate geographical heterogeneity, including re-routing/re-deployment of healthcare staff and services, uptake of school access by eligible children, which are likely to disproportionately impact on areas with higher levels of overcrowding, less outside space, and greater deprivation. However, many surveys have incomplete coverage by geography or over time, making it difficult to accurately estimate the scale of the problem. Understanding time-varying patterns of change is increasingly important as public health responses shift towards localised management (e.g. local lockdowns) to control spread. The researchers therefore need data that makes it possible to understand local area impacts. The researchers have requested the minimum granularity possible, for example by requesting MSOA rather than LSOA.

3) Cohort:

The research focuses on the impact of the pandemic and lockdown on vulnerable children and young people (further defined below). Reliable identification of children meeting this definition is not trivial, requiring longitudinal data from birth across health, education and social care. As a result there are relatively few robust estimates of the size of this population.

However, there is good evidence that these indicators of vulnerability are common. For example, new research estimates that 25% of all children are ever designated a child in need and that 44% are ever referred to children’s social care before the age of 16 years. A further subset of children will have other indicators of vulnerability reflecting health or educational needs. In order for the research to draw meaningful conclusions the researchers wish to draw comparisons between the impact of the pandemic and lockdown on different groups of vulnerable children relative to a series of control children. The researchers will draw high level comparisons (e.g. to all other children) relevant to evaluating impacts at national level and for international comparison, as well as detailed comparisons against synthetic control groups (e.g. through propensity score matching) to better understand the impacts of vulnerability in the context of related factors such as local environment, access to schools and healthcare needs.

The researchers therefore require data for all children and young people in England as without these data our comparisons would be incomplete, at greater risk of selection bias and not generalisable.

Linkage of identifiers from HES-mortality data and NPD will be conducted at NHS Digital which will then only transfer the pseudonymised linkage key to the UCL Data Safe Haven to flag linked records in the UCL-curated HES extract for transfer to the ONS Secure Research Statistics (SRS). NPD attributable data will only be available in the ONS SRS. Using the pseudonymised linkage key, linkage of pseudonymised attribute data (clinical or education characteristics) will then occur separately, at the ONS Secure Research Service (SRS). The following outline describes the complete data flow and details how identifiable and non-identifiable data extracts will be handled:

1) DfE will supply the Trusted Third Party (NHS Digital) with a list of NPD identifier variables, these identifiers include name, date of birth, full postcode and sex, alongside a study specific pseudonymised linkage key known as the anonymized Pupil Matching Reference (aPMR). The identifying variables will be used for linkage to the Personal Demographic Service (PDS) (as previously done for NIC 27404). DfE will transfer the variables for any CYP born on or after cohort inception (1.9.95).

2) NHS Digital will match the identifiers from DfE to records held in the PDS using an algorithm that makes use of the chronology of postcodes in NPD and PDS. Matching to PDS data will be done internally within NHS Digital, no PDS data will be disseminated to ONS SRS or UCL Data Safe Haven. NHS Digital will link the the NPD pseudonymised linkage key (i.e. anonymised PMR or young person ID) to PDS, and then to the ECDS data.

3) For those CYP whose NPD identifiers were matched to PDS, onward linkage to HES-mortality data will occur within NHS Digital to link aPMRs and HES-IDs. NHS Digital will then transfer encrypted HES-IDs, aPMRs, and indicators of match rank (denoting the step at which the match to HES and PDS was made) for these linked cases to the UCL Data Safe Haven for linkage to the existing HES-mortality extract held by UCL (NIC-393510).

4) UCL will extract the HES-mortality data for all CYP born on or after 1.9.1995, using the existing HES extract (NIC 393510), and link the aPMR and match rank statistics for those CYP that were linked by NHS Digital in step (3) from NPD. The de-identified HES-mortality extract will be transferred to the ONS SRS. Only month/year of birth and death will be transferred to the ONS SRS, in order to account for well-established effects of month of birth on school achievement (i.e. research consistently shows that children born in September do better than children born in July/August). The attribute data will also include high-level categorical maternal indicators relevant to birth (e.g. parity – 0, 1, 2+ prior births), which have an important bearing on child health. These derived health indicators are created by analysing variables in our analysis files, that are covered by this DSA (e.g. diagnosis codes and baby tail) and for which we have the appropriate permissions.

5) DfE will supply ONS SRS with requested de-identified attribute data extracts, with the aPMR for all CYP born on or after 1.9.1995. The deidentified attribute NPD and HES data will be linked within the ONS SRS by the research team, using the aPMR. Data will only be used by researchers authorised for the project, with strict output controls applied by ONS SRS staff.

6) The final data set that will be used for analyses will remain within the ONS SRS. The files will not contain any identifiable data. No additional record level data will be gathered or linked to the dataset. The aPMR is the only variable supplied from NPD data that is supplied by NHS Digital to UCL Data Safe Haven and then to ONS SRS.

7) NHS Digital will retain the identifier file of all individuals linked in NPD-PDS and PDS-HES and all the postcodes used in linkage and postcode dates for 12 months to address data queries or potential linkage errors. This data set will not contain any attribute data and will be accessible only to NHS Digital staff. At the end of the 12 months, NHS Digital will confirm deletion of the data to DfE. NHS Digital will not send confidential data to DfE or UCL DSH.

LSHTM and IFS will each have a named researcher and PI who will undertake analyses (on the ONS SRS) relating to a specific component of the wider research question on the impact of the pandemic on vulnerable children and young people. e.g. LSHTM will examine the impact (on health and education) of delays in time-sensitive procedures (e.g. surgical correction of cleft lip and palette) on children with underlying health conditions. These named researchers will be substantive employees of the respective organisations.

The de-identified linked HES-NPD attribute data will be held on the ONS SRS and will only be accessible remotely from the UCL Data Safe Room which has restricted and monitored access. No record level data can be removed from the ONS SRS and statistical disclosure controls are applied by ONS staff. Access will be restricted to named users, who are part of the study team and are accessing the data for the purposes outlined in this DSA. Access to the data is via the ONS SRS environment.

Office of National Statistics (ONS) and UCL have signed and maintain an organisational agreement to use the ONS Secure Research Statistics (SRS) service for the purposes of secure statistical research, signed on 21/03/2019 with an indefinite expiry date. The only HES data stored in the ONS SRS are the 4 one-year cohorts of HES, plus the anonymised PMRs for those records that link to NPD. The HES data will include month of death and month of birth so no identifying data.

For security and resource reasons the SRS is a Managed Service. Equiniti Ltd (based in Belfast) maintains the system, on behalf of the ONS SRS. They do so through encrypted (TLS1.2) VPN tunnel and Remotely Access (RA) the SRS. All Equiniti Ltd administrators are SC cleared and have no access to any data. ONS SRS Research Support “Admin” staff only have permissions to carry out such tasks as creating users, updating patches, testing and installing software applications, arranging DR, ITHC for the SRS environment, closing SRS sessions down, i.e. all the SRS environment Admin maintenance - essentially they are “power users”. There have been no data infractions by Equiniti Ltd staff in the last 5 years of them maintaining the environment, they have been very professional.

The high level security document that Equiniti Ltd provided states: 6.3. Service Management support for the SRS Service is provided from Equiniti offices in Belfast, all staff are SC cleared. The office hosting the SRS Desk is IS0/IEC27001 2018 certified. Equiniti Ltd nor any their staff process the data. Therefore Equinity Ltd is not considered to be a Data Processor.

The ONS SRS environment is an isolated system. It has no connectivity to the internet other than using it as a bearer to pass TLS1.2 encrypted image packages for a virtual desktop infrastructure (VDI), hosted on an accredited cloud server hosted by UKCloud Ltd on the mainland UK. UKCloud Ltd merely host the environment, they have no access to data. Therefore CloudUK Ltd is not considered to be a Data Processor.

CLOUD SECURITY
NHS Digital security has provided assurance regarding the use of the Office of National Statistics' Secure Research Statistics service (ONS SRS), hosted by CloudUK Ltd in this application. The Office of National Statistics has submitted a selection of security documentation to support the use of cloud storage. NHS Digital Security have reviewed the documentation and provided relevant feedback, where necessary. NHS Digital are satisfied that the documentation demonstrates the level of security and governance in place.

The Office of National Statistics have supplied evidence to support:
• The use of the Data Risk Model to assess the Risk Profile Class.
• Risk Management of the use of the Cloud for this data, taking into consideration Confidentiality, Integrity and Availability.
• The use of Pseudonymisation.
• Board level involvement in the Risk Management Process evidenced through Minutes of these meetings.
• Understanding of the Shared Responsibility Model

The Office of National Statistics have a very good understanding of the security controls available to them to provide the appropriate controls to secure data in the Cloud.

Using the Cloud, benefits from the inherited controls that cannot practically be replicated locally such as Physical Controls, Resilience of Systems, Power Supplies, Communications and Geographically dispersed Data Centres within a region.

Elasticity in provisioning is also a consideration that benefits organisations in managing workloads. The Cloud provider, CloudUK, will use UK Data Centres only.

MR1a - Health and Development Study - Consented Cohort Members — DARS-NIC-148100-6RFK9

Opt outs honoured: No - consent provided by participants of research study, No - data flow is not identifiable, Y, No (Excuses: Reasonable Expectation, Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 – s261(7), Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 s261(2)(c), Health and Social Care Act 2012 s261(2)(c); Health and Social Care Act 2012 s261(7)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-03 – 2022-03 2017.09 — 2024.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 27th March 2025 final.pdf, igard_minutes_8_february_2018.pdf, igardminutes-17thdecember2020final.pdf, igardminutes-14thjanuary2021final.pdf, IGARD_Minutes_03.08.17.pdf, IGARD_Minutes_24.08.17.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
Hospital Episode Statistics Accident and Emergency
MRIS - Flagging Current Status Report
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Scottish NHS / Registration
MRIS - List Cleaning Report
Demographics
Civil Registration - Deaths
Cancer Registration Data
MRIS - Members and Postings Report
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)
Civil Registrations of Death

Type of data: Identifiable

Objectives:

The MRC National Survey of Health and Development (NSHD) is the oldest and longest running of the British birth cohort studies. From an initial maternity survey of 13,687 (82%) of all births recorded in England, Scotland and Wales during one week of March, 1946, a socially stratified sample of 5,362 singleton babies born to married parents was selected for follow-up. The NSHD study team is housed within the MRC Unit for Lifelong Health and Ageing (LHA) at University College London (UCL).

The NSHD study team has collected unique lifetime data on body size and maturation, cognitive and physical function, socioeconomic status and diet; and has repeat adult data on diet, smoking, physical activity, blood pressure and lung function. The most intensive data collection in 2006-2010, when study members were aged 60-64 years, included measurement of cardiac structure and function, body composition and bone density.

The 24th and most recent data collection to the whole sample included a postal questionnaire in 2014 and a home visit by a trained research nurse for interview and assessment in 2015/2016. At the 24th follow-up, the target sample was 2816 study members still living in mainland Britain; this is the maximum sample used in the analyses. Of the remaining 2546 (47%) study members: 957 (18%) had already died, 620 (12%) had previously
withdrawn permanently, 574 (11%) lived abroad, and 395 (7%) had remained untraceable for more than 5 years.

Where study members have become lost to follow up, data is being provided under a separate application, NIC-86954. NSHD will use the data under that application to seek to re-contact those study members and invite them to continue participating in the study, i.e. to re-consent these participants.

The NSHD was the first study (in 1971) to have participants flagged on the NHS Central Register for mortality (ICD codes are used to code cause of death) and cancer registrations. The LHA receives notifications on an ongoing quarterly frequency.

The LHA wishes to link NSHD study members to HES data in order to improve the quality of information on hospital admissions and health outcomes for research purposes. Currently, the study obtains self-reported hospital admission data at each follow-up which are then confirmed through contact with each hospital.

The Unit has a 5-year MRC core funded programme of research based on the NSHD with the objective to investigate risk and protective factors from across the life course that influence the ageing process. This core funding has been in place since 1962 and is renewed every five years after scientific review.

The data from HES will be used to improve the identification of acute events such as those caused by cardiovascular disease (CVD). For example, the unit will assess how life course risk factor trajectories of body size, resting heart rate, blood pressure, socio-economic position (SEP) and health related behaviours, accumulate and interact to influence incidence of CVD, thus potentially identifying possibilities for earlier prevention. As the cohort is entering older age, hospital care becomes increasingly frequent and study members are thus less likely to report hospital admissions over a number of years accurately. It is therefore important to capture this information in other ways. New research within LHA on health service use is being developed which will utilise these data and investigate life course predictors of health care utilisation.

The data collected on the NSHD cohort, including that provided by NHS Digital, is used across five research integrated programmes with the overarching aim of identifying social and biological factors that affect lifelong health, ageing and the development of chronic disease risk.

The five programmes are:
1) Enhancing NSHD
2) Functional Trajectories and Cardiovascular Ageing
3) Physical Capability and Musculoskeletal Ageing
4) Mental Ageing
5) Wellbeing in older age

All those with access to the data are substantive employees of University College London.

All processing of ONS data will be in line with ONS standard conditions.

All outputs will be restricted to aggregate data with small numbers supressed in line with the HES analysis guide.

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement. There will be no onward sharing of data as part of this application.

The legal basis for the applicant to continue to hold the Scottish Registration data is consent.

Yielded Benefits:

Delays in obtaining HES data have prevented the study team at UCL from investigating the life course influencers and predictors of health care utilisation. However, since receiving the data, the study team have been working on cleaning and deriving variables that will allow identification of acute events, such as coronary heart disease, heart failure, dementia etc., which will allow UCL to conduct the planned work. Despite the delays, UCL have been able to produce a couple of outputs. For example, work by Dehbi et al (Environment Int. 2017) has provided further evidence of the role of air pollution in cardiovascular mortality, which may be used to influence policies on air quality. Another article in press (British Journal of Psychiatry), looking at adolescent affective symptoms and mortality may be important in assessing health care services.

Expected Benefits:

The NSHD has informed UK health care, education and social policy for 70 years and is the oldest and longest running of the British birth cohort studies. Today, with study members in their early seventies, the NSHD offers a unique opportunity to explore the long-term biological and social processes of ageing and how ageing is affected by factors acting across the whole of life.

Evidence is growing from this cohort study and others, that factors from early life (such as growth, neurodevelopment, nutrition and family socioeconomic circumstances) as well as later life (such as adult smoking, diet, exercise and socioeconomic circumstances) affect the opportunity to age well. This is of interest to policymakers, practitioners, and older people themselves.

The research using NSHD life course information will provide insights into when in the life course interventions to prevent disease (in particular CVD) and, as the cohort age, hospitalisation, will be most useful. This information will inform the design of future interventions which can then be tested in controlled trials. As the study is nationally representative, it will also provide valuable information regarding the factors associated with health care utilisation of the ageing population.

In particular, through knowledge transfer, public engagement, publications, presentations and invited commentaries (http://www.nshd.mrc.ac.uk/findings/) the MRC LHA has contributed to a body of evidence to influence policies and support evidence based medicine. For example, recent paper in PLOS Medicine comparing lifetime trajectories of overweight and obesity across NSHD and the later born cohorts has been cited in the recent Government’s Child Obesity Strategy. Other examples highlighting the depth and breadth of this lifelong study include:

• NSHD is a member of the Dementias Platform UK, a £53 million collaboration between universities and industry established by the MRC in 2014, to transform the best dementia research into the best treatments as quickly as possible. It combines the power of multiple population studies to compare healthy people with people at all stages of dementia.

• The NSHD finding, in 2014, that more rapid rises in systolic blood pressure during midlife (even if not crossing into hypertension) were related to poorer cardiac structure (published in the European Heart Journal in 2014) has implications for treatment guidelines as it suggests that identification and treatment of people with rapidly increasing SBP, even if they are not reaching the criteria for hypertension, may be beneficial in preventing subsequent cardiovascular disease.

• The NSHD findings (published in The Lancet Diabetes & Endocrinology in 2014) suggesting that those who lost weight at any age during adulthood, even if weight was regained later, had better cardiovascular risk profiles than those who remained overweight or obese supports public health strategies that help individuals to lose weight at all ages.

• In 2014, the NSHD finding that better performance in tests of physical capability (i.e. grip strength, chair rising and standing balance) in midlife was linked to higher survival rates over 13 years of follow-up was published in the British Medical Journal. This highlighted the value of these simple objective physical tests in helping to identify those people who from at least as early as midlife onwards may require more support than others to achieve a long and healthy life.

• Subsequent work examining changes in objective measures of physical capability between ages 53 and 60-64 has highlighted that age-related decline may not be entirely inevitable and is potentially modifiable. This work has also suggested that there may be a need to monitor physical capability from at least as early as midlife onwards as opportunities to help some high risk groups may already have been missed if no action is taken until later in life.

• A 2009 report on adult life chances in relation to childhood mental health using NSHD was cited by the government in support of a case for early intervention to build mental capacity and resilience.

• The study’s findings of the continuing effect of early life growth and development on health outcomes in adulthood add to the arguments for early intervention of the kind provided by the national SureStart programme.

• The 1999 paper comparing children’s diet in 1950 with that in the 1990s (‘Food and nutrient intake of a national sample of four-year-old children in 1950: comparison with the 1990s’, Public Health Nutrition) had an impact because of its evidence that the quality and nutrient value of infant and childhood diet had declined between 1950 and 1990.

• The study’s finding (published in All our Future in 1968) of the extent and inequity of the ‘waste of talent’ – in terms of high ability children who did not continue into further or higher education – added to arguments for improving opportunities for, and expectations of, children from poorer families.

• The Home and the School (1964) had a great impact, probably because it provided the first hard evidence that parents and preschool circumstances had a significant impact on ability and attainment at age eight, and so showed that preschool development and experience formed the bedrock on which primary schooling was built.

• Press reports that followed the publication of Maternity in Great Britain (1948), which were concerned with the ‘Need for Better Care and Lower Costs’ (The Times), are likely to have influenced the arguments for improvements in the care of mothers and babies.

Outputs:

The data will be used on an ongoing basis to update study member records. The database will be updated after each data release.

The primary output of the linkages with HES, ONS mortality and Cancer Registration data are the maintenance and enhancement of the NSHD-DR. This is in turn used to achieve multiple research outputs that benefit health and social care.

The programme ‘Enhancing NSHD’ examines many of the genomic, other metabolomic or epigenomic factors that influence the risk of many age-related diseases and quantitative traits, often in collaboration with external researchers.

The programme ‘Functional Trajectories and Cardiovascular Ageing’ examines which factors from across the life course promote good adult cardiovascular function and prevent disease onset, and which increase vulnerability to accelerated cardiovascular ageing.

The programme ‘Physical Capability and Musculoskeletal Ageing’ examines which factors from across the life course promote good adult physical capability and musculoskeletal health, and which increase vulnerability to accelerated decline in capability.

The programme ‘Mental Ageing’ examines which factors from across the life course promote cognitive capability and protect against depression and which factors increase vulnerability to cognitive decline.

The programme ‘Wellbeing in older age’ examines what social contexts and experiences in childhood and early adulthood promote wellbeing in later life and whether wellbeing protects against functional ageing.

Each of these programmes generate multiple publications in peer review journals annually and findings are further disseminated via conference presentations. A full list of publications produced to date plus details of the current priorities for each programme are published on the MRC LHA website at: http://www.nshd.mrc.ac.uk/.

Publications and presentations only use data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

This MRC Unit is committed to research on ageing – outputs arising from ONS data will be anonymised in the form of tables, graphs, peer reviewed journals, presentations and books.

These data have been used in a number of publications. A full list of publications can be found at http://www.nshd.mrc.ac.uk/findings/
Examples of NSHD publications using mortality data are below:
1. Davis D, Cooper R, Terrera GM, Hardy R, Richards M, Kuh D.Verbal memory and search speed in early midlife are associated with mortality over 25 years' follow-up, independently of health status and early life factors: a British birth cohort study.Int J Epidemiol. 2016 Aug 6. pii: dyw100.
2. Zhou CK, Sutcliffe S, Welsh J, Mackinnon K, Kuh D, Hardy R, Cook MB.Is birthweight associated with total and aggressive/lethal prostate cancer risks? A systematic review and meta-analysis.Br J Cancer. 2016 Mar 29;114(7):839-48.
3. Teschendorff AE, Yang Z, Wong A, Pipinikas CP, Jiao Y, Jones A, Anjum S, Hardy R, Salvesen HB, Thirlwell C, Janes SM, Kuh D, Widschwendter M. Correlation of Smoking-Associated DNA Methylation Changes in Buccal Cells With DNA Methylation Changes in Epithelial Cancer. JAMA Oncol. (2015 Jul 1); 1(4):476-85
4. Hartaigh B, Gill TM, Shah I, Hughes AD, Deanfield JE, Kuh D, Hardy R. Association between resting heart rate across the life course and allcause mortality: longitudinal findings from the Medical Research Council (MRC) National Survey of Health and Development (NSHD). J Epidemiol Community Health, 2014 Sep;68(9):8839.
5. Albanese E, Strand BH, Guralnik JM, Patel KV, Kuh D, et al. (2014) Weight Loss and Premature Death: The 1946 British Birth Cohort Study. PLoS ONE 9(1): e86282.
6. Maughan B, Stafford M, Shah I, Kuh D. Adolescent conduct problems and premature mortality: follow up to age 65 in a national birth cohort. Psychological Medicine 2013 Aug 21:110.
7. Ong K, Hardy R, Shah I, Kuh D on behalf of the NSHD scientific and data collection teams. Childhood stunting and mortality between 36 and 64 years: the British 1946 birth cohort study. Journal of Clinical Endocrinology and Metabolism. 2013 May;98(5):20707.
8. Strand BH, Kuh D, Shah I, Guralnik J, Hardy R Childhood, adolescent and early adult body mass index in relation to adult mortality: results from the British 1946 birth cohort. J Epidemiol Community Health. 2012 Mar; 66(3): 225–232.
9. Henderson M, Hotopf M, Shah I, Hayes RD, Kuh D. Psychiatric disorder in early adulthood and risk of premature mortality in the 1946 British Birth Cohort. BMC Psychiatry 2011 Mar 8;11:37.
10. Kuh D, Shah I, Richards M, Mishra G, Wadsworth M, Hardy R. Do childhood cognitive ability or smoking behaviour explain the influence of lifetime socioeconomic conditions on premature adult mortality in a British post war birth cohort? Soc Sci Med. 2009 May; 68(9): 1565–1573.
11. Clennell S, Kuh D, Guralnik J, Patel K, Mishra G. Characterisation of smoking behaviour across the life course and its impact on decline in lung function and allcause mortality: evidence from a British birth cohort. Journal of Epidemiology and Community Health 2008;59:30414.
12. Kuh D, Richards M, Hardy R, Butterworth S, Wadsworth MEJ. Childhood cognitive ability and deaths up until middle age: a post war birth cohort study. International Journal of Epidemiology 2004;33:40813.
13. Kuh D, Hardy R, Langenberg C, Richards M, Wadsworth MEJ. Mortality in adults aged 26-54 years related to socioeconomic conditions in childhood and adulthood: post war birth cohort study. British Medical Journal 2002;325:107680.

Processing:

NSHD receives data from two main sources i) collected from the study members themselves over the past 70 years and ii) from NHS Digital; these data are held in the NSHD-Data Repository (NSHD-DR). Study participants are flagged with NHS Digital. NHS Digital provides notifications of deaths and cancer registrations on a quarterly frequency. These data are incorporated into the NSHD-DR to enhance that dataset for research purposes. The mortality data (fact of death) are also used for administrative purposes. As well as being used to identify specific health events, linkage to HES data will allow the derivation of useful aggregate variables such as number of hospital admissions and length of time in hospital. The derived aggregate variables are then used for other research analyses by LHA scientists and may be shared with external researchers.

In scientific studies in the period that pre-dated the MREC/LREC structure, consent was assumed by participation. In this study, the period of assumed consent covers the years from birth to age 35 years (from 1946 to 1981). Ethical permission for the 1982 and 1989 studies was obtained from the local ethical committees that preceded the LRECs and were run by the teaching hospital to which the NSHD research team were then affiliated (Bristol in 1982 and UCL in 1989). In 1999, MREC approval was obtained for the data collection and its use for research purposes by the team and their collaborations (MREC98/1/121). Ethical approval for the feasibility study (MREC06/Q1407/26) and extension study (07/H1008/245) was obtained from the Central Manchester Research Ethics Committee, and additional Scottish approval (08/MRE00/12) was granted through the Scotland A Research Ethics Committee. Most recently, a favourable opinion was obtained from the London Queen Square REC (14/LO/1073) and Scotland A REC (14/SS/1009).

The consented cohort does include participants who have consented using previous versions of consent material. Consent is taken at face to face contact, which is typically a home visit by a research nurse; this occurs roughly every five to ten years. Consent is sought from study participants, using the updated consent materials, prior to each data collection. Consent materials provided to participants explain the purpose of the data collection and provide the opportunity for individuals to withdraw from the data collection and/or the entire study. The data flow for participants who are either lost to follow-up or non-responders is covered by Section 251 support. The Section 251 support does not cover any participant who has withdrawn their consent.

Derived NHS Digital data will be linked to the NSHD-DR which stores all study member data in pseudonymised form going back to 1946. NHS Digital identifiable data can only be viewed by named NSHD staff and is stored separately from pseudonymised derived data. The NSHD-DR additionally holds hospital admissions data that was previously obtained directly from the hospitals or General Practitioners.

Cancer Registry-wide study in infants with neuroblastoma; Task 11.4 of the ENNCCA Network of Excellence (ODR1516_119) — DARS-NIC-656760-F8Y3C

Opt outs honoured: No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-01 – 2023-12 2024.09 — 2024.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 12 January 2023 final.pdf

Datasets:

NDRS Cancer Registry
NDRS Cancer Registrations

Type of data: Anonymised - ICO Code Compliant

Objectives:

The ENCCA (European Network for Cancer Research in Children and Adolescents, www.encca.eu) Network of Excellence aims to accelerate clinical and translational research in paediatric and adolescent oncology and to promote evaluation of and access to innovative therapies. The ENCCA network of 34 partners spans 11 European countries and includes 27 eminent paediatric oncology institutions. ENCCA links all the multinational clinical trial groups and national childhood cancer professional societies across Europe in an ENCCA European Clinical Research Council. It is structured as a consortium that will carry out 18 working packages. Work package 11 aims to establish methods of linkage of the population-based cancer registries with other forms of routine health care data to conduct future research in childhood cancers where the overall population has a good prognosis. Work package 10 aims to develop risk-adapted therapies in solid tumours, mainly neuroblastoma.

Primary aim: To understand the outcomes of current treatments for neuroblastoma in infants in relation to the success of first-line therapy (event-free survival) and the burden of treatment received by the individual child and reasons for any differences between countries.

this project will develop mechanisms and methods of collaborative work between the population-based cancer registries and the clinical databases across the participating European countries and clinical registries. The aim will be to link the series of cases arising in a well-defined (by age at diagnosis) population of infants with neuroblastoma and registered in cancer registries, enhanced with the detailed information held in the clinical databases/hospital records at the patient's treatment centres.

The objective is to understand the reasons for the observed decline in overall survival rates for infants diagnosed with neuroblastoma in England in the 2000s- hypotheses that will be explored include whether it is due to poorer compliance with the international 'best practice standards of diagnosis and treatment following the ending of the SIOPEN INES 99 study in 2004. There has since been no open clinical trial in the UK for neuroblastoma in this age group.

Yielded Benefits:

An extensive list of publications published by ENNCA can be found at the following link: http://worldspanmedia.s3.amazonaws.com/media/siope/wp-content/uploads/2013/06/SIOPE-ENCCA-Scientific-Articles.pdf. Information on how the research has so far benefitted the provision of health and social care can be found within these publications.

Expected Benefits:

The aim of this study is to better understand the outcomes of current treatments for neuroblastoma in infants in relation to the success of first-line therapy (event-free survival) and the burden of treatment received by the individual child and reasons for any differences between countries. Carrying out this research may assist in identifying optimal practice in the clinical care of children with neuroblastoma, and therefore has the potential to benefit the provision of health and social care in England.

Outputs:

It is anticipated that the study findings will be published in peer-reviewed journals and will also be presented at relevant conferences.

Should the opportunity arise, the study may publish findings on the ENCCA webpages, hold open lectures or engage with the press, this will aid the dissemination of the findings and will reach interested groups in civil society.

Processing:

All data held under this Agreement was disseminated to the International Agency for Research on Cancer (IARC) by Public Health England (PHE) prior to its dissolution in October 2021 under the assigned reference of ODR1516_119.

No identifiers were provided to PHE to support this dissemination, a set of inclusion criteria, previously agreed with the National Disease Registration Service (NDRS) analysis team, defines the cohort. The NDRS have previously pseudonymised cancer registration data based on this inclusion criteria.

IARC has conducted statistical analyses on the data provided to fulfil the study's aims and objectives. Training for all Data Users will be organised by the Information Security Officer and the Director for Administration and Financeon a periodic basis.

Data is kept in a safe and secure environment, available only to authorized users with a legitimate need to access them, and protected against unauthorised access. IARC's System Level Security Policy enforces the following controls:

Access Control:
Physical and logical access controls must be established in order to protect the Data at all times. The Data is stored in a secure location requiring either badge or key access to its physical location. Logical access will be controlled with Access Controls Lists (ACL), username and strong password combinations and file and share level permissions.

User Access:
Access to the Data will only be granted by the Principle Investigator (PI). A central log of users having access to the Data will be maintained.

Passwords:
All passwords will be strong in nature. Passwords must never be written down or shared.

Virus Protection:
All IT equipment storing or accessing the Data will have up-to-date anti-virus protection.

Operating System Management:
All IT equipment storing or accessing the data must be updated automatically on a regular basis with the operating system and security patches in order to avoid potential security breaches.

Backup:
The Data will be backed up on a regular basis and stored in a separate location from the original data in order to allow the recovery of the data after a major incident.

Logging:
Logging of access to the Data will be put in place in order to allow a clear audit trail to be maintained of access and modifications made by each authorised User.

Network Security:
The network where the data is stored will be secured to avoid unauthorized access to the information. Network segregation and firewalls should be implemented to increase the safety of the data.

Millennium Cohort Study (also known as Child of the New Century) - Tracing — DARS-NIC-408892-F1R1Y

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2021-07 – 2024-07 2021.12 — 2024.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 24th June 2021 final.pdf

Datasets:

Demographics

Type of data: Identifiable

Objectives:

The Centre for Longitudinal Studies (CLS) at University College London (UCL) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. It is responsible for four of Britain's internationally renowned longitudinal cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study, the Next Steps, and the Millennium Cohort Study (MCS). All these studies are following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being.

The purpose of this application covers: Update CLS' database with new addresses and deaths - CLS is requesting up to date addresses for the MCS cohort members. The new addresses will be used to invite participants to take part in the upcoming survey. Date of Death is requested to ensure that untraced cohort members are not contacted if they have died (although deaths are provided under agreement DARS-NIC-147860-0RSHN-v1.3 they would be disseminated at a different time to the new addresses and deaths under this application.)

The Millennium Cohort Study (MCS), also known as the Child of the New Century to cohort members and their families, is following the lives of around 19000 young people born across England, Scotland, Wales and Northern Ireland in 2000-02.

MCS is renowned worldwide for the evidence it provides on childrens experience of growing up in the United Kingdom in the 21st Century. Since the studys launch there have been seven attempts to re-contact and gather information from the whole cohort (at ages 9 months, 3 years, 5 years, 7 years, 11 years, 14 years and 17 years). The MCS covers such diverse topics as parenting; childcare; schooling and education (e.g academic qualifications, vocational qualifications); daily activities and behaviour; cognitive development; child and parent mental and physical health; employment and education; income and poverty; housing, neighbourhood, and residential mobility; and social capital, ethnicity and identity. The information collected in previous sweeps of the study has formed the high-quality data resource, that is MCS, for scientific investigation across the life course and domains. The seventh, Age 17 survey (2018-19) added to the data already collected in previous sweeps by updating information on current circumstances of the cohort and experiences they have had since the last sweep. In previous sweeps, schooling will have been the main activity common to the vast majority of cohort members.

The Age 17 survey marked an important transitional time in the cohort members lives, where educational and occupational paths can diverge significantly. It is also an important age in data collection terms since it may be the last sweep at which parents are interviewed and it is an age when direct engagement with the cohort members themselves rather than their families is crucial to the long-term viability of the study. To reflect this, CLS conducted face to face interviews with the cohort members for the first time. Cohort members were also asked to do a range of other activities including filling in a self-completion questionnaire on the interviewers tablet, completing a cognitive assessment (number activity) and having their height weight and body fat measurements taken.

This was a unique opportunity to measure factors that underlie different types of transition into adult life, which may affect future wellbeing in unprecedented ways. Capturing these transitions well, alongside the contemporary factors underlying them was critical. It was important to build up a picture of daily life, including factors such as: relationship with parents, family and peers, risky behaviours, social media engagement and efforts on activities such as education /school. Additional factors affecting decisions at this age include attitudes and preferences, such as preferences for education, attitudes to risk, willingness to trade off resources at different points in time, and expectations about future life events. Measuring social and emotional development, mental health and cognitive development and using well-validated instruments, was also a critical component of the survey.

Of the approximately 19,000 individuals that have ever participated in the study there will always be a number of individuals for whom the Centre for Longitudinal Studies (CLS) at University College London will not have confirmed addresses at the time of carrying out the next survey.

The ongoing success of the study depends on maintaining contact with as large a number of study members as possible. Therefore, CLS are seeking permission to be supplied with updated addresses for MCS cohort members. All of these individuals have made an informed decision to participate in the study over the years and have been made aware that the study is seeking to follow them throughout their lives. This information is provided to participants on the study website, a link is provided at website CNC | FAQs (childnc.net), under How we find you. CLS provide a link to the information on the study website in all materials provided to cohort members. Cohort members receive an advance booklet with complete information about each upcoming survey.

Each year CLS sends an annual postal mailing to all MCS participants. CLS asks that participants complete a reply slip which is returned to CLS which allows participants to provide CLS with any change in their details e.g., a new email address, phone number, etc. CLS also ask them to return the reply slip even if none of their details have changed i.e., seeking a positive confirmation that that is the address CLS hold for them. As a result CLS, can maintain the cohorts' latest details on the MCS database. In the event of the annual mailing not reaching the participant it is returned to CLS as a 'return to sender'. CLS will attempt to trace all these returns but if CLS cannot locate the participants then they are flagged on the database as a 'gone-away'. NHS Digital may potentially hold a more recent address and provide CLS with an opportunity to invite the cohort to re-join the study. CLS will send the details of approximately 14,100 cohort members to be linked to the NHS Digital Personal Demographics Service (PDS) dataset. This number excludes those who have died and those who have requested to be withdrawn from the study. NHS Digital will supply new addresses for study members who can be matched to the PDS dataset. Any study members for whom CLS successfully received a new address via this route would be written to and asked to provide updated contact details.

CLS will appoint an external supplier agency (data processor) to carry out the next survey interviews for age 22 which is currently planned to take place in 2023. CLS will also use a Mailing house supplier to send correspondence to participants inviting them to re-engage with the study. CLS intend to share new addresses received from NHS Digital with these organisations in order for them to invite study members to take part in future surveys. Once the agency is appointed, CLS will inform NHS Digital of the new data processors.

Any study member choosing not to take part in the study are flagged on this CLS database with a code denoting whether their refusal is temporary (i.e. for a particular wave/survey) or permanent (i.e. they wish to have no further involvement in the study). Any previously deposited anonymised survey data for a study member and confidential data from the address database are retained unless the study member specifically asks us not to, in which case this data is securely deleted.

With regard to a request for 'withdrawal' from a participant CLS classifies them as a 'withdrawal from the current survey' or a 'withdrawal from the study' and these are handled slightly differently:
(a) Withdrawal from the current survey: CLS will flag this on its computer system to indicate that the participant will not be taking part in the current survey and the reason for not wanting to take part is also recorded. For example, they may just not have the time to take part. Therefore, there will be no further contact with the participant for the duration of the current survey but they will be invited to take part in the next survey.
(b) Withdrawal from the study: CLS will flag this on its computer system as a permanent refusal to indicate that the participant will not be taking any further part in the study itself and the reason for this type of withdrawal is also recorded for analysis purposes. Therefore there will be no further contact with the participant for the remainder of the longitudinal study. If this request is received in writing then CLS will acknowledge the request and notify the participant that they have been flagged and will no longer be contacted or receive any further communications. This request may sometimes be accompanied by a request for the destruction of their data.

University College London (UCL) are the sole Data Controller for this agreement who will also process data.

UCL legal basis for processing (acquiring, linking and sharing) personal data is for a public task under GDPR (article 6(1)(e)) i.e. processing is necessary for the performance of a task carried out in the public interest (as is made explicit to participants in the information leaflets provided). UCL also process special categories of personal data for research under GDPR (article 9(2)(j)) i.e. processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes. In addition, for ethical reasons and under the Common Law Duty of Confidentiality, UCL sought permission from the Confidentiality Advisory Group to access this data without consent. CLS has also received Research Ethical Committee (REC) approval for tracing participants via NHS Digital.

Participants are aware that the study will attempt to trace them and CLS are confident that many of those newly traced via the NHS will be happy to take part. CLS also uses its own methods for tracing cohort members for example, asking the named stable contacts (relative, neighbour or friend) for the participants new address.

The Economic and Social Research Council (ESRC) are the funder for this study.

Yielded Benefits:

Expected Benefits:

The study produces rich, longitudinal, policy-relevant data, currently unavailable elsewhere, for a large, representative sample of children/young adults. MCS data is widely used by policy makers to evaluate and develop policy and improve services for young people and also by academic researchers to chart and understand social change. The information provided by cohort members provides valuable evidence for the research and policy community about the cohort's transitions to education/work and into early adult life. To enhance the research resource for secondary users, a fully documented, pseudonymised dataset collected at age 17 was archived at the UK Data Service.

A specific benefit of the data dissemination under this agreement is being able to trace cohort members, using this data ensures the study sample is maintained and the study remains representative of the studied population.

Below are examples of existing publications using the MCS data benefiting public health.

Drinking in pregnancy
In the age 3 survey, MCS cohort children completed activities to show which words they understood and spoke, and which colours, letters, numbers, shapes and objects they were familiar with. Parents were also asked about different aspects of children's behaviour, such as how well they got on with other children and how active they were. Research using MCS survey data have found that children whose mothers drank heavily while they were pregnant were more likely to have behaviour problems at age 3 than those whose mothers did not drink or drank lightly. On average they also did less well in the different activities, although lots of other factors are also important too.

Smoking in pregnancy
Several studies based on MCS have looked at how smoking during pregnancy relates to children's development. One group of researchers found that babies with mothers who smoked at any point while they were pregnant weighed on average 146 grams less when they were born (around the weight of a smartphone) than babies with mums who did not smoke. Overall, the more cigarettes a mother smoked a day, the less her baby weighed at birth. Babies with mothers whose partners smoked around them while they were pregnant also weighed on average 36 grams less (about the weight of a chocolate bar) than those with mothers who were not exposed to smoke.

Breastfeeding and child health
An influential study found that babies who were breastfed in the first months of their lives were less likely to go to hospital for diarrhoea or respiratory problems, such as infections and pneumonia. The researchers estimated that half of hospital stays for diarrhoea, and a quarter of stays for respiratory problems, could be prevented every month if all babies in the UK were fed entirely on breast milk for at least six months.

Breastfeeding and child development
Between ages 3 and 7 MCS children took part in a range of activities to show which words they knew and the patterns they could identify in shapes and images. Studies have found that children who were breastfed tended to do better in these exercises and to have less behaviour problems. Research has also suggested that there is a relationship between breastfeeding and young children's ability to coordinate the movements of their arms and legs and to reach milestones such as standing up for the first time and taking their first steps.

The paragraph below is taken from this 2011 impact case study - https://cls.ucl.ac.uk/wp-content/uploads/2017/06/Impact-case-studies-Millennium-Cohort-Study-November-2011.pdf

Breastfeeding and birth weight: The MCS research that found breastfeeding to be associated with lower hospitalisation rates for respiratory infections and child diarrhoea has proved to be very influential. It has been widely cited by health organisations, most notably in:

the National Institute for Health and Clinical Excellence (NICE) guidance on Maternal and Child Nutrition;
guidance issued by the Department of Health/Department for Children, Schools and Families, Commissioning local breastfeeding support services;
Infant Feeding Survey 2005: A commentary on infant feeding practices in the UK, by the Scientific Advisory Committee on Nutrition.

The finding is highlighted in the nutrition guidelines and breastfeeding strategy documents published by many UK primary care trusts, including North Somerset, Stoke on Trent, Blaenau, Gwent, North Lincolnshire, Knowsley and Kent and Medway. It is also cited in documents published by the NCT (formerly the National Childbirth Trust), such as NCT breastfeeding support services - the evidence (2010). This finding has, additionally, had an impact far beyond the UK. It has been used to help underpin the South African governments policy on breastfeeding (see SA Breastfeeding Program: Strategic and action plan 2007 2012). It is also referred to in several documents and public statements issued by Unicef UK on behalf of the Baby Friendly Initiative, a worldwide programme of the World Health Organization and Unicef.

Thanks to MCS and this research, mothers have more information and guidance about the health benefits of breastfeeding for their children.

Outputs:

Output from the data received from NHS Digital
The addresses previously obtained from NHS Digital, for other studies, were used to invite study members to take part and re-engage with the studies and will be used for this study to invite participants to take part in the upcoming survey. Using addresses provided by the NHS Digital helped CLS getting in touch with those cohort members who would otherwise not be able to take part in a new survey.

The main outcome for the study is the next sweep, MCS age 22, provisionally planned to take place in 2023, which will be a fully documented, anonymised research dataset and this will be archived with the UK Data Service to provide a strategically important resource for UK Social Science, including researchers in health and social care.

The scientific priorities and questionnaire content of the next sweep (age 22) will be elaborated and developed in consultation with the academic and policy community with the aim of collecting both information relevant to their lives at age 22 and to later life outcomes, as well as repeat measures of topics covered at age 17. CLS will continue to prospectively harmonise the content with other comparable cohorts, particularly those in the UK, by drawing on comparable measures at a similar age. All surveys are overseen by the CLS Strategic Advisory Board (SAB) which contains representatives from UKRI, Wellcome Trust, Medical Research Council, the scientific community, and government departments. The SAB provide high level strategic oversight for CLS to ensure the cohort studies led by the centre are developed, managed, and maintained in a manner that maximises their benefit as long-term scientific resources of importance both nationally and internationally, while protecting participants' interests. The SAB ensure that the content is closely aligned with research priorities, as well as the areas of research interest (ARIs) published by government departments. CLS will reflect these priorities when deciding on the major themes which CLS intend to cover at the next sweep collection at age 22 which will certainly include health themes such as Mental health and wellbeing, including psychological distress and anxiety, mental wellbeing, life satisfaction, loneliness, coping mechanisms. Physical health and health behaviours, including weight, substance use, sleep, diet and exercise will also be included.

Research using this data often feature in the news, potentially reaching policy making communities in this way. For example https://www.theguardian.com/society/2020/sep/17/children-living-in-more-costly-homes-have-fewer-mental-health-problems-study. Initial findings from the survey are shared on the study website. https://childnc.net/initial-findings-from-the-age-17-survey/ . This is also shared with participants. Similar outputs are expected for the current project. This will encourage engagement with the public, the scientific and the policy-making communities.

CLS will continue to produce outputs from the study via the UK Data Service in the form of aggregated report for the benefit of the wider research community as previous interest in MCS data has proven to be sought in a large scope of research areas. CLS will also publish papers in a range of journals; however, it is not possible to provide detail at this point as to precisely which journals and dates, but the intention is to produce outputs along the same lines as those produced after previous sweep. All scientific papers using the MCS data are published on the CLS Bibliography page online.
https://www.bibliography.cls.ucl.ac.uk/Bibliography.aspx?sitesectionid=647&sitesectiontitle=Bibliography

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by "Personnel" (as defined within the Data Sharing Framework Contract i.e.: employees, agents and contractors of the Data Recipient who may have access to that data).

NHS address tracing and matching variables:
NHS address tracing. CLS wish to use the demographics products to receive new updated address regularly.

CLS will supply NHS Digital with a file with study members to match to the NHS data. The file supplied will only contain eligible study members who have participated in at least one wave of MCS. It will not include study members known to have died or to have withdrawn from the study. The file will contain the following data items:
- CLS identifier
- First name
- Last name
- Middle name (where available),
- Date of birth
- Gender
- Postcode
- NHS Number

NHS Digital would supply the following details to CLS:
- CLS identifier
- NHS number
- Data as stated in the Demographics extract.

No other data will be linked to the NHS Digital data received.

All those accessing the data supplied by NHS Digital are substantive employees of University College London or employees of subcontractor organisations (organisation to be appointed) carrying work on behalf of UCL who have been appropriately trained in data protection and confidentiality.

At UCL, the NHS Digital data will be held at the secure server in the UCL Data Safe Haven (DSH) and accessed remotely by CLS staff. The UCL DSH is certified to ISO 27001:2013 and is compliant with NHS Digital s Data Security and Protection Toolkit. Staff using the DSH complete annual training and regularly review data access arrangements ensuring data are only limited to those authorised to access it. UCL Computing Regulations are based on the premise that access to resources are generally forbidden unless expressly permitted. All data transfers from the DSH require approval and are carried out through secure portals which are fully audited. Access to the UCL DSH is via remote desktop and requires multi-factor authentication. In addition to a strong password each user has to use a six-digit number generated by a smartphone app or physical token at each login. Passwords must be changed at regular intervals, and unused accounts are automatically disabled after a fixed period. Once inside the environment, robust access control ensures that researchers can only examine information that they are approved to use.

The data file supplied by NHS Digital, will be reviewed by CLS. Where addresses supplied by NHS Digital are new or more recent than the address currently held on the CLS confidential database the new addresses will be uploaded. CLS will write to all newly traced cohort members at the addresses that are supplied by NHS Digital and will ask them to confirm their address by return of a reply slip, telephone, email or via our website. For this purpose, CLS will send names and addresses to the mailing house company (to be appointed) so they can send correspondence to cohort members on behalf of CLS. If cohort members confirm their address this will be recorded on CLS' database as a confirmed address. If the letter is returned to sender this will be also be recorded on CLS database. There will also be cases where no confirmation is received and CLS letter is not returned to sender.

CLS will use a mailing house organisation (to be appointed) to send correspondence to participants inviting them to re-engage with the study. Furthermore, addresses will be used by CLS to invite study members to take part in the current survey and future surveys. These new/more recent addresses will also be shared with the fieldwork agency (to be appointed) for the purpose of inviting participants to take part in the upcoming survey. Once the mailing organisation is appointed, CLS will submit an amendment to add the new processor to this application. Only after receiving NHS Digital approval of the new processor will CLS share any data with them.

Centre for Longitudinal Studies Birth Cohort Studies Data Linkage: National Child Development Study — DARS-NIC-49297-Q7G1Q

Opt outs honoured: No

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 s261(2)(c); Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2017-05 – 2020-04 2017.12 — 2024.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

AGD/predecessor discussions: AGD Minutes - 27 June 2024 final.pdf, AGD minutes - 15 June 2023 final.pdf, IGARD Minutes - 13 January 2022 final.pdf, igard-minutes---6-aug-2020-final.pdf, igardminutes-1stoctober2020final.pdf, IGARD_Minutes_20.07.17.pdf, AGD Draft minutes - 9 May 2024 final.pdf, IGARD Minutes - 26 January 2023 final.pdf, IGARD Minutes - 26 August 2021 final.pdf, IGARD Minutes - 29 July 2021 - FINAL.pdf, IGARD Minutes - 20th May 2021 final.pdf, igard-minutes---3rd-september-2020-final.pdf, igardminutes-21stjanuary2021final.pdf, igardminutes-14thjanuary2021final.pdf, IGARD_Minutes_10.08.17.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Outpatients
Hospital Episode Statistics Critical Care
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
HES-ID to MPS-ID HES Accident and Emergency
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The Centre for Longitudinal Studies (CLS) is an Economic and Social Research Council (ESRC) Centre, based at the Department of Quantitative Social Science, UCL Institute of Education. It is responsible for three of Britain's internationally renowned birth cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study and the Millennium Cohort Study (MCS). All these studies are 'birth' studies, following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being. This application is for data to be linked to a subset of the 1958 National Child Development Study (NCDS).

In 1958 doctors and scientists were concerned at the high rate of infant death and ill health in Britain. There were an alarming number of stillbirths and children dying in the first few weeks of life. And so the National Child Development Study (NCDS) began as the Perinatal Mortality Survey. Nearly 17,500 babies were studied. Information was collected on the family background of the mother, her pregnancy and labour, and about her baby at birth and during its first week of life.

Seven years later it was decided that it would be worthwhile to find the families included in the original birth survey and see what had happened to the babies since they were born – how healthy they were, how they were getting on at school, and so on. This second survey was carried out in 1965. Since then there have been eight other major surveys, attempting to trace all those born in the week of the original 1958 survey – in 1969, 1974, 1981, 1991, 1999/2000, 2004/5, 2008/9 and most recently in 2013. In addition, a major ‘bio-medical’ survey took place in 2002/3.

During the 2008 (Aged 50) survey, CLS obtained informed consent from cohort members for their health data to be linked to the data collected in the study. In total consent was obtained from 6529 cohort members who at the time were in England.

Linking health data from Hospital Episodes Statistics (HES) to the NCDS survey data will greatly increase the possibilities for using the cohort to study how health outcomes impact on the individual and aspects of their life such as work, relationships and family life and, likewise, how health outcomes relate to the individual behaviours and lifestyles choices such as drug and alcohol use, sexual health, diet and exercise, which are all documented as part of the study. The successful inclusion of HES data will enrich these data by revealing which cohort members have been admitted to or attended hospital and the reasons for this, e.g. drug and alcohol treatment, accident and emergency, maternity and mental health services which could help us better understand how health conditions could be better treated or supported.

Data about health behaviours may be more accurate if obtained from administrative records as a result of misreporting of complex health conditions, under-reporting of particular health problems or due to perceived sensitivities around certain behaviours and lifestyle choices. So this also offers a methodological opportunity to validate the data collected in the survey and vice versa.

At this stage the aim of the research is to;
1. validate and improve the quality of the cohort data
2. produce methodological papers describing the quality of the data and its benefit to health and social care
3. develop and create a useful and rich HES linked NCDS Aged 50 dataset

Expected Benefits:

NCDS surveys include questions relating to health outcomes and hospitalisations. CLS will use these responses to compare with their data available on HES to obtain a better understanding of relationship between self-reporting and administrative data. This will be shared via methodological information which will assess the data quality and comparability of two important data sources. This will be of benefit to research investigating health and social care.

This data linkage will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of family members. This could be of direct benefit to the NHS, patients and to community services interfacing with schools through informing policy to improve healthy lifestyles.

Below are examples of existing publications using NCDS data benefiting public health in the areas of pregnancy health, birth, breastfeeding, vitamin D, obesity, diabetes, respiratory disease.

Delpierre, C., Fantin, R., Barboza-Solis, C., Lepage, B., Darnaudéry, and M., Kelly-Irving, M. (2016) The early life nutritional environment and early life stress as potential pathways towards the metabolic syndrome in mid-life? A lifecourse analysis using the 1958 British Birth cohort. BMC Public Health. 2016 Aug 18; 16(1):815. Epub 2016 Aug 18.

LLEWELLYN, A, SIMMONDS, M, OWEN, C.G and WOOLACOTT, N. (2016) Childhood obesity as a predictor of morbidity in adulthood: a systematic review and meta-analysis. Obesity Reviews, 17(1), 56-67.

BARBOSA-SOLÍS, C, KELLY-IRVING, M, FANTIN, R, DARNAUDÉRY, M, TORRISANI, J, LANG, T and DELPIERRE, C. (2015) Adverse childhood experiences and physiological wear-and-tear in midlife: Findings from the 1958 British birth cohort. Proceedings of the National Academy of Sciences of the United States of America, 112(7), E738–E746.

BERRY, D.J, HESKETH, K, POWER, C and HYPPONEN, E. (2011) Vitamin D status has a linear association with seasonal infections and lung function in British adults. British Journal of Nutrition, 106(9), 1433-14440.

MONTGOMERY, S.M and EKBOM, A. (2002) Smoking during pregnancy and diabetes mellitus in a British longitudinal birth cohort. British Medical Journal, 324, 26-27.

In it’s nearly sixty years research the NCDS cohort has been responsible for proving beyond doubt that mothers who smoked heavily during pregnancy harmed the health and reduced the weight and height of their children, continuing on to damage English and maths scores at 16 years old. The study also informed the debate about the best place to deliver babies, indicating that mothers should only opt for home births when very early transfer to hospital is possible at the first sign of need and where highly experienced midwives and doctors are available. The study repeatedly demonstrated the need for steps to promote the health of pregnant mothers and facilities for safe childbirth. This led to the modernisation of maternity services with ready availability of high quality obstetrics on the one hand and better and more personal care for all. The case was made for adequate numbers of hospital beds and abolition of the lottery of where to give birth. Research has also made use of the longitudinal nature of the NCDS to examine the long-term effects of breastfeeding. For example, Rudnicka et al (2007) demonstrate that, compared with those who were bottle-fed with formula milk, children who were breastfed for more than a month had a reduced waist circumference and waist/hip ratio, and lower odds of obesity as adults in their mid-forties.

RUDNICKA, A. R, OWEN, C. G and STRACHAN, D. P. (2007) The effect of breast feeding on cardio-respiratory risk factors in adulthood. Pediatrics, 119(5), E1107-15.

Delpierre, Fantin, Barboza-Solis, Lepage, Darnaudéry, and M. Kelly-Irving (2016) examined the influence of both the early nutritional environment, and the psychosocial environment, on the subsequent risk of metabolic syndrome (MetS) in midlife. Early nutritional environment, represented by mother’s pre-pregnancy BMI, was associated with the risk of MetS in midlife. An important mechanism involves a mother-to-child BMI transmission, independent of birth or perinatal conditions, socioeconomic characteristics and health behaviors over the lifecourse. However this mechanism was not sufficient for explaining the influence of mother’s pre-pregnancy BMI which implies the need to further explore other mechanisms in particular the role of genetics and early nutritional environment. Adverse Childhood Experiences (ACEs) (identified through categories such as child in care, physical neglect, offenders, parental separation, mental illness, alcohol abuse) was not independently associated with MetS. However, the authors suggest that other early life stressful events such as emergency caesarean deliveries and poor socioeconomic status during childhood may contribute as determinants of MetS (Delpierre, C., Fantin, R., Barboza-Solis, C., Lepage, B., Darnaudéry, and M., Kelly-Irving, M. (2016) The early life nutritional environment and early life stress as potential pathways towards the metabolic syndrome in mid-life? A lifecourse analysis using the 1958 British Birth cohort. BMC Public Health. 2016 Aug 18; 16(1):815. Epub 2016 Aug 18).

Early negative circumstances during childhood, collected prospectively in the British birth cohort 1958, could be associated with physiological wear-and-tear in midlife as measured by allostatic load. This relationship was largely explained by health behaviors, body mass index, and socioeconomic status in adulthood, but not entirely. The results suggested that a biological link between adverse childhood exposures and adult health may be plausible. The authors’ findings contribute to the development of more adapted public health interventions, both at a societal and individual level (BARBOSA-SOLÍS, C, KELLY-IRVING, M, FANTIN, R, DARNAUDÉRY, M, TORRISANI, J, LANG, T and DELPIERRE, C. (2015) Adverse childhood experiences and physiological wear-and-tear in midlife: Findings from the 1958 British birth cohort. Proceedings of the National Academy of Sciences of the United States of America, 112(7), E738–E746).

In meta-analysis, including the NCDS, Llewellyn, Simmonds, Owen, and Woolacott (2016) investigated the ability of childhood body mass index (BMI) to predict obesity-related morbidities in adulthood. The authors found that high childhood BMI was associated with an increased incidence of adult diabetes, coronary heart disease (CHD) and a range of cancers, but not stroke or breast cancer. The accuracy of childhood BMI to predict any adult morbidity was low. Only 31% of future diabetes and 22% of future hypertension and CHD occurred in children aged 12 or over classified as being overweight or obese. Only 20% of all adult cancers occurred in children classified as being overweight or obese. Childhood obesity was associated with moderately increased risks of adult obesity-related morbidity, but the increase in risk was not large enough for childhood BMI to be a good predictor of the incidence of adult morbidities as the majority of adult obesity-related morbidity occurred in adults who were of healthy weight in childhood. Therefore, the authors suggest, targeting obesity reduction solely at obese or overweight children may not substantially reduce the overall burden of obesity-related disease in adulthood (LLEWELLYN, A, SIMMONDS, M, OWEN, C.G and WOOLACOTT, N. (2016) Childhood obesity as a predictor of morbidity in adulthood: a systematic review and meta-analysis. Obesity Reviews, 17(1), 56-67.

Using cross-sectional data from the NCDS biomedical survey, Berry, Hesketh, Power and Hypponen (2011) found that vitamin D status had a linear relationship with respiratory infections and lung function, but randomised controlled trials are warranted to investigate the role of vitamin D supplementation on respiratory health and to establish the underlying mechanisms (BERRY, D.J, HESKETH, K, POWER, C and HYPPONEN, E. (2011) Vitamin D status has a linear association with seasonal infections and lung function in British adults. British Journal of Nutrition, 106(9), 1433-14440).

Montgomery and Ekbom (2002) tested the hypothesis that maternal smoking during pregnancy increases both the risk of early onset type 2 diabetes and nondiabetic obesity in offspring. The association of diabetes with maternal smoking during pregnancy (independent of finer-grain measures of mothers' smoking in 1974, own smoking at age 16, and other potential confounding factors) suggested that it is a true risk factor for early adult onset diabetes. Cigarette smoking as a young adult was also independently associated with an increased risk of subsequent diabetes.

In utero exposures due to smoking during pregnancy may increase the risk of both diabetes and obesity through programming, resulting in lifelong metabolic dysregulation, possibly due to fetal malnutrition or toxicity. The odds ratios for obesity without type 2 diabetes are more modest than those for diabetes and the scope for confounding may be greater. Smoking during pregnancy may represent another important determinant of metabolic dysregulation and type 2 diabetes in offspring. The authors stress that smoking during pregnancy should always be strongly discouraged (MONTGOMERY, S.M and EKBOM, A. (2002) Smoking during pregnancy and diabetes mellitus in a British longitudinal birth cohort. British Medical Journal, 324, 26-27).

Research using this cohort has also shed light on cancer and leukaemia in childhood, behavioural disorder, educational delay and disability.

Linking Hospital Episodes Statistics (HES) to the NCDS survey data will greatly increase the potential of this unique dataset which has already been benefiting health outcomes for nearly 60 years. Our society is changing fast. This cohort study will be used to chart and understand how society has changed over the years, and how life experiences are different for each generation. They help understand the impact of societal trends such as the ageing population and the growth in lone-parent and step-families, and changes such as growing employment insecurity. This study helps understand that change. Evidence from this cohort study have contributed to many policy decisions in diverse areas – such as increasing the duration of maternity leave, raising the school leaving age, updating breast feeding advice given to parents.

Further examples of benefits to health can be found on the study website athttps://ncds.info/home/what-have-we-learned/

Outputs:

Following the data quality and validation work, the first output will be the creation of the linked NCDS/HES dataset. The HES data will add an important layer to this already rich data as well as providing the means for data quality checking.

The second output will be methodological papers published in peer reviewed journals reviewing the linkage and validating the data from the two data sources. These methodological assessments are expected to finish two years after obtaining the data. Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide.

The creation of this HES/NCDS Aged 50 database and the methodological papers are the first steps in establishing a robust research database which will be of benefit to health and social care. The onward sharing to researchers via an agreed mechanism will be subject to a further application to NHS Digital.

The outputs in the long term from this dataset are difficult to quantify, but the CLS currently has a searchable bibliography on it's website with over 3,600 publications based on data from the 1958, 1970, Next steps and millennium cohort studies.

CLS actively promotes the use of their data among the research community through publications and events, as well as providing extensive documentation, guidance, training and workshops on each data set to help researchers better use the data and so ultimately benefit health and social care.

Processing:

Only individuals, working under appropriate supervision on behalf of data controller(s) / processor(s) within this agreement, who are subject to the same policies, procedures and sanctions as substantive employees will have access to the data and only for the purposes described in this document.

Identifiers will be held separately from attribute characteristics. HES data will not be re-linked to the identifiable data which is held separately from the survey responses. Re-identification will only happen at the occasion of a request, made from a cohort member, for withdrawal from the study, and this includes removal of data. Where a participant wishes to withdraw from the study, the identifiable data is used to locate the study ID which is used to destroy the data.

1. CLS team will supply NHS Digital with identifiers of cohort members who have consented to this data linkage, including full name, sex, postcode, date of birth and unique ID (study-specific pseudonymised identifier).

2. NHS Digital will link the identifiable study data to HEs data. NHS Digital will then remove identifiers from the linked dataset and return to the CLS team at UCL with the study ID.

3. CLS will carry out validation of the administrative data received (linked HES data) and will combine the supplied administrative data with the information collected from the participant as part of the NCDS study using the study ID.

Once the linked survey-administrative data files have been created, CLS may perform other activities to prepare the data for use by other researchers, such as coding and cleaning, derivation of summary variables and compilation of data documentation but as above.

4. CLS researchers will use these data to create an analysis file that will not contain any identifiable data.

5. CLS will create derived variables that summarise study members’ hospitalisation and health histories (e.g. hospital admissions and re-admissions, incidence of common diseases, children’s ailments etc.), and will compare NCDS survey data with data from hospital statistics, in order to compare and validate the data collected in CLS surveys.

UCL are prohibited from linking the identifiable data they hold with data disseminated from NHS Digital. The only exception to this condition would be where a participant wishes to withdraw from the study, the identifiable data would be used to locate the study id and then in turn to destroy the data.

UCL will not share the linked HES/NCDS Aged 50 data with third parties.

Centre for Longitudinal Studies Next Steps Data Linkage: Next Steps Age 25 Study — DARS-NIC-51342-V1M5W

Opt outs honoured: No

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(c); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2017-03 – 2020-04 2017.09 — 2024.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

AGD/predecessor discussions: AGD minutes - 21st November 2024 final.pdf, AGD Minutes - 27 June 2024 final.pdf, AGD Draft minutes - 9 May 2024 final.pdf, AGD minutes - 15 June 2023 final.pdf, AGD minutes - 30 March 2023 final.pdf, IGARD Minutes - 20th May 2021 final.pdf, igard-minutes---6-aug-2020-final.pdf, igard-minutes-31st-october-2019---final.pdf, igardminutes-1stoctober2020final.pdf, IGARD_Minutes_02.03.17.pdf, IGARD Draft Minutes 26th September FINAL.pdf, DAAG_Minutes_31.01.17.pdf, IGARD Minutes - 26 January 2023 final.pdf, IGARD Minutes - 26 August 2021 final.pdf, IGARD Minutes - 29 July 2021 - FINAL.pdf, IGARD_Minutes_09.02.17.pdf

Datasets:

Hospital Episode Statistics Outpatients
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Accident and Emergency
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
HES-ID to MPS-ID HES Accident and Emergency
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The Centre for Longitudinal Studies (CLS) is an academic resource centre responsible for producing and disseminating data resources for the scientific community. CLS manages three world-renowned birth cohort studies; the National Child Development Study 1958, the British Cohort Study 1970, and the Millennium Cohort Study 2000 and now have the Next Steps cohort in their portfolio.

Next Steps is a longitudinal study following the lives of 16,000 people born in 1989/90, originally sampled from schools in England at age 13/14 years and initially managed by the Department for Education. Next Steps participants were interviewed annually between 2004 and 2010 and again in 2015/16 to map their journeys through education and transitions into adulthood and the labour market.

Next Steps is the largest and most detailed research study of its kind trying to understand the changing experiences of this generation. As such, Next Steps has already been highly valuable in informing policy decisions and in enhancing understanding of how specific Government policies can influence and shape the lives of young people. Next Steps data has also been widely used by academic researchers in the UK and elsewhere.

During the 2015/16 survey, CLS obtained informed consent from cohort members for their health data to be linked to the data collected in the study. In total consent was obtained from approximately 4941 cohort members.

Linking health data from Hospital Episodes Statistics (HES) to the Next Steps survey data will greatly increase the possibilities for using the cohort to study how health outcomes impact on the individual and aspects of their life such as work, relationships and family life and, likewise, how health outcomes relate to the individual behaviours and lifestyles choices such as drug and alcohol use, sexual health, diet and exercise, which are all documented as part of the study. The successful inclusion of HES data will enrich these data by revealing which cohort members have been admitted to or attended hospital and the reasons for this, e.g. drug and alcohol treatment, accident and emergency, maternity and mental health services which could help us better understand how health conditions could be better treated or supported.

Data about health behaviours may be more accurate if obtained from administrative records as a result of misreporting of complex health conditions, under-reporting of particular health problems or due to perceived sensitivities around certain behaviours and lifestyle choices. So this also offers an interesting methodological opportunity to validate the data collected in the survey and vice versa.

At this stage the aim of the researchers is to:

1. Validate and improve the quality of the cohort data
2. Produce methodological papers describing the quality of the data and its benefit to health and social care
3. Develop and create a useful and rich HES linked Next Steps dataset

Expected Benefits:

Next Steps surveys include questions relating to health outcomes and hospitalisations. CLS will use these responses to compare with their data available on HES to obtain a better understanding of relationship between self-reporting and administrative data. This will be shared via methodological information which will assess the data quality and comparability of two important data sources. This will benefit research looking at Health and Social Care

This data linkage will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of family members. This could be of direct benefit to the NHS and to community services interfacing with schools through informing policy to improve healthy lifestyles.

It is difficult to predict in advance the type of research question that might be put forward. Below are examples of existing publications using Longitudinal Study of Young People in England (LSYPE) data benefiting public health.

Calderwood, L., and Sanchez, C. (2016). Next Steps (formerly known as the Longitudinal Study of Young People in England). Open Health Data, 4(1), e2

Hale, D., and Viner, R. (2016). The correlates and course of multiple health risk behaviour in adolescence. BMC Public Health. 2016 May 31; 16: 458. doi: 10.1186/s12889-016-3120-z.

Semlyen, J., King, M., Varney, J., and Hagger-Johnson, G. (2016). Sexual orientation and symptoms of common mental disorder or low wellbeing: combined meta-analysis of 12 UK population health surveys. BMC Psychiatry. 2016 Mar 24;16: 67. doi: 10.1186/s12888-016-0767-z.

Symonds, J., Dietrich, J., Chow, A., and Salmela-Aro, K. (2016). Mental health improves after transition from comprehensive school to vocational education or employment in England: A national cohort study. Developmental Psychology, 52(4), 652-665

Chatzitheochari, S., Parsons, S., and Platt, l. (2015). Doubly Disadvantaged? Bullying Experiences among Disabled Children and Young People in England. Sociology, advance online access, 28 April 2015

Debell, D. (2015). Public Health for Children, Second Edition. London: CRC Press.

Hale, D., and Viner, R. (2015). Health in adolescence influences educational attainments and life chances: longitudinal associations in the Longitudinal Study of Young People in England (LSYPE). Archives of Disease in Childhood, 100 (Suppl.3), A210-A211.

Hatton, C., and Emerson, E. (2015). International Review of Research into Developmental Disabilities: Health Disparities and Intellectual Disabilities. London: Academic Press.

Department for Education, TNS BMRB. (2015). Second Longitudinal Study of Young People in England: Wave 1, 2013: Secure Access. [Data collection]. UK Data Service. SN: 7838, http://dx.doi.org/10.5255/UKDA-SN-7838-1.

Department for Education, NatCen Social Research (2013). First Longitudinal Study of Young People in England: Waves One to Seven, 2004-2010: Secure Access. [Data collection]. 2nd Edition. UK Data Service. SN: 7104, http://dx.doi.org/10.5255/UKDA-SN-7104-2.

Next Steps (formally known as the Longitudinal Study of Young People in England - LSYPE) data is a resource with great potential for research and policy community, and the information collected on health and its social determinants widens its potential value for health research and policy interventions.

Researchers currently have access to the LSYPE data and are able to apply and carry out research utilising the established link to benefit health and social care. Below are some examples of existing publications using LSYPE data (waves 1 to 7) benefiting public health.

Hale and Viner (2015), for example, examine longitudinally the causal pathways from poor adolescent health to low academic attainment and unemployment in young adulthood, and make recommendations for policy interventions to focus on improving outcomes for unhealthy adolescents.

Having a chronic condition, poor mental health and poor self-reported general health were assessed between ages 13 and 15. Outcome variables included poor academic performance (non-attainment of expected academic proficiency based on mandated school examinations) at age 16 and NEET status (not in education, employment or training) at age 19. The authors examined associations between health and subsequent outcomes, and conducted mediator analyses to assess the proportion of the association attributable to hypothesised mediators including school absences, classroom behaviour, truancy, social exclusion, health behaviours and psychological distress. The study revealed that poor mental and general health and long-term conditions predicted low educational attainment at age 16. Poor mental health and poor general health (but not long-term conditions) predicted unemployment. Social exclusion was a consistent mediating variable. Long-term absences mediated associations between general health and mental health and later outcomes whereas school behaviour, truancy and substance use were significant mediators for general health and mental health. Poor adolescent health disrupts educational and employment pathways. Due to the economic and social costs of educational underachievement and unemployment, policy interventions should focus on improving outcomes for unhealthy adolescents (Hale, D., and Viner, R. (2015). Health in adolescence influences educational attainments and life chances: longitudinal associations in the Longitudinal Study of Young People in England (LSYPE). Archives of Disease in Childhood, 100 (Suppl.3), A210-A211.).

The same authors - Hale and Viner (2016) - examined the association between health risk behaviours (such as smoking, alcohol use, illicit drug use, delinquency and unsafe sexual behaviour) throughout adolescence (and at ages 14, 16, and 19) and identified common risk factors for multiple risk behaviour (involvement in two or more risk behaviours) in late adolescence (at age 19), drawing attention to policy focus on prevention of adolescence health risk behaviours.

All early risk behaviours were found to be associated with other risk behaviours at age 19. A number of sociodemographic, interpersonal, school and family factors at age 14 predicted risk behaviour and multiple risk behaviour at age 19. Past risk behaviour being a strong predictor of age 19 risk behaviour with those involved in multiple risk behaviour in early adolescence being far more likely to be multiple risk-takers at age 19, while many involved in only one form of risk behaviour in mid-adolescence do no progress to multiple risk behaviour (Hale, D., and Viner, R. (2016). The correlates and course of multiple health risk behaviour in adolescence. BMC Public Health. 2016 May 31; 16: 458. doi: 10.1186/s12889-016-3120-z.).

LSYPE data has also contributed to evidence on childhood disability. Chatzitheochari, Parsons and Platt (2015) enhanced the evidence on school bullying experience among disabled children, likely to have a strong negative impact on social and psychological later life outcomes. The authors studied the relationship between bullying victimisation and childhood disability, and revealed an independent association of disability with bullying victimisation, suggesting potential pathway to cumulative disability-related disadvantage, drawing attention to the school as a site of reproduction of social inequalities (Chatzitheochari, S., Parsons, S., and Platt, l. (2015). Doubly Disadvantaged? Bullying Experiences among Disabled Children and Young People in England. Sociology, advance online access, 28 April 2015).

Semlyen, King, Varney, and Hagger-Johnson (2016) studied the association between sexual orientation identity and poor mental health and drew attention on LGB adults in UK and their higher prevalence of poor mental health and low well-being when compared to heterosexuals. They addressed the need of routine measurement of sexual orientation in health studies and administrative data in order to influence national and local policy development and service delivery. Their findings reiterate for local government, NHS providers and public health policy makers to consider how to address inequalities in mental health among these minority groups (Semlyen, J., King, M., Varney, J., and Hagger-Johnson, G. (2016). Sexual orientation and symptoms of common mental disorder or low wellbeing: combined meta-analysis of 12 UK population health surveys. BMC Psychiatry. 2016 Mar 24;16: 67. doi: 10.1186/s12888-016-0767-z.).

Symonds et al. (2016) analysed mental health at the school to work transition. The authors examined how adolescents’ anxiety, depressive symptoms, and positive functioning developed as they transferred from comprehensive school to further education, employment or training, or became NEET (not in education, employment or training), at age 16 years. Controlling for childhood achievement, socioeconomic status, ethnicity, and gender, the authors found that NEET adolescents had the largest losses in mental health. This pattern was similar to adolescents staying on at school who had increased anxiety and depression, and decreased positive functioning, after transition. In comparison, adolescents transferring to full time work, apprenticeships or vocational college experienced gains in mental health (Symonds, J., Dietrich, J., Chow, A., and Salmela-Aro, K. (2016). Mental health improves after transition from comprehensive school to vocational education or employment in England: A national cohort study. Developmental Psychology, 52(4), 652-665).

Next Steps Age 25 survey broadens the information collected on health and well-being, including family relationships, employment, education and income, which will add to the potential of the data and future research in the area of health.

Outputs:

Following the data quality and validation work, the first output will be the creation of the linked Next Steps/HES dataset. The HES data will add an important layer to this already rich data as well as providing the means for data quality checking.

The second output will be methodological papers published in peer reviewed journals reviewing the linkage and validating the data from the two data sources. These methodological assessments are expected to finish two years after obtaining the data. Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide.

The creation of this HES/Next Steps database and the methodological papers are the first steps in establishing a robust research database which will be of benefit to health and social care. The onward sharing to researchers via an agreed mechanism will be subject to a further application to NHS Digital.

The outputs in the long term from this dataset are difficult to quantify, but the CLS currently has a searchable bibliography on its website with over 3,600 publications based on data from the 1958, 1970, Millennium cohort and Next Steps studies. CLS actively promotes the use of their data among the research community through publications and events, as well as providing extensive documentation, guidance, training and workshops on each data set to help researchers better use the data and so ultimately benefit health and social care.

Processing:

Data disseminated from NHSD to UCL will only be accessed by substantive employees of UCL and only for the purposes described in this document.

Identifiers will be held separately from attribute characteristics. HES data will not be relinked to the identifiable data which is held separately from the survey responses. Re-identification will only happen at the occasion of a request, made from a cohort member, for withdrawal from the study, and this includes removal of data. Where a participant wishes to withdraw from the study, the identifiable data is used to locate the study id, and then in turn destroy their data.

1. CLS team will supply NHS Digital with identifiers of cohort members who have consented to this data linkage, including full name, sex, postcode, date of birth and study ID (study-specific pseudonymised identifier).

2. NHS Digital will link the identifiable study data to HES data. NHS Digital will then remove identifiers from linked dataset and return the dataset to the CLS team at UCL with the study ID.

3. CLS will carry out validation of the administrative data received (linked HES data) and will combine the supplied administrative data with the information collected from the participant as part of the Next Steps study using the study ID.

Once the linked survey-administrative data files have been created, CLS may perform other activities to prepare the data for use , such as coding and cleaning, derivation of summary variables and compilation of data documentation.

4. CLS researchers will use these data to create an analysis file, which to confirm will not contain any identifiable data.

5. CLS will create derived variables that summarise study members’ hospitalisation and health histories (e.g. hospital admissions and re-admissions, incidence of common diseases, children’s ailments etc.), and will compare Next Steps survey data with data from hospital statistics, in order to compare and validate the data collected in CLS surveys.

Centre for Longitudinal Studies Birth Cohort Studies Data Linkage: 1970 British Cohort Study — DARS-NIC-49826-T0J7C

Opt outs honoured: No

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive, and Sensitive

When:DSA runs 2017-03 – 2020-04 2017.06 — 2024.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No, Yes

AGD/predecessor discussions: AGD minutes - 21st November 2024 final.pdf, AGD Minutes - 27 June 2024 final.pdf, AGD Draft minutes - 9 May 2024 final.pdf, AGD minutes - 15 June 2023 final.pdf, IGARD Minutes - 13 January 2022 final.pdf, IGARD Minutes - 20th May 2021 final.pdf, igard-minutes---6-aug-2020-final.pdf, igardminutes-1stoctober2020final.pdf, IGARD_Minutes_02.03.17.pdf, IGARD Minutes - 26 January 2023 final.pdf, IGARD Minutes - 26 August 2021 final.pdf, IGARD Minutes - 29 July 2021 - FINAL.pdf, igard-minutes-13th-december-2018-final.pdf, igardminutes-14thjanuary2021final.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

The Centre for Longitudinal Studies (CLS) is an Economic and Social Research Council (ESRC) Centre, based at the Department of Quantitative Social Science, UCL Institute of Education. It is responsible for three of Britain's internationally renowned birth cohort studies, the 1958 National Child Development Study, the 1970 British Cohort Study and the Millennium Cohort Study (MCS). All these studies are 'birth' studies, following the groups of participants from cradle to grave. As such, this group of studies is unique and has, and still is, providing a wealth of information used in the policy decisions affecting society's health and well-being.

The 1970 British Cohort Study (BCS70) has its origins in the late 1960s, when there was a great deal of concern amongst doctors and others about the number of babies born with abnormalities, or dying very early in life. They decided to compare those mothers and babies who had problems, with those who did not in order to see what could be done about this issue. The simplest way to do this was to study all the babies born in one week. With the help of doctors, midwives, and health authorities throughout England, Wales and Scotland, this study was carried out in 1970.

Information was collected on the family background of the mother, her pregnancy and labour, and about her baby at birth and in the first week of life. Almost 17,500 babies were studied.

It was not for another 5 years that it was decided that it would be worthwhile trying to find the families from the original birth survey to see what had happened to the babies since 1970 – how healthy they were, how they were getting on at school, and so on. This second survey was carried out in 1975. Since then there have been seven other major surveys, attempting to trace all those born in the week of the original 1970 survey – in 1980, 1986, 1996, 1999/2000, 2004/5, 2008 and in 2012 when study members were aged 42.

During the 2012 survey, CLS obtained informed consent from cohort members for their health data to be linked to the data collected in the study. In total consent was obtained from 6181 cohort members who at the time were in England.

Linking health data from Hospital Episodes Statistics (HES) to the BCS70 survey data will greatly increase the possibilities for using the cohort to study how health outcomes impact on the individual and aspects of their life such as work, relationships and family life and, likewise, how health outcomes relate to the individual behaviours and lifestyles choices such as drug and alcohol use, sexual health, diet and exercise, which are all documented as part of the study. The successful inclusion of HES data will enrich these data by revealing which cohort members have been admitted to or attended hospital and the reasons for this, e.g. drug and alcohol treatment, accident and emergency, maternity and mental health services which could help improve understanding of how health conditions could be better treated or supported.

Data about health behaviours may be more accurate if obtained from administrative records as a result of misreporting of complex health conditions, under-reporting of particular health problems or due to perceived sensitivities around certain behaviours and lifestyle choices. So this also offers a valuable methodological opportunity to validate the data collected in the survey and vice versa.

At this stage the aim of the researchers is to;
1. Validate and improve the quality of the cohort data
2. Produce methodological papers describing the quality of the data and its benefit to health and social care
3. Develop and create a useful and rich HES linked (Age 42) BCS70 dataset

UCL will not link the identifiable data they hold with data disseminated from NHS Digital. The only exception would be where a participant wishes to withdraw from the study.

UCL will not share the linked HES/Age 42 BCS70 dataset with third parties.

Expected Benefits:

The BCS70 surveys include questions relating to health outcomes and hospitalisations. CLS will use these responses to compare with their data available on HES to obtain a better understanding of relationship between self-reporting and administrative data. This will be shared via methodological information which will assess the data quality and comparability of two important data sources. This will be of benefit to research looking at health and social care issues which in turn, through time and cost savings will be of benefit to patients.

This data linkage will facilitate research that CLS anticipate will be carried out on the effects of familial socioeconomic circumstances, lifestyle and environmental factors on the evolution of the wellbeing, health and development of family members. This will be of direct benefit to the NHS and to community services such as those interfacing with schools through informing policy to improve healthy lifestyles.

It is difficult to predict in advance the type of research question that might be put forward. Below are four examples of existing publications using BCS70 data benefiting public health.

GREENE, G, GREGORY, A.M, FONE, D and WHITE, J. (2015) Childhood sleeping difficulties and depression in adulthood: the 1970 British Cohort Study. Journal of Sleep Research, 24(1), 19-23.

VINER, R.M. and TAYLOR, B. (2007) Adult outcomes of binge drinking in adolescence: findings from a UK national birth cohort. Journal of Epidemiology and Community Health, 61(10), 902-907.

CABLE, N, KELLY, Y, BARTLEY, M, SATO, Y and SACKER, A. (2014) Critical role of smoking and household dampness during childhood for adult phlegm and cough: a research example from a prospective cohort study in Great Britain. BMJ Open, 4(4), e004807.

SMITH, L, GARDNER, B, AGGIO, D and HAMER, M. (2015) Association between participation in outdoor play and sport at 10 years old with physical activity in adulthood. Preventive Medicine, 74(May 2015), 31–35.

Below expands further on the benefits of these examples of existing publications using BCS70 data drawing attention on how early life course experiences/exposures shape health outcomes into adulthood.

Greene, Gregory, Fone and White (2015), for example, investigated the relationship between childhood sleeping difficulties (at age 5) and depression in adulthood (age 34), to conclude that severe sleeping problems in childhood may be associated with increased susceptibility to depression in adult life. Adjusting for the potential confounding influences of maternal depression and sleeping difficulties, parental reports of severe sleeping difficulties at 5 years were associated with an increased risk of depression at age 34 years [odds ratio (OR) = 1.9, 95% confidence interval (CI) = 1.2, 3.2] whereas moderate sleeping difficulties were not (OR = 1.1, 95% CI = 0.9, 1.3). Further research, however, is needed to explore whether screening and the treatment of children for poor sleeping patterns might impact upon their mental health in adulthood.

Persistent sleep problems are an increasing health concern. In addition, poor sleep in adulthood has been linked with hypertension, diabetes, depression and obesity, as well as from cancer and increased mortality (Colten and Altevogt, 2006). Therefore, successful identification and treatment for children with sleeping difficulties could, if the association identified by the authors is causal, have large dividends across many aspects of health in the future (GREENE, G, GREGORY, A.M, FONE, D and WHITE, J. (2015) Childhood sleeping difficulties and depression in adulthood: the 1970 British Cohort Study. Journal of Sleep Research, 24(1), 19-23.).

Viner and Taylor (2007) studied outcomes in adult life (at age 30) of binge drinking in adolescence (at age 16). Adolescent binge drinking predicted an increased risk of adult alcohol dependence (OR 1.6, 95% CI 1.3 to 2.0), excessive regular consumption (OR 1.7, 95% CI 1.4 to 2.1), illicit drug use (OR 1.4, 95% CI 1.1 to 1.8), psychiatric morbidity (OR 1.4, 95% CI 1.1 to 1.9), homelessness (OR 1.6, 95% CI 1.1 to 2.4), convictions (1.9, 95% CI 1.4 to 2.5), school exclusion (OR 3.9, 95% CI 1.9 to 8.2), lack of qualifications (OR 1.3, 95% CI 1.1 to 1.6), accidents (OR 1.4, 95% CI 1.1 to 1.6) and lower adult social class, after adjustment for adolescent socioeconomic status and adolescent baseline status of the outcome under study.
The authors draw attention that these associations appear to be distinct from those associated with habitual frequent alcohol use, and binge drinking may contribute to the development of health and social inequalities during the transition from adolescence to adulthood (VINER, R.M. and TAYLOR, B. (2007) Adult outcomes of binge drinking in adolescence: findings from a UK national birth cohort. Journal of Epidemiology and Community Health, 61(10), 902-907.).

Cable, Kelly, Bartley, Sato, and Sacker (2014) findings from BCS70 data give support to current public health interventions for adult smoking and raise concerns about the long-term effects of a damp home environment on the respiratory health of children. The authors examined the associations between childhood exposures to smoking and household dampness (at age 10), and phlegm and cough in adulthood (29 years of age), and found that childhood smoking and exposure to marked household dampness at age 10 were associated with phlegm (childhood smoking: relative risk ratio (RRR) =1.45, 95% CI 1.02 to 2.05; dampness: RRR=2.05, 95% CI 1.07 to 3.91) and co-occurring cough and phlegm (childhood smoking: RRR=1.35. 95% CI 1.08 to 1.67; dampness: RRR=2.73, 95% CI 1.88 to 3.99), while exposure to two or more adult smokers in the household was associated with cough-related symptoms (cough only: RRR=1.28, 95% CI 1.04 to 1.58; phlegm and cough: RRR=1.32, 95% CI 1.06 to 1.64).

These associations were independent from adult smoking, childhood phlegm and cough, early social background and sex. Smoking at age 29 contributed to all symptom patterns, however, a substantial association between household dampness and co-occurring phlegm and cough suggest long-term detrimental effects of childhood environmental exposures.

The authors findings support current public health interventions to reduce adult smoking, but also indicate that the management of childhood risk factors such as exposure to smoke (active or second-hand) and household dampness can be a way to prevent adults experiencing poor respiratory health (CABLE, N, KELLY, Y, BARTLEY, M, SATO, Y and SACKER, A. (2014) Critical role of smoking and household dampness during childhood for adult phlegm and cough: a research example from a prospective cohort study in Great Britain. BMJ Open, 4(4), e004807).

Smith, Gardner, Aggio and Hamer (2015) investigated whether active outdoor play and/or sports at age 10 is associated with sport/physical activity at age 42. Final adjusted Cox regression models showed that participants (n=6458) who often participated in sports at age 10 were significantly more likely to participate in sport/physical activity at age 42 (RR 1.10; 95% CI 1.01 to 1.19). Active outdoor play at age 10 was not associated with participation in sport/physical activity at age 42 (RR 0.99; 95% CI 0.91 to 1.07). The finding authors suggest that childhood activity interventions might best achieve lasting change by promoting engagement in sport rather than active outdoor play (Tammelin et al., 2003a, 2003b) (SMITH, L, GARDNER, B, AGGIO, D and HAMER, M. (2015) Association between participation in outdoor play and sport at 10 years old with physical activity in adulthood. Preventive Medicine, 74(May 2015), 31–35.)

To provide an example of the sorts of benefits to health that this linkage and use of this data may provide, it may be useful to be aware of the impact and benefit to health the 1958 National Child Development Study (NCDS) cohort has made. This is a similar birth tracking cohort still following it's members today. In it’s nearly sixty years research from this cohort has been responsible for proving beyond doubt that mothers who smoked heavily during pregnancy harmed the health and reduced the weight and height of their children, continuing on to damage English and maths scores at 16 years old. The study also informed the debate about the best place to deliver babies, indicating that mothers should only opt for home births when very early transfer to hospital is possible at the first sign of need and where highly experienced midwives and doctors are available. The study repeatedly demonstrated the need for steps to promote the health of pregnant mothers and facilities for safe childbirth. This led to the modernisation of maternity services with ready availability of high quality obstetrics on the one hand and better and more personal care for all. The case was made for adequate numbers of hospital beds and abolition of the lottery of where to give birth. Research has also made use of the longitudinal nature of the NCDS to examine the long-term effects of breastfeeding. For example, Rudnicka et al (2007) demonstrate that, compared with those who were bottle-fed with formula milk, children who were breastfed for more than a month had a reduced waist circumference and waist/hip ratio, and lower odds of obesity as adults in their mid-forties. Research using this cohort has also shed light on cancer and leukaemia in childhood, behavioural disorder, educational delay and disability.

RUDNICKA, A. R, OWEN, C. G and STRACHAN, D. P. (2007) The effect of breast feeding on cardio-respiratory risk factors in adulthood. Pediatrics, 119(5), E1107-15.

BCS70 data is a rich and unique resource for the research and policy community, and the information collected on health and its social determinants widens its potential value for health research and policy interventions. Linking health data from Hospital Episodes Statistics (HES) to the BCS70 survey data will greatly increase the potential of the data and future research in the area of health.

The validation of self reported and hospital reported outcomes will benefit health research in terms of being able to offer research methodologies that are quicker and more cost effective. This will be of benefit to patients who are the recipients of research such as in public health and medical interventions etc. For example, using health administration data such as HES could make the delivery of research more efficient and potentially more accurate, it may increase the volume of research and it may ensure that research takes place into diseases which are currently difficult to fund.

Outputs:

Following the data quality and validation work, the first output will be the creation of the linked BCS70 (Age42)/HES dataset. The HES data will add an important layer to this already rich data as well as providing the means for data quality checking.

The second output will be methodological papers published in peer reviewed journals reviewing the linkage and validating the data from the two data sources. These methodological assessments are expected to finish two years after obtaining the data. Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide.

The creation of this database and the methodological papers are the first steps in establishing a robust research database which will be of benefit to health and social care. No onward sharing to researchers will take place. Any onward sharing will be subject to a further application to NHS Digital.

The outputs in the long term from this dataset are difficult to quantify, but the CLS currently has a searchable bibliography on it's website with over 3,600 publications based on data from the 1958, 1970 and millennium cohort studies.

Processing:

Data disseminated from NHSD to UCL will only be accessed by substantive employees of UCL and only for the purposes described in this document.

HES data will not be relinked to the identifiable data which is held separately from the survey response data. Re-identification will only happen at the occasion of a request, made from a cohort member, for withdrawal from the study, and this includes removal of data. Where a participant wishes to withdraw from the study, the identifiable data is used to locate the study id, and then in turn destroy their data.

1. CLS team will supply NHS Digital with the following identifiers of cohort members who have consented to this data sharing; sex, postcode, date of birth, NHS number (if known) and unique ID (study-specific pseudonymised identifier).

2. NHS Digital will link the identifiable study data to HES data. NHS Digital will then remove identifiers from linked dataset and return the dataset to the CLS team at UCL with the study ID.

3. CLS will carry out validation of the linked HES data and will combine the supplied HES data with the information collected from the participant as part of the BCS70 study.

Once the linked survey-HES data files have been created, CLS may perform other activities to prepare the data for use in research, such as coding and cleaning, derivation of summary variables and compilation of data documentation.

4. CLS researchers will use these data to create an analysis file that will not contain any identifiable data.

5. CLS will create derived variables that summarise study members’ hospitalisation and health histories (e.g. hospital admissions and re-admissions, incidence of common diseases, children’s ailments etc.), and will compare BCS70 survey data with data from hospital statistics, in order to compare and validate the data collected in CLS surveys.

Childhood Outcomes after Perinatal Brain Injury (Data flowing to DfE) — DARS-NIC-475526-F3Z5H

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non-Sensitive

When:DSA runs 2023-11 – 2024-10 2024.07 — 2024.07. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: IMPERIAL COLLEGE LONDON, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 14 July 2022 final.pdf, IGARD Minutes - 30 September 2021 final.pdf

Datasets:

Demographics

Type of data: Identifiable

Yielded Benefits:

No data has been disseminated by NHS England for this research study. There are therefore no yielded benefits to date.

Expected Benefits:

This population study is hoped to provide the most complete picture of how childrens lives are affected by perinatal brain injury, providing essential information to answer parents questions accurately and in a meaningful family-centric manner. This information is intended to reshape clinical practice and facilitate optimum service planning within the NHS, to meet the needs of these children and their families through to adulthood, and ultimately improve their future health outcomes. An understanding of the sequelae of perinatal brain injury, specifically how and when children are affected, is expected to inform enhanced developmental surveillance across the NHS and enable the design of targeted multidisciplinary interventions to support children as needed. For example, premature infants (prone to inattention) can benefit from delayed school entry, Special Educational Needs (SEN) support, and educational packages raising awareness amongst educators of these specific challenges.

Stakeholder meeting with a view to impacting policy:
One of the project supervisors has a strong track record in bringing together key stakeholders on the issue of perinatal brain injury. Stakeholder meetings have previously been arranged with representatives from the Department of Health and Social Care, NHS Improvement, NHS England, neonatal doctors and nurses, and parent representatives; to discuss how perinatal brain injury should be defined in research. This project intends to utilise these existing connections to maximise the impact of this work.

UCL is looking to hold a stakeholder meeting in the final year of the fellowship, inviting representatives with whom the research team have established relationships, from the Department of Health and Social Care, the Department of Education, NHS England, BLISS, Meningitis Research Foundation, neonatal doctors and nurses, and parent representatives. It is planned that a Professor of Human Development from Oxford University with considerable experience in shaping social and education policy globally will be invited to join the stakeholder meeting, as well as a psychologist with considerable experience in designing educational interventions for preterm infants. The study results are intended to be discussed in this meeting alongside how these findings should shape policy and practice going forward. UCL are additionally anticipating to create an action plan to determine an effective strategy to disseminate this information to teachers and other professionals within the education sector, in collaboration with our stakeholders. This is hoped to ensure wide-spread awareness across the education sector, and that long-lasting sustainable measures are in place/ planned to support current and future children with perinatal brain injury, beyond completion of this fellowship.

Anticipated impact on neonatal care, society and NHS services:

Impact on neonatal care
Equip healthcare professionals with reliable information to counsel families (target date 2024).
Communication aids will facilitate meaningful family-centred conversations on the neonatal unit (target date 2024)
Help prepare families for their childs future and understanding what additional support may be needed (long-term)
Encourage healthcare professionals to consider the long-term impact of various neonatal care decisions (long-term)

Impact on the NHS and policymakers
Help those involved in shaping policy, resource planning and service provision to make informed decisions about how to most effectively support these children whilst maximising the efficiency of services (target date 2024-25)
UCL findings are intended to inform national guidelines on follow-up after brain injury (target date 2024-25)

Impact on schooling and policymakers
Equip parents with important information about the academic impact of brain injuries to help them plan their childs future and support them with their educational needs (long-term)
Provide key information and education to teachers about how they can support children with perinatal brain injuries (long-term)
Help the Department for Education in determining resource allocation and the provision of additional educational support (long-term)

Outputs:

Academic outputs are hoped to include high-impact peer reviewed publications, and international conference presentations. Findings are expected to be submitted for publication in high impact general medical journals, such as the New England Journal of Medicine, the British Medical Journal, and JAMA Pediatrics. The study results are intended to be presented at international conferences such as the Royal College of Paediatrics and Child Health annual conference, the Kings Fund annual conference, and the Paediatric Academic Societies meeting in the USA.

Publications will be Open Access as per UCL policy, and freely available both on journal websites and via the UCL webpage. Outputs will contain only aggregate level data with small numbers suppressed in line with National Neonatal Research Database (NNRD), NHS England, Office for National Statistics (ONS) and Department for Education (DfE) policy and guidance. All data will be stored within the ONS secure research service (SRS) and all outputs from this server undergo independent checks by ONS staff to ensure outputs meet regulations and could not be deemed identifiable in any way.

Dissemination of the research findings to the public (parents who have children with a perinatal brain injury) are intended to be facilitated through existing collaborations with the Neonatal Data Analysis Unit (NDAU), BLISS (the charity for babies born sick or premature) and the Meningitis Research Foundation. UCL are also looking to also create an infographic/ information leaflet to improve communication of prognosis after perinatal brain injury between doctors and parents. Public dissemination is intended to include production of lay research reports publicised on the NDAU, BLISS, UCL and Meningitis Research Foundation websites. Research regarding neonatal outcomes has attracted a high level of media interest, and it is anticipated that this will be the case for the proposed study. UCL are acutely aware of the potential harmful effect of inaccurate or sensational reporting of research findings in this sensitive area, and the confusion and anxiety this can cause for affected families. UCL are planning to work closely with BLISS and Imperial College London to co-ordinate press releases and ensure that information is conveyed accurately and responsibly. BLISS and the Meningitis Research Foundation are also expected to publicise findings to their followers and the general public through their social media channels.

UCL will commence analysing the data as soon as it has been made available in the ONS SRS. It is anticipated that the process of data analysis, interpretation and report writing will take approximately 36 months, with papers submitted for publication in mid to late 2024.

Processing:

The study will involve the following data processing and linkage steps:

1. Infants meeting the Department of Health definition for perinatal brain injury will be identified within the National Neonatal Research Database (NNRD) (cohort 1, n = 54,733). This database contains care data for all neonates admitted to NHS neonatal units across England, Wales and Scotland. Its population coverage is internationally unique with 100% coverage since 2012 and high representative coverage since 2008. The premature infants (< 34 weeks gestation) in cohort 1 will be matched to a comparator group of infants within the NNRD (cohort 2, n = 24,612).
2. The pseudonymised neonatal care data for cohort 1 and 2 will be transferred to the ONS Secure Research Service (SRS) by Imperial College London.
3. Under DARS-NIC-342322-Q1N7M, the NNRD will transfer the minimum identifiers for the NNRD cohorts (1 and 2) to NHS England (NHS number, date of birth, sex and postcode at birth). The NNRD will also provide the birth weight, gestation (from 2015), and multiplicity status (i.e. twins, triplets etc) for the remaining children with gestation time > 34 weeks in cohort 1 to NHS England.
4. The un-matched infants in cohort 1 with perinatal brain injury will be matched in a 1:3 ratio, by NHS England, to a comparator group of infants, identified from Birth Notifications and Civil Registrations (Births) data to create a term control cohort (cohort 3, n = 90,363) (DARS-NIC-342322-Q1N7M).
5. All 3 cohorts will be linked to Civil Registrations (Deaths), Hospital Episode Statistics (HES) Admitted Patient Care (APC), HES Accident and Emergency (A&E), HES Outpatients and the Mental Health Services Data Set (MHSDS) up to December 31st 2020, by NHS England. The pseudonymised health outcomes and analysis covariates from the Births products for the three cohorts will be transferred from NHS England to the ONS SRS (DARS-NIC-342322-Q1N7M).
6. Under this Data Sharing Agreement (DARS-NIC-475526-F3Z5H), a file containing a list of personal identifiers (forename, surname, date of birth, sex, and postcodes) for linkage to the National Pupil Database (NPD) will be transferred from NHS England to the Department for Education (DfE). The NPD contains detailed information on the educational attainment, special educational needs and attendance of children at state schools across England between the ages of 5-18 years. A logic model, designed to maximise the chance of a reliable postcode match (given the variation over time), will be used. After linkage, all identifiers will be removed (only the unique study ID number will be retained) and these pseudonymised educational data will also be securely transferred for storage within the ONS SRS.

Identifiable information transferred to the DfE for matching will be controlled through secure access arrangements in line with DfE policy. Access is limited to a team of qualified (permanent DfE staff) data engineers employed on the maintenance and production of the National Pupil Database (NPD). All DfE staff accessing this data are cleared to levels in line with departments vetting protocols and have Baseline Personnel Security Standard (BPSS), Disclosure and Barring Service (DBS) and Level 2 Non-Police Personnel Vetting (NPPV) check clearance. The DfE uses Microsoft Azure cloud hosting for the storage and processing of data, applying a combination of software and hardware controls which meet the ISO27001 standards and the Government Security Policy Framework. The Departments use of Microsoft Azure hosting has approval from the Cabinet Office and meets all the relevant guidelines for holding and processing personal and restricted data. This includes ensuring the systems comply with Data Protection Legislation and other relevant legislative obligations that apply to data rated at OFFICIAL-SENSITIVE.

Microsoft Limited Azure supply Cloud Services for the DfE and are therefore listed as a processor. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database(s) containing the data. Microsoft Limited Azure servers are located within the EEA.

UCL researchers will only have access to pseudonymised data held within the ONS SRS under DARS-NIC-342322-Q1N7M. In order to access any data in the ONS SRS, all researchers will need to be ONS accredited and undergo data protection and confidentiality training. No data will be held by or at UCL. There will be no requirement or attempt to re-identify participants. Indeed, this would not be possible for UCL.

UCL research staff responsible for conducting the analysis for the project will complete the ONS Researcher Accreditation process, which involves specific training in the safe use of research data environments. They will sign and adhere to the ONS Accredited Researcher Declaration, and will be required to adhere to ONS data protection policies and procedures. All data to be transferred out of the SRS (the results of the analyses) will be checked by ONS staff to ensure that no individual level data, or potentially identifiable data, is transferred. Only aggregate level data with small number suppression will be transferred out of the SRS system for publication.

Data retention
The linkage keys used for the health and educational linkages will be securely held by NHS England and the Department for Education respectively. These will be retained for the duration of the agreement should further linkage be required (this will be requested separately to this version of the DSA). Only the pseudonymised dataset will be retained within ONS SRS to facilitate analysis by the UCL research team.

Using Large-scale Routine Data to Monitor and Improve Ethnic Inequalities in Cancer and Cardiovascular Disease ( ODR1920_301 ) — DARS-NIC-656874-T3L9D

Opt outs honoured: No (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 s261(2)(a), Other-The Health Service (Control of Patient Information) Regulations 2002- Regulation 2

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-04 – 2023-12 2024.03 — 2024.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY OF LEICESTER

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 6 June 2024 final.pdf, AGD minutes - 11 May 2023 final.pdf, IGARD Minutes - 3 November 2022 finalv1.pdf

Datasets:

NDRS Cancer Registrations
NDRS Linked HES AE
NDRS Linked HES APC
NDRS Linked HES Outpatient

Type of data: Anonymised - ICO Code Compliant

Objectives:

The project aims are to:
1. Investigate ethnicity reporting in cancer and cardiovascular diseases (CVD) data.
2. Characterise the burden of coexisting cancer and CVD in Black Minority Ethnic (BME) groups.

The objectives are:
· Look at the quality of ethnicity reporting in routine healthcare data.
· Determine the incidence and prevalence of co-existing cancer and CVD by ethnic group.

Yielded Benefits:

Data for this study has previously been share when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS Digital becoming data controller. These yielded benefits are noted below; A review of the impact of COVID-19 on multimorbid ethnic minority groups was conducted and published in the JACC cardio-oncology journal.

Expected Benefits:

The importance of the overlap between cancer and cardiovascular disease (CVD) is illustrated by the novel discipline of cardio-oncology which has emerged from the recognition that anti-cancer treatments (e.g. chemotherapy) can be associated with adverse CVD complications (e.g. heart failure). There are similarities at the level of risk factors (e.g. tobacco, obesity) which represent opportunities for shared prevention strategies. Better characterisation of coexisting cancer and CVD may lead to improvements in treatment and prevention of both diseases.
Research has found that ethnic minority groups living in white majority European countries have a higher prevalence of multi-morbidity (2 or more chronic illnesses) and have an earlier onset of multi-morbidity than their white counterparts.
Based on this evidence, there is a need for further investigation into the rates of individuals living with both cancer and CVD as well as looking into the occurrence of these co-morbidities across ethnic groups.
In both CVD and cancer, health inequalities for disease incidence, outcomes and treatment have been reported. Ethnicity health data in the UK has historically been inaccurate and there is a need for further research to determine where the gaps are and how it can be improved.

Overlapping Cancer and cardiovascular disease overlap between them in BME is beneficial to health and social care system for the following reasons:

First, it will improve the publics health and wellbeing, by identifying health inequalities in individuals suffering from both cancer and CVD.

Second, it will improve population health through sustainable health and care services, by maximising the use of existing national audit data and other National Health Service (NHS) programmes to gain new insights to guide the planning of health services.

Moreover, enhancements to coding for individuals and healthcare utilisation by ethnicity are possible from this research.

Third, it will build the capacity and capability of the public health system, by highlighting which individuals and areas have the greatest need in overlapping multi morbidity for the two most common types of disease, by combining datasets from different health audits, for cancer and CVD.

Outputs:

The anticipated outputs for this project will be several recommendations and papers.
The results from the analysis looking at the quality of ethnicity coding in the data are anticipated to be published as a paper, as well as several recommendations to improve the analytical capabilities of routinely collected ethnicity data, and the coding terminology used in the collection of data. The results from the analysis looking at the incidence, hospitalisation and mortality rates for people with both cancer and cardiovascular disease will also be published as papers. Dissemination of the outputs can be done via 3 routes: media; scientific publications and/or presentations and
Patient and Public Involvement (PPI) engagement. The PhD outputs are also expected to be presented at relevant conferences, whether cardiology (e.g. European Society of Cardiology Annual Scientific Congress), public health or ethnicity and health (e.g. South Asian Health Foundation annual conference).
The expected target date for submission of this PhD project 31 August 2023.

Processing:

Within this programme of work there are a range of research questions requiring a variety of analytical strategies. The research team includes epidemiologists and statisticians with a considerable track record of the analysis of similar large datasets.

Initially, we will perform preliminary analyses, which will be exploratory though focused around the key hypotheses, to better understand the different treatment processes for the subsets of patients. Additional preliminary analyses we will then quantify effect sizes using simple logistic regression modelling techniques whilst appropriately accounting for potentially confounding covariates. Where appropriate we will also consider different study designs utilising the rich nature of the linked data resource. For example, matched cohort studies where patients with both cardiovascular disease and cancer are matched to patients suffering from a single condition with similar covariate patterns. Many of the outcomes are of a time-to event nature.

For example, time to revascularisation, time to recurrence of cancer or time to death. For these analyses, we will flexible parametric survival models in order to appropriately account for non-proportional hazards and to potentially account for competing risks where necessary. We will build on previous work utilising excess mortality modelling techniques to understand mortality associated with the diagnosis of multiple conditions 18. Where necessary, we will use mixed effects models to account for the hierarchical nature of the data. The group have experience of quantifying the outputs from complex models in ways that are easily interpretable for a wide variety of audiences. For example, the use of avoidable deaths 32, loss in expectation of life 18, and real-world probabilities 33 accounting for competing risks. We will investigate regional variation across the different analysis strategies and research questions where appropriate. We will utilise mapping techniques and funnel plots to present variation beyond that expected by chance.

This study will not recruit patients but will use existing pseudonymised national audit data for the purposes of research

UK Early Life Cohort Feasibility Study (ELC-FS) — DARS-NIC-482185-K8G0F

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 - s261(5)(d), National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-05 – 2024-04 2023.05 — 2024.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 11th September 2025 final.pdf, AGD minutes - 20 April 2023 final.pdf, AGD minutes - 2 March 2023 final.pdf, IGARD Minutes - 1 December 2022 final.pdf

Datasets:

Birth Notification Data
Civil Registration - Births
Demographics

Type of data: Identifiable

Objectives:

University College London requires access to NHS England data for the purpose of the following research project: The Early Life Cohort Feasibility Study (ELC-FS).

The following is a summary of the aims of the research project provided by University College London:

The ELC-FS will test proof of concept for a new national birth cohort study for the UK. It will collect rich data on babies born across the UK during two consecutive months of 2022 or 2023, capturing the economic and social environments into which these babies are born, and their health, well-being and development in their first 6-10 months. The study will provide data of substantive value in itself, providing vital evidence on new lives across the UK at a critical time, particularly with regards to the shock to health and the economy induced by the COVID-19 pandemic, as well as the as yet unknown impacts of Brexit on our economy and society. It will highlight major sources of early developmental inequalities and family stressors, and identify potential foci for early intervention and support.

NHS England data will be used for the recruitment and sampling elements of the study, including sampling participants in England and Wales for the study from birth registrations linked to birth notifications and using an opt-out approach to taking part in the study following first contact with the sample.

The primary scientific aim of the study will be to understand how variability, and particularly inequalities, in the central domains of early child development emerge over time and to determine the social and biological factors influencing their trajectories. The study will use the ethnicity fields to boost the England sample. Although for this study it is unlikely to be practicable to include boost samples based on the other fields, securing access to these variables is important to demonstrating the future feasibility of this, including for Wales (and Scotland and Northern Ireland under separate applications to relevant controllers). The study also proposes to use additional fields for targeted fieldwork approaches to maximise engagement and inclusivity. Access to the individual-level characteristics by the Centre for Longitudinal Studies (CLS) also provides possibilities for wider use of records for methodological research, including non-response analysis to assess population representivity, weighting and adjustment.

NHS England will draw a sample of registrations of births linked to NHS birth notifications for the two selected birth months. The initial source of the sampling fields would be (i) the mother who has registered a birth (ii) informant, where father/other parent, is not at same address as mother. Before contacting the sample, NHS England would provide the study with up-to-date address details, embarkations and death notifications from its Personal Demographics Service. The study requires access to names and addresses including baby name and address (from birth notifications), mother name and address (from birth registrations), father/other parent name and address (from birth registrations).

For assessing feasibility of using the birth registrations linked to NHS birth notifications as a sample frame, for over-sampling on ethnicity, for targeted fieldwork materials and also for methodological and research purposes the study also requires additional fields from the birth notification and birth registration records. These include fields such as age of mother, multiple birth, birthweight, ethnicity baby, ethnicity mother, gestational age from birth notification records; and fields such as age of mother, multiple birth, birth weight, socioeconomic status, occupation, registration type (sole/joint), name and address mother, name and address father (if registered), country of birth mother, country of birth father (if registered), previous births from birth registration records. Variables which are in both the birth notifications and birth registrations are requested for verification/validation purposes and to fill in missing data.

The study team also require de-identified sampling fields for all patients who are sampled for the study for non-response analysis and adjustment. Non-response analysis, carried out on data from the whole sample, forms a vital part of the feasibility study since it is needed for the project to understand response rates among different population groups, what the biases are in who decides to take part, and whether or not the sample achieved is sufficiently representative of the national population. This analysis will be used for non-response adjustment (in particular to generate non-response weights, as well as other statistical methods). These weights and guidance on non-response adjustment will be provided to data users and are essential to the study in order that any substantive scientific analyses that are undertaken using the feasibility study data (and in due course, the main study) can be adjusted for any to biases due to selection into the sample, and to ensure the study findings are robust and valid.

Phase I: Data for Sampling

The following NHS England data will be accessed:
Birth Notifications necessary because this dataset will be used to identify all births in the selected two-month period and will contain the dates of birth and each babys ethnicity;
Civil Registration Births necessary because this dataset will contain the mothers postcode which will be converted to Lower Super Output Area (LSOA)

The level of the data will be:
Pseudonymised

The data will be minimised as follows:
Limited to all babies born in England and Wales during two consecutive months in either 2022 or 2023;
Limited to four non-identifying variables:
o Unique person ID
o Babys ethnicity
o Babys month of birth
o Lower Super Output Area of mothers place of residence

Phase II: Data for Fieldwork and Recruitment

The following NHS England data will be accessed:
Birth Notifications necessary because this dataset provides universal coverage of the population of babies, contains key characteristics of the baby, mother and father, including where they live, and may allow own-household fathers (OHFs, defined as fathers resident at a different address to the baby at the time of the interview) to be recruited in their own right;
Civil Registration Births necessary because this dataset contains additional variables which could be used for sampling, including the ethnic group of the baby and because it facilitates timely access to updated addresses for any post-birth moves and provides notifications of early infant deaths, as well as providing possibilities for wider use of health records for substantive and methodological research.
Demographics necessary to facilitate timely access for address updates, any post-birth moves and provides notifications of deaths.

The level of the data will be:
Identifiable necessary in order to be able to make contact with the selected sample to let them know they have been chosen to take part in a study and to give them the option to opt out.

The data will be minimised as follows:
Limited to approximately 2,970 families selected for recruitment by Ipsos and meeting the inclusion criteria (~2,376 families in England and ~594 in Wales) families in this context is defined as the baby, their mother and their father or other parent;
NHS England will screen for deaths and/or possible adoption cases and will apply National Data Opt-Outs and families will be excluded from the data disseminated if:
o either the baby or mother are identified as deceased;
o either the baby or mother are identified as having de-registered from the NHS for the reason of moving abroad ('embarkation');
o the babys record is marked as sensitive or is not traced;
o any of the baby, mother or father has registered a National Data Opt-Out

Sample sizes have been calculated taking into account estimated recruitment rates and to ensure a representative sample. The fields being requested from the data sources are those that are necessary to draw the sample and contact people, to be able to assess the feasibility of using them in the sampling frame, to oversample on particular characteristics such as ethnicity, for targeted engagement (e.g. including leaflets aimed at teen mums), responsive design (e.g. during fieldwork checking to see if certain groups are not taking part and putting more effort into recruiting from those groups) and for important methodological purposes, including non-response analysis to assess population representivity, weighting and adjustment.

University College London is the research sponsor and the controller as the organisation responsible for ensuring that the data will only be processed for the purpose described above.

The lawful basis for processing personal data under the UK GDPR is:
Article 6(1)(e) - processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller;

The lawful basis for processing special category data under the UK GDPR is:
Article 9(2)(j) - processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) based on Union or Member State law which shall be proportionate to the aim pursued, respect the essence of the right to data protection and provide for suitable and specific measures to safeguard the fundamental rights and the interests of the data subject.

This processing is in the public interest because it adheres to the UK Policy Framework for Health and Social Care Research, which protects and promotes the interests of patients, service users and the public, and aims to produce generalisable and publicly available information to inform future decisions over patients treatments or care.

The funding is provided by the Economic and Social Research Council (ESRC). The funding is specifically for the feasibility study described. Funding is in place until June 2024.

Ipsos is a processor acting under the instructions of University College London. Ipsos is a fieldwork agency which will undertake the sampling and recruitment tasks.

Amazon Web Services (AWS) is a processor acting under the instructions of UCL. UCL stores data on the Cloud provided by Amazon Web Services.

The project is led by three Co-Directors employed by UCL. These Co-Directors are supported by a team of Co-Investigators, including experts in each of the four nations of the UK, including from UCL, Swansea University, University of Edinburgh, University of Ulster, and the Fatherhood Institute.

A number of highly experienced senior collaborators are supporting the project by providing input into the study design from Bryson Purdon Social Research, Public Health Scotland, University of Edinburgh, University of Belfast, Manchester Metropolitan University, ScotCen. These senior collaborators are not Controllers or Processors for the study. These contributions are all funded by an ESRC grant.

The study also has project partners who are providing non-financial support to the project, not directly funded. Project partners also bringing extensive networks and experience to the project are the Nuffield Family Justice Observatory, First 1001 Days Movement, and the National Childrens Bureau (NCB).

Public and Patient Involvement and Engagement helped refine the purpose of the research.

During September-November 2021, two waves of public dialogue workshops for the ELC-FS were hosted by Kantar Public, with the approach carefully modified in each UK nation to reflect the different legal frameworks in place across the UK. These involved 62 participants, all parents with young children across five locations (two in England and one each in Wales, Scotland and Northern Ireland). Kantar explored the attitudes of parents of young children to the proposed uses of administrative data in the ELC-FS through the public dialogue workshops. This included exploring views about the use of linked birth registration data and NHS maternity records as the sample frame for this project, and the proposed recruitment process. Overall, parents were accepting and supportive of the proposed uses of identifiable administrative data and recruitment processes proposed, as they saw them as necessary to draw the sample and to ensure sufficient representativeness of the study. The public engagement suggested that parents of young children understood the rationale and public benefit for this proposed use of their data and its importance to building an inclusive cohort.

UCL has used existing parent and young person advisory groups, recruited from across England and Northern Ireland, run by NCB as representatives of potential study participants. The Young Research Advisors (YRAs) are a diverse group of approximately 40 children and young people aged from 7-18. The Family Research Advisory Group (FRAG) comprises approximately 25 parents and carers, some of whom are parents of young people with additional support needs. Workshops have been conducted with each of these groups focusing on co-production of the scientific content, ethics and participant engagement. In addition to work with the NCB, UCL commissioned Ipsos to conduct qualitative research with own household fathers (OHFs) and low-income families to understand how best to engage these groups in the ELC-FS. In both the NCB and qualitative research projects, the participants had useful suggestions about how to build trust with study participants through our communication strategies and interviewer training.

Outputs:

The expected outputs of the processing will be:
A Research resource deposited with the UK Data Archive late 2024, available to bona fide researchers for the purposes of statistical data analysis comprised of information collected from the consented participants. The information provided by NHS England will have facilitated the collection of this information. The deposited data will not include the sampling field data provided by NHS England.
A Number of recommendations for funders and stakeholders around the design for a future main early life cohort study. These will include a set of high-quality, open-access outputs (including reports, working papers and journal articles) to enable a thorough assessment of the feasibility of the main ELC and to inform its design and implementation. Outputs from the feasibility study will include an evaluation of recruitment rates and biases, and of data quality (including item non-response), the suitability and scalability of data collection innovations that have been tested, and an evaluation of experimental components (targeted incentives and bio-samples).
An assessment of the quality of data fields on the achieved sample frames, and their suitability for use for over-sampling or targeted recruitment strategies, a report on record linkages (lessons learned), recommended next steps in developing a national study of children in need, reports from the public dialogue, parent and young person groups, including qualitative work with fathers.
A set of design protocols for the main study including for the sample design, participant contact, and scientific content (including data collection instruments, bio-samples and record linkages) and a public engagement policy.

The outputs will not contain NHS England data and will only contain aggregated information with small numbers suppressed as appropriate in line with the relevant disclosure rules for the dataset(s) from which the information was derived.

The outputs will be communicated to relevant recipients through the following dissemination channels:
Reports to funders and stakeholders;
The CLS website https://cls.ucl.ac.uk/cls-studies/
A participant facing ELC-FS website with its own branding (to be commissioned).

The target date for production and dissemination of the above outputs is 2023/2024.

Processing:

Phase I: Data for Sampling

NHS England data will provide the relevant records from the Civil Registration Births and Birth Notifications datasets to Ipsos. The data will contain no direct identifying data items but will contain a unique person ID which can be used by NHS England to link the data with other record level data it holds.

The data will be stored on servers at Ipsos. Access is restricted to employees or agents of Ipsos.

Analysts from Ipsos will process the data for the purpose of selecting a random sample of families selected for recruitment to the study.

Ipsos will transfer data to NHS England. The data will consist of the unique person ID (as originally supplied by NHSE) for the ~2,970 babies selected for recruitment.

NHSE will extract the relevant data from the Civil Registration Births and Birth Notifications datasets for each baby, their mother and, where known, their father. NHSE will match the details of each family member with the latest Demographics data to identify any deaths, embarkations or no traces. and will apply National Data Opt-Outs to the output. NHSE will then remove any family where any member has applied the National Data Opt-Out or where either the baby or mother is deceased or where the baby is not traced.

Phase II: Data for Fieldwork and Recruitment

NHS England data will provide the relevant records from the Civil Registration Births, Birth Notifications and Demographics datasets to Ipsos. The data will contain directly identifying data items including Names, NHS Number, Date of Birth, Address and Postcode which are required to facilitate contact with the parents and also to enable Ipsos to screen for deaths and obtain latest address information from NHSE ahead of each contact attempt.

The data will be stored on servers at Ipsos.

Ipsos fieldworkers will use the name and contact details to send an initial letter to the relevant mothers and fathers giving them information about the study and giving an opportunity for respondents to remove themselves from the sample. Additionally, analysts from Ipsos will process the data for the purpose of real time analyses of response rates.

Prior to mailing out further correspondence to potential recruits, Ipsos will upload the details of the cohort, excluding those who opt-out, to NHS Englands Cohort Management System (CMS) and will download reports containing latest vital status and addresses.

Ipsos will extract a pseudonymised subset of the data containing variables necessary for non-response analysis and securely transfer this to UCL. The data will be stored in UCLs Data Safe Haven.

UCL uses offsite back-up services provided by VIRTUS Data Centres.

UCL stores data on the Cloud provided by Amazon Web Services.

The data will be accessed by authorised personnel via remote access. The data will remain on the servers at UCL at all times. There will be no requirement and no attempt to reidentify individuals when using this data.

Once the fieldwork has concluded, Ipsos will transfer the details, including direct identifiers, of all people who consented to participate to UCL along with the deidentified data for non-respondents and those who declined to participate or opted out of taking part.

Ipsos will retain the data until instructed by UCL to destroy it or contractually required to destroy it. UCL will undertake verification checks on receipt of the data from Ipsos and will instruct deletion when ready. Ipsos are contractually required to delete the data within 28 calendar days from the end of their contract with UCL.

The data will not leave England and Wales at any time.

All personnel accessing the data have been appropriately trained in data protection and confidentiality.

Advancing Survivorship after Cancer: Outcomes Trial (ODR1819_039) — DARS-NIC-656825-X7T4K

Opt outs honoured: No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2023-08 – 2026-08 2023.12 — 2023.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 18 April 2024 final.pdf, AGD minutes - 13 July 2023 finalv2.pdf

Datasets:

NDRS Cancer Registrations

Type of data: Anonymised - ICO Code Compliant

Yielded Benefits:

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS England becoming data controller. These yielded benefits are noted below; Previously disseminated data has been linked to the trial data so that the participants can be correctly classified based on their cancer diagnosis. This has been used in the analysis for a number of papers which UCL are due to submit to peer reviewed journals in the near future including the main trial results (up to 6 month follow-up) are due to be submitted soon. Preliminary findings suggest the intervention was successful at supporting patients to improve their health behaviours.

Expected Benefits:

Breast, prostate and colorectal cancer are 3 of the most common cancers diagnosed each year with approximately 150000 new diagnoses a year in the UK across these three cancer types. This intervention involves a booklet and feedback from a health professional about health behaviours and the potential benefits of making healthy changes to these. If this intervention is shown to benefit cancer patients, it can be incorporated into patient care so that all patients have access to it. This could then improve health behaviours in this group by helping them to move closer to meeting public health recommendations. Such improvements have the potential to improve their cancer outcomes as well as other health conditions. The NHS England data will allow UCL to assess whether the intervention has improved cancer outcomes and survival. If the intervention has resulted in these improvements this could lead to further work focused on how to implement this intervention within the NHS both locally and nationally. The researchers will disseminate the findings widely and discuss with hospitals and funders (including relevant charities such as WCRF, Cancer Research UK and Macmillan). The next steps to work towards providing this intervention as apart of standard care. The aim is that this will happen during 2023.

Outputs:

The data on cancer diagnoses disseminated in the first data share will be used to describe the sample in all analyses and subsequent papers, in peer reviewed journals using the trial data. This main trial paper is still in progress.

The data on cancer diagnoses and mortality will be used as part of the trial analysis of the impact of the intervention which will be published as a paper. It will hopefully also be used in further analyses linking variables UCL collected data on in the trial to longer term cancer outcomes and survival. These will all be published as research papers in peer reviewed journals.

A report of the main results will be sent to ASCOT Study participants via a newsletter and shared on the study website once the data has been analysed. As more subsequent studies are published, participants will be signposted to the study website.

These results will be presented at other research institutions, at national and international conferences (e.g. UK Society for Behavioural Medicine, American Society of Clinical Oncology, International Society for Behavioural Nutrition and Physical Activity) and at patient-facing events (e.g. charity hosted), and will be shared with the recruitment sites via email. UCL may seek additional funding to explore the potential to implement their intervention within the NHS over the long-term, which could involve engagement with local and national policymakers.

UCL hope to complete the majority of this dissemination before the end of August 2023 in line with project funding however some research will continue beyond this, in particular the work linking to later cancer events and mortality.

No participants will be identifiable in this data. Data will be aggregated with small numbers supressed such that averages are presented, e.g. number of participants with each type of cancer, average time since diagnosis and the different variables will not be linked.

Processing:

Eligible patients were identified by searching the electronic cancer databases (e.g. Somerset or Info-flex) of each of the participating NHS trusts cited in objective for processing. The initial list was drawn up by an appropriate member of staff at the trusts (e.g. a nurse or data analyst). The list was carefully cross-checked (via surgical diaries or the multidisciplinary team) to ensure that patients are still alive and have a diagnosis of breast, prostate or colorectal cancer.

Once the list of eligible patients had been finalised the NHS sites only shared a list of finalised ID numbers and cancer types with UCL. The research team at UCL could then issue the staff at the trusts with the appropriate number of copies of the ASCOT initial patient survey (health and lifestyle questionnaire), the letter from the consultant and envelopes. At the back of the survey there was an invitation for participants to leave their details if they would be interested in hearing more about a trial of a lifestyle intervention for cancer patients. The NHS Trusts then sent the letter and survey together in the envelope to the patients to ensure that the research team at UCL did not have initial access to patient identifiable information unless a participant chose to leave their details on the initial survey invitation. All subsequent questionnaires for consented patients for 3/6 and 2 year follow ups were sent by UCL.

Staff at the NHS sites kept a copy of the list of patients with allocated ID numbers that were printed on the initial survey. Completed initial surveys were returned by patients directly to UCL. Staff at UCL then informed the NHS sites of the ID numbers that were returned so reminders could be sent to any patients who hadnt completed the initial survey. When surveys were returned UCL checked eligibility for the trial for participants who chose to leave their contact details on the survey. If participants were eligible, they were sent an information sheet and consent form for the trial and could consent.

UCL will share trial ID numbers, NHS numbers, names, date of birth, sex and postcode of the enrolled consented participants to allow NHS England to identify the trial participants in the NCRAS data. NHS England will then return the trial ID numbers of data on cancer diagnosis, hospital records, treatment information, health status and mortality for these participants. The patient-level cancer registration data is disseminated from NHS England to UCLs data safe haven via SEFT (secure electronic file transfer).

UCL will use cancer registration data collected by the National Cancer Registration and Analysis Service (NCRAS; cancer registry in England) to allow them to group the sample population based on the cancer type and stage at diagnosis. The study team will also use the data (on cancer diagnoses and mortality) to compare the experimental groups (those who received the intervention and the control group) to assess if the intervention had a positive impact on these outcomes. The study team are also interested in additional related research questions, for example, exploring whether relationships between health behaviours and survival are moderated by patient variables (like patient reported outcomes of anxiety/depression) to determine if any groups are especially impacted. There is a funded UCL PhD student (who is part of the ASCOT team) who will explore these questions, as well as continuing other planned ASCOT analyses. The disseminated data will be integrated into the trial dataset so that cancer outcomes and survival data become an outcome that can be analysed in relation to various data that the study team collected from participants during the trial.

Data processing will only be carried out by employees of UCL and one enrolled PhD student. All of those carrying out data processing via the safe data haven will complete yearly training on information governance, data protection and confidentiality. The 2nd Principle Investigator (PI) is based from the University of Leeds but will not have access to data disseminated under this agreement. The 2nd PI was involved in the study design plans for analysis and will contribute towards paper writing and will undertake these duties under an honorary contract with UCL.

ASCOT began in 2015 and data collection from participants in the form of the ASCOT patient questionnaires ended in 2021. However, the trial is not considered complete until the study team have received the final NCRAS data from NHS England. UCL plan to examine co-morbidities up to ten years after participants consented. UCL will therefore retain personal identifiers until this time (2028-2029 at the latest). After which UCL will destroy all identifiable data by deleting the database which links patient ID's to patient identifiers, making all the remaining data anonymous to the UCL study team.

All data collected as part of the ASCOT patient questionnaires is pseudonymised using participant ID numbers as soon as the surveys are received. Consented patient identifiable information is stored separately to their ID numbers in a locked filing cabinet in the Department of Epidemiology and Public Health at UCL.

Electronic Data collected as part of the ASCOT patient questionnaire and the ASCOT trial is stored in the data safe haven (a secure environment) by the research team in UCLs Department of Behavioural Science and Health and accessed remotely by the study team for statistical analysis. The study team can look at the data on the server but are prohibited from moving it to any other machines.

The role of the listed joint data processor VIRTUS data centre is used exclusively for secure server storage purposes only. VIRTUS only supply the physical location for storage. VIRTUS operate 7 layers of physical security on site, including perimeter fencing, access control, CCTV external and internal and restricted pass code access. Staff at VIRTUS data centres will not have access to NHS England data.

The study team will process, store and dispose of patient identifiable information (names and contact details) in accordance with all applicable legal and regulatory requirements, including the Data Protection Act 2018 and any amendments thereto.

All data used in publications and outputs will be completely anonymous using aggregated data with small numbers supressed.

LAUNCHES QI: Linking AUdit and National datasets in Congenital HEart Services for Quality Improvement. — DARS-NIC-234297-P4M5G

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-07 – 2022-06 2020.09 — 2023.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 11th January 2024 final.pdf, IGARD Minutes - 12 January 2023 final.pdf, IGARD Minutes - 15 December 2022 final.pdf, IGARD Minutes - 7th July 2022 final.pdf, IGARD Minutes - 1 July 2021 Final.pdf, igard-minutes-3rd-october-2019-final.pdf

Datasets:

HES:Civil Registration (Deaths) bridge
Civil Registration - Deaths
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
Civil Registration (Deaths) - Secondary Care Cut
HES-ID to MPS-ID HES Accident and Emergency
HES-ID to MPS-ID HES Admitted Patient Care
HES-ID to MPS-ID HES Outpatients
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) requires pseudonymised data of life status, age at life status, place of death and a subset of HES data for use in a new study called “LAUNCHES QI: Linking AUdit and National datasets in Congenital HEart Services for Quality Improvement” (IRAS ID 246796). UCL are the sole data controller who also process data.

LAUNCHES QI aims to indirectly improve services for congenital heart disease (CHD) by providing the first description of how CHD patients interact with the NHS acute sector and where variation in outcomes or service use exist. This information is the first crucial step in supporting service improvement by building the evidence base on which aspects of the current service offer the most potential for improvement programmes. The team will link for the first time NCHDA (National Congenital Heart Disease Audit), PICANet (paediatric intensive care audit), ICNARC CMP (adult intensive care audit), Life status and place of death, and HES (Hospital Episode Statistics) data. This will provide information on: a) the challenges in linking national data sets and whether it is feasible to do this routinely, and b) create a research datasets to examine the interactions CHD patients have with different NHS services over time. The team will aim to improve services by: describing patient care trajectories through secondary and tertiary care; identifying useful metrics for driving quality improvement (QI), informing commissioning and policy; and exploring variation across services to identify priorities for QI.

Measuring, reporting and learning from outcomes should drive quality improvement (QI), but this is particularly challenging for lifelong conditions such as CHD where outcomes need to be interpreted in the context of changing treatment options, service provision and the natural evolution of disease. Given the complex care trajectories of such patients, rich datasets and careful multi-disciplinary analysis is required to identify meaningful variations and opportunities for targeted QI. The study will produce: the first comprehensive understanding of care received by a complex population from birth to adulthood; a basis for creating a step change in how quality in CHD services is measured and improved.

The activity is compliant with the principles of GDPR. Based on guidance, as this is for University research, the lawful basis for processing data is GDPR article 6(1)(e): ‘Processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller’. Also referred to as ‘Public Task’. As the research involves health data, which is included in the definition of special categories of personal data, it requires an additional condition for processing. Based on guidance, for health research this is GDPR article 9(2)(j), which details that processing is necessary for scientific and research purposes, subject to appropriate safeguards.

LAUNCHES QI is a dataset analysis of five linked audit and national datasets which will include up to approximately 144,000 patients with congenital heart disease that have been captured by the National Congenital Heart Disease Audit (NCHDA) since 2000, the core dataset defining the study CHD population. Included will be all patients, with no age restrictions, who have had at least one intervention for congenital heart disease that has generated at least one entry in NCHDA. Patients undergoing cardiac intervention since April 2000 (when NCHDA began) will be included in the study, but the requested records for these patients in HES is from 1997/98 (where available) until March 2017, since these patients may have interacted with NHS services prior to their cardiac intervention. CHD is a life-long condition, so obtaining 20 years of data (including 3 years prior to the NCHDA dataset) is crucial for establishing as complete trajectories of their service use as possible, which will enable the study to get a more detailed understanding of how patients interact with secondary care throughout their lives, how this varies across centres, and how it impacts on outcomes. National data is required to allow exploration of the different CHD services across the country. The pseudonymised linkage of the other 4 datasets to NCHDA at patient level will allow trajectories of care to be determined.

The study research dataset will be generated by linking pseudonymised NCHDA data (English and Welsh centres) to the four additional sources:
1) Paediatric intensive care data from the Paediatric Intensive Care Audit Network (PICANet) based at University of Leeds (English centres).
2) Adult intensive care audit data (Case Mix Programme, CMP) (English and Welsh centres).
3) Hospital Episode Statistics (HES), (English centres).
4) Death registrations (Civil Registration Deaths). (Will link to identifiers from patients of English and Welsh centres).

UCL has received authorisation from HQIP to transfer personal identifiers from NICOR (which curates the NCHDA dataset) to ICNARC, PICANet and NHS Digital. NICOR is located at Bart’s Health NHS Trust. UCL has also received authorisation from HQIP and ICNARC to receive the pseudonymised clinical information at University College London. UCL has Ethics (18/NS/0106) and CAG (18CAG0180) approval for the study to process the data and to link the datasets.

Two datasets will be created: one containing the patient level, pseudonymised information of all hospital admissions, outpatient appointments and procedures for each CHD patient; the other a pseudonymised hospital admission level dataset covering all hospital stays for CHD patients. The final pseudonymised datasets will be stored within UCL’s secure data safe haven.

Patient identifiers will be sent to NHS Digital from the NCHDA database. UCL are requesting that NHS Digital match the identifiers from the NCHDA dataset to HES and Civil Registration Deaths and extract the requested fields. NCHDA will provide a record level LAUNCHES ID which should be transferred to each pseudonymised HES/Civil Registration Deaths record that matches. UCL additionally request that NHS digital generate a unique record-level Study ID that will be common between NHS Digital and UCL to facilitate data queries. The data is to be securely transferred with the record level study numbers to University College London.

The LAUNCHES QI study was instigated and is led by the Principal Investigators based at the Clinical Operational Research Unit (CORU) at UCL. University College London will be the only organisation to require access to the record level data supplied from NHS Digital.

The sole funder, The Health Foundation, are involved in the study only to provide the award, as grant-funding. They will also oversee progress of the project through annual reports and award meetings. The Health Foundation have no influence over the results. They are not permitted to access any record level NHS Digital data.

As in many studies now there are a number of collaborators providing an advisory role. Only UCL substantive employees are working with the data. The organisations involved as collaborators/advisors other than UCL are, Birmingham Children’s Hospital, Intensive Care National Audit & Research Centre, Leeds University, Royal Brompton NHS Foundation Trust, Great Ormond Street Hospital NHS Trust, Leeds Teaching Hospitals NHS Foundation Trust, University Hospital Southampton, and University Hospitals Bristol NHS Foundation Trust. These organisations will not have access to the data. UCL researchers make all the final decisions as data controllers.

Expected Benefits:

The LAUNCHES study addresses important areas related to services in congenital heart disease (CHD) about which there is currently little or no information. In response to national recognition of the urgent need, the study will provide information regarding better system measures for identifying opportunities for quality improvement, trajectories of service use, and variation in CHD service provision.

The main areas of uncertainty addressed by the LAUNCHES study are:
1. There is little known about long term survival and outcomes for CHD patients.
2. Little is known about patient trajectories through the NHS for lifelong conditions like CHD (i.e. for a given patient, their outcomes and sequence of contacts with the NHS over time). The study will generate important understanding about service use and variation, and using this to identify key areas to target improvements in health care quality.
3. Many specialised CHD services are commissioned directly by NHS England. To assure quality, NHS England rely primarily on a Quality Dashboard which necessarily uses fairly crude measures available in the current system. The study will directly address a stated need for better measures of service quality by NHS England by developing new robust service metrics.
4. Other lifelong conditions, particularly those where patients are likely to require periods of intensive care such as Cystic Fibrosis or Renal Disease, will also benefit from their approach for linking audit data to measure quality and variations.

Benefits will include:
1. Designing meaningful outcomes for commissioners, managers, patients and clinical teams that can be interpreted in the context of changing treatment options, service provision and the natural evolution of disease.
2. Connecting different data sources to track outcomes and service use across multiple sectors over a lifetime.
3. Identifying and understanding variations in outcomes to drive quality improvement.

This research will benefit the provision of health care by:

Identifying candidate metrics to inform commissioning and drive change for CHD: Many specialised CHD services are commissioned directly by NHS England. To assure quality, they rely primarily on a Quality Dashboard which necessarily uses fairly crude measures available in the current system. The study will directly address a stated need for better measures of service quality by NHS England. The importance of the research to commissioners is demonstrated through the CO-Is from the relevant Clinical Reference Group and the relevant national audits, and endorsement from NHS England, clinical bodies (BCCA, SCTS, RCSEd) & HQIP.

Generate important new knowledge about lifelong service use & variation for CHD, and target areas for QI: Little is known about patient trajectories through the NHS for lifelong conditions like CHD (i.e. for a given patient, their outcomes and sequence of contacts with the NHS over time). Leveraging the national audits, the study will establish patient trajectories for the CHD population for the first time, generating important understanding about service use & variation, and using this to identify key areas to target improvements in health care quality.

Provide a template for other high-resource, high-impact NHS services: The approach for linking audit data to measure quality and variations will be applicable to other lifelong conditions, particularly those where patients are likely to require periods of intensive care such as Cystic Fibrosis or Renal Disease. The team will share & build on the generalisable learning e.g.: technical & governance solutions for linking national datasets routinely; methods for incorporating commissioning & management perspectives in developing metrics for QI & Quality Assurance (QA); approaches to tracking patient trajectories through multiple datasets.

With approximately 200,000 people currently living with CHD in the UK and services are expensive, high profile and have enormous impact on patients’ and their family’s lives, dissemination is in the public interest. With the possibility of applying findings to other lifelong conditions the magnitude of the impact is potentially even higher.

The researchers will generate the high-quality evidence necessary to guide the quality improvement of CHD services and inform decisions about national policies. Scientific manuscripts will be written detailing the findings. UCL expect the publications to contribute to the evidence and expert opinion for the development and update of clinical guidelines in congenital heart disease in the future. The dissemination via publications and presentations will be completed following conclusion of the study in 2021. The timescale of the full benefit through changes in policies and in procedures is impossible to determine.

Outputs:

The results of the study will be disseminated actively and extensively. The research team has strong links with the Congenital Heart Services Clinical Reference Group, NHS England and clinical bodies including the British Congenital Cardiac Association, the Society for Cardiothoracic Surgery and the Royal College of Surgeons of Edinburgh. UCL also have strong links with CHD charities including The Somerville Foundation, Children’s Heart Federation, The British Heart Foundation and Little Hearts Matter.

Outputs will involve approximately ten publications in peer-reviewed Medical and Scientific Journals, oral and written presentations at national and international conferences. Target journals for the papers are Circulation, Heart, The Annals of Thoracic Surgery, and Archives of Disease in Childhood. The final outputs will only contain aggregate results with small number suppression, in line with the HES Analysis Guidelines.

The LAUNCHES study team plan to write the ten peer reviewed publications to be completed within six months of the study end. The study end date is end of February 2021. Therefore the team are aiming that the publications will be completed by summer 2021 and will present findings to key stakeholders (e.g. professional societies, national audit bodies, the Care Quality Commission, HQIP, commissioners and local hospitals) through meetings and short briefing documents. UCL will disseminate to the public through a project website (https://www.ucl.ac.uk/operational-research/domains/congenital_heart_disease/launches), and via social media (Twitter @UCL_CORU) and blogs. Updates to the website will be ongoing throughout the study and as and when publications and communications are available. The final communication will coincide with the final publication.

UCL will ensure that lay summaries are provided (reviewed in collaboration with patients and parents on their Advisory Committee). The patients and parents on the advisory committee attend annual advisory group meetings to receive updates and to provide feedback on any aspect of the study.

The team have also arranged the dissemination of findings through the Children’s Heart Federation. Where appropriate, results will be promoted as press releases (2019-2021). UCL will also submit reports to the Health Foundation (the project funders) and partner with them to draw on their networks and skills in dissemination and spread to make an impact more widely, which may include generating accessible resources such as downloadable leaflets and case studies, research highlights, blogs and webinars. UCL will also publish details of LAUNCHES QI on the PICANet, ICNARC and NICOR websites as well as on the UCL CORU website. Dissemination will occur throughout the course of the LAUNCHES study (2019-2021).

Processing:

LAUNCHES QI has the necessary research ethics and section 251 approvals. A favourable opinion has been obtained from the North of Scotland Research Ethics Committee, reference number 18/NS/0106. Section 251 support has been received to ensure that the accessing, linking and processing of the datasets is in line with the common law duty of confidence (Ref: 18CAG0180). Data will not be handled by any additional third party organisations. Data will not be accessed outside the UK.

The planned data flows are as follows:
1) Data flows to NHS Digital
National Institute for Cardiovascular Outcomes Research (NICOR) will securely transfer a file to the NHS Digital. This file will contain patient identifiable information (NHS Number, name, postcode, date of birth, local hospital patient ID) for all patients in the National Congenital Heart Disease Audit (NCHDA), and unique study ID (NCHDA record-level LAUNCHES Study ID).
2) NHS Digital will identify common records between NCHDA data and HES and Civil Registration Deaths data, including the requested derived fields. NHS Digital will generate a unique record-level Study ID for each identified record.
3) Data flows from NHS Digital
i) NHS Digital will return HES data for all individuals in the NCHDA cohort to University College London. The unique study ID (LAUNCHES QI record level study number and HES record-level study number) and requested derived fields will be appended to the end of every episode record returned to University College London.
ii) Civil Registration Deaths derived fields for all individuals in the NCHDA cohort will be returned to University College London. The unique study ID (LAUNCHES QI record level study number and HES record-level study number) will be appended to the end of every record.
4) The linkage strategy is that NICOR will provide the personal identifiers NHS number, hospital number, date of birth and postcode to NHS Digital, ICNARC and PICANet from the NCHDA data. Patient name will also be sent to PICANet and NHS Digital. Each will identify in their respective datasets which records pertain to those CHD individuals and return to UCL the requested clinical and administrative data they hold on the matched individuals (UCL will not receive data for patients that do not match), pseudonymised with the LAUNCHES record level ID and local dataset record level IDs. UCL will receive from NICOR the pseudonymised clinical data of the NCHDA dataset and will not receive any personal identifiers.
UCL will then link HES and Civil Registration Deaths data received from NHS Digital to pseudonymised clinical data received from NICOR (NCHDA data), PICANet and ICNARC (CMP data), via the LAUNCHES QI unique record-level study number and the LAUNCHES Patient IDs (which will be contained within the NCHDA data set). UCL will not be receiving any personal identifiers from any of the data sources. UCL will use the record level NHS Digital study identifiers in case of any queries about specific records with NHS Digital.

The data is stored and processed within the UCL Identifiable Data Handling Solution (IDHS) called the Data Safe Haven (DSH). The data will be held within a secure environment where all statistical analyses will be undertaken. Access to this record level data will be limited to only four members of the LAUNCHES QI team, who are all substantive employees at UCL. Staff accessing the UCL data safe haven attend training in its use and security procedures. Staff are also required to complete mandatory annual Information Governance and GDPR training. Each study working on the data safe haven has what is known as its own ‘share’, where the study specific data is kept. Access to this share is granted only by the Principle Investigators who request access for each user. Any team member leaving the study has their access revoked.

Re-identification is not permitted under this data sharing agreement.

Any linkage that could identify an individual is not permitted under this agreement.

No linkage, other than that described within the agreement is permitted and no further data linkage will be undertaken.

As part of LAUNCHES analysis; admission, procedure, diagnosis, discharge, and provider information are required for all matched patients. UCL are also requesting derived fields, age at admission/appointment to four decimal places and age at discharge to four decimal places. UCL are not requesting any dates of birth for this project to prevent identifiability of the data and so LAUNCHES will use ages at health service interaction to determine each patient’s treatment trajectory. In- and out-patient HES data is critical in identifying the treatment that each patient has received. In addition, HES A&E data are required to identify adverse outcomes such as unplanned emergency treatment following discharge from CHD surgery or through deterioration in a patient’s health. The requested Civil Registration Deaths derived fields of life status, age at life status to four decimal places and place of occurrence of death (home/hospice/hospital/care home/other communal establishment/elsewhere), are vital to complete the patient trajectories.

Methods of analysis for the study will include:
1. Data cleaning and descriptive analysis of individual datasets.
2. Develop and update clinical coding maps.
3. Establish and examine variations in longitudinal patient trajectories.
4. Identify candidate metrics to inform routine quality improvement and assurance.

The results of all analyses will be published in aggregate form, with small numbers suppressed in line with HES guidance. No identifiable data will be held by University College London, therefore no identifiable data will be released.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Camden & Islington Clinical Record Interactive Search (CRIS) Linkage with HES/Mortality Data — DARS-NIC-408171-X7F8W

Opt outs honoured: Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., , Health and Social Care Act 2012 s261(2)(a); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2021-04 – 2024-04 2023.05 — 2023.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: CAMDEN AND ISLINGTON NHS FOUNDATION TRUST

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 19 January 2023 final.pdf, IGARD Minutes - 22nd April 2021 final.pdf, IGARD Minutes - 3rd June 2021 final.pdf

Datasets:

Civil Registration - Deaths
Demographics
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
Civil Registrations of Death
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

This agreement aims to link HES and Mortality data from NHS Digital with the Camden & Islington NHS Foundation Trust (C&I) Clinical Record Interactive Search (CRIS) Research Database for the purpose of research in the public interest.

Camden & Islington NHS Foundation Trust (C&I) is a large mental healthcare provider serving a geographic catchment area of two inner-city London boroughs, and approximately 470,000 residents. Based on social deprivation scores of 326 local authorities in England, Camden is the 74th and Islington is the 14th most deprived local authority. The variation in the levels of deprivation within both boroughs is large, highlighting the inequalities between different population groups and places. Within Camden there are areas that are within the top 10% most deprived areas in England and areas that are in the 20% least deprived. C&I provides mental health and substance misuse services to people living in Camden and Islington, substance misuse services to Westminster, and a substance misuse and psychological therapies service to residents in Kingston. The Trust has two inpatient facilities, at Highgate Mental Health Centre and St Pancras Hospital, as well as community-based services throughout the London boroughs of Camden and Islington. The Trust provides services for adults of working age, adults with learning difficulties, and older people in community or inpatient settings.

The objective of the data collection is to create a research resource to be used for research projects aiming to investigate physical health outcomes (including mortality) and receipt of health care in people with mental and behavioural health disorders attending secondary mental health care services provided by C&I.

The proposed linkage would significantly increase high quality research outputs that examine the interface between mental and physical health. There is increasing emphasis in the health inequalities experienced by individuals diagnosed with severe mental illness (SMI), commonly defined as schizophrenia, bipolar disorder, schizoaffective disorder and other non-organic psychotic illnesses. These individuals have been found to have a reduced life expectancy of up to 20 years. What is less clear is how other extremely disabling psychiatric disorders, such as severe depression, post-traumatic stress disorder and personality disorders compare in terms of premature mortality, self-harm and physical health comorbidities. By linking HES and ONS mortality data with C&I CRIS data the study team will explore and quantify this currently under-researched health disparity. The study team will focus specifically on commonly occurring comorbidities and adverse complications such as; cardiovascular, respiratory, cancer, liver disease and self-harm.

This data linkage is supported by service user members of the C&I CRIS Oversight Committee, along with local patient and public involvement (PPI) groups the study team have consulted. People using secondary mental health services are rightfully concerned about their risk of premature mortality and morbidity, this linkage has the potential to answer outstanding questions.

Routine recording of Electronic Health Records (EHR)s at C&I commenced in mid-2008 using RiO, an electronic patient record system. RiO contains a comprehensive, longitudinal record of all clinical information recorded throughout patients contacts with Trust services, including socio-demographic information, dates and other details of referrals and admissions, detailed clinical assessments, care plans and standardized assessment forms. The record consists of both structured fields (such as dates and pick-lists) and unstructured free text (including progress notes and correspondence). The CRIS tool, developed by South London and Madusley NHS Foundation Trust (SLaM) Biomedical Research Cluster (BRC) to extract information from their bespoke electronic Patient Journey System (PJS), consists of a series of data-processing pipelines which both structure and de-identify fields in the electronic patient record, rendering effectively anonymized data from the full clinical record available at the researcher interface. The system allows researchers to search against any combination of structured and unstructured fields that exists in the database. Users then specify the precise fields they want returned (such as specific diagnostic codes, demographic information and/or a particular text string in a clinical assessment).

University College London (UCL) is C&Is long-standing research partner in clinical research and this is reflected in the development and operation of the C&I CRIS Research Database. The C&I CRIS Research Database administrator is formally employed with University College London with a substantive honorary research contract with C&I. The C&I CRIS Research Database clinical academic lead holds an academic appointment with UCL in the Division of Psychiatry and a consultant psychiatrist with C&I.

The C&I CRIS Research Database employs the same security model as that developed by SLaM to address the legal and ethical considerations attendant upon the use of confidential health data. Authorized researchers are provided with regulated access to anonymized information extracted from electronic patient records. The Research Database is used to support epidemiological and population-based research using only anonymized data, for which no patient consent is necessary though patients can opt out entirely if they choose.

The data subjects are individuals who:
(i) have received treatment from the Trust between 2012/13 and 2017/18 and some treated earlier where records existed and could be migrated AND who have not notified C&I that they wish to opt out of having their data collected and/or linked, and/or

(ii) individuals who are or have been resident within the London boroughs of Camden & Islington geographic catchment between 2012/13 and 2017/18 and attended hospital for any reason whilst resident in that catchment area.

C&I, the Data Controller, have contracted with the SLaM Clinical Data Linkage Service (CDLS) to carry out certain necessary duties for the processing, supply and hosting of the distinct, C&I CRIS research database. The South London and Maudsley NHS Foundation Trust will act as the Data Processor for the application.

Specifically; the linkage activity, i.e. sending of Patient Identifers and receiving of HES and mortality attribute data, will be conducted by the SLaM Clinical Data Linkage Service (CDLS) and the C&I CRIS - HES and mortality linked data will stored and hosted by SLaM within the C&I specific secure area. The SLaM CDLS provides data processing services (linkage, storage, and data extraction) to external collaborators. All reasonable security steps have been taken to protect data held by SLaM, for example no standalone devices are used and no data is permitted to be stored on the local drives. Furthermore, security measures designed to protect data from being saved outside the SLaM firewall are in place. Data cannot be accessed directly by researchers wishing to use linked data, this means that researchers will only have access to project specific extracts of the C&I-HES/ mortality data. Extractions will be carried out within the C&I secure area within the SLaM firewall by C&I CRIS staff. If appropriate, support is provided to external researchers who wish to access the linked data to help them fulfil the necessary requirements to gain approved status.

In summary, SLaM CDLS, as a Data Processor, process C&I clinical data on behalf of and under the contractual obligation to C&I. C&I maintains exclusive control over access to the C&I CRIS research database as Data Controller. SLaM CDLS has no ability or permission to access C&I data. This contractual relationship is analogous to that of an NHS Foundation Trust and any third-party data processor and host insofar the NHS Foundation Trust remains the Data Controller while the third-party vendor acts as a Data Processor (for example C&Is relationship with its electronic health record provider). As defined above, SLaM CDLS process the data, governed by the Data Processing Agreement, to fulfil the terms of its contractual obligation to C&I: to create and maintain the C&I CRIS research database. The lawful basis being relied upon to support the flow of confidential patient information from Camden and Islington NHS Foundation Trust (C&I) to SLaM to facilitate the creation of the C&I CRIS Database and this data linkage are: Article 6(1)(e) and Article 9(2)(j) of the General Data Protection Regulation (EU) 2016/679 (GDPR). SLaM are acting as a data processor (as defined in GDPR Article 28) on behalf of C&I who serve as the data controller. Per GDPR Article 28(3), C&I has established a data processing contract and processing agreement for the lawful flow of confidential patient information from C&I to SLaM specifically for the purpose of data processing. SLaM CDLS staff do not have access to the C&I research database.

Access to the C&I CRIS research database is limited to the C&I research database administrator and approved research users at C&I, only for research projects which are approved by the C&I research database oversight committee. Access can only be gained via the C&I network. This includes all data validation and quality checks for pseudonymisation which are conducted by C&I staff following data processing by SLaM CDLS. All external researchers with no contractual arrangements with C&I are required to obtain a Research Passport and Honorary Research Contract prior to project approval. Honorary Research Contracts must be signed by approved research users, their substantive employers, and C&I with wording that the employee will be subject to their substantive employer's disciplinary process if they do anything that they shouldn't with the data. Researchers only have access to pseudonymised linked NHS Digital Data. All research projects are carried out within the C&I network and the linked data remain within the C&I NHS firewall at all times (as is the security model requirement for all analyses of C&I data, regardless of data linkage).

The C&I CRIS Oversight Committee will consider research proposals to use the linked dataset. The Oversight Committee includes research and development governance, Caldicott/ information governance, technical, clinical and service user representation. The Oversight Committee will be responsible for granting or denying approval for all applications to use the linked data. A key consideration of this panel will be to decide if a project poses an increased risk of providing de-anonymised results due to anticipated small cell sizes. Where the panel envisages this to be likely, additional reassurances will be asked for from the applicant and amendments to the application form may be requested. Any successful application will be provided with a bespoke dataset, i.e. one with only the specific variables identified in the application as necessary for the planned analysis. Speculative studies (data dredging) will not be permitted.

Research requests vary from year to year. Historically, CRIS has hosted between 5-12 research projects per year from MSc and PhD students at UCL and academic clinical staff at Camden & Islington NHS Foundation Trust. This number has risen in recent years, given the increasing profile of the CRIS research database as a research platform. The study team expect this number to increase following the successful linkage with HES-ONS data which will facilitate more robust, longitudinal analyses. The study team foresee an increase in research projects to 10-18 projects per year. They expect projects pertaining to their established areas of expertise in: severe mental illness, suicide and suicidality, substance use disorders, psychosis, eating disorders, and personality disorders. Understanding the physical health comorbidities of these patients, as well as the causes of mortality, are crucial to research which will elucidate the risk factors and potential interventions to improve care for these individuals.

C&I will only grant access to HES data as part of a linked dataset comprising a minimum of HES and CRIS data (i.e. not for analysis of HES data alone). Broadly, the studies using the linkage have adopted the following designs:
1. Investigations carried out on HES data from the C&I catchment, identifying a HES-derived outcome and comparing its occurrence between people with/without a given diagnosed mental and behavioural health disorders in order to derive standardised morbidity ratios (for example, some current research investigating respiratory disease admissions in people with learning disability compared to the local population);
2. Investigations restricted to people with a given HES-derived outcome and comparing subsequent events between people with/without a given diagnosed mental and behavioural health disorders (for example, further analyses of people with/without a learning disability who have a respiratory disease admission, comparing duration of hospitalisation and risk of readmission between the two groups);
3. Investigations restricted to people with a given diagnosed mental and behavioural health disorders investigating one or more HES-derived outcomes in relation to C&I-derived information (for example, investigating the relationship between mental health symptom profiles and physical health events in people with severe mental illness);
4. Investigations primarily carried out using C&I data, where HES-derived information is used to provide supplementary information (for example, the ability to adjust for serious physical illness in a number of analyses). This includes the use of mental healthcare data contained on HES for residents in the C&I catchment to capture mental health service use by providers other than C&I (e.g. out-of-catchment hospitalisations);
5. Investigations primarily carried out using C&I data where a HES outcome is used to define the sample (for example, a series of analyses investigating medication and health outcomes before and after childbirth in women with pre-existing severe mental illness).

One further planned linkage with Public Health England and the National Cancer Registry is currently under review by Public Health Englands Office for Data Release. This dataset will be stored separately from the proposed HES-ONS-CRIS data linkage and there is no intention or technical ability to link data from the National Cancer Registry to HES-ONS-CRIS linked data.

Expected Benefits:

The over-arching objective of this research programme is to provide information that will assist in narrowing the mortality and physical morbidity disadvantage experienced by people with diagnosed mental and behavioural health disorders. Improvement in the physical health of people with diagnosed mental and behavioural health disorders is highlighted regularly in Government policy and the monitoring of physical health outcomes is increasingly becoming a metric for mental health Trusts, as well as for national structures such as the PHE Mental Health Intelligence Network. The proposed linkage would significantly increase high quality research outputs that examine the interface between mental and physical health. This innovation is supported by service user members of the C&I CRIS oversight committee, along with local patient and public involvement groups the study team have consulted. People using secondary mental health services are rightfully concerned about their risk of premature mortality and morbidity, this linkage has the potential to answer outstanding questions the study team has outlined below.

There is increasing emphasis in the health inequalities experienced by individuals diagnosed with severe mental illness (SMI), commonly defined as schizophrenia, bipolar disorder, schizoaffective disorder and other non-organic psychotic illnesses. These individuals have been found to have a reduced life expectancy of up to 20 years. What is less clear is how other extremely disabling psychiatric disorders, such as severe depression, post-traumatic stress disorder and personality disorders compare in terms of premature mortality, self-harm and physical health comorbidities. By linking HES and ONS mortality data with C&I CRIS data the study team will explore and quantify this currently under-researched health disparity. The study team will focus specifically on common occurring comorbidities and adverse complications such as; cardiovascular, respiratory, cancer, liver disease and self-harm. Self-harm is an interesting example, as instances of this are currently not well covered in the C&I CRIS records, whereas severe self-harm attempts resulting in emergency department attendance will be well recorded in HES data.

In acknowledgement of this health inequality for mental health patients, there have been concerted national efforts over the past two decades to improve health parity. For example, the Department of Healths no health without mental health policy document , and more recently the NHSs Five year forward view for mental health strategic guidance identify reduction of morbidity and mortality for people with diagnosed mental and behavioural health disorders as key targets. Moreover, the issue of increased morbidity and mortality is a key strand in the study teams clinical work for people with mental health problems locally at C&I. Recent research indicates that people with severe mental illness remain vulnerable. Therefore, understanding the relationship between physical and mental health, and pathways and barriers to receipt of appropriate physical healthcare, is of enduring relevance. The reasons underlying these disparities in morbidity and mortality are complex and thought to be due to a combination of individual and social factors. This may include the long-term use of antipsychotics and adverse social or economic determinations of health (smoking, obesity, inactivity, and illicit drug use), as well as the cumulative effects of deprivation, stigma, social exclusion, which may all contribute to higher rates of cardiovascular disease, respiratory disease, diabetes mellitus and its complications.

Given the study teams existing data assets which detail pathways of secondary mental health clinical care through the Trust, including clinical free-text along with structured fields, the study team believe that the C&I Research Database will offer greater insights into clinical care than obtaining NHS Digital Mental Health data sets.

Thus far, guidelines on physical healthcare in people with diagnosed mental and behavioural health disorders are mostly extrapolated from studies of the general population without considering more specific risks in those with mental health problems. Given the lack of improvement in health inequalities associated with mental illness, and the persistence of differential morbidity/ mortality, there is a pressing need for further research and more thorough investigation into reasons for general hospital admissions among people with diagnosed mental and behavioural health disorders. Insights from this research can meaningfully inform clinical practices, including care guidelines, which can improve routine care and ameliorate an understanding of how physical health comorbidities present differently and uniquely among those with diagnosed mental and behavioural health disorders. Reducing disparities in morbidity/mortality is a crucial and key goal for overall population health.

There is an existing HES-ONS-CRIS linkage using data from South London and Maudsley (SLaM) NHS Trust (a completely separate data flow from C&Is proposed linkage). The adverse health impact of people with diagnosed mental and behavioural health disorders has been demonstrated with this powerful data-linkage which the study team aims to replicate and extend. Examples include; a description of the most common reasons for acute hospital admissions in people with severe mental illness and the predictors of admissions with falls and fractures in this group. SLaM have also provided an evaluation of the accuracy of HES discharge diagnoses for ascertaining diagnosed mental and behavioural health disorders, of importance for groups using HES for this purpose. Moreover, a number of publications have used linked HES data to investigate physical health outcomes experienced by people with a recent dementia diagnosis, including investigations of hospitalisations in people suffering dementia with Lewy bodies, the impact of polypharmacy of hospitalisation outcomes, predictors of falls and fractures, emergency department use close to the end of life, and an evaluation of the accuracy of dementia diagnoses recorded on HES. These publications show that linkage of CRIS data with HES and ONS mortality data has clear potential to yield novel and high quality research publication However, there has been little research output regarding:
(1) co-morbidity and premature mortality in non-SMI patient groups, and
(2) pathways to treatment for physical health problems in various patient groups, examining if treatment options are offered equitably and how these effect outcomes and mortality.

Furthermore, all studies presented above are based on data from a single Trust (SLaM) This means important work is needed to replicate findings across multiple Trusts with different patient populations and NHS providers, along with further closing of the gaps in knowledge outlined above.

Physical health disadvantages are likely to cross multiple disorders and multiple levels of morbidity: from mortality to non- fatal conditions, and from the individual impact of serious health conditions to the wider economic impacts of increased secondary care use, longer hospitalisations, and increased risk of readmission. There is therefore a need for a coordinated series of analyses to inform on specific areas of inequality in order to target interventions to improve health. In order to improve morbidity and mortality through health and social care interventions, it is important both to have information on the adverse outcomes potentially underlying disadvantages and to be able to characterise groups most at risk of these outcomes.

Outputs:

The primary output of the linkage is the production and maintenance of a research resource for the purpose of use in informative research analyses for publication in peer-reviewed journals and other standard routes of academic dissemination (e.g. conference presentations).

All secondary outputs (whether tables or visuals) will only include aggregated data suppressed according to the HES analysis guide. Outputs must also comply with the UK Data Services Handbook on Statistical Disclosure Control for Outputs including the rules around secondary suppression where applicable.

The study team expect that a minimum of two research papers would be published per year using the proposed data linkages. Examples of papers planned for publication include:

1. Co-morbidity and premature mortality in non-severe mental illness patients. Severe mental illness (SMI) is commonly defined as schizophrenia, bipolar disorder and other psychotic disorders. Published research has explored prevalence and incidence of physical co-morbidities among patients with SMI, but less is known about these comorbidities for other mental diagnostic groups such as depression, PTSD, and personality disorder. The paper will explore if common physical co-morbidities such as cardiovascular disease, diabetes, severe asthma, chronic obstructive pulmonary disease, and cancer are also overrepresented in non-SMI patient populations using secondary care mental health services. By linking C&I CRIS with HES-ONS mortality the study team will compare incidence and prevalence between SMI and non SMI patients, using both published population control estimates as well as matched HES-ONS mortality control data drawn from Camden and Islington boroughs, exploring both at co-morbidity and mortality from the physical conditions known to be overrepresented in SMI. The study team will also explore predictors of mortality and co-morbidity including demographic, social and clinical factors.

2. Mental illness - Pathways to physical healthcare. In order to reduce the physical health inequality experienced by patients suffering from mental health issues, availability and quality of treatment offered prior to and after receiving a comorbid physical diagnosis is of high importance. For example, are patients with mental health problems less likely to receive coronary angioplasty and stents? Are they less likely to receive transplants? Through the linkage of CRIS with HES-ONS mortality, the study team will be able to map out which treatments were offered to patients, and explore how such treatments (or lack of) impacted outcomes of physical and mental health and/or if treatments can be related to cause-specific mortality. The study team will also explore if there is inequality between mental health diagnostic groups in terms of treatment offers and pathways to physical healthcare, and the degree to which social deprivation, ethnicity, diagnosis, medication, age and sex explain any disparities.

All the potential uses of the linked data fall within the stated primary purpose of investigating physical health in people with diagnosed mental and behavioural health disorders. The data being requested will only be used for the purpose described. Any proposed changes will be submitted to NHS Digital for amendment and approval before implementation.

Publication targets will clearly depend on the nature of individual findings and the potential audience envisaged. Where possible, the study team will target general medical and/or public health journals with a broad audience, because analyses are likely to cross disciplines; however, they will also consider specialist journals within the mental health field as well as the individual medical specialties implicated. Dissemination at national and international conferences will adopt a similar strategy of aiming for as broad as possible a reach. They will include mental health focused meetings such as the Royal College of Psychiatrists and European Psychiatric Association congresses, and psychiatric epidemiology meetings such as the International Federation of Psychiatric Epidemiology (IFPE) but they will also seek presentations at medical specialty conferences where results have relevance to those audiences, as well as meetings where commissioners are likely to be represented.

For each application received, the CRIS Oversight Committee, considers the study design and advises on optimisation of benefits. The CRIS Oversight Committee also has a responsibility for publicity and dissemination of findings to relevant parties, media and patient groups.

Patient and public involvement (PPI) is central to the operation and ethical approval for the C&I CRIS research database. There are three service users on the C&I CRIS research database oversight committee. All applications for projects to access the C&I research database are reviewed by a service user.

Separately, the study team also have a Data Science PPI group (chaired by the McPin Foundation a charity integrating experts by experience into research www.mcpin.org) who comment on and contribute to the design, conduct, and dissemination of studies using the C&I CRIS research database. While this group does not review applications for use of the C&I research database, their advisory role provides important guidance and insights into academic research, including ensuring that research questions are appropriately framed and that research findings are meaningfully contextualised. This PPI group continues to meet regularly to offer their guidance to C&I research database users. The McPin Foundation are not considered Data Controllers as they have any say over the data processing methodology. They provide facilitation support to the separate Data Science PPI group given their expertise in integrating lived experience into academic research. The Data Science PPI group provides important insights and guidance but do not regulate access to CRIS data that is the remit of the CRIS Governance Board which also includes service user/carer representation.

Processing:

The Clinical Record Interactive Search (CRIS) system contains pseudonymised copies of C&Is electronic patient records for all patients (i.e. all C&I service users) other than those who exercised their right to opt out of participation.

The study research will start with a broad descriptive analysis and the study team will then focus on two specific projects. These projects demonstrate how the linked fully pseudonymised dataset will be used to investigate physical health service provision to adults who have been referred to C&I services compared with the local population. Outputs from the studies have the potential to rapidly inform local and national health and mental health service developments, especially given the size and well characterised nature of the sample. The longitudinal nature of the data will provide future investigations the opportunity to study the impact of treatment and diagnosis on individual health-related outcomes over time. Result summaries will be fed back to relevant organisations such as NICE, and promoted locally with the aim of directly impacting NHS policies and current patient care.

The justification of using non-consent approaches is; that linking administrative data is preferred over primary data collection because it provides accurate and complete information and efficient use of existing resources. It also has ethical advantages over collecting new survey data, particularly from disadvantaged and vulnerable individuals whose responses are of the greatest importance yet particularly challenging to obtain. These ethical and methodological advantages are of course subject to the security and confidentiality of data linkage, storage and access, and on rigorous information governance and stakeholder consultation procedures.

The study is designed as a series of retrospective clinical cohort studies of adults who have received secondary mental health care, utilising an individually matched dataset containing longitudinal pseudonymised health data on physical health and mortality: Data Requested is listed below:

Hospital Episode Statistics (HES) Critical Care 2012/13 2017/18
HES Outpatients 2012/13 2017/18
HES Admitted Patient Care 2012/13 2017/18
HES Accident & Emergency 2012/13 2017/18
Civil Registrations (Deaths) data extract to cover this period.
Demographics data extract to cover this period.

The cohort will be made up of: All adults (aged 18 and over) who have been referred for C&I treatment between 1st January 2008 and 31st December 2018. The sample size is approximately 146,000 adults, and characterised with a range of symptom severity from common diagnosed mental and behavioural health disorders (e.g. depression and anxiety) to severe diagnosed mental and behavioural health disorders (e.g. schizophrenia, bipolar affective disorder), substance use disorders and organic disorders (e.g. neurological syndromes associated with severe intellectual impairment).

Measures: As described in a number of recent studies, C&I CRIS data provides individual level data on sociodemographic (date of birth, sex, ethnicity, neighbourhood deprivation) and time variant data on ICD-10 psychiatric diagnoses, diagnostic assessments, illness severity (e.g. via scales including the Health of the Nation Outcome Scales), risks (e.g. suicidal ideation, physical disability, to others and from others), mental health treatment frequency of contact, type, professionals involved, local or specialist services, community vs inpatient, medication (e.g. antipsychotics, stimulants, anti-depressants, hypnotics) and psychotherapeutic interventions (individual or group CBT, family therapy, psychodynamic etc.) and treatment adherence. The study team use General Architecture for Text Engineering (GATE) software to develop precise text mining algorithms to extract and code clinically relevant free text data (typed notes) from CRIS.
Hospital Episode Statistics (HES) are held by NHS Digital and include all accident and emergency, hospital admissions and outpatient visits which occur in all hospitals throughout England. This includes important clinical information such as diagnoses, operations or the speciality of the treating clinician; demographic information such as age, sex and ethnicity; and also administrative data such as methods of admission and discharge.

The Office for National Statistics (ONS) collects information on cause of death from an individuals death certificate; this information is held by NHS Digital in the form of Civil Registration (Deaths) data extracts. This includes diagnosis (using both free text and structured ICD10), and date and cause of death.

Using deterministic matching techniques NHS Digital will link the C&I CRIS and HES/mortality data sets for all patients seen by C&I services. This includes those resident to the boroughs of Camden and Islington. However, it also includes those referred to C&I national and specialist services from outside the catchment area. In addition, pseudonymised HES/mortality data on the residents of the two London boroughs which form the C&I catchment area (Camden and Islington) will also be sought to enable comparison. The HES/mortality data will enhance CRIS data, enabling researchers to explore and identify health inequalities.

The linkage will not generate or collect new data. Rather, it will be a static linking of datasets. Both datasets have been previously created as a matter of course in the performance of service activities by each Data Controller. The utility of the linked dataset created will be demonstrated through the programme of research described below.

Methodology:
1. SLAM CDLS create a cohort (approximately 146,000 individuals) with identifiers to include Study ID (BRCID), NHS Number, Post Code, First Name, Last Name, sex, and Date of Birth and send this to NHS Digital via Secure Electronic File Transfer (SEFT).

2. NHS Digital extracts the HES and mortality data fields requested and removes the identifiers, leaving the Study ID in place. NHS Digital send the pseudonymised extract to SLAM CDLS via SEFT.

3. SLAM CDLS uploads the pseudonymised data to the C&I CDLS data safe haven.

The HES and mortality data will not be linked with patient identifiers from C&Is electronic patient record and no attempt will be made to re-identify individuals in the data under any circumstances.

C&I will manage and finance the resources required to sustain the proposed database. More specifically, the day to day processes of running the database will be conducted by a collaborative team within the SLaM Clinical Data Linkage Service (CDLS) who are hosting C&Is data within a C&I specific area within a secure firewall in the SLaM network. Therefore, the day to day processes of hosting the database will be managed by this team. The SLaM CDLS is an impartial, trusted third party service and comprised of a small, dedicated team of informatics, IT, and Information Governance (IG) professionals. The SLaM CDLS is part of both SLaM ICT and Information Governance Departments. There will be no further linkage of the NHS Digital data.

All the datasets will be stored separately and are only accessible to a restricted number of approved technical support staff. Technical staff (all of whom are substantive employees of C&I or C&Is Data Processor, SLaM CDLS) will then assemble bespoke de-identified linked databases meeting the approved requirements of the research study. These are deposited in shared network drives within the C&I network. For each research database created a different encoded identifier variable (anonym) will be assigned meaning there are no common identifiers or pseudo-IDs across different databases making it impossible for researchers to link their database with source C&I, HES, or Mortality data. This uses a one-way encryption method following which anonyms cannot be reverse engineered.

Microsoft Ltd provide Azure Backup Storage Services for South London and Maudsley NHS Foundation Trust and are therefore listed as a data processor. They supply support to the system, but do not access data. Therefore, any access to the data held under this agreement would be considered a breach of the agreement. This includes granting of access to the database[s] containing the data.

When an application has been approved by the CRIS Oversight Committee, technical staff, all of whom are substantive employees of C&I, assemble bespoke de-identified linked databases meeting the approved requirements of the research study. These are deposited in shared network drives within the C&I network.

Approved researchers can only access the data on location within the C&I network. All research databases remain within the C&I firewall at all times on the C&I network. A dedicated office suite has been set up in the Bloomsbury Building onsite at St. Pancras Hospital in order to facilitate analyses using C&I data. Removal of data from this environment is expressly forbidden other than in the form of aggregated summary data with small numbers suppressed in line with the HES Analysis Guide. For each research database created a different encoded identifier variable (anonym) is assigned meaning there are no common identifiers or pseudo-IDs across different databases making it impossible for researchers to link their database with source CRIS, HES or mortality data. This uses a one-way encryption method following which anonyms cannot be reverse engineered. Researchers do not have access to the record level identifiable or pseudonymised HES or mortality data.

At the completion of research projects, the databases used are removed from the shared network drive and archived for a period of 5 years and then permanently destroyed.

HES and ECDS DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION
In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, you must make sure that:
cell values from 1 to 7 are suppressed at a local level to prevent possible identification of individuals from small counts within the table.
Zeros (0) do not need to be suppressed.
All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

Understanding the health needs of mothers involved in family court cases — DARS-NIC-196263-J9Q7Z

Opt outs honoured: No - Statutory exemption to flow confidential data without consent, No (Excuses: Statutory exemption to flow confidential data without consent)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2020-12 – 2023-11 2021.01 — 2023.06. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 18th September 2025 final.pdf, AGD minutes - 2 May 2024 final.pdf, AGD minutes - 4 May 2023 final.pdf, igardminutes-8thoctober2020final.pdf, igarddraftminutes10thdecember2020final.pdf

Datasets:

MRIS - Bespoke
Civil Registration (Deaths) - Secondary Care Cut
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
MRIS - List Cleaning Report
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) are proposing to link an existing cohort of mothers and babies (held by the research team under a separate Data Sharing Agreement (DSA) with UCL; NIC-393510-D6H1D) with programme information from the Children and Family Court Advisory and Support Service (Cafcass). Women aged between 15 and 50 years, with at least one live birth recorded in the Hospital Episodes Database between 01/04/1997 and 31/03/2017 will make up the cohort. It is estimated that there will be a maximum of 12 million women in this cohort. This cohort of women will be linked to Cafcass data on women involved in care proceedings between 01/04/2007 and 31/03/2019, which includes 113,191 mothers. Those women who are not linked to Cafcass data will act as a control cohort for analysis. This will allow UCL to compare women linked in the Cafcass data to the general population of women giving birth as 97% of women in England who give birth in NHS hospitals.

The Cafcass extract used for this study will be minimised by the UCL research team to only contain information on women aged 15-50 years who were party to section 31 applications (care proceedings where local authorities apply to have a child removed from parental supervision due to serious concerns for the child safety and wellbeing). The extract will contain demographic information on the mothers (date of birth, local authority of residence), information on the children subject to the case (number of children, month and year of birth, sex), and information on the care proceedings (hearing dates, final legal output).

The Cafcass data is held by the Children and Family Court Advisory and Support Service, and available for researchers upon request. (Research must be approved by the Cafcass Research Advisory Committee. The committee checks if the proposed research is feasible, as well as beneficial for Cafcass and for children and families using the service. The research will only be approved if it is relevant to Cafcass’ statutory remit and strategic aims. In this case, the committee has approved the UCL project as being done in accordance with CJCSA section 13 and falling within its strategic aims).

The data under NIC-393510-D6H1D is disseminated for a programme of research within the healthcare provision theme of the Policy Research Unit for Children, Young People and Families (CPRU), within University College London (UCL) which is funded by the Department of Health and Social Care (DHSC). The extract of data from NIC-393510-D6H1D for the purpose of this agreement NIC-196253 will only contain HES APC, A&E, Outpatient data and Civil Registration data on women who had a record of a live birth between the ages of 15 and 50 years old between 1 April 1997 and 31 March 2017.

This request aims to assess the feasibility of using linked health and family court administrative data sets to facilitate research to explore the underlying issues such as long-term health and mental problems. Further, it aims to inform policy about associations between justice system risk factors, such as regional variation in returns to court and health outcomes such as time to next pregnancy, mental health exacerbations during court proceedings.

The study aims to generate evidence about the health needs of mothers involved in public law care proceedings in England. There is clear evidence that mothers, whose children enter public care or are adopted, often have complex health needs, such as drug and/or alcohol misuse, exposure to violence, mental health problems as well as chronic physical conditions.

Key questions are:

Question 1: Are there health characteristics of mothers that are associated with a high likelihood of care proceedings?

Question 2: Among mothers involved with care proceedings, what characteristics are associated with time to subsequent pregnancy, adversity related admissions and adverse outcomes related to court ?

UCL has defined adversity related admissions are admissions related to alcohol- and/or illicit drug use, self-harm, violence and severe mental health. As adverse outcomes, UCL will assess time to next pregnancy (and return to court for further children in the Cafcass data), emergency hospital admissions and deaths.

Key policy questions will be addressed by this study:
1) Is there an unmet burden of health need that could be reduced by interventions in courts, social services or health services;
2) could improved input by healthcare services reduce the chance of being the subject of care proceedings and improve health and welfare outcomes for mother and child.

UCL will attempt to generate research evidence to answer these questions by assessing which health characteristics of mothers are associated with a high likelihood of being involved in care proceedings, and determining what characteristics are associated with time to subsequent pregnancy, adversity related admissions and adverse outcomes related to court.

UCL will explore risk factors known at live birth (such as maternal age, parity, (history of) mental health related admissions, underlying long-term conditions, adversity-related injury admissions) and their association with timing of care proceedings. UCL require whole HES records to assess how and when women use health services, determine whether there is proactive management through outpatients, assess whether health crisis recorded as Accident and Emergency (A&E) attendances or unplanned admissions are related to hearing dates, and asses underlying conditions using women’s medical history as recorded in Hospital Episode Statistics (HES) (e.g. long-term conditions or adversity-related admissions as recorded in ICD-10 or OPCS codes (classification of Interventions and Procedures)). UCL request HES data from 1997 in order to identify the birth episodes for children involved in court cases from 2007 onwards, as well as to identify any long-term conditions in the mothers.

The purpose of this application falls under Article 6 (1) (e) of the GDPR and the lawful basis for using information collected routinely for administrative purposes for research is the ‘public task’. This is part of the University’s commitment to ‘integrate research and innovation for the long-term benefit of humanity’. The application also falls under Article 9 (2) (j), as scientific research.

The study has been designed by UCL researchers without the involvement of the funders (Nuffield Foundation).

There are no alternative, less intrusive ways of achieving the purpose of this study.

UCL are the sole data controller who also process the data under this agreement.

There are no other organisation involved in the study other than those referenced.

Expected Benefits:

The research project in this application informs policy and practice. All analyses undertaken as part of this project aim to provide evidence to inform health care professionals, service providers, policy makers and service users about the health of mothers in the family court system and how services currently meet their needs.

Results will provide evidence of how healthcare need and use of services differs between women who become subject to care proceedings and those who do not. This will provide evidence that can guide the design of potential interventions to reduce care proceedings that Cafcass could test and implement.

This study will examine whether indicators for maternal adversity recorded in healthcare such as a history of admissions for mental health problems, drug/alcohol abuse or exposure to violence can identify groups of women at a live birth event who could benefit from proactive or preventive healthcare input that might reduce risk of involvement in (recurrent) care proceedings and improve health outcomes for both mothers and children.

This information will help family courts as it can identify health needs that health services are aware of at first contact with courts. Improved communication between services could direct potential interventions at this point. Additionally, results will evaluate the healthcare trajectories of women involved in (recurrent) care proceedings. This will provide evidence to assess whether healthcare services can identify (and potentially address) health needs in women at live birth before they go to court.

Evidence is lacking about the extent to which health needs are addressed by services before, during and after court involvement.

The sparse evidence currently available on families involved in recurrent care proceedings suggests this group is particularly vulnerable. Understanding when and how mothers use healthcare, what their health trajectories before care proceedings look like compared to other women with a live birth, and what their healthcare needs are before, during and after involvement in care proceedings will inform interventions aimed at safeguarding vulnerable families (defined as families involved in section 31 care proceedings for this study).

The hypothesis is that a longer time to next pregnancy following a first care proceeding is associated with a reduced risk of recurrent care proceedings, but this association may vary by area, final legal order of index care proceeding and individual risk factors such as presence of long-term conditions or mental health needs in the mother.

This study will also test the feasibility and success of linkage between health and family court data, while evaluating an important policy question about the association between health service use and recurrent care proceedings, taking into account risk factors.

Outputs:

The research will inform Department of Health and Social Care and Ministry of Justice policy makers, service providers and practitioners about patient and service factors associated with the health needs of mothers and children involved in care proceedings in family court. UCL will work with the funder, the Nuffield Foundation to organise a symposium inviting key policy makers and scientists to showcase the findings.

Additionally, UCL will share results with patients who are part of a mental health service user group based at King’s College London and PAUSE, a charity working with women who have experienced, or are at risk of, repeat removals of children from their care. https://www.pause.org.uk/about-us/

UCL has also formed a project advisory group that includes six leading practitioners from across health, family justice and child safeguarding. Part of UCL’s planned discussions with this group will include identifying ways to improve communication about this project and to identify a wide range of organisations including within health, children’s social care, family justice and the voluntary sector with which to share study findings. This group includes a PAUSE practice lead and the programme manager for SHRINE (a human rights -based service in South London delivering sexual and reproductive healthcare to marginalised populations including people with serious mental illness or substance misuse).

Previous public and patient engagement work for this data linkage project has helped to identify further stakeholders to engage with this work. UCL will continue to undertake public and patient engagement work to support interpretation of findings and to discuss methods of sharing study findings to ensure findings are accessible to both policy, practitioners and the study population.

The findings will be published in peer reviewed journals and reports prepared for the funder, the Nuffield Foundation.

The papers resulting from these studies will be published in peer-reviewed journals (such as the Lancet, Archives of Disease in Childhood, PLoS Medicine, BMJ Open) and presented at scientific conferences (such as the, International Population Data Linkage Conference, International Society for the Prevention of Child Abuse and Neglect, Royal College of Paediatrics and Child Health annual conference, and Informatics for Health conference).

UCL aim to present the work at scientific conferences during 2021 and use feedback provided at these meetings to write up papers to be submitted for publication in 2021.

Outputs will be aggregated with small numbers suppressed in line with the HES analysis guide.

Processing:

Data sources

The datasets to be linked are:
1) Hospital Episode Statistics (HES APC, A&E and OPD) and death registration data which contains details of all hospital contacts and deaths in NHS hospitals in England collected from 1 April 1997 to date (data UCL already hold under NIC-393510-D6H1D). Women aged between 15 and 50 years, with at least one live birth recorded in the Hospital Episodes Database between 01/04/1997 and 31/03/2017 will be extracted to make up the cohort to be liked with the Cafcass data.
2) The Children and Family Court Advisory and Support Service database (Cafcass) contains information on family care proceedings, including demographic information on persons involved in court applications (adults and children), hearing and application dates, and proceeding outcomes. Data is available from Cafcass from 2007 onwards. The Cafcass extract used for this study will be minimised to only contain information on women aged 15-50 years who were party to section 31 applications (care proceedings where local authorities apply to have a child removed from parental supervision due to serious concerns for the child safety and wellbeing). Cafcass will minimise the data in this way before it is sent to UCL. The extract will contain demographic information on the mothers (date of birth, local authority of residence), information on the children subject to the case (number of children, month and year of birth, sex), and information on the care proceedings (hearing dates, final legal output).

The study cohort will include all women for whom a section 31 application has been made (by an English local authority) concerning their child(ren) between April 2007 and March 2019. Though most section 31 applications result in a section 31 order being made, a considerable number result in children being placed with extended family under private law orders (such as a Special Guardianship Order) and a small number of applications are dismissed or are subject to an ‘Order of No Order’. Each of these outcomes are equally as important, particularly when considering the heterogeneity of this study population with respect to identifying risk factors for returning to court for subsequent s31 court proceedings.

To mitigate the risk of re-identification, identifiers will be used by the trusted third party (NHS Digital) to carry out the link between the two data sets (Cafcass and HES, via PDS), the final analysis file will only contain pseudonymised non-sensitive variables.

Data flow:
• Cafcass will supply NHS Digital with a list of identifier variables for women aged 15-50 years involved in section 31 care proceedings, including name (first and surname) date of birth, local authority of residence during care proceeding and postcode histories, alongside a study specific pseudo-identifier number (pseudo-study ID) for mothers involved in care proceedings in England that started between 1 April 2007 and 31 March 2017. Only identifiers for women aged 15 to 50 years involved in section 31 proceedings will be flowing. No other Cafcass data will flow. The flow of identifying data from Cafcass to NHS Digital is Practice Direction 12G which enables Cafcass to lawfully share information about family court proceedings for the purposes of an approved research project.

These identifiers will be linked to records held in the Patient Demographic Service (PDS) using an algorithm that prioritises the most recent postcode in Cafcass (at the end of the last case the woman was involved in). The identifiable information disclosed by Cafcass will be used by NHS Digital to facilitate linkage with the Personal Demographics Service (PDS) data in order to identify patients within the HES data.

• For the remaining unmatched Cafcass records, the second postcode in Cafcass will then be compared with PDS as above. This approach will be repeated up to a maximum of 3 postcodes in Cafcass. (Only up to 3 postcodes per person are available from Cafcass for linkage but UCL would expect NHS Digital to compare these with up to 5 postcodes from the PDS).

• A linked file will be created, with the pseudo-study ID provided by Cafcass and the matching NHS number and postcode as recorded in PDS.

• The national data opt-out will then be applied to this linked file and anyone from the Cafcass cohort who has opted out of having their data shared for research or planning purposes will be removed from the linked file.

• UCL will use HES data from an existing data extract (NIC-393510-D6H1D) which is currently held by UCL to create a minimised extract containing HES APC, A&E, Outpatient data and Civil Registration data on only women who had a record of a live birth between the ages of 15 and 50 years old between 1 April 1997 and 31 March 2017.

NHS Digital will disclose a file of pseudonymised HES-IDs for women within the Cafcass dataset to UCL to enable linkage with this existing HES-Civil Registration extract for the purpose of this study. The NHS Digital file will not contain the NHS Number or the postcode.

• NHS Digital will retain the identifier file of all individuals linked in Cafcass-PDS and PDS-HES and all the postcodes used in linkage and postcode dates for 12 months to address data queries. This data set will not contain any attribute data and will be held separately from the final analysis file.

• NHSD will transfer to UCL a list of pseudonymised study IDs for women who link and encrypted HES-IDs for women who link and women who do not link.

There will no requirement or attempt to re-identify individuals.

Data processing is only carried out by substantive employees of UCL who have been appropriately trained in data protection and confidentiality. The data requested will be kept in UCLs Data Safe Haven (DSH). A file transfer mechanism enables information to be transferred into the Safe Haven simply and securely.

The UCL DSH uses Dual Factor Authentication to access and handle data transferred into the DSH service. This ensures that only the named applicants will have access to the data from DSH. Removing data from the Data Safe Haven is only allowed for the Principle Investigator.

UCL will not attempt to link the Cafcass cohort linked to HES and Civil Registration data disseminated under NIC-393510-D6H1D to any of the other datasets held under NIC-393510-D6H1D.

Analysis of the linked dataset will involve the following steps:
Question 1: Are there health characteristics of mothers that are associated with a high likelihood of care proceedings?

• As a first step, UCL will determine the proportion of mothers in the linked Cafcass-HES cohort who have prior indicators of vulnerability (history admissions for mental health problems, injuries related to self-harm, drug or alcohol misuse or violence, exposure to violence, chronic conditions, or young age (<20 years) at (previous) live birth).
• UCL will determine the frequency of hospital contacts (planned and unplanned) for this group of women and describe how this relates to timing of pregnancy and birth, as well as the care proceeding. The findings will inform policy makers about the potential for health services to offer more holistic care for this population of women before they are involved in care proceedings (e.g. are indicators of vulnerability recorded before pregnancy?), and whether proactive care could be warranted during care proceedings (e.g. if women are at risk of exacerbation's such as self-harm admissions during proceedings).

From previous research as part of the Children Policy Research Unit (NIC-393510) UCL will have information on the sub-cohort of mothers with history of risk factors for child maltreatment (e.g. mental health, alcohol/illicit drug, or violence related hospital admissions) as compared to the whole population, for instance the rates and ages at first live birth.
• The main analyses for question 1 will determine whether there are specific factors in the healthcare records before and during birth and immediately after that could identify groups of mothers who might benefit from targeted early intervention. By having a comparator group of mothers not involved in court proceedings, UCL will quantify the relative importance of these risk factors, and identify potential points of intervention for healthcare services (e.g. at antenatal care, or during hospitalisation prior to pregnancy or birth).
• In secondary analyses, UCL will determine whether associations between healthcare risk factors and having a record of care proceedings are affected by indicators of unmet healthcare need such as emergency admissions to hospital or serious healthcare conditions in the mother. UCL hypothesise that contact with healthcare services provides opportunities for interventions, such as the provision of contraception to delay future pregnancies, or treatment for mental health problems, which might improve capacity to parent.
• UCL will use multiple regression techniques to study these associations. This will provide UCL with an estimate of the likelihood of care proceedings in each risk group, adjusted for risk factors such as area-based deprivation levels and underlying health needs.

Question 2: Among mothers involved with care proceedings, what characteristics are associated with time to subsequent pregnancy, adversity related admissions and adverse outcomes related to court?
• Two key outcomes for question 2 are the timing of subsequent pregnancies after an initial set of care proceedings and involvement in recurrent care proceedings, given a further pregnancy. From previous work, UCL would expect that approximately 25% of mothers will have recurrent care proceedings, but it is not currently known how many women have subsequent pregnancies.
• UCL will also analyse a range of secondary outcomes obtained from the health data for mothers, including rates of hospital admission and death. Assessing these outcomes will allow UCL to examine how indicators of healthcare need are associated with the primary outcome ʹtiming of subsequent pregnancies and care proceedings. Findings from these analyses will inform inferences about whether input from healthcare might help prevent further care proceedings.
• UCL will use multivariable survival analyses to determine time to next pregnancy and recurrent care proceedings, and whether this is related to underlying risk factors evident before or during the index care proceedings. UCL will also explore whether any associations are mediated through healthcare events or needs that are indicated by HES records after the first set of care proceedings. For example, UCL will determine whether risk factors such as maternal age at first birth, presence of other children, underlying chronic health conditions, a history of injury due to self-harm, drug or alcohol misuse or violence, are associated with the timing of subsequent pregnancy and recurrent care proceedings and whether these risk factors could be used to identify mothers most likely to experience further care proceedings.
• The analyses will quantify the relative contribution of health events and healthcare needs for mother and/or child to the likelihood of recurrent care proceedings. If UCL find strong associations, UCL could hypothesise that health interventions postpone the timing of subsequent pregnancy and reduce the likelihood of recurrent care proceedings. The findings will provide evidence on the potential for healthcare interventions to reduce the risk of subsequent care proceedings.

When describing and analysing differences in secondary care service use among the study and control cohorts, UCL will use proxy measures for potentially unmet healthcare need such as having recurrent A&E attendances, readmissions to hospital, and emergency admissions. In addition, UCL will look at reasons for and length of emergency admissions; for example, 0-1 night stays, ambulatory care sensitive conditions as well as admissions related to mental health, substance misuse and exposure to violence have been used previously to identify potentially avoidable admissions to hospital.

However, UCL also recognise the limitations of the data. As UCL attempt to answer these policy questions, UCL expect to develop hypotheses that could be addressed in further research focussing on evaluating interventions to respond to and reduce unmet healthcare need in the study population.

UCL have formed a Project Advisory Group that includes practitioners across health, social care and family justice to support the interpretation of study findings. In addition, similar work by researchers in Wales using Cafcass Cymru linked to NWIS (NHS Wales Informatics Service) data within the SAIL data bank will offer opportunities to compare findings across England and Wales to strengthen UCLs interpretation of study findings.

No new data will flow to UCL (University College London) for the purpose of this study. The data for the control cohort has already flowed to UCL under DARS NIC-393510, therefore UCL are not requesting any additional data on hospitalisations, A&E attendances, outpatient appoints, or civil registrations to flow between NHS Digital and UCL.

UCL requires data on the whole-population (i.e. all women with a live birth episode in HES between 15 and 50 years old in England) to accurately quantify sociodemographic and health differences between mothers in England who are involved in s31 applications (i.e. who link to Cafcass) and mothers who are not. UCL also expect these differences to vary by region and by local authority as children’s social care practice and local population characteristics vary considerably across England, as well as by time (i.e. 2007 to 2019). Understanding differences by area and time are crucial to ensuring any study finding are generalisable and relevant to both national and local policy makers and service planners.

For some analyses UCL must use statistical matching methods to match mothers in HES who link (to Cafcass) to controls (mothers in HES who don’t link to Cafcass) based on a number of variables in HES (e.g. age at first HES birth episode, age and number of children at baseline, ethnicity, local authority of residence, comorbidities recorded in HES etc). The variables used in matching, as well as the timing for ‘baseline’ measurements, will vary by analyses dependent on the study outcome. In order to achieve balance across matching variables, UCL require a large pool of controls representative across all local authorities in England. In addition, for particularly rare study outcomes (e.g. such as maternal mortality and adversity-related admissions) UCL require a sufficient number of controls to robustly model outcomes.

The proposed control group would not result in any additional data flowing between NHS Digital and UCL, and a lack of a whole-population control group would impact feasibility of the whole study and UCLs ability to meet the objectives of this study, to ensure that analyses are robust, and to inform policy and practice at the local as well as national level.

Assessing the utility of healthcare systems data for trials: data utility comparisons in the STAMPEDE trial (DUCkS)(previously: ODR1718_094) — DARS-NIC-656801-R9F6Z

Opt outs honoured: Yes, No (Excuses: Mixture of confidential data flow(s) with consent and flow(s) with support under section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 s261(2)(c); Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'., Health and Social Care Act 2012 s261(2)(c)

Purposes: Yes (Academic)

Sensitive: Non-Sensitive, and Sensitive

When:DSA runs 2022-10 – 2025-10 2022.11 — 2023.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 13 October 2022 final.pdf

Datasets:

Emergency Care Data Set (ECDS)
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
NDRS Cancer Registry
NDRS Linked HES A&E
NDRS Linked HES APC
NDRS Linked HES Outpatient
NDRS National Radiotherapy Dataset (RTDS)
NDRS Systemic Anti-Cancer Therapy Dataset (SACT)
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)
NDRS Cancer Registrations
NDRS Linked HES AE

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) are requesting NHS Digital record level data for the Study: "Assessing the utility of healthcare systems data for trials: data utility comparisons in the STAMPEDE* trial (DUCkS)"

*STAMPEDE (Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy) is a clinical study. STAMPEDE aims to identify new treatments for prostate cancer.

Data for this study has previously been share when the data were controlled and managed by Public Health England (PHE). PHE facilitated data release via its Office of Data Release service (ODR). ODR was responsible for providing a common governance framework for responding to requests to access PHE data for secondary purposes, including service improvement, surveillance and ethically approved research. All requests to access data were reviewed by the ODR and were subject to strict confidentiality provisions. The responsibility for the management of the National Disease Registration Service of which the National Cancer Registration and Analysis Service is a part, transferred from PHE to NHS Digital on 1st October 2021. The STAMPEDE study previously accessed data via Public Health England under the reference: ODR1718_094.

The Medical Research Council Clinical Trials Unit at University College London (MRC CTU at UCL) is the Data Controller, who also processes data for this study. Herein forward the Data Controller shall be referred to as University College London or UCL. UCL is the trials Sponsor and delegates sponsorship responsibilities to MRC CTU at UCL.

The STAMPEDE study has received funding from Cancer Research UK and the DUCkS study has received funding from Health Data Research (HDR) UK. HDR UK and Cancer Research UK do not determine the purpose(s) the data will be processed for nor undertake any role which defines them as a Data Controllers. HDR UK and Cancer Research UK will also have no access to NHS Digital data other than in the form of aggregated and suppressed tabulations (as per the HES analysis guide).

***THIS VERSION (v1): OCTOBER 2022***
This is a request to Renew and Amend a Data Sharing Agreement (DSA) with University College London.

In order for trials to use healthcare system data, UCL are requesting record level NHS Digital data in order to assess the utility of the data by comparison with trial collected data. The study team intend to do this for data from approximately 10,500 participants from within the STAMPEDE trial.

STUDY AIMS
University College London aim to assess the concordance agreement between traditional trial-specific data collection and healthcare systems data (routinely-collected healthcare data) in approximately 10,500 STAMPEDE participants. The analyses will involve assessment of five objectives:

1. Assessment of survival,
2. Chemotherapy treatments,
3. Radiotherapy treatment,
4. Second-line treatment,
5. Toxicities.

UCL are therefore requesting NHS Digital record level health data linked to specific participants of the STAMPEDE trial. The study team will curate the data to be analysed against the five objectives above, working through each in turn.

NEW DATA REQUESTED
UCL are requesting access to the following NHS Hospital Episode Statistics (HES) data sets:
- HES Outpatients (OP)
- HES Admitted Patient Care (APC)
- HES Accident and Emergency (A&E)
- Emergency Care Dataset (ECDS)

The HES data provided from NHS Digital provides additional fields to those previously received from PHE therefore this will enrich the data previously held. To date, approximately 10,500 people have joined STAMPEDE within England and Wales, and recruitment continues (though the study Arms A, H-L are due to close their recruitment around September-October 2022) but this agreement is for a one-off extraction of record-level data, and the study does not anticipate returning for health data on those recruited after the cohort has been provided to NHS Digital under this agreement.

UCL are also requesting further access to the following National Cancer Registration and Analysis Service (NCRAS) National Disease Registration Service (NDRS) datasets (formerly available via Public Health England (PHE):
- NDRS Radiotherapy Data Set (RTDS)
- NDRS Systemic Anti-Cancer Therapy (SACT)
- NDRS Cancer Registry

These data sets, holding information on patients treated with radiotherapy or chemotherapy, allow the audit build of the full picture of the treatment provided to cancer patients, in-depth analysis of specific regimens and changes to prescribed treatments. It allows the exploration of whether the radiotherapy and chemotherapy data items, collected by the Audits from hospitals, are appropriate and necessary. Should particular data be available in existing, national data sets, these data items could be removed from the data collection, to ease the burden on data providers (hospital staff). RTDS data is linked to Cancer Registry data using the Patient ID pseudonym; it is not possible to request RTDS data in isolation without this linkage, therefore NDRS Cancer Registry has also been requested.

In order to establish a full picture of the data, the study team are requesting HES and NDRS data from 2005/06 to 2021/22 (where datasets will allow) for approximately 10,500 participants from the STAMPEDE trial.

PREVIOUSLY HELD DATA
To note: The study team currently hold data which was collected via PHE (under ODR1718_094), for the period 2005/06 to 2017/18 for a historic (previously submitted) cohort. This includes:
- NDRS RTDS
- NDRS SACT
- NDRS Linked HES A&E Data
- NDRS Linked HES Inpatient Data
- NDRS Linked HES Outpatient Data

PATIENT AND PUBLIC INVOLVEMENT AND ENGAGEMENT (PPIE)
PPIE was sought to discuss the use of routine data of trial participants recruited before 2013, where it was unknown if they also consented to data linkage (optional participation). Two focus groups of twelve and seven people met in July and September 2021 respectively. They were recruited by Prostate Cancer UK and Cancer Research UK. All who attended have been affected by cancer in some capacity, either a patient or a family member.

Overall, the groups expressed agreement that it was acceptable to access routine data of trial participants without consent provided there was transparency, i.e. information stating that this was being done in line with the reasons given on the trial website. As only a few trial participants (up to 12) did not agree to linkage of their routine data before 2013 and it was not known why, the groups felt it was acceptable to use the routine data as most of the trial cohort would have agreed, and it was not practical to go back to them to ask for retrospective consent. Some felt involvement in the clinical trial should automatically allow for the use of routine data, so participants would have to opt-out or withdraw from the trial if they didnt agree. Consequently, the privacy notice was revised on the trial website to clearly explain the use of routine data without explicit consent to linkage.

Members discussed how the patient information sheet and consent form should clearly state that routine data will be accessed and that participation in the trial would permit access to the records held by NHS Digital and other data providers. Both documents have provided this information about data linkage since 2013 when all patient-facing documentation was updated.

COMMON LAW DUTY OF CONFIDENTIALITY
The study team at UCL will be providing one cohort for this request containing approximately 10,500 individual records. Within this cohort, approximately 1,600 individuals will be flagged as submitted under Section 251.

The separate legal bases for dissemination are as follows:
1. Informed Consent
Consent to data linkage has been sought for approximately 8,900 participants, spreading over all arms of the STAMPEDE trial. NHS Digital will not apply National Data Opt-Out for these participants and are content that the consent materials are compatible with the flow of data described in this agreement.

2. Section 251 Support from the Confidentiality Advisory Group (CAG)
For STAMPEDE trial arms A-G, approximately 1,600 participants were recruited, however consent for data linkage was not recorded. The dissemination of data for these participants is covered by the section 251 approval which has support from the Confidentiality Advisory Group (CAG). CAG application reference is 21CAG0048. This cohort of participants will have National Data Opt-Outs applied.

NOTE: to date five participants have directly opted out of data linkage by informing the study team. These five and any future participants who directly opt-out of data linkage with the study team will be removed from the cohort and record-level data on these participants will not be disclosed by NHS Digital.

LAWFUL BASIS FOR THE PROCESSING OF PERSONAL DATA (GDPR)
The Data Controller will process the data under GDPR Article 6 (1) (e) - Processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller. As a higher education establishment, the University conduct research to improve health care and service and the linkage requested is necessary for the performance of a task carried out in the public interest; i.e. improving the treatment offered to men with prostate cancer.

Additionally, under GDPR Article 9(2)(j) processing of Special Category Personal Data is necessary for archiving for research purposes. Data minimisation process is being followed and only data that is required specifically for the purposes of this study has been requested, to protect the rights of the data subjects.

-----------------------
Version 0 (JULY 2022) - This request was previously handled by Public Health England (PHE) under the reference ODR1718_094.

OVERVIEW
Project Aims:
1. To identify better ways of obtaining clinical trial outcome data to (A) Repeat reported STAMPEDE analyses, to validate the PhD project's algorithm capabilities and the trial results; (B) Perform new, secondary analyses not possible with conventionally-collected trial data, including but not limited to, rates of neutropenic sepsis when chemotherapy is given at different times in their disease or cardiac events or hormone therapy.
2. To develop a clinically useable tool (e.g. algorithm), for use in a clinical trial, to accurately identify disease-driven events and trial outcomes.
3. To help reduce the burden of collecting clinical trial data from traditional patient and clinician contact. By using data that has already been accurately collected by the NHS, it may be possible to improve timeliness, reduce costs and save resources that can be used elsewhere.

Objectives:
1. Develop, enhance, and validate a methodology to calculate trial outcomes from EHRs. Can the new tool accurately detect disease-related events and trial outcome events, and therefore successfully identify treatment effect differences? The tool must:
a. Either be able to use HES data alone or use non-HES-based data in addition
b. Be clinically useable for planned applications
2. Identify if the routine data can be used directly for clinical trial follow-up through comparison assessments of specific clinical events and treatments recorded in routine data and trial data (survival, chemotherapy, radiotherapy, second-line treatment, toxicities including safety events).
a. If yes, then utilise this data for routine trial follow-up.

COMMERCIAL BENEFIT AND TRANSPARENCY
For transparency it is noted here that the following pharmaceutical companies provided discounted or free drugs used in the STAMPEDE Trial and funding of Educational grants in return for accessing the final primary outcomes report and record-level trial-relate data (NOT ODR/NHS Digital data), but had no influence over how the trial was conducted nor how outcomes were reported.
> Sanofi Aventis, Novartis, Pfizer, Janssen, Astellas and CLOVIS.

These pharmaceutical companies have no involvement with the DUcKS analysis in this agreement.

Yielded Benefits:

Data for this study has previously been shared when the data were controlled and managed by Public Health England (PHE). As such there are some yielded benefits to be observed from the access to the data for the study prior to NHS Digital becoming data controller. These yielded benefits are noted below. Cancer progression and recurrence is not routinely recorded within patient health records, and so it is difficult for clinical trialists to accurately estimate time to cancer progression. Time to cancer progression is often used as a primary trial outcome to determine if a trial intervention is effective or not. The development of an algorithm that uses routinely collected health data to better estimate fact and time of cancer progression should allow trialists to test if a cancer treatment is effective. The data previously obtained from PHE (obtained under ref: ODR1718_094) has enabled MRC CTU at UCL to develop and test an algorithm to estimate cancer recurrence. The results have been written up in a PhD thesis and are also expected to be published as a peer-reviewed research article in 2023.

Expected Benefits:

Benefits type: Knowledge about use of healthcare systems datasets for clinical trials.

The DUCkS trial is in place to assess the utility of healthcare systems data for trials.

The results of the DUCkS project aims to support trialists, funders and NHS Digital in the better use of HES, ECDS, and NCRAS data in clinical trials. Dissemination of the results from the DUCkS study aims to enable the trials community to understand if centrally-collated national datasets can replace trial-specific data collection of important outcomes such as chemotherapy and radiotherapy treatments and the occurrence of serious safety events (for example cardiovascular events and toxicities)

Should the DUCkS study prove that data collection through national datasets is effective, this has the potential to improve efficiency of trials, saving time and money. Although the exact savings for the NHS cannot be predicted, it is hoped this will be achieved through research nurses/practitioner/doctors spending less time on data collection for clinical trials. Consequently, trials could potentially be completed more quickly with less missing data and fewer participants lost to follow-up. This supports innovation and faster research and development of better treatments.

It is hoped that the project could form a basis for HDR UK research activities in the "Transforming Data for Trials" programme over the 5 years from 2023. One of the key aims is to transform the utility of healthcare systems data for trials.

The DUCkS project hopes to pave the way for others to conduct data comparison studies which could determine the utility of other datasets to support clinical trials. Additional studies should provide the trials community with further evidence of which healthcare datasets can be used for specific trial outcomes. This will allow more trials to be designed using these data as the main resource, minimising time and effort by healthcare workers to collect this information. Instead, they would have more time to provide care to patients.

Outputs:

The project aims to complete the analyses by April 2023. The study team plan to write up results (containing aggregate data with small numbers supressed) for publication and dissemination. The study team intend to submit papers to journals by Autumn 2023 and it is hoped these will be ready for publication around Autumn 2024.

If subsequent funding is secured, the study team may wish to retain data to further extend via extension to the agreement the data comparison studies to investigate the utility and completeness of data for other key clinical events in relation to the STAMPEDE trial.

The MRC CTU at UCL plan to communicate the methods used and the results of these data comparison studies via:
- Presentations at major international and national scientific conferences
- Publications in high-impact peer-reviewed journals
- Training workshops disseminating study results to trialists, funders and other key stakeholders.
- Results may also be made available via the HDR Innovation Gateway to enable researchers to decide data relevance for their studies.

The study team also aim to communicate the main STAMPEDE trial results via a booklet to trial participants and to share findings via the trial's website.

The study team plan to write a paper of the methodology used and a paper of the results. The aim is to disseminate the results via workshop around March 2023.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by Personnel (as defined within the Data Sharing Framework Contract). There will not be any access to the data by any third parties.

This application is to request the renewal of NCRAS data (previously provided under the agreement with the Office for Data Release ODR1718_094), and a request for additional HES data to cover a new cohort of approximately 10,500 individuals.

The data flow will be as follows:
1. University College London will send two cohort files (of approx. 10,500 patient records in total) each with the identifiers listed below to NHS Digital securely via Secure Electronic Transfer (SEFT). One cohort file will contain individuals processed under Section 251 and one cohort file will contain individuals processed under consent.
2. The NHS Digital data production team will link patient identifiers to the NHS Digital datasets (HES OP, HES APC, HES A&E, ECDS, NCRAS Cancer Registrations, SACT and NCRAS RTDS)
3. NHS Digital will apply National Data Opt-Outs to the cohort of participants received under s251 (approx. 1,600).
4. The NHS Digital production team will then remove identifiable patient information from the linked data.
5. The NHS Digital production team to securely send the de-identified linked data, containing study ID to the data recipient at the University College London via SEFT.

To facilitate the linkage of the STAMPEDE cohorts to HES and NCRAS data, the study team at University College London will securely transfer the following identifiers to the NHS Digital data production team:

FOR CONSENTED COHORT
Study ID (STAMPEDE trial participant identifier)
NHS Number
First Name
Last Name
Postcode

FOR SECTION 251 COHORT
Study ID (STAMPEDE trial participant identifier)
NHS Number
Date of Birth
Postcode
There is no linkage field for gender as the entire trial cohort is male.

These pseudonymised datasets with the Study ID will be sent to the MRC CTU at UCL using the specified transfer method above. The data will reside in UCLs Data Safe Haven and will be identified by Study ID only, thus there will be no identifying personal data attached to a study number. Only defined members of the DUCkS study team and MRC CTUs methodology team will have access to Data Safe Haven for data analysis - all are either substantive employees of UCL or Chief Investigator, Comparison Chief Investigators or clinical delegates involved in the blinded review of clinical information for this project only who are under a contract with UCL. All UCL substantive employees have completed training in data protection and confidentiality, and users of Data Safe Haven receive appropriate training before being granted access.

MRC CTU at UCL is not permitted to re-identify individuals under this agreement.

DATA OPT OUT
>National Data Opt-Out
NHS Digital record level data sought under s251 under this agreement is subject to National Data opt-out. If an individual has evoked their right to opt-out from the use of their data for research or planning purposes (the National Data Opt-Out) their data will not be released under this Agreement. This will not apply for participants in the cohort who have provided informed consent.

> NDRS specific Data Opt-Out
NCRAS data is subject to both the NDRS Opt-Out as well as National Data Opt-Out. If an individual has indicated that they wish to be excluded from the national cancer registry, their data will be permanently removed from all NCRAS datasets within 20 days of receipt. This means that for this agreement, for both the cohort under s251 and the consented cohort, no NCRAS data will be released for any person who has registered an NDRS specific opt out which has been completed.

DATA STORAGE AND ANALYSIS
The data will be held on UCLs Data Safe Haven using UCL approved computers. The Data Safe Haven is UCLs technical solution for transferring and storing research information that is highly confidential. It meets the requirements of the NHS Digital DSP Toolkit and ISO 27001 Information Security standard. Access is controlled by the Information Asset Owner, and all UCL staff complete training in confidentiality and data protection, which is renewed annually.

Statistical data analysis will be carried out via UCL owned devices connected to the secure UCL network remotely, using an appropriate statistical package. To remotely access the server requires a secure 2-factor authenticator (VPN) and users are then able to securely access the secure server on the Universitys IT framework. All data analysis will be conducted within the confines of the Universitys secure Data Safe Haven, and will not be downloaded to remote devices for storage or processing. The export of record-level data is not permitted from the Data Safe Haven under access restrictions.

HES and ECDS DISCLOSURE CONTROL / SMALL NUMBER SUPPRESSION
In order to protect patient confidentiality, when presenting results calculated from HES record level data, outputs will contain only aggregate level data with small numbers suppressed in line with HES Analysis Guide. When publishing HES data, data processors must make sure that:
National-level figures only may be presented unrounded, without small number suppression
cell values from 1 to 7 (inclusive) are suppressed at a sub-national level to prevent possible identification of individuals from small counts within the table.
Zeros (0) do not need to be suppressed.
All other counts will be rounded to the nearest 5.
Data will not be made available to any third parties other than those specified except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

MR623 - NATIONAL MOTHER AND CHILD COHORT — DARS-NIC-148128-815J1

Opt outs honoured: Y, No (Excuses: Statutory exemption to flow confidential data without consent)

Legal basis: Health and Social Care Act 2012, Health and Social Care Act 2012 s261(7); Other-Regulation 3 of The Health Services Regulations 2002

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2017-03 – 2020-11 2016.12 — 2022.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL), NHS ENGLAND (QUARRY HOUSE)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 27 October 2022 finalv2.pdf, IGARD Minutes - 6 October 2022 final.pdf, IGARD Minutes - 4 August 2022.pdf, IGARD Minutes - 9 December 2021 final.pdf, IGARD Minutes - 5th August 2021 final.pdf

Datasets:

MRIS - Cohort Event Notification Report
MRIS - Scottish NHS / Registration
MRIS - Cause of Death Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
Cancer Registration Data
Civil Registration - Deaths
Civil Registrations of Death

Type of data: Identifiable

Objectives:

The data supplied by the NHS IC to UCL Institute of Child Health will be used only for the approved Medical Research Project, National Mother and Child Cohort.

Yielded Benefits:

Flagging methods have been established and 100 death and cancer event notifications received for 8877 children reported in the NSHPC with a flagged status until 2017.

Expected Benefits:

It would not be appropriate or valid to share project results at this point in this long-term surveillance activity. The planned later dissemination of findings following access to data under a future version of this Agreement will be in the public interest for the following reasons:

1. The vast majority of children born to women living with HIV were not only exposed to HIV in utero but also to antiretroviral drugs, used to prevent vertical (mother-to-child) transmission and to treat maternal HIV disease. All people with HIV require lifelong treatment with antiretroviral drugs (ARVs) and it is recommended that treatment starts as soon as a new diagnosis is made; early initiation of ARVs with good adherence to medication should result in achievement of life spans similar to those seen in uninfected people. Specifically, for pregnant women, alongside the benefits for their own health, widespread and early use of ARVs has resulted in the risk of vertical transmission decreasing from around 18-25% without treatment to 0.2-0.3% in the current treatment era. The enormous benefits of ARVs within and outside pregnancy are therefore undisputable. However, ARVs have both known and unknown safety concerns when used in pregnancy, highlighting the importance of pharmacovigilance in this uniquely exposed population. Antiretroviral drugs have had reported mutagenic and carcinogenic effects, in addition to haematological and mitochondrial toxicities. The recent safety signal of increased risk of neural tube defects with periconception use of the integrase inhibitor Dolutegravir highlights the evidence gaps around use of newer ARVs in pregnancy. Understanding the effects of HIV and ART exposure in foetal and perinatal life will inform treatment guidelines and contribute to risk-benefit analyses of the use of different combinations of ARVs.

2. There is a growing body of research that suggests children who are HIV-exposed and uninfected (CHEU) have poorer morbidity and mortality outcomes than children HIV-unexposed and uninfected (CHUU). This project will provide an evidence-base for evaluating these outcomes for CHEU in the UK and enable key stakeholders such as PHE and the NHS to design and implement measures that could alleviate health inequities in this growing population.

3. Given that the in utero exposures to ARVs have already taken place in thousands of individuals born to mothers living with HIV, should there be a signal of concern from the data collected then this will require an appropriate, ethical and proportionate response in terms of communication of results requiring expert input from key stakeholders such as the MHRA and PHE.

This project can be considered a first phase in the long-term surveillance of CHEU in England and Wales.

Outputs:

No new outputs will be produced under this Data Sharing Agreement.

There will be no dissemination of results under this extension. The nature of this project is to facilitate long-term surveillance with respect to the incidence of cancers and of survival in CHEU through the establishment of the National Mother and Child Cohort. Given that these are rare events and with knowledge of the substantial latency period between exposures to potential carcinogens and cancer onset, it is not appropriate to conduct analyses at this time. However, there have been communications activities with respect to the project methods with key stakeholders, including Public Health England (who are now commissioning this work), the European Medicines Agency, the Medical and Healthcare Products Regulatory Agency, researchers and the wider HIV community.

UCL have achieved the aim of the initial project, which was to establish feasibility and methods for the use of national cancer and death registration data to monitor adverse health outcomes in CHEU in England and Wales through the establishment of the National Mother and Child Cohort, via linkage with the NSHPC database. Methods for matching appear to be sufficiently robust to continue this longitudinal surveillance activity. There have been 100 death and cancer events reported for the cohort to date. The first cancer event was reported in 1996 and cancer event reporting has demonstrated no concerns (e.g. around missing data).

Processing:

Under this Agreement, the data may be securely stored but not otherwise processed. No new data will be provided by NHS Digital under this Agreement.

The study data, including data provided by NHS Digital under previous agreements, are currently held by the University College London.
The following provides background on the processing activities undertaken prior to this Agreement:

Stage 1: Extraction
The dataset of children to be flagged was extracted from the NHSPC database (input dataset) including the following variables: unique study ID, child date of birth, multiple/singleton birth, sex, NHS number, hospital of birth, birthweight, mothers partial postcode (district of residence at delivery), mothers date of birth and mothers country of birth. Encrypted datasets were then emailed to the ONS Newport Mortality Team Leader via email. ONS Newport then sent the ONS input dataset (containing the variables detailed above in the input dataset) to ONS Titchfield where matching procedure on the births/deaths registration database (BDRD) was carried out.

Stage 2: Matching algorithm
The matching algorithm structure was a hierarchical set of six match types based on variables both in NSHPC database and the BDRD: child NHS number, date of birth and sex, mothers date of birth and partial postcode.

Matching Algorithm
Match type 1) Childs date of birth, sex and NHS number
Match type 2) Childs date of birth, sex and mothers date of birth
Match type 3) Childs date of birth and mothers date of birth
Match type 4) Childs date of birth, sex and mothers partial postcode
Match type 5) Mothers date of birth, sex and mothers partial postcode
Match type 6) NHS number

Child birthweight, mothers country of birth and variables used in matching algorithm, as well as the match type and number of matches found for the child were provided in the output dataset. The output dataset was emailed in a password protected excel file via ONS London to the researcher at NSHPC.

Stage 3: Confirmation of matches
To assess whether correct matches were made the NSHPC team ran a programme to confirm all matches using probabilistic methods. This was necessary because NHS number was not available for all children reported to the NSHPC. A password protected file including only those cases to be flagged (confirmed matches) were then emailed back to ONS Newport.

Stage 4: Flagging Subjects on NHS Central Register (NHSCR)
A dataset with children to be flagged was then sent by ONS Newport to the NHS Central Register where the for tracing. Their records were traced on the NHSCR and flagged with the National Mother and Child Cohort identifier. The decision was taken to use a generic flag name (i.e. National Mother and Child Cohort Study), with no mention of HIV.

Stage 5: Event Notifications
Once confirmation of flagging was received, a members and posting (M&P) listing was requested from the Medical Research Information Service (MRIS). The MRIS team sent a named team member in the NSHPC event notifications annually. For death registration, date and cause of death was provided and for cancer registration, year of diagnosis, site and type of cancer was provided with a pseudonymised identifier.

Stage 6: Updating NSHPC Database
A flagging table on the NSHPC database contained pseudonymised data on all children flagged in the National Mother and Child Cohort. Event notifications were also stored on a table within the database in a pseudonymised format.

The NSHPC database is currently stored on the AIMES secure ISO27001 environment. All NHS Digital data files from processing are stored in UCLs secure ISO27001 Data Safe Haven. Data is not linked to any third parties, mortality data received to date was used to update the NSHPC (data processors and controllers) database with confirmation of deaths on the paediatric and maternity surveillance interfaces. Data on cancers were not stored on the NSHPC database as there were few events; this data was stored in UCLs secure ISO27001 Data Safe Haven.

Data processing is conducted solely by employees of University College London, specifically members of the ISOSS/NSHPC team and accessed in the secure ISOSS/NSHPC office at the UCL GOS Institute of Child Health. Mandatory annual information governance and GDPR training is required by all UCL staff. The ISOSS/NSHPC team members also complete Level 1 NHS Digital IG training and Data Security Awareness on an annual basis.

Family, household and environmental risk factors for hospital admissions in childhood — DARS-NIC-234656-C3J1D

Opt outs honoured: Yes - patient objections upheld, Yes, No (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7),

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2019-04 – 2022-04 2020.01 — 2022.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: CITY UNIVERSITY OF LONDON, UNIVERSITY COLLEGE LONDON (UCL), CITY, UNIVERSITY OF LONDON, UNIVERSITY COLLEGE LONDON (UCL), CITY ST GEORGE'S UNIVERSITY OF LONDON, UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-4th-april-2019---final.pdf, igardminutes-4thfebruary2021final_.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Demographics
HES-ID to MPS-ID HES Admitted Patient Care
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

In this project, researchers at University College London (UCL) will examine environmental and household risk factors for preventable hospital admissions in children, and whether children whose parents were born abroad face barriers to accessing preventive primary and community health services, which in turn leads to the need for hospital admission. These analyses therefore extend the analyses for the related project NIC-10094-P6P4B-v4.2.

The data disseminated under DARS-NIC-10094-P6P4B was for the use of the data for a specific project led by City, University of London entitled ‘Births and their outcome: analysing the daily, weekly and yearly cycles and their implications for the NHS’. The aims of this project were to compare daily, weekly and yearly variations in numbers of spontaneous and other births by time, day and season of birth and to compare variations in rates of adverse outcome. The project involved birth registration records already linked by ONS to birth notification records, previously known as the NHS Numbers for Babies (NN4B) records being linked to HES data, this data was then added to a database. This database adds value to the HES data by adding further data items from birth registration and notification and more complete data for important data items, notably birthweight and gestational age where HES data was incomplete. It was created to enable ‘analyses relating to inequalities in the outcome of pregnancy and information for service users about the outcome of midwifery, obstetric and neonatal and related health care’.

UCL researchers will use an existing birth cohort of all babies born in England between 2005 and 2014 for this project. This birth cohort is available in the Office for National Statistics (ONS) Secure Research Service (SRS) as a result of linkage of:
-ONS birth registration data
-ONS death registration data
-ONS stillbirth registration data
-NHS birth notification data
-HES APC data for mothers and babies (NHS Digital data)

These datasets were linked by researchers at City, University of London for a previous project covered by Data Access Agreement NIC-10094-P6P4B-v4.2. The data controller for the existing birth cohort is City, University of London, who agreed that this data could be used for this amended purpose by UCL. This is why City, University of London is also the joint data controller for this project, together with UCL. The processing for this project will be carried out by ONS, who will link the existing birth cohorts to further datasets, and UCL, who will carry out the analyses on the linked, pseudonymised data. This is why UCL and ONS are the data processors. The identifiers (maternal and child dates of birth, NHS numbers and maternal postcodes at delivery) are kept in a separate database, accessible only by ONS staff members in the Health Analysis and Life Events Team.

UCL are asking NHS Digital for permission to use the existing linked database to extend the analyses carried out for project NIC-10094-P6P4B-v4.2, which focused on the role of birth hospital and obstetric practices on child health outcomes, including mortality and hospital admissions during infancy. UCL researchers want to build on these results to examine further risk factors for potentially preventable hospital admissions in young children. The research objectives are to:

1. examine the environmental factors e.g. air pollution, type of housing, indoor air pollution (measured via type of heating) and occupancy rate to the risk of respiratory hospital admissions in the first two years of life.
The research questions for this objective are:
1a) what is the association between air pollution exposure during pregnancy and risk of admission for respiratory tract infections in children?
1b) What is the association between adverse housing conditions (for example overcrowding, indoor air pollution exposure) and risk of admission for respiratory tract infections in children?

2. determine the impact of family factors i.e. country of birth, year of migration and knowledge of English to the risk of hospital admissions in children, focusing on hospital admissions which are considered preventable via timely access to preventive primary and community health services.
The specific research questions for this objective are:
2a) what is the association between parental country of birth and risk of hospital admission for conditions preventable through community or primary care in children?
2b) how do parents' knowledge of English/English proficiency (self-assessed from Census data) and length of residence in the UK affect the association between parental country of birth and risk of hospital admission in children for conditions preventable through community or
primary care?

To carry out these two projects, UCL are proposing to link the existing birth cohort, via the ONS birth registration records, to:

a) 2011 Census data
b) Air pollution data (Freely available data on annual averages of 8 pollutants modelled at 1x1 km grids across the British Isles: https://uk-air.defra.gov.uk/data/pcm-data, available 2005-2014; distance to A roads for each postcode in England, available for 2005-2014; daily exposures for ozone, PM2.5 and Nitrogen dioxide modelled at 100m x 100m grid across London only for 2010-2012)
c) building characteristics data, available at postcode level from Energy Performance Certificates: https://epc.opendatacommunities.org/ - available 2005-2014

The Census data will be linked to the existing cohort via the mothers' date of birth, forename and surname and postcode at the time of delivery (as recorded on ONS birth registration records). Only births occurring up to 12 months before and until 12 months after the Census (that is, births between March 2010 and March 2012) will be linked to 2011 Census data. The air pollution and buildings data will be linked via the maternal postcode at delivery - also recorded on ONS birth registration records.

Therefore, although the HES APC or NHS birth notification data in the cohort will not be used for linkage to Census, buildings or air pollution data, the resulting linked dataset will include birth and death registration data, NHS birth notification data, and HES APC data for mothers and babies, Census 2011 data, and postcode-level air pollution and building characteristics data.

The purpose of this application falls under Article 6 (1) (e) of the GDPR and the lawful basis for using information collected routinely for administrative purposes for research is the ‘public task’. This is part of the University’s commitment to ‘integrate research and innovation for the long-term benefit of humanity’. The application also falls under Article 9 (2) (j), as scientific research.

Amendment:
To allow NHS Digital to apply patient objections to the data, as approved by IGARD. ONS will send the encrypted HES IDs disseminated under DARS-NIC-110094-P6P4B to NHS Digital. NHS Digital would decrypt this data, run it through the patient objections process to remove those who have objected. NHS Digital would then encrypt the HES IDs before flowing the data back to ONS. The data processor as outlined in this agreement would only be permitted to access this ‘new’ data.

Expected Benefits:

The aim of this project is to identify household, parental and environmental risk factors for hospital admissions in children. UCL will also calculate population attributable risks for each of these factors, meaning that the relative contribution of, for example, overcrowded housing to emergency hospital admissions can be estimated. The results will allow the design and targeting of interventions, either through the NHS or other organisations (such as local authorities) to reduce the need for children to be admitted to hospital. This project therefore aims to improve the evidence base in order to save money for the NHS, reduce stress for families, and improve children’s access to health services and health outcomes both in the short and long term.

The specific beneficiaries of the project will be:

1) Public health departments in local authorities. This project will provide information on the relative importance of risk factors including housing quality, overcrowding, air pollution and health access barriers to hospital admissions in children, providing a sound evidence base for prioritising spending in order to improve child health. UCL will also carry out subnational analyses to examine whether risk factors for hospital admissions vary according to region within England. The project will also highlight which Census variables are most strongly associated with children’s health status and need to be collected through other data sources when the Census is phased out.

2) The National Health Service (Clinical Commissioning Groups; CCGs): Part 2 of the project will examine whether all children have equal access to healthcare, independent of their parents’ country of birth. If inequalities are found, this will provide evidence for CCGs to improve access to particular preventive health care services (this could include extending interpreting services or setting up outreach clinics).

3) Parents and children using the NHS: This project will provide parents with information about the relative importance of housing quality, air pollution and health access barriers as risk factors for hospital admission in children. This evidence base can empower parents and civil society to push for improvements in local environments, housing quality, or healthcare access.

Outputs:

Expected outputs:

1) an enhanced administrative birth cohort including all children born in England between 2005-2014, linked to HES APC records for themselves and their mothers, air pollution data and building characteristics data, and Census 2011 data for mothers and their resident partners for a subset of children born between 2010 and 2012
2) estimates of the association between ambient air pollution exposure during pregnancy and the risk of hospital admission with respiratory tract infections during early childhood
3) estimates of the association between adverse housing characteristics (eg mould, indoor air pollution, overcrowding) and the risk of hospital admission with respiratory tract infections during early childhood
4) estimates of the association between maternal and paternal country of birth and admission to hospital for potentially preventable causes in children
5) estimates of the association between maternal and paternal knowledge of English/English proficiency (self-assessed from the Census data) and length of stay in the UK and admission to hospital for potentially preventable causes in children

Output 1), creation of the birth cohort, is expected by the end of June 2019. UCL will describe the success of linkage between Census 2011 data and the existing birth cohort, and conduct analyses to validate the linkage, including examining agreement among key variables such as maternal ethnic group and country of birth, that are available both in the existing birth cohort and on the Census records. The resulting Census-linked birth cohort database will be described in a journal publication, such as a Data Resource Profile in the International Journal of Epidemiology.

Output 2), an estimate of the strength of association between air pollution exposure during pregnancy and the risk of hospital admissions for respiratory tract infections: these results will be published in an academic journal. UCL will target high impact journals including BMJ, Lancet or Environmental Health Perspectives. UCL will also present the results at academic conferences, such as Society for Social Medicine or the Lancet Public Health Science Conference. UCL expect that these results would not be available until the end of the period of this three-year data sharing agreement period due to the complexity of analyses.

Output 3). As for output 2).

Output 4), an estimate of the association between mother's and father's country of birth and the risk of potentially preventable hospital admissions: these results will be presented in a scientific journal; UCL will target the Lancet or the Lancet Child and Adolescent Health. UCL will also present scientific conferences, such as the Migration, Ethnicity, Race and Health conference, and the NIHR School for Public Health Research Annual Scientific Meeting. UCL expect these results to be available by December 2020.

Output 5), estimates of the association between maternal and paternal knowledge of English/English proficiency (self-assessed from the Census data) and length of stay in the UK and admission to hospital for potentially preventable causes in children (as for output 4).

UCL researchers will communicate the results to policy makers and arms length bodies, including Public Health England, via meetings organised by the NIHR School for Public Health Research, and the NIHR Children's Policy Research Unit.

UCL Researchers will use several avenues to communicate results to parents. Updates from the project will be posted on a website hosted by UCL: https://www.ucl.ac.uk/child-health/research/population-policy-and-practice/child-health-informatics-group. UCL researchers will also work with the UCL press office to communicate results from the studies via press, television and radio outlets, and via social media. UCL researchers have worked with two charities who support this work: Shelter and The British Lung Foundation. The British Lung Foundation, through their partner organisation The Clean Air Parents' Network have agreed that updates about this study can be placed on their website and Facebook page.

UCL will also hold meetings with parents in collaboration with these two charities to update them on our results and discuss how they can best be communicated to the general public.

The new Policy Research Unit for Children and Families (CPRU), funded by the NIHR, will be setting up a new young people's advisory group who are specifically trained in research using administrative data sources. The principal investigator of this project is also a co-investigator on CPRU. Project 2 (on parents' country of birth and children's access to preventive health services) in particular is highly relevant to the work of the CPRU. UCL researchers will also be able to consult this group of children and young people about the work carried out, and how best to communicate results to parents. Further, UCL researchers will consult the Great Ormond Street Biomedical Research Centre Young Person's Advisory Group, and the Parents' and Carers' Advisory Group regarding how to communicate results to a broad public of parents and children.

Processing:

There are two proposed linkages with the existing birth cohort data:

1) Between the existing birth cohort and the Census 2011 data
2) Between the existing birth cohort and postcode-level air pollution and buildings data.

These are described in detail below. The existing birth cohort data is the result of linkage of Hospital Episode Statistics (HES) admitted patient care (APC) data (data previously disseminated by NHS Digital, and to which this agreement relates) and the following disseminated from elsewhere i.e. not NHS Digital):

-ONS birth registration data
-ONS death registration data
-ONS stillbirth registration data
-NHS birth notification data.

It should be noted that, whilst NHS Digital only disseminated the HES data, NHS Digital is now the data controller for all of the above data.

1 - Proposed linkage between existing birth cohort and Census 2011 data:

The following text explains the linkage between the ONS birth registration data in the existing birth cohort and the Census 2011 data, and the air pollution and building characteristics data. Please note that these linkages are not based on any variables in the HES APC or the NHS birth notifications datasets.

Births between 27th March 2010 and 27th March 2012 (that is, births up to one year before and one year after the 2011 Census) will be linked to data from 2011 Census questionnaires completed by the mothers in the cohort and their partners (or by the household reference person in the household in which the mothers live if this is not the mother or her partner). The proposed linkage can be summarised as follows:

1) Identifiers extracted from birth cohort identifier file & sent to ONS Data Processing Team
2) Identifiers extracted from identifiable birth registration file & sent to ONS Data Processing Team
3) ONS Data Processing team clean, hash and encrypt the identifiers from both files (which are kept separate during processing)
4) The files with encrypted identifiers are transferred to the ONS Data Integration team. The two files will be linked together, then to the Census 2011 data using hashed and encrypted identifiers.
5) Linked file will be transferred to ONS Security and Permissions Team for disclosure assessment
6) Data released to ONS Secure Research Service (SRS) where research team will access it for analysis.

These steps are now outlined in more detail:

1) Identifiers (Baby’s NHS number, Baby’s & mother’s date of birth, Sex of baby, Full postcode,
Study ID) will be extracted by a member of the ONS Health Analysis and Life Events team from the file holding identifiers from mothers and babies in the English national birth cohort and transfer the file to the ONS Data Processing Team.

2) Identifiers (Baby’s NHS number, Baby’s & mother’s date of birth, Sex of baby, Full postcode, Mother’s forename, Mother’s surname, Mother’s maiden name) will be extracted from a dataset of birth registrations held by the ONS Vital Statistics Output Branch in a separate area of the SRS. A member of the ONS Data Integration Team will extract the data and send the file to the ONS Data Processing Team. This step is necessary since linkage to Census requires mother’s first and last name, and this is not available from the birth cohort files.

3) The ONS Data processing team will clean, hash and encrypt the identifiers from the birth cohort identifier file and the birth registration file, using the encryption algorithm described in the document ‘ONS policy for safeguarding data whilst managing Admin Data Research Network projects'. This work will be carried out on the ONS Data Processing Team Secure Server. The files will be processed separately. The original (unencrypted identifiers) will be deleted after encryption.

The encryption algorithm is described in detail in the "ONS Policy for Safeguarding Data Whilst Managing ADRN Projects" (file:///C:/Users/rafa1/Downloads/adrcsafeguardingpaper_tcm77-404473.pdf). In brief, the fields forename, surname, initial, sex, date of birth and postcode are cleaned, then encrypted (hashed) in each dataset before the original identifiers are deleted. The encryption is done by creating 11 matchkeys for each individual, based on combinations of the variables forename, surname, initial, date of birth, sex and postcode, to allow for potential variations in spelling discrepancies, forename/surname transpositions or moving between postcodes. The 11 matchkeys for each individual are then encrypted (hashed). The original identifiers are deleted. It is the encrypted matchkeys that are linked. This is done hierarchically so that if exact agreement is not reached between two records in the two datasets, (using the hashkey for forename, surname, full date of birth, sex and postcode), the algorithm moves on to matching on less exact matchkeys. Once the datasets are linked, the hashed identifiers are also deleted.

4) The two files with encrypted identifiers will be transferred to the ONS Data Integration Team. The data will be linked on the ONS Data Integration Server. The birth cohort file will first be linked to the birth registration file (to add encrypted identifiers based on mother’s name), then to the 2011 Census using the encrypted identifiers including mother’s name. The Data Integration Team hold a copy of the 2011 Census on their secure server, which includes encrypted identifiers and attribute data only (no actual identifiers). The Census household matrix will be used to extract data about the mother’s resident partner at the time of the Census.

5) The linked file will be transferred to the Security and Permissions Team, who check the data, and assess the risk of disclosure, including granularity of variables and the output level.

6) Data will only be released to the research team once the risk of disclosure has been minimised. The Census attribute data will be linked back to the birth cohort data via a record ID generated as a random number by the data processing team in step 3. The data will be held in the ONS SRS where the research team will access and analyse the data.

2 - Proposed linkage between birth cohort and postcode-level air pollution and buildings data:

The air pollution/buildings datasets only contain postcode-level data which cannot be linked specifically to one individual. The linkage will be carried out in the following steps:

1) A member of the Health Analysis and Life Events Team at ONS will extract mother’s full postcode at delivery and Study ID from birth cohort identifier file.
2) Postcode level datasets will be sent by UCL to the Health Analysis and Life Events Team at ONS.
3) Health Analysis and Life Events Team at ONS will link the postcode level data to the birth cohort identifier file using the full postcode at delivery.
4) Mother’s postcode at deliver will be removed from linked file
5) The air pollution and buildings data, together with the unique mother study IDs, will then be transferred to the area in the SRS where the clinical birth cohort data are held, so that the postcode level air pollution data and building characteristics data can be linked to the clinical data via the mother's study ID.

The researchers working on the birth cohort will therefore not be able to see the clinical data, the full postcode and the air pollution /buildings data simultaneously.

These linkage steps will be carried out separately, ie the birth cohort will be linked to air pollution and building characteristics data first, then to Census data. The intended outcome is a database which integrates the clinical and birth registration information in the existing cohort with individual-level Census 2011 data from mothers and resident partners, and the postcode level data on air pollution and building standards.

All the datasets will be pseudonymised for analysis, that is, all mother and baby identifiers (date of birth, date of death, NHS numbers and full postcodes) will be kept separately to the clinical and Census data at all times. The identifiers will not be available to the researchers working on analysing the birth cohort data (the separation principle). Linkage between Census and the birth cohort data will be carried out using privacy preserving methods based on hashing and encrypting the identifiers before they are moved to the ONS Data Integration server for linkage. The clinical data, Census responses and identifiers cannot be simultaneously viewed by the ONS staff linking the Census and birth cohort data. Similarly, postcode level data will be linked to the birth cohort and clinical data using a method that ensures that full postcodes and clinical data cannot be viewed concurrently. All data analyses will take place in the Secure Research Service, an ONS data safe haven. All outputs from the data analyses (tables and figures) will be checked for potential individual disclosure risk by trained staff at the ONS.

Research staff working on analysing the birth cohort with linked Census and post-code level data will require training in data security and information governance, and accreditation as Approved Researchers by the ONS. They will also require substantial contract with UCL. There will also be one UCL-registered PhD student, who is supervised by the lead UCL reseacher, working on the data. All researchers working on the data will therefore have a duty of confidentiality enshrined in their UCL contract and are required to follow the UCL data protection policy. NHS Digital draws to the attention of the University College London that NHS Digital regards the University College London as being responsible for the actions and omissions of the PhD student.

Once linked, UCL researchers will check, clean and validate the linked Census and postcode-level air pollution and building characteristics data to ensure linkage is reliable. UCL researchers will derive variables indicating pollution and housing quality, and variables indicating parents' country of birth, length of stay in the UK and knowledge of English/English proficiency (self-assessed from Census data) from the linked data. Other important risk factors for hospital admissions in children have already been derived in the available cohort database, that is in the linked ONS birth and registration and HES APC data, including birth weight, gestational age, mode of delivery, and maternal age.

UCL researchers will use the linked birth cohort, that is, linked ONS birth/death registration-HES APC-Census-air pollution-buildings data to calculate admission rates according to each of these risk factors as events per child-years at risk. They will use a statistical method that will measure the association between air pollution, housing and admissions for respiratory tract infections (project 1) and parental migration history and potentially preventable admissions (project 2) in children, whilst taking into account other important risk factors for hospital admissions including preterm birth, and birth weight.

MR1415 - Application for ONS mortality data to be used for flagging and analysis of the RADICALS trial, which is a large phase III randomised controlled trial for people with prostate cancer (ISRCTN 40814031). — DARS-NIC-37191-P5S9S

Opt outs honoured: No - consent provided by participants of research study, No - data flow is not identifiable, No (Excuses: Reasonable Expectation, Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 – s261(2)(c), Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2019-05 – 2022-05 2017.12 — 2022.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

MRIS - Flagging Current Status Report
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
Demographics
Civil Registration - Deaths
Cancer Registration Data
Civil Registrations of Death

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

The RADICALS clinical trial protocol contains two clinical trials for men who have chosen surgery as their primary treatment for prostate cancer. These trials are RADICALS-RT, testing whether a certain subgroup of men should or should not have radiotherapy after their surgery; and RADICALS-HD testing whether men should have hormone therapy (and, if so, how long) with post-operative radiotherapy.

The first consent for the trial was taken in November 2007. A total of 2,863 patients were recruited in England & Wales.

The primary outcome measure for RADICALS-RT is freedom-from-distant-metastases. The primary outcome measure for RADICALS-HD is survival without death from prostate cancer. These patients, overall, will do well and most men will survive a long time without any failure. The Medical Research Council Clinical Trials Unit (MRC CTU) at University College London (UCL) is concerned that sites will not be able to follow patients as intensely as required over this full period. Flagging data from NHS Digital will allow MRC CTU to ensure that most deaths are captured and included and will also help with cause of death review. MRC CTU is aware that flagging data has been very valuable in previous RCTs. Furthermore no man should die from prostate cancer without prior progression so a reported death will allow MRC CTU to check that events are not being missed on the Case Report Forms that clinicians are directly completing for these consenting patients. MRC CTU will also, for the purposes of survival analysis, be able to assume that patients are alive at a set point in time if not reported dead, thereby increasing reliability of data. Finally, based on previous discussions with ONS statisticians MRC CTU will make assumptions about the survival of patients not reported as dead.

For information, there is intention to request HES data to be linked to this request at a later date which would allow MRC CTU to further understand the treatment patterns for the cohort. This will be subject to a new application to NHS Digital.

Yielded Benefits:

Comparison of ONS data with deaths of patients reported by study sites has shown that in most cases the study sites have reported deaths in a timely and accurate manner. A small number of discrepancies in cause of death have been identified and corrected through liaison with staff at sites, in each case so far these have been patients who died outside hospital and hospital staff had been uncertain about cause of death. The reassurance of well-reported data coming from study sites has helped the trial management group in planning the future of the trial.

Expected Benefits:

The RADICALS trial will define standard-of-care for men with localized prostate cancer who have chosen surgery. The results are keenly awaited by the uro-oncology community. In no way will the results of the analysis or the dissemination of findings be influenced by MRC CTU's funding organisations.

The trial is looking to see whether particular treatment approaches improve long-term disease-based outcomes, notably based in survival or distant spread of the disease. The trial will standardise the use (or otherwise) of immediate post-operative radiotherapy and the use (and duration, or otherwise) or hormone therapy with any post-operative radiotherapy. This patient group will generally do very well and it is key that MRC CTU do not have missing event data. However, long-term follow-up is very difficult in trials; MRC CTU anticipate that some centres will not be able to follow patients adequately. Therefore, connection to flagging data will ensure that MRC CTU do not miss deaths from the analyses. Knowing the status of patients will help MRC CTU to better tailor requests to centres for current information. Furthermore, attributing cause of death is notoriously difficult for men with prostate cancer. Having death registry information will help with MRC CTU's review of causes. Also , if patients are reported to have died from prostate cancer without a prior report of metastases MRC CTU can ensure sites providing this missing information (no man should die from prostate cancer without it first spreading).

Regardless of the findings for any of the three main comparisons, MRC CTU will be asked about treatment(s) at relapse(s). MRC CTU have deliberately collected little information on this from sites, preferring to seek this information from central sources.

RADICALS is expected to define standard of care in two aspects of prostate cancer treatment where differing practice has arisen in the absence of definitive randomised evidence. International coordination with other large trials is already in place and a meta analysis has been planned in which RADICALS will play a central part.

Outputs:

Intermediate trial reports will be produced for review by the Independent Data Monitoring Committee (IDMC) which is an independent group of experts who monitor patient safety and treatment efficacy data. The IDMC usually meets annually. Reports to the committee are confidential and the IDMC are the only people to see data by randomised group while the trial is in progress. They may recommend changes to the trial, for action by the trial steering committee. For clarification, no data supplied by NHS Digital would be shared with the IDMC.

Peer reviewed publications and high impact medical journals -either cancer-specific journal (like JCO or Lancet Oncology) or a general medical journal (like Lancet, JAMA, NEJM). MRC CTU will look to general journals first but will review the results and whether they might or might not appeal to a general audience. It is hoped that the two comparisons for RADICALS-HD will be ready to be reported in Autumn 2017 and 2018 for short-term hormone therapy versus long-term hormone therapy, and no hormone therapy versus short-term hormone therapy respectively. RADICALS-RT should be ready for reporting in Autumn 2020.

MRC CTU will communicate the RADICALS results using at least:
• Presentation at major international and national scientific conferences
• Publication in high-impact peer-reviewed journals
• A written summary of results distributed to participants
• News articles on MRC CTU website
• Tweets on the @MRCCTU Twitter account

MRC CTU will also communicate the results to the wider patient population via articles in the Tackle Prostate newsletter, Prostate Matters. MRC CTU will also inform Prostate Cancer UK of the results, building on the relationship MRC CTU have with them for other trials in MRC CTU's prostate cancer portfolio. If appropriate, MRC CTU will work with the MRC and UCL press offices to develop press release(s) about the results. Depending on what the results show, MRC CTU may also look at other methods of communication. For previous prostate cancer trials MRC CTU have used films, briefing papers and events to communicate the results to health-workers and patients.

All outputs will be aggregated will small numbers suppressed and in line with the HES Analysis Guide.

Processing:

The data will be used for the long-term assessment of men in both the RADICALS-RT and RADICALS-HD comparisons of the RADICALS trial protocol. The main outcome measures involved survival and cause of death, which are predefined in both the protocol and the Statistical Analysis Plan. This data will be joined with the data that are requested on the study Case Report Forms and will be linked to those records already held on consenting trial participants. The number of participants and the rationale for their inclusion will always be included in all presented results.

1. The trial team will supply a list of cohort identifiers to NHS Digital containing Trial Number, NHS Number, Date of Birth, full names and postcode.

2. NHS Digital will use the supplied information to extract death information and send back reports using the specified transfer method. The reports will contain the Trial Number, Date of Death if applicable and no other identifiers.

3. Using the secure PID database, the Head of DMS will link these records back to the study-specific trial number and prepare an output file for trial statisticians containing no identifiers other than the full date of death.

4. This output file is placed in a secure directory with access limited only to people listed in the Data Sharing Agreement, all of whom are substantive employees of UCL.

5. Trial statisticians undertake data cleaning/validation activities.

6. The data extract received from NHS Digital then serves three aspects:
-i. Deaths already reported to MRC CTU by site – MRC CTU would use NHS Digital data to verify and corroborate date and cause;
-ii. Deaths not reported to MRC CTU by site – MRC CTU then follow up with site to corroborate date and cause, to add to study database;
-iii. Deaths not reported by site that sites are not aware of (died elsewhere) – in these cases, NHS Digital data becomes the date and cause for entry into MRC CTU’s study database.

All processing of ONS data is in accordance with standard ONS terms and conditions.

Only substantive employees of UCL will access record level data or aggregated data containing small numbers. No data provided by NHS Digital will be shared with any third parties, including members of the IDMC, except in the form of aggregated data with small numbers suppressed in line with the HES Analysis Guide.

Evaluating the Family Nurse Partnership in England — DARS-NIC-136916-B7D5C

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006, Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(a)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2019-10 – 2022-09 2020.08 — 2021.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 23 May 2024 final.pdf, IGARD Minutes - 26 May 2022 final.pdf, igard-minutes-17th-october-2019---final.pdf, igard-minutes-25th-july-2019-final.pdf, igardminutes-28thjanuary2021final.pdf, IGARD Minutes - 14 July 2022 final.pdf

Datasets:

MRIS - Bespoke
Civil Registration (Deaths) - Secondary Care Cut
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
MRIS - List Cleaning Report
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London (UCL) are proposing to link an existing cohort of mothers and babies (held by the research team under a separate DSA with UCL; NIC-393510-D6H1D) with programme information from the Family Nurse Partnership (FNP) Information System. NHS Digital hold the FNP data and the data linkage will be completed by NHS Digital.

The data under NIC-393510-D6H1D is disseminated for a programme of research within the healthcare provision theme of the Policy Research Unit for Children, Young People and Families (CPRU), within University College London (UCL) which is funded by the Department of Health (DoH).

The HES data under NIC-393510-D6H1D is limited to those under the age of 56. This data will be further minimised for the purpose of this agreement to those mothers aged 13-25 giving birth between 2010 and 2019 in England. The Civil Registration and the bridging file data released under NIC-393510-D6H1D is limited to deaths registered in England between 1st January 1998 until as late as possible, for all persons who died aged 0-55. The data will be further minimised for the purpose of this agreement to those mothers aged 13-25 giving birth between 2010 and 2019 in England.

This request (and the purpose of this application) is for a longitudinal research study funded by NIHR.

The Family Nurse Partnership (FNP) is an intensive early home visiting programme for first time teenage mothers, delivered by trained nurses aiming to improve maternal and child outcomes by providing support throughout pregnancy and until the child’s second birthday. This study aims to evaluate the real-world implementation of FNP in England. To do this, the research team at UCL will use electronic health records (HES Outpatient, Accident and Emergency and Inpatient data, along with Civil Registration Data) that are routinely collected to compare outcomes for FNP participants with similar families who did not take part in FNP.

There will be two cohorts:

~ mothers aged between 13-24 and their babies born between 01 April 2010 and 31 March 2019. The cohort size is approximately 975,000 but this will be further reduced through detailed matching of mothers with FNP participants in terms of age, parity, and other demographic and health characteristics (e.g. deprivation, chronic conditions).

and

~ those within the FNP. Cohort size approximately 25,000.

The total sample size will be approximately 1,000,000 mother-baby pairs; around 25,000 of these will be FNP participants. In practice, a smaller control group will be created for the comparison (see Processing Activities – Propensity Score Matching). Creating this matched control group will involve detailed analysis of the characteristics of mothers prior to pregnancy, in order to achieve closely-matched groups that are required to account for differences between mothers who participated in FNP and those who did not. Characteristics used to create the matched comparator group will include maternal age, ethnicity, local authority, deprivation, number of A&E visits, admissions for mental health conditions, and admissions for injuries and adversity-related diagnoses (in the 5 years prior to pregnancy). Within some local authorities, there may be a small number of women that can be matched to a FNP mother. The propensity matching process will be iterative and thus the whole population of mothers is initially required in order to generate the most comparable groups that will allow for a robust analysis.

The purpose of this request is to answer a set of research questions, aiming to generate evidence on the real-world implementation of the FNP in England.

The purpose of this application falls under Article 6 (1) (e) of the GDPR and the lawful basis for using information collected routinely for administrative purposes for research is the ‘public task’. This is part of the University’s commitment to ‘integrate research and innovation for the long-term benefit of humanity’. The application also falls under Article 9 (2) (j), as scientific research.

The data will be used to address the following objectives:

- Describe variation in delivery of FNP and usual care across Local Authorities (LAs)
- Describe variation in health characteristics of families participating in FNP over time and by LA; compare characteristics of FNP participants with families who met eligibility criteria but did not enrol
- Explore individual and LA-level predictors of engagement (number of valid visits)
- Evaluate the effectiveness of FNP on a broad range of health outcomes for both children and mothers
- Determine which families stand to benefit from FNP using detailed information on maternal trajectories prior to pregnancy (e.g. chronic conditions)
- Evaluate outcomes for groups who have recently become eligible for FNP (e.g. mothers up to age 24)
- Explore the effect of contextual factors such as usual care models, nurse characteristics and programme content covered
- Determine how the effect of FNP differs between LAs.

The co-investigators:
~ London School of Hygiene and Tropical Medicine
~ Tavistock and Portman NHS Foundation Trust
~ University of Cambridge
~ University of Kent
~ University of Oxford

and the collaborators:
~ FNP National Unit
~ University of Cardiff

named in the protocol are not listed as joint data controllers. Although they were involved in the initial set-up of the study, they do not have any control or influence over the overall purpose of how the data will be used. UCL are the sole party who have control of the purpose and processing of the data for this study.

Expected Benefits:

The research will benefit the provision of health care and the promotion of health, by informing policy on the effectiveness of the FNP for vulnerable families in England, and providing evidence on the likely benefits to maternal and child health. This will have direct relevance to the 4% of babies each year (22,465 babies in 2016) that are born to mothers aged <20 years. The research is in the public interest, because teenage mothers face a number of challenges during pregnancy. Lower levels of education, less stable careers and lower income put teenage mothers at a disadvantage compared with older mothers. Combined with a greater risk of inadequate prenatal care and unhealthy behaviours during pregnancy, these factors can lead to greater healthcare needs for their children. Approximately 80% of infants born to first time teenage mothers attend an emergency department or are admitted to hospital at least once before their second birthday. Two-thirds of these mothers go on to have a subsequent pregnancy within two years. Despite these higher rates of hospital admissions and healthcare needs throughout childhood, teenage mothers are also less likely to seek out preventive care in the community, meaning that pregnancy and the postnatal period provide important opportunities for intervention in this vulnerable group. There is growing recognition of the need for evidence on the best ways to support young and vulnerable mothers.

This research will complement existing evidence from the Building Blocks trial of FNP in approximately 1600 families recruited in England between 2009-2010, by evaluating outcomes for the 25,000 families enrolled in FNP since 2010. The study will directly benefit the Health and Social Care sector by providing NHS managers, commissioners and policy makers with detailed and up-to-date evidence to aid decision making about ongoing the roll out and targeting of early interventions designed to support young mothers.

The research will help inform decision makers on those most likely to benefit from increased early support, and on the potential gains from reducing maltreatment, abuse and neglect, and emergency use of hospital services. Findings will also be used by the FNP NU to inform ongoing adaptations of the FNP in England. Local Authorities require evidence on the implementation and effectiveness of FNP in their local area to monitor the service and support commissioning decisions. Linkage of administrative data will provide a resource to support LAs for these purposes. Identifying the characteristics of families participating in FNP, and those eligible but not participating, will be particularly useful for LAs wishing to target the most disadvantaged families.

Targeted support for the most disadvantaged children and their families is recognised as a priority for research, and programmes such as the FNP are likely to remain a priority for services as understanding how best to provide early support to young mothers and their families could help improve maternal and childhood outcomes. Research is therefore likely to have significant impact, given the ongoing roll-out of the FNP internationally.

The study team will work with commissioners on the study steering group, and continue to work with the FNP National Unit, to ensure that outputs are used to support policy makers and commissioners in their efforts to improve the quality of care for young mothers and their families in England. This study will therefore directly benefit the Health and Social Care sector by providing healthcare professionals, commissioners and policy makers with detailed evidence to inform policy and aid decision making in relation to young vulnerable mothers and their families.

Evidence generated by this study will support commissioners in providing improved services for mothers and children who could benefit most, and lead to increased efficiency through more effective targeting of resources. This information will inform targeting of appropriate services for families who are most in need, and most likely to gain from, additional support during pregnancy and early childhood. The results of this study will be used to inform professionals about the best ways to offer the FNP to those who could benefit from the service.

Outputs:

The main output will consist of a report on the effectiveness of FNP for different groups of families. The researcher will disseminate these outputs by providing briefings of these results that will be prepared for policy makers and disseminated using the FNP Neonatal Unit’s existing networks. Findings will be used by the FNP NU to inform ongoing research into the adaptation of the FNP in England (ADAPT sites) and by Local Authorities wishing to target the most disadvantaged families. Findings will be published as peer review publications in high quality journals (e.g. Lancet Public Health, BMJ, JAMA Paediatrics, submitting within 3 years of data access).

The researchers will also work with parent representatives to co-produce a range of outputs suitable for communicating results to families participating in FNP, e.g. fact sheets about the impact of FNP from a parent perspective. The researchers have already had input on their study from a number of teenage mothers, and these mothers will continue to be involved in dissemination of results, e.g. by co-producing outputs and ensuring that public-facing materials are age-appropriate. The FNP National Unit are already very experienced in producing material that is appropriate for the ages of their participants, and they will have input to the outputs from this study. Two mothers sit on the study steering committee and will advise on appropriate routes to disseminate outputs, e.g. social media and blogs on the FNP study website.

Secondary outputs will include methodological research on the accuracy and reliability of linkage of data from health, education and social care sectors. These subsidiary analyses will be published to inform data providers and other researchers on the use of these data for future and ongoing studies. Targeted journals will include as the International Journal of Epidemiology and PLoS One, submitting within 3 years of data access.

Outputs from the study will help policy-makers decide whether FNP should be offered to families in their local setting. Outputs will also provide commissioners with information on variation in health outcomes and healthcare use according to different maternal characteristics and differing engagement with FNP.

All journal articles will be published with open access, to ensure the wide dissemination of the study’s results to healthcare professionals, NHS managers, commissioners and policy makers. Results of the study will also be made available in both clinical and methodological research forums: abstracts will be submitted to the following conferences within 2 years of data access: International Population Data Linkage Network, Public Health Science, Society for Social Medicine.

Outputs will contain only aggregate level data with small numbers suppressed (in line with the HES Analysis Guide).

Only aggregated data with small numbers supressed (in line with the HES Analysis Guide) will be used by the organisations mentioned in the protocol.

Processing:

The data flows are summarised as follows:

A. FNP cohort (as some identifiers might have changed since enrolment (e.g. mother’s name, postcode), identifiers will first be updated using the Personal Demographic Service within NHS Digital, so that the most relevant set of identifiers can be used for linkage).
1. Identifiers for FNP mothers and their first child are currently held by NHS Digital via Open Exeter. These identifiers (NHS number, GP code, name, sex, date of birth, postcode) will be transferred to the NHS Digital HES Production team and linked to records held in PDS as some identifiers might have changed since enrolment.

2. FNP cohort with the FNP study ID but no identifiers will be transferred separately to the secure data safe haven at UCL.

B. HES data
1. A HES cohort of mothers and babies will be prepared by the researchers at UCL based on an existing dataset (NIC-393510-D6H1D) held by the research team at UCL. Encrypted HESIDs for these records will be transferred to NHS Digital.
2. A limited version of the HES cohort, containing encrypted HESID and a number of analysis variables, will be extracted from an existing HES cohort held at UCL (DSA NIC-393510-D6H1D) to a new server within the secure setting.
3. NHS Digital will extract identifiers for the list of encrypted HESIDs (sex, date of birth, NHS number) and updated identifiers from PDS (name, postcode, GP code) and use these for linkage with the FNP cohort (using the key from the abstract for NIC-393510-D6H1D) .
4. A pseudonymised link-key will be transferred from NHS Digital to the UCL data safe haven.

C. Secure setting
1. The link-key will be used to merge the de-identified FNP programme data (cohort) with the HES analysis variables within the secure setting of the data safe haven. Identifiers will not be held in the secure setting. The data will remain pseudonymised as the data is encrypted.

There will be no attempts to identify individuals. Risk of re-identification will be mitigated by checking all outputs for small cell sizes. No potentially disclosive outputs will be shared or published. Data processing will only be carried out by substantive employees of UCL who have been appropriately trained in data protection and confidentiality.

The researcher will compare outcomes for mothers ever enrolled in FNP versus those who were never enrolled. Two analysis strategies will be used to take account of measured confounders related to both participation in FNP and outcomes: i) propensity score matching; ii) adjusted analyses.

Propensity Score Matching (statistical matching technique used to estimate the effect of a treatment)

To derive propensity scores, the researcher will regress FNP participation on all available maternal characteristics, e.g. pre-pregnancy chronic conditions. Matched groups will be formed based on the propensity of participation. Effects will be estimated as the difference in outcomes between matched groups. Statistical models will allow for clustering of families within LAs, and multiple imputation will be used to account for missing data.

The main analysis will restrict matching within the same LA and within the time periods in which FNP was offered within that LA. Secondary analyses aiming to achieve more closely matched groups (with potentially smaller numbers) will match i) within the same LA but in different time periods, comparing outcomes for eligible families before vs. after FNP was offered; and ii) within the same time period but in different LAs, comparing outcomes for eligible families in LAs that did and did not offer FNP.

Adjusted analyses

This analysis will be an unmatched comparison, adjusting for maternal variables (e.g. pregnancy complications, ethnicity, Index of Multiple Deprivation) and neonatal variables (e.g. gestational age, birthweight, length of postnatal hospital stay, season of birth, congenital anomalies, admission to the Neonatal Intensive Care Unit (NICU)).

Sensitivity analyses will determine the strength of unmeasured confounding required to invalidate results. To further assess the robustness of findings to the analysis approach and to evaluate any potential differences in results due to the use of real-world data, the researcher will use the cohort to replicate findings observed in the Building Blocks Trial (a randomised control trial to evaluate the Family Nurse Partnership in England which was conducted by Cardiff University and ended in March 2016). For each analysis strategy, the researcher will derive trial outcomes for a group of families in the administrative data cohort with the same aggregate baseline characteristics as trial participants.

MR1b - Health and Development Study - S251 Cohort members — DARS-NIC-86954-Y0R2N

Opt outs honoured: Y, N, Yes - patient objections upheld, Yes (Excuses: Section 251, Section 251 NHS Act 2006)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-01 – 2022-01 2018.03 — 2021.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 5 May 2022 final.pdf, igard_minutes_8_february_2018.pdf, igardminutes-17thdecember2020final.pdf, IGARD_Minutes_03.08.17.pdf, igard_minutes_1_march_2018.pdf, igardminutes-14thjanuary2021final.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Outpatients
Hospital Episode Statistics Admitted Patient Care
MRIS - Flagging Current Status Report
MRIS - List Cleaning Report
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Members and Postings Report
Civil Registration - Deaths
Demographics
Cancer Registration Data
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)
Civil Registrations of Death

Type of data: Identifiable

Objectives:

The Medical Research Council National Survey of Health and Development (NSHD) is the oldest and longest running of the British birth cohort studies. From an initial maternity survey of 13,687 (82%) of all births recorded in England, Scotland and Wales during one week of March, 1946, a socially stratified sample of 5,362 singleton babies born to married parents was selected for follow-up. The NSHD study team is housed within the MRC Unit for Lifelong Health and Ageing (LHA) at University College London (UCL).

Linkages for Scotland and Wales will be performed separately to the NHS Digital linkage. The linkages to central NHS held data will only involve the transfer of data for patients recruited in those nations, so for example there will be no data transferred to NHS Digital for patients recruited in Welsh institution.

The NSHD study team has collected unique lifetime data on body size and maturation, cognitive and physical function, socioeconomic status and diet; and has repeat adult data on diet, smoking, physical activity, blood pressure and lung function. The most intensive data collection in 2006-2010, when study members were aged 60-64 years, included measurement of cardiac structure and function, body composition and bone density.

The 24th and most recent data collection to the whole sample included a postal questionnaire in 2014 and a home visit by a trained research nurse for interview and assessment in 2015/2016. At the 24th follow-up, the target sample was 2816 study members still living in mainland Britain; this is the maximum sample used in the analyses. Of the remaining 2546 (47%) study members: 957 (18%) had already died, 620 (12%) had previously
withdrawn permanently, 574 (11%) lived abroad, and 395 (7%) had remained untraceable for more than 5 years.

Where study members have become lost to follow up, the MRIS data being provided under this application will enable NSHD to seek to re-contact those study members and invite them to continue participating in the study, i.e. to re-consent these participants.

The NSHD was the first study (in 1971) to have participants flagged on the NHS Central Register for mortality (ICD codes are used to code cause of death) and cancer registrations. The LHA receives notifications on an ongoing quarterly frequency.

The LHA wishes to link NSHD study members to HES data in order to improve the quality of information on hospital admissions and health outcomes for research purposes. Currently, the study obtains self-reported hospital admission data at each follow-up which are then confirmed through contact with each hospital.

The Unit has a 5-year Medical Research Council core funded programme of research based on the NSHD with the objective to investigate risk and protective factors from across the life course that influence the ageing process. This core funding has been in place since 1962 and is renewed every five years after scientific review.

The data from HES will be used to improve the identification of acute events such as those caused by cardiovascular disease (CVD). For example, the unit will assess how life course risk factor trajectories of body size, resting heart rate, blood pressure, socio-economic position (SEP) and health related behaviours, accumulate and interact to influence incidence of CVD, thus potentially identifying possibilities for earlier prevention. As the cohort is entering older age, hospital care becomes increasingly frequent and study members are thus less likely to report hospital admissions over a number of years accurately. It is therefore important to capture this information in other ways. New research within LHA on health service use is being developed which will utilise these data and investigate life course predictors of health care utilisation.

The data collected on the NSHD cohort, including that provided by NHS Digital, is used across five research integrated programmes with the overarching aim of identifying social and biological factors that affect lifelong health, ageing and the development of chronic disease risk.

The five programmes are:
1) Enhancing NSHD
2) Functional Trajectories and Cardiovascular Ageing
3) Physical Capability and Musculoskeletal Ageing
4) Mental Ageing
5) Wellbeing in older age

Yielded Benefits:

UCL have not yet analysed the data, but examples of previous work are listed above.

Expected Benefits:

The NSHD has informed UK health care, education and social policy for 70 years and is the oldest and longest running of the British birth cohort studies. Today, with study members in their early seventies, the NSHD offers a unique opportunity to explore the long-term biological and social processes of ageing and how ageing is affected by factors acting across the whole of life.

Evidence is growing from this cohort study and others, that factors from early life (such as growth, neurodevelopment, nutrition and family socioeconomic circumstances) as well as later life (such as adult smoking, diet, exercise and socioeconomic circumstances) affect the opportunity to age well. This is of interest to policymakers, practitioners, and older people themselves.

The research using NSHD life course information will provide insights into when in the life course interventions to prevent disease (in particular CVD). This information will inform the design of future interventions which can then be tested in controlled trials.

In particular, through knowledge transfer, public engagement, publications, presentations and invited commentaries (http://www.nshd.mrc.ac.uk/findings/) the MRC LHA has contributed to a body of evidence to influence policies and support evidence based medicine. For example, recent paper in PLOS Medicine comparing lifetime trajectories of overweight and obesity across NSHD and the later born cohorts has been cited in the recent Government’s Child Obesity Strategy. Other examples highlighting the depth and breadth of this lifelong study include:

• NSHD is a member of the Dementias Platform UK, a £53 million collaboration between universities and industry established by the MRC in 2014, to transform the best dementia research into the best treatments as quickly as possible. It combines the power of multiple population studies to compare healthy people with people at all stages of dementia.

• The NSHD finding, in 2014, that more rapid rises in systolic blood pressure during midlife (even if not crossing into hypertension) were related to poorer cardiac structure (published in the European Heart Journal in 2014) has implications for treatment guidelines as it suggests that identification and treatment of people with rapidly increasing SBP, even if they are not reaching the criteria for hypertension, may be beneficial in preventing subsequent cardiovascular disease.

• The NSHD findings (published in The Lancet Diabetes & Endocrinology in 2014) suggesting that those who lost weight at any age during adulthood, even if weight was regained later, had better cardiovascular risk profiles than those who remained overweight or obese supports public health strategies that help individuals to lose weight at all ages.

• In 2014, the NSHD finding that better performance in tests of physical capability (i.e. grip strength, chair rising and standing balance) in midlife was linked to higher survival rates over 13 years of follow-up was published in the British Medical Journal. This highlighted the value of these simple objective physical tests in helping to identify those people who from at least as early as midlife onwards may require more support than others to achieve a long and healthy life.

• Subsequent work examining changes in objective measures of physical capability between ages 53 and 60-64 has highlighted that age-related decline may not be entirely inevitable and is potentially modifiable. This work has also suggested that there may be a need to monitor physical capability from at least as early as midlife onwards as opportunities to help some high risk groups may already have been missed if no action is taken until later in life.

• A 2009 report on adult life chances in relation to childhood mental health using NSHD was cited by the government in support of a case for early intervention to build mental capacity and resilience.

• The study’s findings of the continuing effect of early life growth and development on health outcomes in adulthood add to the arguments for early intervention of the kind provided by the national SureStart programme.

• The 1999 paper comparing children’s diet in 1950 with that in the 1990s (‘Food and nutrient intake of a national sample of four-year-old children in 1950: comparison with the 1990s’, Public Health Nutrition) had an impact because of its evidence that the quality and nutrient value of infant and childhood diet had declined between 1950 and 1990.

• The study’s finding (published in All our Future in 1968) of the extent and inequity of the ‘waste of talent’ – in terms of high ability children who did not continue into further or higher education – added to arguments for improving opportunities for, and expectations of, children from poorer families.

• The Home and the School (1964) had a great impact, probably because it provided the first hard evidence that parents and preschool circumstances had a significant impact on ability and attainment at age eight, and so showed that preschool development and experience formed the bedrock on which primary schooling was built.

• Press reports that followed the publication of Maternity in Great Britain (1948), which were concerned with the ‘Need for Better Care and Lower Costs’ (The Times), are likely to have influenced the arguments for improvements in the care of mothers and babies.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Outputs:

The data will be used on an ongoing basis to update study member records. The database will be updated after each data release.

The primary output of the linkages with HES, ONS mortality and Cancer Registration data are the maintenance and enhancement of the NSHD-DR. This is in turn used to achieve multiple research outputs that benefit health and social care.

The programme ‘Enhancing NSHD’ examines many of the genomic and other epigenomic (genetic material of a cell) and metabolomics (systematic study of the unique chemical fingerprints that specific cellular processes leave behind) factors that influence the risk of many age-related diseases and quantitative traits, often in collaboration with external researchers.

The programme ‘Functional Trajectories and Cardiovascular Ageing’ examines which factors from across the life course promote good adult cardiovascular function and prevent disease onset, and which increase vulnerability to accelerated cardiovascular ageing.

The programme ‘Physical Capability and Musculoskeletal Ageing’ examines which factors from across the life course promote good adult physical capability and musculoskeletal health, and which increase vulnerability to accelerated decline in capability.

The programme ‘Mental Ageing’ examines which factors from across the life course promote cognitive capability and protect against depression and which factors increase vulnerability to cognitive decline.

The programme ‘Wellbeing in older age’ examines what social contexts and experiences in childhood and early adulthood promote wellbeing in later life and whether wellbeing protects against functional ageing.

Each of these programmes generate multiple publications in peer review journals annually and findings are further disseminated via conference presentations. A full list of publications produced to date plus details of the current priorities for each programme are published on the MRC LHA website at: http://www.nshd.mrc.ac.uk/.

Publications and presentations only use data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

This MRC Unit is committed to research on ageing – outputs arising from ONS data will be anonymised in the form of tables, graphs, peer reviewed journals, presentations and books.

These data have been used in a number of publications. A full list of publications can be found at http://www.nshd.mrc.ac.uk/findings/
Examples of NSHD publications using mortality data are below:
1. Davis D, Cooper R, Terrera GM, Hardy R, Richards M, Kuh D.Verbal memory and search speed in early midlife are associated with mortality over 25 years' follow-up, independently of health status and early life factors: a British birth cohort study.Int J Epidemiol. 2016 Aug 6. pii: dyw100.
2. Zhou CK, Sutcliffe S, Welsh J, Mackinnon K, Kuh D, Hardy R, Cook MB.Is birthweight associated with total and aggressive/lethal prostate cancer risks? A systematic review and meta-analysis.Br J Cancer. 2016 Mar 29;114(7):839-48.
3. Teschendorff AE, Yang Z, Wong A, Pipinikas CP, Jiao Y, Jones A, Anjum S, Hardy R, Salvesen HB, Thirlwell C, Janes SM, Kuh D, Widschwendter M. Correlation of Smoking-Associated DNA Methylation Changes in Buccal Cells With DNA Methylation Changes in Epithelial Cancer. JAMA Oncol. (2015 Jul 1); 1(4):476-85
4. Hartaigh B, Gill TM, Shah I, Hughes AD, Deanfield JE, Kuh D, Hardy R. Association between resting heart rate across the life course and allcause mortality: longitudinal findings from the Medical Research Council (MRC) National Survey of Health and Development (NSHD). J Epidemiol Community Health, 2014 Sep;68(9):8839.
5. Albanese E, Strand BH, Guralnik JM, Patel KV, Kuh D, et al. (2014) Weight Loss and Premature Death: The 1946 British Birth Cohort Study. PLoS ONE 9(1): e86282.
6. Maughan B, Stafford M, Shah I, Kuh D. Adolescent conduct problems and premature mortality: follow up to age 65 in a national birth cohort. Psychological Medicine 2013 Aug 21:110.
7. Ong K, Hardy R, Shah I, Kuh D on behalf of the NSHD scientific and data collection teams. Childhood stunting and mortality between 36 and 64 years: the British 1946 birth cohort study. Journal of Clinical Endocrinology and Metabolism. 2013 May;98(5):20707.
8. Strand BH, Kuh D, Shah I, Guralnik J, Hardy R Childhood, adolescent and early adult body mass index in relation to adult mortality: results from the British 1946 birth cohort. J Epidemiol Community Health. 2012 Mar; 66(3): 225–232.
9. Henderson M, Hotopf M, Shah I, Hayes RD, Kuh D. Psychiatric disorder in early adulthood and risk of premature mortality in the 1946 British Birth Cohort. BMC Psychiatry 2011 Mar 8;11:37.
10. Kuh D, Shah I, Richards M, Mishra G, Wadsworth M, Hardy R. Do childhood cognitive ability or smoking behaviour explain the influence of lifetime socioeconomic conditions on premature adult mortality in a British post war birth cohort? Soc Sci Med. 2009 May; 68(9): 1565–1573.
11. Clennell S, Kuh D, Guralnik J, Patel K, Mishra G. Characterisation of smoking behaviour across the life course and its impact on decline in lung function and allcause mortality: evidence from a British birth cohort. Journal of Epidemiology and Community Health 2008;59:30414.
12. Kuh D, Richards M, Hardy R, Butterworth S, Wadsworth MEJ. Childhood cognitive ability and deaths up until middle age: a post war birth cohort study. International Journal of Epidemiology 2004;33:40813.
13. Kuh D, Hardy R, Langenberg C, Richards M, Wadsworth MEJ. Mortality in adults aged 26-54 years related to socioeconomic conditions in childhood and adulthood: post war birth cohort study. British Medical Journal 2002;325:107680.

Processing:

NSHD receives data from two main sources i) collected from the study members themselves over the past 70 years and ii) from NHS Digital; these data are held in the NSHD-Data Repository (NSHD-DR). Study participants are flagged with NHS Digital. NHS Digital provides notifications of deaths and cancer registrations on a quarterly frequency. These data are incorporated into the NSHD-DR to enhance that dataset for research purposes. The mortality data (fact of death) are also used for administrative purposes. As well as being used to identify specific health events, linkage to HES data will allow the derivation of useful aggregate variables such as number of hospital admissions and length of time in hospital. The derived aggregate variables are then used for other research analyses by LHA scientists and may be shared with external researchers.

In scientific studies in the period that pre-dated the MREC/LREC structure, consent was assumed by participation. In this study, the period of assumed consent covers the years from birth to age 35 years (from 1946 to 1981). Ethical permission for the 1982 and 1989 studies was obtained from the local ethical committees that preceded the LRECs and were run by the teaching hospital to which the NSHD research team were then affiliated (Bristol in 1982 and UCL in 1989. In 1999, MREC approval was obtained for the data collection and its use for research purposes by the team and their collaborations (MREC98/1/121). Ethical approval for the feasibility study (MREC06/Q1407/26) and extension study (07/H1008/245) was obtained from the Central Manchester Research Ethics Committee, and additional Scottish approval (08/MRE00/12) was granted through the Scotland A Research Ethics Committee. Most recently, a favourable opinion was obtained from the London Queen Square REC (14/LO/1073) and Scotland A REC (14/SS/1009).

The legal basis for access to NHS Digital data for DARS-NIC-148100 (MR1a) is through consent. In this parallel agreement for members who are lost to follow up, DARS-NIC-86954 (MR1b), the legal basis is through Section 251 of the NHS Act 2006 (CAG approval ref: 15/CAG/0139). ONS Terms and Conditions will be adhered to.

Derived NHS Digital data will be linked to the NSHD-DR which stores all study member data in pseudonymised form going back to 1946. NHS Digital identifiable data can only be viewed by named NSHD staff and is stored separately from pseudonymised derived data. The NSHD-DR additionally holds hospital admissions data that was previously obtained directly from the hospitals or General Practitioners.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

All those with access to the data are substantive employees of University College London.

All processing of ONS data will be in line with ONS standard conditions.

The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

There will be no onward sharing of record level data as part of this application.

Evaluating protocols for identifying and managing patients with FH — DARS-NIC-300282-G9Q0Q

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 - s261 - 'Other dissemination of information', Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-11 – 2022-10 2016.04 — 2021.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY OF NOTTINGHAM, UNIVERSITY OF YORK

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes---23-july-2020-final.pdf, igarddraftminutes10thdecember2020final.pdf, igardminutes-18thfebruary2021final.pdf

Datasets:

MRIS - Members and Postings Report
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
Civil Registration - Deaths
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
MRIS - Flagging Current Status Report
Civil Registrations of Death
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant, Identifiable

Objectives:

The Register was designed as a dynamic prospective cohort study with notification of embarkation and death being provided by the NHS Central Register. 3,623 patients with FH (and a separate cohort of 340 patients with severe hypertriglyceridemia were registered from 1980 through to 2012 by up to 22 UK hospital lipid clinics. Results from the Simon Broome Register published in 2008 showed that the high risk of CHD in FH can be reduced to that of the general population by early treatment, particularly with high potency statins as recommended in England by the National Institute of Health and Clinical Excellence. However, although the 3,382 patients were followed up for 46,560 years from 1980 to 2006, a number of important questions remain unanswered which may be addressed by continued follow-up until the end of 2014 by when it is estimated that the person-years follow-up will have increased by at least half.

UCL therefore wish to update mortality data for this study in order to increase power for this analysis examining the changes in coronary, all-cause, and cancer mortality in men and women with Definite and Possible heterozygous familial hypercholesterolaemia (FH) before and after lipid-lowering therapy with statins.

Yielded Benefits:

Benchmarking: A very similar analysis has been conducted in the context of stable coronary artery disease as part of the University of York led CALIBER project using HES data. The team plan to benchmark the analysis and outcomes of cardiovascular outcomes against the well-established CALIBER programme findings. This will ensure a level quality control against cardiovascular outcome definitions and comparison of findings. The study has allowed the team to produce long-term estimates of the risk of acute coronary syndrome, stroke/TIA and cardiovascular deaths in FH patients. This has supported the advocacy work of the respective charities (HeartUK and British Heart Foundation). Further, the statistical models produced from this data is informing a cost-effectiveness model quantifying the costs and benefits of diagnosis and treatment for FH patients. This model will be used to assess the cost-effectiveness of different configurations of FH cascade testing services for the purposes of the ongoing NIHR HTA grant. The model will also provide the basis for future assessments of interventions to improve FH diagnosis and treatment. Most recent analysis of the dataset has also highlighted to lipid specialists the persistent inequalities in management of the condition, with poorer morbidity and mortality in female patients and those from deprived backgrounds.

Expected Benefits:

Results from the study are expected to confirm the importance of early identification and treatment of heterozygous FH patients and to inform guidelines for patient treatment with the aim of preventing coronary mortality. Criteria from the study are used in the diagnosis of FH and the study results helped substantively in the development of the 2008 NICE guidelines. Previous results showed that the high risk of CHD in FH can be reduced to that of the general population by early treatment, particularly with high potency statins. These results were published in the European Heart Journal along with an editorial emphasising the importance of these results for diagnosis and management of FH in adults.

Outputs:

Results will be published in medical journals. A publication list for this study has been included with our application.Our 2008 results were published in the European Heart Journal along with an editorial emphasising the importance of these results for diagnosis and management of FH in adults ( http://dx.doi.org/10.1093/eurheartj/ehn448). No other such registry data are available elsewhere and the register has provided uniquely valuable data which helped substantively in the development of the 2008 NICE guidelines. The published data will be summary data showing mortality rates for the study participants. No patient level data will be shown. We would aim to publish the main findings within 6 months of receiving the ONS data.

Processing:

Recruitment to the study is not ongoing, with the last patients recruited in 2012. However the study committee would like to recruit further patients to the study at a later date. The identifiable data will be stored and processed within the UCL Data Safe Haven (IDHS) which is a safe haven environment designed to meet the requirements of the NHS Information Governance Toolkit (certified to ISO 27001:2013 – certificate number: IS612909). Data from HSCIC will be collected via secure electronic data transfer by the study statistician, encrypted and immediately transferred to the IDHS. All other study data is pseudonymised. Only the study statistician and study PI have access to the identifiable data which will be used to link and update mortality data for the study. Causes of mortality will be coded and added along with dates to the pseudonymised data ready for analysis.

Patients will be censored on reaching the age of 80 years and on emigration on date of embarkation. All deaths will be coded to ICD 9th revision. Person-years at risk will be accumulated within 5-year age groups and 5-year calendar periods for men and women in the cohort. The age and calendar-specific death rates for men and women in the general population of England and Wales will be applied to the person-years accumulated by men and women in the cohort to estimate the expected numbers of deaths from specified causes. The ratio of the number of deaths observed to the number expected will be expressed as the Standardised Mortality Ratio (SMR for reference population = 100) and the exact 95% confidence intervals will be calculated for men and women separately and aggregated for both sexes. The test of significance used will be a two-sided Poisson probability of observing the number of deaths that occurred given the expected number of deaths. The absolute rates of mortality from coronary heart disease and all causes will be calculated per 100,000 person-years per five-year intervals of attained age and also aggregated by 20-year periods of attained age. Previous publications from the Simon Broome Register have used these methods.

Patients recruited after 1996-1998 have consented to the use of their data in this way.

For patients recruited earlier we have obtained section 251 exemption. Approved researcher forms have been submitted for those processing the mortality data. Published data will be at a summary level and no record level data will be shared with third parties.

MR740 - United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS) — DARS-NIC-334952-R5M7K

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251, Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , National Health Service Act 2006 - s251 - 'Control of patient information'. , Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 - s261(5)(d); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: Yes (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2018-12 – 2021-11 2017.12 — 2021.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 11 May 2023 final.pdf, IGARD Minutes - 17 February 2022 Final.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Members and Postings Report
Civil Registration - Deaths
Demographics
Cancer Registration Data
MRIS - Scottish NHS / Registration
MRIS - Flagging Current Status Report
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)
Civil Registrations of Death

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

There are two distinct purposes for which the data are required.

Primary Purpose:

The United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS) is a multicentre randomised control trial which aims to assess the impact of screening on ovarian cancer mortality while comprehensively evaluating physical and psychological morbidity, compliance and resource implications of screening and performance characteristics of a serum CA125 versus ultrasound based screening strategy.

The UKCTOCS team published the primary outcome on impact of screening for ovarian cancer on mortality in December 2015. The UKCTOCS team demonstrated for ovarian cancer a significant reduction in stage and a non-significant 15% mortality reduction in the multi-modal screening group compared to the control (no screening) group. When prevalent cancer cases were excluded, there was a significant reduction in mortality of 20%. The mortality rate was rising in the control arm at censorship while it seemed to be plateauing in the screen arms. On the basis of this evidence, the UKCTOCS team are undertaking extended follow-up of the UKCTOCS cohort through to end of 2019.

At the end of the extended follow-up there will be a further mortality and cost-effectiveness analysis. In addition, there will be continued detailed analysis of routes to diagnosis, investigations and treatment of the patients in the three arms of the trial.
The data will be processed only by the core UKCTOCS team for the purposes of:
• identification of episodes of ovarian and related cancers and details of management;
• identification of the treating physicians for the purposes of requesting medical documentation required for the review process;
• identification of resource costs involved in the diagnosis/treatment of ovarian cancer
• identification of treatment – surgery which may have resulted in removal of ovaries for the purpose of censorship
University College London is using the data to ensure that UKCTOCS have tracked individual patients for the cost-effectiveness analysis. The data will ensure that UKCTOCS capture ALL hospital events over the follow-up period. The analysis will be undertaken on an Intent to Treat basis from the perspective of the NHS provider and therefore, as there will be judgment used over which hospital events are directly related to ovarian cancer, particularly for the control arm patients, it will be valuable to include all hospital events for the individual patients enrolled in the trial.

Supporting Secondary Research Studies:

At recruitment, which took place between 2001 and 2005, participants donated serum samples to be used along with their data in future ethically approved secondary studies. They provided written consent to allow access to their medical notes and to permit use of their data.

The team is committed to using the data collected during the long-term follow up of the cohort together with donated serum samples for identification and validation of novel risk prediction/screening/ early detection/diagnosis/treatment strategies including costs, disease biomarkers, insights into natural history of disease and wellbeing in older women and impact of lifestyle and screening on disease outcome.

Proposals for secondary studies have to be ethically approved and reviewed by an oversight committee at UCL. Secondary studies in the field of cancer, cardio-vascular disease and increasingly other chronic diseases in older women are continuously being proposed, submitted for grant applications and being funded, with CRUK and European Union the largest funders so far. As it is not possible to predict exactly which cancer/disease will be investigated UKCTOCS requires HES data on the entire range of diagnoses rather than just the ovarian cancer episodes needed for the primary purpose.

To support the secondary studies detailed above, the UKCTOCS team will use the data supplied by NHS Digital solely for the purpose of
(1) identification of participants to be included in a specific secondary study. Examples include
a. for a case control study of myocardial infarction (MI), where the UKCTOCS team already have self-reported data on MI, HES data will be used to confirm and validate the reported data, to identify episodes that occurred after the date of self-reporting as well as additional cases. This would be done by searching for the specific ICD codes in relevant HES fields such as outpatient/inpatient DIAG. The information will be used towards generating a new processed FINAL DIAG data field.
b. for a case control study of dementia, where the UKCTOCS team have not asked about the disease on postal surveys, HES data would be used to identify cases by searching for the specific ICD codes in relevant HES fields such as outpatient/inpatient DIAG. The information will be used to generate a new processed FINAL DIAG data field.
(2) enrichment of phenotypic data (accrued from multiple sources including self-reported during postal follow-up) that is available on a woman with regard to a specific disease. For example
a. in a cohort study of MI for a prognostic biomarker, the UKCTOCS team would classify the treatment patients underwent into a few major groups. This would be derived by searching the relevant HES field using OPCS codes. The information will be used to generate a final treatment category that will be stored in a new processed FINAL TREATMENT data field.
b. in phenotyping a cancer the UKCTOCS team might infer routes to diagnosis by the hospital department where the patient was seen in the episode that led to the diagnosis for e.g. an ovarian cancer patient being admitted through A&E or first seen in Gastroenterology outpatients instead of Gynaecological Oncology. The information will be used to generate a final routes to diagnosis category that will be stored in a new processed FINAL ROUTES DIAG data field.

The use of samples/data in secondary studies, usually case control studies, requires not only identifying the disease but also further details of the individual/disease to ensure eligibility for the particular study. This additional data, required to make the case/control selection, include type of treatment, route to diagnosis, treatment date so that appropriate samples before or after diagnosis/treatment/recurrence of a cancer are identified and confounding information such as other diseases both in individuals with diseases (cases) and controls. The enriched phenotypic information is generated from the combination of HES data with information provided by the women in their follow-up questionnaires as well as information provided by clinicians and primary care physicians. HES data (in relation to secondary purposes) may only be used to identify and characterize individuals that could have their serum samples/data used within particular studies.

The raw NHS Digital data will only be accessed, analysed and processed by the UKCTOCS team within the Data Safe Haven at the University College London (UCL) and will not be shared with any third party. No dates will ever be shared. Where a diagnosis or event has been identified using HES data the age at which the diagnosis/event occurred will be generated from the HES date and the self-reported date of birth which the participants have shared with the UKCTOCS team.
The data that third parties will have access to is limited to generated fields. Only anonymous data will be released to third parties.

The secondary studies involve both academic partners and industrial partners including Abcodia (a UCL spin-off) which has an exclusive commercial license to work on biomarker discovery and validation using serum samples from the UKCTOCS biobank. Neither the storage facility nor any third party (including Abcodia) will have access to the NHS Digital data. Only the UKCTOCS team will have access to the patient level data and data will only be accessed at the approved locations.

Yielded Benefits:

Below are a number of recent publications that have been prepared using NHS Digital provided data, either in the identification of Ovarian cancer cases or in the identification of case/control nested sample sets. No record-level data supplied by NHS Digital was published. The cost-effectiveness of screening for ovarian cancer: results from the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS). Menon U, McGuire AJ, Raikou M, Ryan A, Davies SK, Burnell M, Gentry-Maharaj A, Kalsi JK, Singh N, Amso NN, Cruickshank D, Dobbs S, Godfrey K, Herod J, Leeson S, Mould T, Murdoch J, Oram D, Scott I, Seif MW, Williamson K, Woolas R, Fallowfield L, Campbell S, Skates SJ, Parmar M, Jacobs IJ. Br J Cancer. 2017 Aug 22;117(5):619-627. doi: 10.1038/bjc.2017.222. Epub 2017 Jul 25. Risk of chronic liver disease in post-menopausal women due to body mass index, alcohol and their interaction: a prospective nested cohort study within the United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS). Trembling PM, Apostolidou S, Gentry-Maharaj A, Parkes J, Ryan A, Tanwar S, Burnell M, Jacobs I, Menon U, Rosenberg WM. BMC Public Health. 2017 Jun 28;17(1):603. doi: 10.1186/s12889-017-4518-y. Elevation of TP53 Autoantibody Before CA125 in Preclinical Invasive Epithelial Ovarian Cancer. Yang WL, Gentry-Maharaj A, Simmons A, Ryan A, Fourkala EO, Lu Z, Baggerly KA, Zhao Y, Lu KH, Bowtell D, Jacobs I, Skates SJ, He WW, Menon U, Bast RC Jr; AOCS Study Group. Clin Cancer Res. 2017 Oct 1;23(19):5912-5922. doi: 10.1158/1078-0432.CCR-17-0284. Epub 2017 Jun 21. Testing breast cancer serum biomarkers for early detection and prognosis in pre-diagnosis samples. Kazarian A, Blyuss O, Metodieva G, Gentry-Maharaj A, Ryan A, Kiseleva EM, Prytomanova OM, Jacobs IJ, Widschwendter M, Menon U, Timms JF. Br J Cancer. 2017 Feb 14;116(4):501-508. doi: 10.1038/bjc.2016.433. Epub 2017 Jan 12. Novel risk models for early detection and screening of ovarian cancer. Russell MR, D'Amato A, Graham C, Crosbie EJ, Gentry-Maharaj A, Ryan A, Kalsi JK, Fourkala EO, Dive C, Walker M, Whetton AD, Menon U, Jacobs I, Graham RL. Oncotarget. 2017 Jan 3;8(1):785-797. doi: 10.18632/oncotarget.13648.

Expected Benefits:

Primary Objective:
Ovarian cancer is the fourth commonest cause of death from cancer amongst women in the UK. The majority of women unfortunate enough to develop this cancer have few symptoms until it has spread outside the ovaries. By this time it is difficult to treat and approximately 70% of these women will die. In contrast, the outlook for the small proportion of women diagnosed before ovarian cancer has spread is good. This research trial is based on the premise that screening detects ovarian cancer at an early stage may reduce the number of deaths. At the end of the study UKCTOCS will have information about how many lives ovarian cancer screening can save, how much this will cost; how women feel about being screened and the complications of screening.

The Director of NHS Cancer Screening Programme, sits on the UKCTOCS steering committee and is already aware of the project and its detailed publication plan. The NHS Cancer Screening Programme has contacted the UKCTOCS team to plan independent analysis (which does not include HES data) of the cost effectiveness of any screening within the NHS to make a timely decision as to whether an NHS national screening programme for ovarian cancer should be introduced.

Supporting Secondary Research Studies:
UKCTOCS will also have information on the potential of endometrial cancer screening and the cost benefits if any. It is expected that the serum bank and associated data will lead to biomarkers that will be used for the early detection of disease in asymptomatic subjects as well as predictive and prognostic tests of clinical use. In the short term some of these may be incorporated into new clinical trials.

Outputs:

Primary Objective:
Products are limited to publications in peer-reviewed Medical and Scientific Journals, oral and written presentations at national and international conferences. Personal data is not disclosed.

The data will form the basis of the cost effectiveness analysis, supplemented by additional treatment data extracted from patient notes. The final output will be a publication which will only contain aggregate results with small number suppression, in line with the HES Analysis Guidelines. The aim is to publish the paper by the end of 2017.

The final mortality results were published in December 2015. The results showed a non-significant 15% reduction in mortality from Ovarian cancer in the Multi-modal screening group compared to the Control group and a significant reduction of 20% when prevalent cases were excluded from the analysis. On the basis of these results a NIHR HTA grant has been awarded for extension of follow-up of the UKCTOCS cohort for a further mortality analysis at end of 2019.

With regard to other outputs, in most cases, the data received from NHS Digital is/will be used only to identify/trace women diagnosed with ovarian cancer so that the study team can retrieve the patient notes on which analysis is based. The data has contributed to the identification and classification of some of these cancers. To date, there have been 15 published UKCTOCS papers and significantly more oral presentations. Multiple data sources have been used.

Further planned outputs in 2017 include publications on performance characteristics of the ultrasound screening strategy and the primary analysis results (Impact of screening on ovarian cancer mortality).

In addition, there will be analysis of data source contribution towards a confirmed diagnosis. E.g. In common cancers/diseases what proportion had data available through NHS Digital cancer registration, death registration, HES, follow-up questionnaire etc.). This will be a useful indicator for future researchers working in the disease area as a comparison will be made between cases identified through the listed sources and the confirmed diagnosis classification.

Progress reports are also submitted to the NIHR (Evaluation, Trials and Study Coordinating Centre) who oversee the governance of the MRC funded clinical trials. These do not contain any patient data.

Supporting Secondary Research Studies:
The output of using the data is typically the identification of other cancers/diseases (in the same way as for ovarian cancer) to select cases for inclusion in nested case control studies. Data provided by NHS Digital is used to identify women with diagnoses of cancer/disease and determine the date of diagnosis in relation to when serum samples donated by the women were collected. The data is then no longer used in such secondary studies.

One exception is the CRUK-funded study of the cost effectiveness and possible impact of endometrial cancer currently being undertaken by UCL’s Gynaecological Cancer Research Centre in collaboration with the London School of Hygiene and Tropical Medicine. This study used HES data in analysis and will produce outputs containing aggregate results (complying with the HES Analysis Guidelines). For this study, no data was transferred outside the department and only UCL employees had access to the HES data. A publication with the working title ‘Cost of Survival Over 5 Years Following Endometrial Cancer Diagnosis’ was submitted to the British Journal of Obstetrics and Gynaecology in December 2015.

Processing:

Section 251 support has been obtained as the legal basis for obtaining NHS Digital HES data for both the primary purpose and secondary studies detailed above and below. UCL supplied the identifying details of a cohort to NHS Digital including name, date of birth, NHS Number and address. This cohort also included a Volunteer Reference Number in order to link to UCL’s participants and only pseudonymised data is transferred back to UCL where it is linked back to the original study database containing patient identifiable data.
The data is stored and processed within the UCL SLMS Identifiable Data Handling Solution (IDHS) which is the Data Safe Haven (DSH). The data is held within a Microsoft SQL 2005 database with access limited to staff specifically granted access.

Primary Purpose:
As part of the primary UKCTOCS analysis, diagnosis codes associated with an ovarian cancer and other gynaecological malignancies and corresponding operation codes are analysed. Where such codes are identified, UKCTOCS will request medical records from the GP or treating consultant for review and confirmation of diagnosis.
As part of the UKCTOCS cost-effectiveness analysis, the project will need to consider hospital in-patient and out-patient resource use and costs relating to standard therapy and any follow-on costs associated with an ovarian cancer diagnosis as well as with false positive surgery/investigations in the screened population.
A cost-effectiveness analysis compares the cost to the NHS of screening and treatment for ovarian cancer in the screen arms of the trial to costs of diagnosis and treatment in the control (no screening) arm of the trial. In- and out-patient HES data is critical in identifying the procedures and treatments that both groups received. In addition HES A&E and HES Critical Care data are required for the routes to diagnosis analysis as many ovarian cancer come through A&E. Some ovarian cancer patients are also likely to be admitted in critical care post-surgery or following a complication. All of this would be need to be included in the cost-effectiveness analysis of treatment.

Supporting Secondary Research Studies:
The data will only be used to contribute to identification of cases for nested case control and cohort studies. The data contributes to disease identification and helps to establish the interval between sample collection and diagnosis date. In a proportion of studies, once potential cases are identified, the treating clinician is contacted for the details and confirmation of diagnosis followed by selection of the appropriate serum samples sets from the UKCTOCS biobank. The serum samples are stored at a commercial facility the costs of which are currently covered by the UCL contract with Abcodia which allows Abcodia access to serum samples.

Abcodia receive serum samples donated by the UKCTOCS volunteers. The only NHS Digital-sourced data provided alongside the sample is information that identifies whether a sample originated from a case (diagnosis of the cancer/disease being investigated) or control subject, date of diagnosis and date of sample donation being converted to age and, where the volunteer has passed away, age at death and a generated field identifying whether death was caused by the disease in question. The source of the diagnosis will not be revealed to the third party. No other identifiable or record level data provided by NHS Digital will accompany the serum samples that Abcodia will access.

Many of the secondary studies involve collaborations with data analysis mostly involving academic collaborations and biomarker studies involving both academic and industry collaborations. Third parties will only be provided with anonymised serum samples for nested case control studies. Data will be provided that identifies whether a sample originated from a case (as defined by whether the individual was diagnosed with the cancer/disease being investigated) or control. The source of the diagnosis will not be revealed to the third party. HES data is only used as described above to supplement data from the patient follow up questionnaires, NHS DIGITAL cancer registrations and deaths, Myocardial Ischaemia National Audit Project (MINAP) data and GP data in identifying the appropriate samples that UKCTOCS make available for any studies – commercial or academic. No identifiable data and no NHS Digital data is released to Third Parties.

Evaluating variation in special educational needs provision for children with Down syndrome and associations with emergency use of hospital care. — DARS-NIC-50975-X6N3J

Opt outs honoured: Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2019-01 – 2022-01 2020.12 — 2020.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-23-august-2018-final.pdf, igard-minutes-2nd-august-2018-final.pdf, igard-minutes-3-may-2018.pdf, igardminutes-29thoctober2020final.pdf, IGARD Minutes - 26 January 2023 final.pdf, IGARD Minutes - 26 August 2021 final.pdf, IGARD Minutes - 29 July 2021 - FINAL.pdf, igard-minutes-1-november-2018---final.pdf, igardminutes-4thfebruary2021final_.pdf

Datasets:

MRIS - Bespoke
Civil Registration - Deaths
Emergency Care Data Set (ECDS)
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
Civil Registrations of Death
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

There is considerable variation across England in both health and education services for children with chronic conditions and special educational needs. This request from University College London (UCL) relates to the first stage of a larger project to evaluate variation in service provision from education, children’s social care and healthcare for children with Down syndrome and determine the impact on emergency use of hospital care. This first stage will involve data from the National Down Syndrome Cytogenetic Register (part of the National Congenital Anomalies and Rare Diseases Registration Service (NDSCR) provided by Public Health England (PHE)) and Hospital Episode Statistics for England (HES).

The patient identifiable HES data provided by NHSD to PHE is linked to the records of individuals with a congenital anomaly or rare disease (CARD) held by PHE in the National Congenital Anomaly and Rare Diseases Register (https://www.gov.uk/guidance/the-national-congenitalanomaly-and-rare-disease-registration-service-ncardrs). PHE has Section 251 support to collect information on cases of CARDs in England and to link this to other sources of information on the diagnosis, treatment (such as HES) and outcomes of these individuals.

A planned second stage of the research will include linkage to the National Pupil Database but this is not permitted under this Data Sharing Agreement. Approval for the second stage will be sought as an amendment when the Department for Education has acquired the necessary certification for their information governance procedures.

This research is jointly funded by the Economic Social Research Council (ESRC) and the Administrative Data Research Network (ADRN). The ESRC fund the Administrative Data Research Network (ADRN) (soon to be the Administrative Data Research Centre for England (ADRC-England)). The ADRN will be the public facing website for this study. ADRC - England is led by the University of Southampton and runs in collaboration with (amongst others) UCL, who are the data controllers for this request. ADRN have a prominent role in cross-sectoral linkages hence why the study information is being published on the ADRN website.

ESRC is the funder for ADRN/ADRC and support for the study will also be provided by the Children’s Policy Research Unit (CPRU) at UCL, which is funded by the Department for Health. The ADRN facilitates requests to access administrative data for research by independently reviewing applications and liaising with stakeholders. The ADRN review panel approved this study and published a summary on their website. The ADRN and ADRC-E are multicentre collaborations involving researchers from many institutions. This project is being led by ADRC-E researchers at University College London only and does not involve any researchers from the University of Southampton. The Institute of Education is an Institute of University College London but was a separate institution at the time of preparing the protocol. Lorraine Dearden, who has an appointment with the (UCL) Institute of Education is a co-investigator and contributed to production of the protocol. Only substantive employees of UCL will have access to the data specified in this Agreement. A project summary has been published on https://adrn.ac.uk/research-impact/research/project165/. A summary will also be published on the CPRU website http://www.ucl.ac.uk/children-policy-research.

Objectives relating to this first stage are:

Objective A1: Identify a cohort of children with and without Down syndrome using linked data from HES and NDSCR.
Objective A2: Monitor variation in comorbidity, mortality rates and healthcare use in children with Down syndrome vs the general population, over time and by region.

Public Health England will be supporting the research by linking their NDSCR data to the copy of HES that they already hold, to identify the study cohort. PHE will generate pseudo-identifiers for the cohort and provide pseudonymised NDSCR data to researchers at UCL. PHE will also attach the pseudo-identifiers to a file containing matching variables from HES for the cohort, and send this file to NHS Digital for further linkage.

Using the matching variables provided by PHE, NHS Digital will link the cohort to the Personal Demographic Service to provide updated postcode histories in preparation for stage 2 (stage 2 will involve sending identifiers to the Department for Education for linkage but approval for that sharing does not form part of this application). NHS Digital will then attach encrypted HESID pseudonyms to the cohort using the encryption key for the HES dataset already held by UCL researchers (NIC-393510-D6H1D) and send to UCL a file containing only cohort ID, HESID, ONS mortality data, and indicators of linkage quality.

UCL will use the incoming file from NHS Digital to extract a subset of the data already provided under NIC-393510-D6H1D, for use in this project under a new data sharing agreement (just as if NHS Digital has provided a new project-specific extract, but avoiding an additional data flow). They will combine the HES data extracted from the existing project with the incoming NDSCR data from PHE and carry out analyses on the linked data in pursuit of the objectives stated above.

The legal basis for processing falls under GDPR Article 6(1)( e) and GDPR Article 9(2)(j). Processing and dissemination meet the public interest criteria as defined by the ICO for good decision-making by public bodies and securing the best use of public resources. The processing of data for this study is a task of public interest since it will benefit policy makers, the public and children with Down syndrome and their families, through evidence about how health outcomes and use of healthcare services compare between children with and without Down syndrome, and how they vary over time and region.

Yielded Benefits:

Data has not yet been disseminated so no benefits have yet been realised.

Expected Benefits:

This first stage of the study aims to demonstrate the feasibility and benefit of using linked NDSCR and HES data to monitor health outcomes and healthcare service use in children with Down syndrome. There will be initial benefit to policy makers, the public and children with Down syndrome and their families, through evidence about how health outcomes and use of healthcare services compare between children with and without Down syndrome, and how they vary over time and region. Such evidence is not currently available for England or any UK country.

Results will also be useful for the development of policies and strategies to reduce unplanned hospital admissions in children and young people with Down syndrome who have many medical problems, such as recurrent severe respiratory tract infections (RTIs). Finally, the study will benefit the data providers as the assessment of data quality and consistency between sources and the development of optimal strategies for linking large administrative datasets will be an integral part of the study.

Other benefits include:
~ Children with Down syndrome and their families will be able to access more comprehensive and up-to-date information about prognosis and life expectancy. It is expected that statistics reported in the scientific papers will be able to contribute the production of literature tailored more for clinical and public audiences, and that engagement with Public Health England will support this.
~ A feasible method for routine linkage of disease registers to hospital episode statistics and mortality data at Public Health England will be established.
~ Currently nobody knows what level of coverage the National Down Syndrome Cytogenetic Register has, or how well Down syndrome is captured by Hospital Episode Statistics. This limits the interpretation of research and statistics generated from both. This research will generate evidence about the quality of each dataset with respect to recording and population coverage of Down syndrome, allowing for more accurate conclusions to be drawn.

Outputs:

The first output of this study is to create a dataset comprising of all children with Down syndrome and a cohort of matched controls at a ratio of 1:9. This dataset will be used to monitor variation in comorbidity and healthcare use in children with Down syndrome vs the general population, over time and by region. Findings will be disseminated through peer-reviewed academic journals, conferences and social media. Findings will also be shared with the Down's Syndrome Association. The researchers will specifically target a number of journals and conferences. These include the PLoS One, BMJ Open, Social Science and Medicine, Journal of Epidemiology and Community Health, Journal of Public Health, Archives of disease in childhood and paediatrics, Public Health Science Conference, and the International Conference on Congenital Anomalies and Pathology.

The researchers will publish a series of methodological papers in peer reviewed journals reviewing the linkage and validating the data from the two data sources. The targeted journals will include the International Journal of Epidemiology, and PLoS One.

The primary outputs will include a series of academic papers describing:
~ Health outcomes (as indicated by health service use and mortality) for people with Down syndrome, including how these have changed over time and how they compare with the general population
~ Methodological issues in the construction of the linked dataset.

The first of these reports covering each topic are expected to be produced within the first year of receiving data, with more detailed analysis and reporting to be conducted over the next three years. Methodological outputs are expected to be lead directly to translation into practice at Public Health England. Findings about the health outcomes will be shared with stakeholder groups such as the Down's Syndrome Association, to support dissemination to patients and to receive feedback to guide the design of future research applications for the linked data (subject to approvals).

Findings will also be disseminated directly through seminars at Public Health England, which help to develop plans for translating this demonstration project into the establishment of routine ongoing linkage systems for population health monitoring, both for Down syndrome and other disease registers maintained at Public Health England. Findings will also be presented to clinicians at meetings organised by the North Thames CLARHC, GOSH and the Down’s Syndrome Medical Interest group who disseminate findings across the UK https://www.dsmig.org.uk/. As funding for ongoing work will be through the CPRU findings will be disseminated to policy makers at DH and through the CPRU network of stakeholders in children’s health.

All outputs will contain aggregate level data only and all small numbers will be supressed in compliance with ADRC-E statistical output controls and the HES analysis guides. No potentially disclosive outputs will be shared or published. The projects in this application are expected to finish three years after obtaining the data.

Processing:

Data Processing:

1 - Processing at PHE:
a) Link NDSCR to HES (both datasets held by PHE) for children born between 1.4.97 and 31.03.14.
b) For every child identified as having Down syndrome through either a registration in NDSCR or clinical codes in HES, select 9 controls without Down syndrome from HES birth admissions. Assign ‘cohort ID’ pseudo-identifier to each cohort member (case or control). The control cohort are selected at random from the same week and year of birth and local authority of residence to children with Down syndrome in the NDSCR-HES cohort, using the HES data only. There will be approximately 117,000 controls and 13,000 cases. *** This equates to 9 controls per 1 case. ***
c) Send to UCL a file containing cohort ID, Down syndrome registration flag, diagnosis date, and linkage quality indicators.
d) Send to NHS Digital a file containing cohort ID and conventional identifiers (Epikey for birth record, Financial year for birth record, NHS number, DOB, Postcode, Sex, Ethnic group, Birthweight, Birth order) obtained from HES.

2 - Processing at NHS Digital
a) Match cohort to PDS, using matching variables provided by PHE (derived from HES).
b) Prepare file containing matching variables (Study ID, PDS sequence number, First name, surname, other names, DOB, postcode, ethnic group) for future sharing with Department for Education pending approvals for stage 2 (to be disseminated under a future amendment). This process is taking place now (rather than at the amendment stage) to ensure that the file created uses the same cohort, if this is done at a later date it could be that the cohort is slightly different, therefore, the results may be incorrect.
c) Prepare file for sharing with UCL now, containing cohort ID, HESID (using the same encryption key as used for NIC-393510-D6H1D) and indicators of match quality to PDS.
d) NHS Digital to send file to UCL.

3 - Processing at UCL
a) UCL will use the file provided by NHS Digital to create an extract of cohort HES data from project NIC-393510-D6H1D for separate storage, access and processing by the researchers named on this (Down syndrome) project, and link this extract with the files provided by NHS Digital and Public Health England.
a) UCL researchers will categorise the NDSCR-HES cohort according to underlying chronic conditions, major congenital anomalies and other comorbidities (based on clusters of diagnostic and procedure codes in HES), local authority, gender, maternal age at birth, deprivation quintile and ethnicity.
b) UCL researchers will undertake descriptive analyses to determine variation in unplanned admissions and health outcomes (e.g. mortality) by region of residence, adjusting for child characteristics in the cohort with Down syndrome compared with controls.

Data minimisation:

Consistent with the principle of minimum data required for purpose, UCL have requested only variables that are necessary for the planned analyses. Where possible UCL will minimise the sensitivity of the dataset by requesting less detailed variables. UCL will minimise the potential risk of disclosure from the variables in the final analysis file by:
Rounding date of birth to month and year of birth, removing postcode, using broad ethnic groups, using quintile of index of multiple deprivation, retaining location of residence at local authority only.

UCL do require, however, data over the full range of years available of the HES record (from 1.4.97 onwards). This is because the researchers are interested in chronic conditions, which can only be identified from the longitudinal record, ideally from birth. Previous work has shown that chronic underlying conditions may not be recorded at every admission (for example, asthma may not be recorded when a child is admitted for an operation).

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

MR1318 - General Health & Hospital Admissions in Children Born after ART; A Population Based Linkage Study — DARS-NIC-180665-GJMW5

Opt outs honoured: N, Yes - patient objections upheld, Yes (Excuses: Section 251 NHS Act 2006)

Legal basis: Health and Social Care Act 2012, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii); Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii); Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(2)(b)(ii), Health and Social Care Act 2012 - s261(5)(d), Health and Social Care Act 2012 - s261(5)(d); Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(2)(a), Health and Social Care Act 2012 s261(2)(a); Health and Social Care Act 2012 s261(7)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2018-07 – 2021-07 2017.06 — 2020.10. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off, Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 2 July 2020 final.pdf, IGARD_Minutes_20.07.17.pdf, DAAG_Minutes_31.03.15.pdf, igard-minutes---6-aug-2020-final.pdf, DAAG_Minutes_10.02.15.pdf

Datasets:

Hospital Episode Statistics Critical Care
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
MRIS - Bespoke
Civil Registration - Births
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

UCL (Great Ormond Street Institute of Child Health) wish to establish if children born after assisted conception (including IVF and related techniques) are at an increased risk of specific diagnoses compared to spontaneously conceived siblings and unrelated spontaneously conceived controls.

The diagnoses which are to be investigated include:
a) Complications of Prematurity (such as respiratory distress syndrome, necrotizing enterocolitis, retinopathy of prematurity, intra-ventricular haemorrahges, per-ventricular leucomalacia)
b) Cerebral palsy
c) Congenital malformations
d) Asthma and allergic disease
e) Developmental delay
f) Death
g) Cancer
h) Hospitalization rates and length of stay

These comparisons will help to provide robust risk estimates for this ever growing population. This is important as all of these potential risks have been suggested by previous research but have never been confirmed, as previous studies lacked the necessary power and design to do so.

UCL is not requesting identifiable data from NHS Digital and will not pass any data on to third parties, including the HFEA.

Fertility clinics have been engaged in a campaign to inform patients of how their data may be used for research and how they could opt out. Families connected to ART have been consulted on what they would like research to achieve. These service user views are incorporated in UCL's study design.

Yielded Benefits:

The receipt of all data and data linkage processes have taken much longer than expected. The main benefits will be delivered once the dataset is analysed and outputs are produced. This project has already produced the benefit of linking cohort members and their mothers, which has also enabled restricted access to more reliable identifiers of cohort members than was originally available from the HFEA. As per original data sharing agreement, these identifiers are currently being held by NHS-Digital securely, but would allow future researchers to undertake important linkage work not possible before this project (when only very limited cohort identifiers were held by the HFEA despite their mandate to collect such comprehensive data). Such future access is subject to approvals from HFEA, CAG, HRA and NHS-Digital. UCL have no access to this data.

Expected Benefits:

With 2% of babies now born through ART every year, this research is important for those babies, their mothers and those parents considering ART, the clinicians that treat these patients and the healthcare system as a whole. There are still many questions over the best ART methods and the long term health outcomes for these children. This study will benefit patients and the healthcare system by providing robust analysis to help remove some of these uncertainties.

This research has clear public benefits and its outcomes will significantly add to the currently very limited body of research in this area.

The results will be used to provide information to ART stakeholder groups, including fertility experts, patients wishing to undergo ART, children born after ART and their families and public health workers.

Benefits from this study will include robust risk estimates for children born after assisted conception in comparison to both control groups (spontaneously conceived siblings and spontaneously conceived unrelated children). This information is crucial for;

i. counselling of families of children born after assisted conception
ii. couples who wish to have assisted conception
iii. informing practitioners of any increased health risks enabling early diagnosis
iv. future health service planning for this population

Outputs:

Outputs will contain only aggregate level data with small numbers suppressed in line with the HES analysis guidance.

The expected outputs from this project include a number of scientific papers, detailing robust risk estimates for the outcomes under investigation (including hospitalization incidence, and incidence of specific diagnoses). It is expected these papers will be submitted by the end of 2018 beginning of 2019 and published shortly afterwards.

It is expected that these papers will be submitted to broad medical peer-reviewed journals which may or may not be subscription only, however abstracts of this work will be open access. The main audience for these papers will be a scientific/ clinical audience, in order that clinicians disseminate results to their service users.

Additionally the study will produce a report for the HFEA to publish open access on their website and disseminate via their networks (the fertility clinics, clinicians and directly to patients via these clinics and their website) aiming for June 2019 or 2 years after receiving data from NHS- Digital.

It is aimed to also submit abstracts to the Royal College of Paediatrics and Child Health to further publicise results.

It is not possible to say exactly which journal will be the appropriate one for submission of these reports as this depends to some extent to the results the study will find. However, it is expected that these to be high quality journals. For example, work previously done linking this dataset to national cancer registries was published by the New England Journal of Medicine which has the highest impact factor of any medical journal (impact factor 59.6). (http://www.nejm.org/doi/full/10.1056/NEJMoa1301675#t=article)

Findings will also be presented at the European Society of Human Reproduction and Embryology conference.

Processing:

Data processing for this project has been designed to ensure that identifiable data are seen by the fewest number of people at secure locations in secure methods as possible.

A dataflow diagram has been supplied, but below is a short summary of the data flow and processes;

1. NHSD produces an extract of women who were treated with ART (produced from MR1208)
2. NHSD sends extract containing mothers details to ONS to match to births
3. ONS matches mothers to all births (ART children and non-ART siblings) and returns matched births to NHSD
4. HFEA sends NHSD HFEA births (containing unique ID number)
5. NHSD matches HFEA births (to ONS births and creates ART Cohort)
6. NHSD sends the ART Cohort to ONS who return two controls for each member, creating the Control Cohort.
7. NHSD links all remaining ONS births that match to HFEA mothers and creates Sibling Cohort.
8. NHSD sends member numbers of all unmatched HFEA births to UCL.
9. NHD links ART, Sibling and Control Cohorts to ONS mortality and HES data and removes identifiers
10. NHSD supplies de-identified data to UCL along with a deprivation score and unique study ID (this study ID cannot be used by UCL to re-identify)

Numbers 1 to five have been completed under the previous agreement.

NHS Digital will produce de-identified outcomes for all the cohorts and securely send to UCL. UCL will then match this data to de-identified fertility treatment data (provided to UCL by HFEA) using the HFEA unique ID.

The final de-identified data-set will then be held encrypted and securely at UCL using UCL's data safe haven. This storage has IG approval from NHS-Digital via IG toolkit

Only individuals, working under appropriate supervision on behalf of data controller / processor within this agreement, who are subject to the same policies, procedures and sanctions as substantive employees will have access to the data and only for the purposes described in this document.

ONS data will be processed in accordance to the standard Office for National Statistics terms and conditions.

UCL have no requirement and will not attempt to re-identify the data.

UCL will not share the data with any third parties.

Using national electronic databases to validate cardiovascular outcomes in PATCH a pilot study to assess the use of electronic databases for clinical trial follow up. — DARS-NIC-242415-V9T5D

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2019-12 – 2022-12 2020.05 — 2020.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-16th-january-2020-final.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The Medical Research Council Trials Unit (MRC CTU) at University College London (UCL) requires pseudonymised Hospital Episode Statistics (HES) data for use in a sub-study of the PATCH (Prostate Adenocarcinoma: TransCutaneous Hormones, PR09, CRUK/06/001; ISRCTN70406718, ClinicalTrials.gov NCT00303784) trial. The sub-study is called “Using national electronic databases to validate cardiovascular outcomes in PATCH – a pilot study to assess the use of electronic databases for clinical trial follow up.”. UCL are the sole Data Controller who also process data for this project.

In a sub-study of the PATCH trial, MRC CTU propose to compare the coverage and accuracy of cardiovascular outcome reports derived from:
- electronically collected from the National Institute for Cardiovascular Outcomes Research (NICOR)
- electronically collected from HES datasets from NHS Digital.
- routine outcome ascertainment within the PATCH trial (manually collected data via the PATCH Case Report Forms (CRFs).

This sub-study will be for patients currently recruited into the PATCH trial (approx. 1600 patients). This will inform whether future event ascertainment within the trial could be supplemented with registry data (NICOR or HES and other nationally collected datasets) to improve data quality/completeness, or may even reduce or negate the need for trial follow-up (and therefore reduce research costs) because if the study can show that there is a good concordance between HES data and trial data then the team could in the future use registry data to supplement the trial data. This is the objective of this methodological study. Additionally the data source (i.e. data collection within PATCH trial CRFs vs routine data collection (HES/NICOR) may also affect the quality of data. This is perhaps more important, since trials are moving towards electronic data capture, but there may still be a benefit to supplementing it with registry data.

The justification of processing this data under the principles of GDPR, is Article 6(1)(e): ‘Processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller’ as this is a task within the public interest and as the research involves health data, Article 9(2)(j) is also applicable, as this details that processing is necessary for scientific and research purposes, subject to appropriate safeguards.

PATCH is an ongoing clinical trial assessing the safety and efficacy of replacing androgen suppression using Luteinising Hormone Releasing Hormone (LHRH) analogues with transdermal oestrogen patches, in men with locally advanced or metastatic prostate cancer. The study is funded by Cancer Research UK and currently recruiting at multiple centres across the UK. The cardiovascular safety of systemic oestrogen administration is one of the key toxicity concerns around its usage in PATCH. Although transdermal oestrogens bypass the liver first pass effect seen with oral administration and therefore should decrease cardiovascular risk, surveillance of these events within the study is an ongoing endeavour. This is challenging and time consuming, as it requires frequent exchanges of Serious Adverse Event (SAE) forms and documentation between centres and the MRC CTU. Furthermore the complexity of the event data is subtly different from the more usual cancer specific endpoints that form the main endpoints of the trial.

In terms of cardiovascular outcomes in PATCH, a previous 2013 report (Langley et al, Lancet Oncology 2013) reported data from the first 254 patients randomised (169 oestrogen patches and 85 LHRHa), using events reported via the investigators (CRFs). These were independently reviewed by cardiologists and classified as one of: heart failure, acute coronary syndrome, thromboembolic stroke, arterial thrombosis (other) or venous thromboembolism. Importantly, absolute numbers of events were small).

For large scale clinical trials, clinical outcome data, often collected over many years, is central to the satisfactory analysis of both primary and secondary endpoints but also subsequent hypotheses and interpretation. Conventional data collection via CRFs, whilst considered a gold standard, is time consuming, costly and, despite the efforts of NHS research departments throughout the UK, subject to patients moving away and being lost to follow up. In addition, clinical trials frequently collect information for a limited follow-up period, for example five years. There may well be events of interest which occur many years after the end of the trial which are currently not collected.

With the development of national clinical databases and audits, there is the potential for electronic data capture of specific outcomes via routinely collected information, improving the robustness and efficiency of trial follow up. However, information around the totality of event capture within the different available datasets and how this compares with conventionally collected trial data via CRFs is lacking. Initial comparisons of such databases show differences in event detection arising for a number of reasons including difficulties in disease coding (Herrett et al, BMJ 2013).

The cohort of patients will be those enrolled within the PATCH trial. This will be in line with the data availability from HES and NICOR. Only events occurring during time on study will be retrieved and compared. The PATCH trial database will be used as the template for this, and only events occurring post randomisation will be analysed. NICOR and HES event data will be cleaned in this way by the MRC CTU.

UCL are requesting that HES Admitted Patient Care and A&E datasets are linked to the cohort and the requested fields extracted. The HES data, including the associated unique study number, will be transferred in pseudonymised form to UCL.

UCL are providing a cohort. The patient identifiers being sent to NHS Digital are Study ID, NHS Number, Date Of Birth, Postcode and Gender (although all cohort are male). The identifying patient data (cohort) provided to NHS Digital to enable linking to HES data is held within a separate UCL data safe haven and will not be combined with the disseminated data.

UCL are not permitted to re-identify individuals under this agreement.

The subset of the outcomes collected in PATCH will also be collected by the National Institute for Cardiovascular Outcomes Research (NICOR), based at Barts Health NHS Trust which houses national databases on cardiac surgery (SCS), percutaneous revascularisation (BCIS), myocardial infarction (MINAP), heart failure, arrhythmia and congenital heart disease.

The research design and objectives are as follows:

Substudy Design:
1) request NICOR data on patients enrolled into PATCH study from MINAP and Heart failure audits (under patient consent model).
2) request HES Inpatient and A/E data (cardiovascular outcomes to include stroke and thromboembolic disease) on the same cohorts from NHS Digital.
3) Perform 3-way comparison of coverage and accuracy of outcomes as reported in NICOR, HES and manually collected data (via the PATCH CRFs).
4) The data may also be used for methodology research into the quality, accuracy and completeness of the datasets, including the reasons or causes of any discrepancies.

The aim of the study is to compare cardiovascular events between PATCH study data, NICOR and HES data and see if they are comparable.

NICOR data will be analysed within UCL data safe haven and has no identifiable data. NICOR will send to the MRC CTU all cardiovascular events that has occurred in the cohort of PATCH trial participants supplied to them. Each cardiovascular event has the participants study number attached to it. Although NICOR data could be re-identifiable, the team are not permitted to re-identify individuals in this study.

The NICOR and HES data will not be linked but using the study numbers of the PATCH trial participants researchers will compare cardiovascular events that occurred while on the trial to see if there are differences between the three different data sets.

Sponsorship for the PATCH trail is currently being transferred from Imperial College London to UCL. This sub-study will be solely managed, financed and run within UCL (MRC CTU).

Expected Benefits:

A challenge for several clinical trials is the long-term follow up of patients over many years, required to determine the risk/benefit of an intervention in the longer-term. This is important to understand:
- whether early effects are retained in the longer-term.
- to measure any delayed effects (benefits and harms) from interventions that cannot be captured in the short term.

With traditional methods of follow-up, there is generally:
- attrition of participants over time which creates missing data
- a high cost associated with ongoing trial visits and data capture.

This loss-to-follow-up can also result in bias, for example:
- participants from lower socioeconomic classes may be more likely to withdraw from trials
- those with disease progression might be more or less likely to attend follow up (depending on the setting).

A potentially more cost-effective way of continuing follow-up in the long term is to use routine electronic health records. These records can contain demographic, clinical, and patient-centered data.

The use of routine electronic health records within multicentre trials has the potential to:
- reduce the burden of additional trial-specific clinic visits and assessments for both participants and clinical research teams.
- significantly reducing the costs of central trial coordination and data collection.

To see if this is a viable option for the future there needs to be a body of work to provide evidence of the suitability of this data. This study will add to this body of evidence to allow for more effective clinical trials in the future which in turn will benefit patients who participate in trials and also who benefit from the trial result.

The beneficiaries of this research could be:

1. Patients - as this could allow for trials to be streamlined so they would not need such intense follow up. This would mean less visits to hospital which can often be time consuming and costly.
2. Clinicians and researchers involved in clinical trials in NHS hospitals. The decrease of work load of providing evidence of serious adverse events which could allow for more time with patients.
3. Clinical trial staff work load would be decreased with lack of paper work needed to process serious adverse events forms. This would decrease cost and improve efficiency of large clinical trials.
4. This could allow for innovative designs of clinical trials that could be developed in the future in partnership with national registries.

Outputs:

The results of this study are expected be published in high-impact peer reviewed journals such as Lancet Oncology, or an appropriate methodological journal such as Trials.

The study will be published in peer reviewed journals to give the highest impact and broadest readership.

In addition, the results will be presented at scientific conferences related to cancer, cardiovascular, and trial methodology, and to key stakeholders. The conferences will be chosen depending on the key findings and also the target audience. This would be important to deliver the findings with the most impact possible. Possible conferences include International Clinical Trials Methodology Conference, European of cancer registries conference and National institute of cancer research conference. This would allow for coverage of key stakeholders in the setting of registry data, clinical trial design and clinicians involved in cancer research.

The results will also be discussed at meetings within the clinical trials community in the use of electronic health records to allow for further development of this resource in trials.

Through these mechanisms the results will be widely disseminated around the clinical trial community, including clinical trial centres who run the majority of clinical trials in the UK. The results of this study will establish the usability of HES data to identify important adverse events such as heart-attacks, by comparing HES data with a robust, gold-standard event data capture conducted as part of standard protocol in the PATCH trial. This will inform on the general quality of HES data for this purpose, and more specifically to inform whether this method of data collection is suitable in the longer-term for the PATCH trial and for other clinical trials who may wish to use this data.

Publications will be published based on UCL open access policy. This would include publication in open access journals and access to the results on the MRC clinical trials website which is freely open to the public.

Outputs will be anonymised to the level required by the ISB anonymisation standard and will contain aggregate data only.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

The PATCH research team based in MRC CTU at UCL will provide to NHS Digital, NHS number, Postcode, gender (all males) and DOB for each individual to allow accurate and reliable identification of patients within the cohort. These will be associated with a unique study number. Patients have consented for data to be shared with researchers in an annoymised or linked annonymised form (this is where the 'Linked anonymised data' are anonymous to the people who receive and hold it (e.g. a research team) but contain information or codes that would allow the suppliers of the data to identify people from it). They have also consented that personal details can be used to obtain long term follow up information from national registries.

The pseudonymised HES datasets, along with the unique study number, will be transferred to the UCL PATCH team by NHS digital, this data will reside in the UCL data safe haven and will be identified by study number only – thus there will be no identifying personal data attached to a study number. Only defined members of the PATCH team and MRC CTU’s methodology team will have access to the data safe haven for analysis of the data. These will be substantive employees of UCL. All UCL substantive employees must have completed training in data protection and confidentiality and also have had appropriate training on the UCL data safe haven.

Outcomes of interest will be defined pragmatically, for example using both generic and more specific codes within the HES data hierarchy. The data retrieved will then be assessed on an individual event basis by a PATCH clinician to assess the concordance of clinical definitions between the data sources.

Analysis will be achieved within the UCL data safe haven using pseudonymised data. Analysts will not have access to the identifiable data which is held within a different data safe haven within the clinical trials unit.

The research will perform a three-way comparison using the PATCH data (collected via the PATCH CRFs) as the gold standard, as compared with NICOR and HES. Major cardiac outcomes should be common to all, though collected in 3 different ways. Cerebrovascular and thromboembolic event will be compared between PATCH and HES.

The NICOR and HES data will not be linked but using the study numbers of the PATCH trial participants researchers will compare cardiovascular events that occurred while on the trial to see if there are differences between the HES, NICOR and the manually collected data sets.

HES data will not be shared with NICOR or any other third party.

NICOR data will not be shared with NHS digital as per the data sharing agreement between NICOR and UCL.

Venn diagrams may be produced to assess concordance between databases (as per Herrit et al, BMJ 2013) and descriptive statistics used to detail the relative rates of event detection in the separate databases. The positive predictive value of each database in recording and classifying events subsequently collected via the CRF will be calculated. Additionally the reverse will be performed to assess the efficiency of CRFs in capturing events as compared with national electronic datasets.

UCL are not permitted to re-identify individuals under this agreement.

Outcomes of this sub-study will not be used as outcomes for the PATCH trial in the future.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Data provided will be for all members of the cohort from 2010/11 onward. This is so the cohort of data is standardised for all participants to verify trial data and attempt to eliminate false positives and negatives from analysis.

The relationship between education and health outcomes for children and young people across England: the value of using linked administrative data. — DARS-NIC-27404-D5Z3F

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2018-12 – 2022-01 2019.08 — 2020.04. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

Hospital Episode Statistics Admitted Patient Care
Civil Registration (Deaths) - Secondary Care Cut
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

Background
The strong interrelationship between health and education services in relation to the health and wellbeing of children is recognised by policy makers, but evidence is lacking on how services complement or compensate for each other and there have been calls for a stronger evidence base to be developed.

This study has two purposes. Firstly, to address the methodology for linking HES and national pupil data (NPD) from the Department for Education and to address important policy and health service questions about the interrelationship of health and education for children with and without underlying chronic conditions. As part of the methodology programme of the ADRC-E, the aim is to assess the feasibility of using linked health and education administrative data to carry out such research. The research will benefit the data providers, and thereby, indirectly benefit NHS systems, by undertaking an assessment of data quality and consistency between sources and by assessing the accuracy of linkage. The results will be published to inform others (government departments and researchers) of match rates and factors associated with linkage error.
The second purpose, is to inform policy and service commissioners about associations between education risk factors and health outcomes, such as emergency hospital admissions. The hypothesis is that school attainment is associated with unplanned admissions to hospital, but this association may vary by area, type of school and individual risk factors, particularly presence of chronic conditions. Researchers need to study the whole of England in order to determine how education outcomes vary according to local authority and type of school, and what the impact is of such variation on emergency admissions to hospital.

The project combines a methodological component to evaluate linkage quality and an applied research component, which involves evaluating the association between education outcomes and emergency use of hospital services by children with and without underlying chronic conditions.
This study has three aims, namely to:

1. Evaluate linkage success between National Pupil Database (NPD), Personal Demographic Service (PDS), hospital admissions data (HES) and mortality data.

2. Evaluate the association between hospital admissions for children and adolescents with underlying chronic conditions and subsequent school achievement

3. Evaluate the association between education outcomes and subsequent use of hospital services, taking into account underlying chronic conditions.

Methods

Linkage
Linkage will be conducted by NHS Digital using identifiers supplied by DfE (forename, surname, date of birth, sex and postcode) captured in each school year. Identifiers will be matched initially to PDS and then to HES and de-identified. UCL will then receive the pseudonymised HES data from NHS Digital together with a Study_ID supplied by DfE. DfE will supply de-identified NPD data together with a Study_ID.

Analyses
Using the linked data UCL will examine how the association between school achievement and hospitalisation varies according to local authority and type of school, taking into account factors at the individual level (eg chronic conditions in the health record, free school meals, ethnicity in the school record) that might affect both school achievement and emergency use of hospital services (ie. confounding factors).

UCL will examine whether low school achievement is associated with subsequent emergency admissions to hospital, and whether health problems, manifest in admissions to hospital, are associated with subsequent changes in school achievement. UCL would expect these associations to vary across the country, as local factors, such as type of school and local services, eg support for children with chronic conditions, vary between local authorities. Evidence on such risk factors will be important for generating hypotheses about how healthcare and schools can reduce adverse outcomes for children and adolescents. To evaluate variation across the country, and to have sufficient power to evaluate outcomes for children with chronic conditions at different ages and in different services, UCL need to use data for the whole country and for all children (within the age restriction) in both datasets.

Both datasets contain longitudinal data of a child’s routinely captured information in the NPD and of hospital admissions, outpatient and A&E attendance in HES and mortality data. Within HES, diagnosis and procedural variables will be required to identify unplanned admissions, and to define cohorts of children with and without chronic conditions. Information from OPD and A&E adds information on planned outpatient specialist care and the frequency of emergency hospital contacts without admission. Comparison groups, comprising those with and those without chronic conditions will be identified from the longitudinal record of hospital admissions, if possible from birth. Previous work completed by UCL has shown that chronic underlying conditions may not be recorded at every admission (for example, asthma may not be recorded when a child is admitted for an operation) and UCL have demonstrated the added value of using the whole longitudinal record.

Description of the cohorts:
In order to minimise the amount of data, UCL have restricted the data requested to four one-year cohorts.
The four cohorts are:

Cohort 1. Young people cohort (born between 01/09/1990-31/08/1991 who entered reception class in September 1996):

This cohort will test the quality of linkage of data with young people receiving education up to age 18 years, and as young people become more mobile.

These young people will first be recorded in the NPD with their Key Stage 1 (KS1) from 1998 and Key Stage 2 (KS2) from 2002, and annual school census from 2001/2. UCL request NPD KS1, KS2, KS4, and KS5. They will also be captured in HES on their first hospitalisation on or after 01/04/1997 (at approximately 7 years of age or more). HES data is requested up until the most recent available period.

This cohort tests linkage with all children (state and non-state educated) who have a KS4 assessment (approximately 99% of adolescents aged 15/16). UCL will also follow up adolescents who sit KS5 (A levels). At this age they have shown high rates of emergency use of hospital for adversity related injury (self-harm, drug or alcohol misuse or violence). In this cohort, UCL can evaluate the antecedent education risk factors for such admissions.

Cohort 2. Primary-secondary school transition cohort (born between 01/09/1996-31/08/1997 who entered reception class in September 2002).

This cohort will comprise children whose educational profile can be followed from primary to secondary school and will capture health service utilisation from the early years and throughout the educational trajectory. The cohort will provide valuable information on the quality of data linkage, where there is an overlap between the time points on the main datasets (i.e. NPD and HES).

These children will be recorded in the NPD annual school census from 2001/2 and in KS1 data from 2003/4. These children enter secondary school in September 2008 and will have KS3 recorded in 2011 and KS4 recorded in 2012/13. This cohort will have annual school census data throughout the primary school years, but not all would have had a hospital admission (UCL expect around 40% to have been admitted at some point). Data on hospitalisations will be captured in HES on or after 01/04/1997 (when this cohort will be approximately 1 year of age) until the most recent data extract available.

Cohort 3. Preschool-primary school cohort (born between 01/09/1999-31/08/2000 who entered the reception class in September 2005).

This cohort will capture indicators of chronic conditions recorded in the birth record and in infancy, which is the period when the risk of admission to hospital is highest. It will also provide a complete record of primary school education. The frequent movement of children in the preschool years will present a challenge to linkage (ie. postcode changes), which UCL will seek to evaluate in this study using available HES resources, such as the Patient Demographic Service introduced in 2004. Linkage quality is expected to be less good than for Cohort 4, as identifiers in HES birth records between the cohort years. NPD data are requested up until KS4 data, which would end in 2015/16. UCL will examine the association between chronic conditions and school achievement in the cohort, accounting for chronic conditions and birth characteristics recorded in early life.

Cohort 4. Patient demographic service cohort (born between 01/09/2004-31/-8/2005 and enter reception class in September 2010)

This cohort data in HES birth episodes and PDS will be concurrent and therefore maximise the likelihood of successful linkage to the NPD (ie most children in NPD are expected to have a birth episode in HES). It should comprise a complete record of health service used from birth and early education that may be used to investigate the impact of early education/educational achievements on pre/post school hospital admissions.

NPD are requested up until KS2 data.

Yielded Benefits:

Expected Benefits:

UCL will evaluate whether linkage error disproportionately affects certain ethnic or disadvantaged groups, who are also at increased risk of chronic conditions requiring hospital care by comparing characteristics of children who are linked with unlinked children in HES and NPD datasets. Findings can be used to modify linkage algorithms by NHS Digital when linking non-health data via the Personal Demographic Service to HES. Such information has the potential to benefit direct health care beyond the scope of this study.

The results relating to the provision of health care will be as follows:
1. The study will inform health care services about whether certain types of schools or local authorities are associated with increased or decreased rates of emergency admissions for children with and without chronic conditions. For example, children with serious learning impairing conditions may have lower rates of emergency hospital admission if they attend a special school than if they attend a mainstream school; school type or area may affect rates of emergency admissions and A&E attendance for adversity-related conditions (eg self-harm, violence or mental health), after adjusting for underlying chronic conditions, previous admissions, age and socioeconomic factors. In this way, the study will generate hypotheses about how interventions in schools, or improved feedback from hospitals, could potentially reduce rates of emergency use of hospital services. The findings will inform preventive health strategies by local authorities and hospitals.

2. The study will provide useful information for children and their parents, which is relevant to the promotion of their health and wellbeing and relevant to clinical practice. The study will show how school achievement and absence varies between children with and without chronic conditions, across the age range and between areas. Such variation can be used to inform clinical practice by identifying potentially better practices (eg to reduce school absence for children with chronic conditions) that could be adopted more widely.

3. The study will examine whether indicators at school such as absenteeism, and school failure can identify groups of vulnerable children and young people, particularly those with chronic conditions, who could benefit from proactive or preventive healthcare input that might reduce emergency use of hospital services, and improve health and educational outcomes.

Outputs:

A range of outputs are expected from the study relating to the aims of the study to investigate the quality of linkage and address relevant policy, healthcare and NHS systems questions on the association between child health and education (in the presence or absence of chronic illness; please refer to ‘Objective for processing’ for Aims).

All outputs will contain aggregate level data only and all small numbers will be suppressed. Outputs will be monitored for compliance with ADRN statistical output controls and the HES Analysis Guides. No potentially disclosive outputs will be shared or published. The project in this application/agreement is expected to finish three years after obtaining the data.

Aim 1 (Linkage component):
i. Linkage evaluation: The first output of this study will be the evaluation of a bespoke linkage method that could be extended by NHS Digital to wider linkages of data involving postcode histories over time. UCL will publish methodological papers in peer reviewed journals reporting the evaluation of the linkage accuracy. The methodological evaluation is expected to finish two years after obtaining the data. UCL will be targeting journals such as the International Journal of Epidemiology, and PLoS One. Further UCL will share this report with NHS Digital and DfE to inform their linkage methods. UCL will publish a summary of the outputs on the ADRC-E website.

ii. UCL would expect to present findings at conferences such as the International Population Data Linkage Conference and the Administrative Data Research Network after two years from obtaining the data.

Aim 2 /3: The other outputs will address the health service and policy questions relating to the linked data i.e. the applied research component.

iii. Study findings will be disseminated through peer-reviewed academic journals, and social media including lay summaries on the ADRC-E website. UCL expect the analysis to finish three years after obtaining the data. UCL will target both health and education journals and conferences. These include the Archives of Disease in Childhood, PlosONe and Journal of Public Health, and the British Education Research Journal.

iv. Study findings will be widely disseminated through conferences and seminars such as the Public Health Science Conference, International Population Data Linkage Conference, Society for Social Medicine, Society for Longitudinal and Lifecourse Studies, the British Education Research Association Conference.

v. Relevant findings will be shared with policy makers, clinicians/health professionals, educators and parent groups particularly those with an interest in chronic conditions in accessible formats (e.g. lay summaries, videos or animations). This could include forums such as the National Children’s Bureau (NCB) Young Person and Parent group, the Great Ormond Street Hospital (GOSH) Patient Engagement group. These groups can be accessed through the joint institute of UCL Great Ormond Street Hospital Institute of Child Health, through the North Thames CLARHC (led by UCL) and through the Children’s Policy Research Unit. Lay summaries of the study findings can be published on the ADRCE website, and linked through websites for these organisations.

Feedback to the Department of Health and Department for Education will be through the Children’s Policy Research Unit.

Processing:

Data Flows
The HES data to be provided by NHS Digital and NPD data provided by DfE are for the one-year birth cohorts comprising records for all children specified in the following cohorts:

Cohort 1: Born between 01/09/90-31/08/91
Cohort 2: Born between 01/09/96-31/08/97
Cohort 3: Born between 01/09/99-31/08/00
Cohort 4: Born between 01/09/04-31/08/05

The linkage of identifiers will be conducted separately from attribute data (clinical or education characteristics) by NHS Digital and pseudonymised before release to the UCL Safe Haven for analyses (the definition of pseudonymisation is where direct identifiers have been removed from the data extract; identifiers refer to fields such as names, sex, date of birth, full post code). UCL researchers will receive only pseudonymised HES data and pseudonymised NPD data (without sensitive variables, ie no date of birth or date of death).

The following outline describes how identifiable and non-identifiable extracts from the data scheme will be handled.

1) NHS Digital will create four cohorts of all HES records for children born in years of cohorts 1 to 4 (admissions, critical care, A&E and OPD and fact and date of death. Only month/year of death will be released to UCL, not full date of death. UCL request month of birth for each child in order to account for well-established effects of month of birth on school achievement (ie research consistently shows that children born in September do better than children born in July/August).

2) DfE will supply UCL Safe Haven with requested (attribute) data extracts, alongside the study specific pseudo-identifier number (Study_ID) for children in Cohort 1 – 4. There will be no identifiable information on the extract sent to UCL.

3) DfE will supply the Trusted Third Party (NHS Digital) with a list of NPD identifier variables, including name and postcode histories, alongside a study specific pseudo-identifier number (Study_ID) for children in Cohort 1 – 4.

4) NHS Digital will match the identifiers from DfE to records held in the PDS using an algorithm that prioritises the most recent post code in NPD (at the school census date). Matching to PDS data will be done internally within NHS Digital, no PDS data will be disseminated to UCL.

The following methodology is proposed for the linkage by NHS Digital:

The current MIDAS algorithm will be modified to consider the 15 most recent postcodes in PDS, ranked according to distance from the date associated with the NPD postcode. For remaining unmatched NPD records, the second postcode in NPD will then be compared with PDS as above. This approach will be repeated up to a maximum of 5 postcodes in NPD or until a match is achieved with PDS. This is necessary because ignoring address changes in PDS after the school census date in NPD would increase the number of missed matches. For example, although NPD postcode might be correct in September, PDS postcode in the subsequent months until the next census is more likely to be the most recent postcode.

NHS Digital will not send confidential data to DfE.

5) NHS Digital will create two HES – NPD data files. Each file will contain the pseudo-id provided by NPD (Study_ID) and pseudo-id provided by HES (HESID). All individuals within the relevant cohorts 1-4 age range, whether they are linked or unlinked (i.e. appear in both HES and NPD or just one dataset) will be assigned a HESID, and those matched to NPD will be assigned the study-ID transferred from DfE. These pseudo- anonymised files will be transferred to the UCL Safe Haven to be accessed by the researcher. These files will not be provided to DfE.

The HES-Mortality/PDS-NPD Matched Assessment File will contain linkage details of the matches between NPD and PDS using the pseudo-study id for each of the data sets i.e. HESID and Study_ID. It will indicate the match rank at linkage with NPD (i.e. the first, second or third etc. running of the modified MIDAS algorithm). The file will not contain any postcodes or other identifiers.

The second file will contain attribute data. This contains the HES records for the matched HES-Mortality/PDS - NPD records and the unmatched HES records for patients in cohorts 1-4. All records will contain a HESID and those matched to NPD will also contain a Study-ID.

6) NHS Digital will retain the identifier file of all individuals linked in NPD-PDS and PDS-HES and all the postcodes used in linkage and postcode dates for 12 months to address data queries. This data set will not contain any attribute data and will be accessible only to NHS Digital staff.

7) The two HES-NPD files will be transferred to the UCL Data Safe Haven from NHS Digital for analysis by the UCL team. The pseudo Study IDs attached to NPD data supplied by DfE, attached to the matched HES-NPD records assessment file supplied by NHSD, and attached to the unmatched and matched HES attribute records, will be used to create the final dataset that will be used for analyses within the UCL Safe Haven. The files will not contain any identifiable data. No additional record level data data will be gathered or linked to the dataset.

Data security
To maintain the physical and technical security measures, the following safeguards will be in place:
1) Data linkage: Linkage will be carried out by a Trusted Third Party (NHS Digital) following the separation principle. The NHS Digital team undertaking the linkage will only have access to identifiers from the HES team for hospital administrative data and receive full identifiers from DfE for NPD data. These identifiers include name, date of birth, full postcode and sex. No additional attribute data will be provided. These identifiers will be removed and replaced with the pseudonymised Study ID and HESID assigned to each child in the cohort before release to the UCL Safe Haven for access by the researchers. The original file will be retained by the Trusted Third Party for 12 months to address queries about the linkage accuracy.

As a consequence, the UCL researcher will not have access to the identifiers and NHS Digital will at no point have access to NPD attribute data.

2) Minimising data disclosure: During analyses, the UCL researchers will minimise the potential for deductive disclosure, by limiting the presentation of specific information that could potentially identify any individual in the cohort. Statistical disclosure control measures will be applied according to HES requirements. UCL have minimised disclosure risks by requesting: only month of death and month of birth, LSOA decile rather than rank, location defined only by area indicators (eg local authority), and UCL will allocate a pseudo-id for school.

3) Limited access: The data will be held on a secure server at University College London. The project has ADRN panel approval and REC approval: 17/LO/1494. Access will be restricted to named users, who are notified to the i) data providers, the ii) secure server operators, and who are contractually part of iii) the study team and have signed a data user agreement with all three parties. The users must also be accredited as data users (meaning they have been trained in governance and confidentiality) by the UK Administrative Data Research Network (ADRN adrn.ac.uk). All users have university contracts that stipulate compliance with data governance. This means that the researcher can be dismissed from their post if they fail to comply with data governance requirements. Access to the data is via a remote login that requires a physical authentication system as well as a password (i.e. double authentication).

4) Limiting outputs: The data analyses are conducted on the UCL data safe haven. Detailed individual level child data cannot leave the UCL Safe Haven (i.e. secure server) except by a secure encrypted transfer system, as that would be in breach of data sharing agreements. It is used to export results of analyses and can be audited. Any outputs from analyses that are published have to meet statistical disclosure controls that prevent small cell sizes in accordance with NHSD requirements. Tabulations of aggregate data are assessed for statistical disclosure control and authorised for export by a data scientist not involved with the analysis who has delegated exporting rights authorised by the Principal Investigator of the project. No de-identified death registration data will leave the UCL safe setting.

Data Analysis
The record-level HES-mortality data will only be accessed by staff with contracts at UCL who have been trained and authorised to access the UCL safe haven. Any outputs will contain only aggregated data that complies with the small number suppression rules in the HES Analysis Guide.

The analyses of the linked dataset will involve the following steps. These will be concurrent with the evaluation of linkage accuracy:

i) Create inception cohorts of children:
(a) defined by birth (Cohorts 3 and 4)
(b) entry to primary school (Cohorts 1 and 2)

ii) Examine characteristics of linked and unlinked cohorts to determine impact of linkage error.

iii) Characterise children according to past admissions and history of chronic conditions in HES-mortality data.

iv) Determine variation in school achievement taking into account health care history and chronic condition status (based on admissions in previous years).

v) Determine association between school achievement and emergency use of hospital services taking into account individual characteristics (health care and chronic condition status, socioeconomic status) and type of school and local authority. School attainment will be used as a time dependent covariate that is associated with, or mediates, unplanned hospital admissions.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Mixed methods evaluation of the Getting it Right First Time programme - improvements to NHS orthopaedic care in England — DARS-NIC-112374-X0T4S

Opt outs honoured: Yes - patient objections upheld, Yes, No (Excuses: Section 251, Section 251 NHS Act 2006, Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(7), Health and Social Care Act 2012 s261(7)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-11 – 2021-11 2019.01 — 2019.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-8th-november-2018---final.pdf, igard-minutes-11th-october-2018.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Patient Reported Outcome Measures (Linkable to HES)
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

Background:
Researchers from University College London (UCL) are conducting an evaluation of the high profile ‘Getting it Right First Time’ (GIRFT) improvement programme currently being rolled out nationally in orthopaedic surgery. The first GIRFT report was published in 2012, by the then president of the British Orthopaedic Association (BOA). It recommended changes to NHS orthopaedic practice to improve patient outcomes, and achieve cost savings. NHS England subsequently funded a national professional pilot which involved senior clinicians visiting providers across the country to offer bespoke recommendations about improving care. By July 2015, the GIRFT project team, which is hosted on behalf of the BOA, at the Royal National Orthopaedic Hospital in Stanmore, had carried out reviews with 120 NHS Trusts in England. The project highlighted significant variations in practice and outcomes; device and procedure selection; costs; and infection rates. Consequently, the Department of Health and Social Care commissioned a three-year programme to address these challenges initially in orthopaedics, and then across ten other clinical specialities.

The wider GIRFT programme has now been expanded to over 37 specialties. Work is underway to enhance patient safety by addressing the complex issue of unwarranted variation in elective orthopaedic surgery. The programme aims to address this, by prioritising the following areas: increasing minimum volumes of complex procedures undertaken by individual surgeons, enforcing ring-fenced beds to reduce infection rates, reducing length of stay and waiting times to improve efficiency for patients, improving theatre efficiency, aiming for more competitive procurement processes, reducing unnecessary procedures (i.e. therapeutic arthroscopies), and gaining a better understanding of the causes of litigation. The programme also has strong links with Lord Carter’s Procurement and Efficiency programme.

Evaluation:
The GIRFT programme seeks to change practice, (e.g. prosthesis choice) in order to improve patient outcomes (e.g. infection rates). It can be defined as a complex intervention, containing several interacting components, targeting different stakeholders and addressing a number of different behaviours. There is, however, limited evidence to support the approach being used. Whilst there have been reorganisations of a number of acute NHS services in recent years, there have been few examples on this scale. The UCL evaluation aims to contribute to the development of that evidence base by studying in depth the implementation of changes to orthopaedic services.

UCL are evaluating the intervention as a whole, including the original pilot and the work now being carried out. The researchers are focusing on outcomes directly relating to patient care, for the commonest elective orthopaedic procedures: total hip replacement (THR) and total knee replacement (TKR). Through this, they will identify lessons to inform future service improvement efforts. A cohort of patients for linkage will be provided from the National Joint Registry (NJR), and returned to UCL.

Relationship between UCL and GIRFT:
Although some members of the GIRFT team are listed as co-authors on the published study protocol, UCL are undertaking an independent evaluation of the programme. Indeed, the GIRFT Team are keen to ensure that the UCL evaluation provides an independent analysis of their work. The GIRFT team are neither funders nor co-applicants on the revised NJR application (or any other data requests they are making as part of the evaluation). To facilitate understanding of the programme, the GIRFT team provide UCL with key documents, dates and details of their visits and re-visits, as well as providing UCL with access to key events undertake non-participant observations. UCL maintain contact with the GIRFT team to ensure both remain up-to-date with any developments of the national programme, which may influence UCL's ability to collect data, as well as influence the development of the findings. However, they have no role in the conduct of the evaluation.

The researchers’ overall aim is to evaluate the implementation of a complex intervention, the GIRFT programme which seeks to improve the quality and cost-effectiveness of NHS orthopaedic care in England. In doing so, they will identify lessons to guide future improvement work in other services. Their objectives are to use formative and summative evaluation methods to:

1. Examine the key processes of – and factors influencing - the ongoing development and implementation of planned improvements to orthopaedic services, nationally and locally, including any unintended consequences;
2. Assess whether the GIRFT programme has reduced variations in
orthopaedic practice and expenditure, and improved patient outcome measures;
3. Estimate the economic impacts (e.g. cost savings) of these changes; and
4. Explore patient and public perceptions of the planned improvements to care.

Reflecting these objectives, the evaluation comprises four work streams: (WS1) an organisational study; (WS2) a quantitative study; (WS3) an economic study; and (WS4) patient and public focus groups.

Data request:
This agreement refers to the quantitative study (WS2) and the economic study (WS3). In order to assess whether the GIRFT programme has reduced variations in orthopaedic practice, and improved patient outcomes, longitudinal, provider-level data are required. With details of all admissions to NHS hospitals in England, the Hospital Episode Statistics database represents the only means by which the UCL researchers can evaluate the impact of the intervention nationally. Similarly, there are no alternative sources to the Patient Reported Outcome Measures database, which would provide data from all providers.

The researchers plan to link the data being requested from NHS Digital (i.e. Hospital Episode Statistics and Patient Reported Outcome Measures) to data from the National Joint Registry. The dataset will then be used to examine: the proportion of procedures by low volume surgeons and in low volume hospitals; the use of prostheses not recommended by GIRFT; rates of readmission and revisions; outcomes reported by patients in their post-operative PROMs questionnaires; and resource use such as prosthesis type and length of hospital stay). Researchers will also inform an estimate of the economic impacts (e.g. cost savings) of the changes.

Clarification of Roles:

UCL have provided the following information to clarify the roles of parties mentioned in their fair processing notice and protocol documents.

1. NIHR CLARHC (Collaboration for Leadership in Applied Health Research and Care) North Thames
National Institute of Health Research (NIHR) Collaborations for Leadership in Applied Health Research and Care (CLAHRCs) conduct world class applied health research which will have a direct impact on the health of patients with long term conditions and on the health of the public. CLAHRCs are collaborations between universities, the NHS (local service providers and commissioners), local authorities, industry, patients and the public, the local Academic Health Science Network (AHSN) and other relevant organisations in the region. NIHR CLAHRC North Thames is one of 13 CLAHRCs across England. It covers the geographical area of north central and north east London, south and west Hertfordshire, south Bedfordshire and south west and mid-Essex. The research team at UCL work within the NIHR CLAHRC North Thames structure.

2. The GIRFT programme team
This is an independent evaluation. The GIRFT programme team have no role in the conduct of the research, nor are they data controllers. They do not determine the purposes for which and the manner in which any personal data are to be processed by the data processors.

3. Royal National Orthopaedic Hospital (RNOH)
The GIRFT programme began from the RNOH, as the majority of the original GIRFT programme team were employed by this organisation. It is now co-delivered by the RNOH and NHS Improvement. As this is an independent evaluation, the RNOH have no role in the conduct of the research, nor are they data controllers. They do not determine the purposes for which and the manner in which any personal data are to be processed by the data processors.

4. British Orthopaedic Association (BOA)
The BOA is the Surgical Specialty Association for Trauma and Orthopaedics in the UK. The 2012 GIRFT ‘national professional pilot’ was hosted by the BOA (and RNOH), as the clinical lead and founder of the GIRFT programme – was the president of the BOA at that time. As previously described, the GIRFT programme is now co-delivered by the RNOH and NHS Improvement. As this is an independent evaluation, the BOA have no role in the conduct of the research, nor are they data controllers. They do not determine the purposes for which and the manner in which any personal data are to be processed by the data processors.

Yielded Benefits:

This aspect of the evaluation has not yet yielded benefits, as researchers at UCL are still in the process of obtaining a linked NJR-HES-PROMs dataset. This is required to answer the research questions. Following successful receipt of the linked dataset in early 2019, and dissemination of the findings, researchers at UCL anticipate the benefits outlined in Section 5d.ii. will be achieved by mid-2020.

Expected Benefits:

The evaluation will yield benefits on three levels. First, through the use of formative methods, findings will be fed back by researchers at UCL in ͚real time͛ directly to the GIRFT programme team. This will enable them to make use of the information as quickly as possible to modify the orthopaedic programme, if necessary, and to use the learning to refine the delivery of the intervention in other specialities. Findings from the evaluation will therefore enhance the wider national GIRFT improvement programme as it is rolled out. This should lead to an improved experience for those delivering the programme and the hospitals in receipt of it. Ultimately, this will increase the likelihood of the programme reducing variations in practice and improving outcomes for patients across the NHS.

Second, the summative outputs described above (such as peer-reviewed publications, a workshop and conference presentations) will be targeted at the audiences who have the greatest ability to use the findings to change practice, including clinicians, service managers and the GIRFT programme team. For example, if the research shows that elements of the GIRFT programme have been successful in reducing variations in practice and expenditure, similar approaches could be applied in other settings, with potential benefits for patients and the wider health service. However, UCL cannot assume with certainty that a reduction in expenditure will necessarily demonstrate patient benefit. Therefore, resource allocation will be taken into consideration when exploring cost-effectiveness.

The estimated total sample size for the quantitative and economic evaluation is approximately one million procedures (based on c120,000 elective hip and knee replacements per year from April 2009 to March 2017) in 134 NHS Trusts (as at 2016/17). GIRFT interventions in elective orthopaedic surgery commenced in September 2013, and by April 2014 GIRFT interventions were in place at over half of NHS Trusts. This will provide approximately 300,000 procedures occurring after the core period of GIRFT intervention in 2013 and the first half of 2014. No formal power calculation has been performed and, as described in the published protocol, researchers recognise that there may be insufficient data to produce precise estimates of impact for some outcomes. However, it is anticipated that the sample size will be sufficient to detect any meaningful differences in the proportion of low volume procedures and the use of prostheses not recommended by the GIRFT programme. Potential for over-interpretation of 'non-significant' findings will be minimised by researchers, through avoidance of hypothesis testing, and instead focusing on estimation with assessment of uncertainty. Similarly, the economic evaluation of the impact of the GIRFT programme will be conducted using available data on the cost of the GIRFT intervention, the cost for each Trust to implement the GIRFT recommendations, and the likely effects. This evaluation might highlight the lack of available and complete data to perform a full analysis, and therefore persuade other specialties to collect useful data on costs and effects for future evaluations.

Finally, whilst there have been reorganisations of a number of acute NHS services in recent years, there have been few examples on this scale. Reviews indicate that more research is needed to understand the drivers, processes, and outcomes when implementing change of this kind. This evaluation will identify lessons. Findings will be disseminated through multiple channels to a wide audience, who will be able to use the learning to optimise the organisation and delivery of future improvement programmes. For example, if research shows that approaches used GIRFT programme appear have been successful in facilitating change, similar approaches could be applied in other settings, with benefits for patients and the wider health service. Equally, if the programme has not achieved the anticipated benefits, then this is important for those planning improvement programmes in the future to take into account.

Outputs:

The proposed dissemination plan is multi-faceted, acknowledging the wide range of stakeholder groups with an interest in the GIRFT programme. The researchers will draw on expertise available within NIHR CLAHRC North Thames to support dissemination (e.g. communication and implementation expertise), to ensure that outputs are produced in the most appropriate format and targeted to the most appropriate audience, to speed up the adoption of the evaluation findings into practice.

Likely outputs that will utilise the linked NJR-HES-PROMs dataset are listed below:

• Two manuscripts are proposed for peer-reviewed journals aimed at the orthopaedic community (target journals include British Medical Journal and JAMA Surgery). Acknowledging the likely turnaround timelines for journal publication, it is anticipated that all manuscripts will be in press by mid-2020. The researchers will also explore publishing a summary of the findings in the Journal of Trauma and Orthopaedics, the BOA’s journal for trauma and orthopaedics professionals (mid 2019).

• As National Joint Registry (NJR) data will be used in the evaluation, the researchers are required to provide a 6-month interim progress report, and final report to the NJR (dates conditional on data release and final HQIP approval). These reports will be targeted at members of the NJR Research Sub-committee (clinicians, researchers, statisticians and economists); and will not be published on the NJR website or be made publicly available.

The number of each output type may be subject to change, depending on the level of interest in the work. Key findings are likely to include descriptive changes, impact of the intervention on processes, and impact of the intervention on outcomes and cost-effectiveness.

Likely outputs derived from the linked NJR-HES-PROMs dataset, as well as the wider evaluation are listed below:
• For those delivering and funding the programme (GIRFT programme team, NHS Improvement and the Royal National Orthopaedic Hospital (RNOH)) – a workshop will be held towards the end of the evaluation to summarise and discuss the findings. The audience will include project managers, clinicians, healthcare professionals, data analysts, policy makers and civil servants (Summer 2019).
• There has been considerable interest in the evaluation amongst the orthopaedic community. Therefore, a summary presentation will be offered to specialist groups, such as the British Hip Society (Summer 2019), aimed at health care professionals, patient representatives and researchers in the field of orthopaedic surgery. In addition, an abstract of the findings will be submitted to the British Orthopaedic Association’s (BOA) National Congress (September 2019).
• An abstract of the findings will also be submitted to the 2019 UK Health Services Research Network symposium (July 2019). Attendees include researchers, clinicians, policymakers and patient representatives with an interest in the organisation and delivery of health care services.
• Drawing upon findings from the wider evaluation, three further journal articles are also proposed for the wider academic community, with an interest in organisation and delivery of future improvement programmes (target journals include Implementation Science, Social Science and Medicine, and Health Policy). Again, it is anticipated that all manuscripts will be in press by mid-2020.

To reach as wide an audience as possible, researchers at UCL will make use of the NIHR CLAHRC North Thames social media channels, including tweeting links to outputs online, as well as posting links on the NIHR CLAHRC North Thames and UCL websites. CLAHRC BITEs (Brokering Innovation Through Evidence) will also be produced, which are headline ‘need to know’ summaries of high quality evidence based research findings, made available through the national NIHR CLAHRC network.

Patient and public involvement is a core part of the evaluation. A patient representative is a member of the bi-monthly steering committee, which reviews and advises on study progress. This individual has also been involved in the design and delivery of qualitative research components, for example co-facilitating focus groups. Researchers at UCL have also drawn on the support of the NIHR CLAHRC North Thames Communications and Patient and Public Involvement / Engagement (PPI/E) Officer and lay Research Advisory Panel (RAP). Members of the RAP have provided feedback on the design of the evaluation and study documentation. They also contributed to a consultation exercises, seeking public feedback about the acceptability of using psedonymised patient data in the study. This was approved by the Health Research Authority Confidentiality Advisory Group.

All outputs will be aggregated with small number suppression in line with the HES analysis guide.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

A list of personal identifiers of recipients of elective orthopaedic surgery who have consented for their details to be stored in the National Joint Registry (NJR) (managed by Northgate Public Services), will be securely transferred to NHS Digital. The following identifiers will be sent in for linkage from the NJR:

- NHS Number
- Date of Birth
- Sex
- Postcode
- Study ID

NHS Digital will link this data to information it collects, and extract details of all Hospital Episode Statistics and Patient Reported Outcome Measures data for the study cohort. NHS Digital will securely transfer linked pseudonymised data to University College London (UCL), with a Study ID and no patient identifiable data items. NHS Digital will also supply a fully psuedonmyised HES/PROMS extract to UCL created by extracting the PROMS eligibility codes from the HES data. This will be used for case ascertainment purposes which will improve the quality of the evaluation. The NJR will then securely transfer pseudonymised data for the cohort from its records to UCL using the same unique ID so data can be linked and processed without researchers being able to identify patients. After this, the NJR will have no further involvement.

UCL will use study identifiers, provided by NHS Digital to link pseudonymised information between the three data-sets: Hospital Episode Statistics, Patient Reported Outcome Measures, and the National Joint Registry data-set for the specified cohort. These information will be analysed alongside NHS Trust-level data on the timing of GIRFT interventions (provided by the GIRFT team) and an aggregated organisational survey of some NHS Trusts (that has been designed and sent out by UCL), but will not be linked to any other national data-sets. All processing for the purposes of analysis will be carried out by researchers at UCL. There will be no attempts to re-identify any of the patients in the study. GIRFT are not a data processor in this study.

All outputs will be aggregated with small number suppression in line with the HES analysis guide.

The data will not be made available to any third parties. Data will be used for study outputs described in this Agreement.

Project 85 — DARS-NIC-330769-C9Y8Y

Opt outs honoured: Yes - patient objections upheld (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 – s261(7)

Purposes: ()

Sensitive: Non Sensitive, and Sensitive

When:2019.07 — 2019.07. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type:

Sublicensing allowed:

AGD/predecessor discussions: igard-minutes-12-april-2018.pdf, igard_minutes_21_december_2017.pdf, igard_minutes_28_september_2017.pdf, DAAG_Minutes_13.08.15.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Outpatients
Civil Registration - Deaths

Type of data:

Objectives:

Due to an error by one of the GP computer system suppliers (TPP), NHS Digital did not receive updated details on patients who opted out after 1 April 2016 in GP practices running SystmOne (run by TPP). This means that information may have been incorrectly included when NHS Digital provided information to approved organisations, up until 27th May 2018.

This amendment is to re-supply the data previously held under this agreement. The applicant organisation has destroyed all data previously held under this agreement, including the affected data, and supplied a data destruction certificate.
****************************
Improving the transition of young people from child-centred to adult-centred health care systems has been a health policy priority for successive governments. There is clear evidence that poor transitions result in poor health outcomes, particularly related to drop-out from healthcare at a key time in chronic disease management. Conversely, there is now reasonable quality evidence that improving transitions improves outcomes in diseases such as diabetes and chronic renal disease/transplantation. Such evidence has resulted in a decade of clear guidance to health services on the importance of providing good transitional care in England. Yet despite this, and subsequent Department of Health (DH) initiatives, such as the Transitions Champions programme, there appears to have been little change in the majority of health services in England. It is anecdotally believed that the majority of young people with long-standing conditions in England do not receive appropriate transitional care, although routine data on transitional care are not collected.

The objective for processing the data is to investigate, using a contemporary UK sample, the effect of transitioning from paediatric to adult care on indicators of illness management relating primarily to health service usage. University College London (UCL) will also investigate how specific features of the transition to adult services are associated with these outcomes to contribute to the evidence-base regarding the features of successful transitions. This research is limited to the following three purposes:

1) Examine the health impact of transitions from paediatric to adult care.
This will help determine the extent to which current transition arrangements are fit for purpose. This entails examining whether transition itself is associated with detrimental health-related outcomes such as changes in planned health service use, increases use of inpatient, A&E and critical care services and changes in the frequency of missed outpatient appointments.

2) Guide healthcare policy to improve health transitions.
This research will identify factors associated with good transition outcomes to guide policy efforts to improve healthcare transitions. First, UCL will examine how age of transition is associated with transition-related outcomes. This may influence guidance regarding appropriate age of transition. Secondly, UCL will examine the impact of the frequency of outpatient appointments in paediatric care on adult outcomes. Finally, UCL will examine differences in transition outcomes across sentinel health conditions and specialties: diabetes, renal disease and gastroenterology. This will identify areas requiring improvement in transition approaches.

3) Contribute to the development of a measurable outcome metric for transition that could be included in the NHS outcomes framework, to drive improved attention to transition by providers.
The research will constitute a trial of an approach to measuring transition outcomes using routine health data. Health outcomes which are found to be associated with health transitions could form the basis of quality measures for transition across regions, specialties or health authorities.

Yielded Benefits:

UCL's previous work using HES data was useful to provide proof of concept of transition analyses; researchers were able to identify disease cohorts and operationalise definitions of successful transition within the cohorts. Overall system performance on transition was found to be poor, with high dropout and lack of engagement in adult services. Furthermore, there were substantial inequalities across social groups and region. Researchers found some evidence linking poor transition to poor outcomes, namely within diabetes; poor transitions are associated with higher risk of death in 5 years post-transition as well as higher A&E use and emergency admissions. The research also suggests several pathways for achieving improved transition. Firstly, targeting under-served groups based on socio-economic status, ethnicity and area is critical in more equitable transition services. Second, general delaying of transition age could improve transition outcomes. Findings also provide some support for the use of age-appropriate services, but the enactment of this model would need to be handled well to avoid additional difficulties in transitioning among services. UCL researchers provided a briefing paper for DH on these emerging findings, plus presented data at conferences. The Briefing report outlines the results from the analysis and background and is for the policy makers within DH. In addition, a poster on ‘Optimising healthcare transitions for young people: a systematic review of reviews’ was presented to the Royal College of Physicians Conference, 2015; which researchers have previously also presented to the European Society for Prevention Research 6th Annual Conference, 2015 and Royal College of Paediatrics and Child Health, 2016.

Expected Benefits:

UCL's previous work using HES data was useful to provide proof of concept of their transition analyses; and they were able to identify disease cohorts and operationalise definitions of successful transition within the cohorts. UCL provided a briefing paper for the Dept of Health (DH) on these emerging findings, plus presented data at conferences.

UCL researchers found overall system performance on transition was poor, with high dropout and lack of engagement in adult services was common. Furthermore, there were substantial inequalities across social groups and region. Evidence was found linking poor transition to poor outcomes, namely within diabetes; poor transitions are associated with higher risk of death in 5 years post-transition as well as higher A&E use and emergency admissions.

UCL's research suggests several pathways for achieving improved transition. Firstly, targeting under-served groups based on socio-economic status, ethnicity and area is critical in more equitable transition services. Second, research suggests that a general delaying of transition age could improve transition outcomes. They also provide some support for the use of age-appropriate services, but the enactment of this model would need to be handled well to avoid additional difficulties in transitioning among services.

However, UCL also identified several limitations in their work, particularly relating to the definition of disease cohorts. UCL researchers use ICD10 admission codes to define disease cohorts, and then examine transition from paediatric to adult outpatient activity. Therefore currently, cohorts may under-estimate the proportions who transition; as these could only include those with ICD10 codes from inpatient admissions >age 10 years. Further researchers are unable to examine the transition of cohorts younger than those 10-19 years in 2004/05, limiting the generalisability of their findings. Additionally, UCL wish to look at wider quality markers in addition to (as well as) transition, and examining younger cohorts will be important for this.

Therefore, UCL aim that this project and previous work will add towards the expected measurable benefits on health and social care to include:

1) Improved transition-related health policy
This research will guide Department of Health policy on the need, as well as specific policy measures, for improving transitions from paediatric care. This research will inform policy recommendations regarding appropriate age of transition and measures for reducing upheaval to service provision during the transition to adult services including frequency of appointments in late paediatric care and improving retention in adult services.

2) Developing quality measures for children’s health services and transition
The research will investigate the feasibility and value of measuring key aspect of quality gaps as identified from the recent RCPCH (oyal College of Paediatrics and Child Health) research. These will help identify health services where improved transition is justified and may contribute to the development of standards for transition quality.

Measurable benefits resulting from this research would relate to improved health outcomes and healthcare (including, for example, retainment in adult services, fewer missed appointments and less usage of emergency services) stemming from improved transition services.

Outputs:

Key outputs will be produced based on the data analyses:
1) Briefing reports targeted at health practitioners (particularly DH), who are funding the research) which summarise the key findings and suggest policy measures for maximizing health outcomes resulting from service transition. Policy guidance may include the development of measures for monitoring transition care and outcomes especially in children and adolescent.
2) Peer-reviewed publication(s) designed to disseminate the findings from the health transitions research and from the quality measures for children’s health services.
3) Presentations to conferences will also be prepared.

The output will presented in following format for all of the above:
• Descriptive statistics for disease cohorts in the analyses will presented. The cohorts will be combined on predictors selected from inpatient appointments pre- and post-transition, A&E and critical care appointments pre- and post-transition. Means and Frequency distribution will be provided on only aggregated level data with small numbers suppressed in line with HES analysis guide.
• Regression co-efficient showing association or causal pathway for the outcomes and predictors.

The outputs will focused to explain the hypothesis to the DH policy makers, clinicians and academics. No data will be shared with third parties. The original target date of finishing the project was December, 2018, however due to restrictions on the processing of data as a result of being notified of the Type 2 Patient Opt Out error in TPP GP Practises. Therefore, the conclusion to the project has been pushed to March 2019.

All outputs will be aggregated with small numbers suppressed in line with the HES Analysis Guide. No data will be linked to record patient level data, and record-level data will not be removed from the secure servers.

Processing:

Data will be stored in and accessed through UCL’s ‘Information Data Safe Haven’ (IDHS) which ensures the appropriate and safe handling of sensitive data (see: http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln). Only authorised UCL staff members will have access to the data and it will not be accessible by any third parties, nor will it be accessed outside the UK.

Data analyses will be conducted in Stata to extract summary statistics. UCL will determine age of transition with reference to paediatric and adult codes within the data. UCL will also define two additional characteristics of transition for each patient: the delay in transition (the gap between last paediatric code and first adult code), and retention in adult services (including changes in regular, planned outpatient appointments). The analysis is limited to comparing outcomes pre and post transition, and examining the effect of age of outcome, delays in transition to adult services and retention in adult services on these outcomes. UCL will conduct this analyses across conditions, as well as in three sentinel conditions: renal pathologies, diabetes and gastroentological diseases.

Outcomes are limited to use of inpatient care, A&E attendances, frequency of hospital admissions, critical care admissions, and mortality.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Precision in Provision: Predicting Treatment Outcome and Resource Use in Child Mental Health — DARS-NIC-140981-R5N6Z

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2018-10 – 2021-09 2019.05 — 2019.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: IGARD Minutes - 3 February 2022 final.pdf, igard-minutes-30th-august-2018-final.pdf, igard-minutes-22nd-november-2018---final.pdf, igard-minutes-15th-november-2018---final.pdf, igard-minutes-1-november-2018---final.pdf

Datasets:

Mental Health Services Data Set
Mental Health Services Data Set (MHSDS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The University College London (UCL) requires a pseudonymised extract from the MHSDS for use in the Precision in Provision project to conduct analysis into predicting treatment outcome and resource use in child and adolescent mental health in children and young people aged 2 to 25.

UCL instigated this project after recognising that there is a call for precision medicine whereby child mental health interventions are tailored to meet the specific needs of individual young people. The team at UCL applied for and secured funding from MQ: Transforming Mental Health to undertake this work. MQ represents mental health and quality of life, two things MQ believe everyone deserves. The data controller for this project is UCL who will also process the data. MQ is purely the funder.

The individuals who will process the data are substantive employees of UCL or hold honorary contracts with UCL. Honorary contract holders are subject to the terms and conditions of their substantive employer and are subject to the same disciplinary procedure as substantive employees of UCL.

There is a lack of evidence about which characteristics of a young person are associated with treatment outcome and resource use. This research aims to address this gap and thereby expand the use of data resources for mental health research, while at the same time develop the skills base in data linkage. To this aim, UCL will link and analyse data on young people accessing child mental health services from the Child Outcomes Research Consortium (CORC) Research programme and the MHSDS data being requested from NHS Digital. No identifiable data is being used to link the data sets rather linkage is being performed using the probabilistic method (data will be linked using overlapping variables).

In addition to exploring the best methods for data linkage, the two substantive research questions will be:
RQ1. What is the association between case-mix characteristics and effective treatment outcome in young people accessing child mental health services? Here, effective treatment outcome will be assessed using fields relating to a service user’s change in symptoms and functioning between the start and end of treatment (e.g., Strengths and Difficulties Questionnaire, Revised Child Anxiety and Depression Scale).
RQ2. What is the association between case-mix characteristics and efficient resource use in young people accessing child mental health services? Here, efficient resource use will be assessed using fields relating to the number of sessions attended, drop out, and time taken for case closure.

The corresponding expected outcomes are:
O1. Evidence showing that young people with certain case-mix characteristics are more or less likely to achieve an effective treatment outcome;
O2. Evidence showing that young people with certain case-mix characteristics are more or less likely to achieve an efficient resource use.

Case-mix characteristics refer to characteristics of service users including demographics (e.g., age, gender, ethnicity, socio-economic status, special educational needs) and needs-based groupings.

***********
A secondary purpose from this project is to support the feasibility and sustainability of data linkage and future data linkage work that is part of the government’s commitment to better data on child mental health. This is not a separate project and no other parties are involved. UCL will link the pseudonymised record level data from the MHSDS to the pseudonymised record level data from the Child Outcomes Research Consortium Research dataset. Examples of the linkage variables that will be used are gender, organisation identifier, team local identifier, team type and care contact date. The linkage will be carried out using a probabilistic method.

Expected Benefits:

The measurable benefits of this work are a more effective and efficient service provision for young people with mental health difficulties and their families. By learning and sharing the learning about the different outcomes achieved by different young people with different sorts of problems and how this links with other factors in their lives, clinicians can target help more effectively and also ensure people get what they need in the most timely way possible. These findings will also help commissioners agree more realistic targets for different groups with service providers allowing more efficient allocation of resources. The direct benefit to young people and their carers and families is that they will be able to have more refined information about what sort of help might help them given their particular circumstances and this should enable them to make more informed decisions about their care. These benefits will be facilitated by providing training, guidance and helping people audit practice in relation to this.

NHS England who are developing payment systems for child mental health services will be able to draw on the learning from this project as UCL works closely with them and they can use the outcomes to help determine the most appropriate case-mix groupings for payment purposes. In addition, the organisation has close working links with child mental health policy leads in NHS England and Department of Health, as well as also being part of the work for both the mental health policy research unit and child policy research unit. Through all of these routes, the findings and learning from this project will influence and contribute to policy development in this area.

By understanding more about the factors associated with different outcomes for different children and young people, service providers and policy makers can target their work more effectively allowing for cost efficiencies. It is expected that these would be realised within two years of project end.

Findings from this project will also work toward the government’s commitment to better data on child mental health, and it will also support the sustainability of data linkage and future data linkage – for example, across education, health and social care. This will be achieved by sharing learning about the probabilistic data linkage approach being trialled here which offers the opportunity to link data sets without inclusion of personal identifiers. As noted above, UCL has extensive links with government departments including Department of Health and NHS England. In addition, UCL are part of two policy research units (child and mental health) which work closely with civil servants to advise on research evidence to inform policy and have a particular commitment to advancing data linkage and use of secondary data to inform policy. The funders for this study, MQ, are committed to using this and other funded projects to learn more about best ways forward for data linkage and to promote this through their national campaigns and future funding initiatives.

The benefits of this research are expected to be realised by early 2020.

Outputs:

This project will produce reports, publications and presentations. These will contain advice to commissioners, practitioners and service users about implications of case-mix for clinical outcomes. The advice will be written commentary on the learning from the analysis, in terms of the understanding of factors correlating with child mental health outcomes relevant to the original research question. All outputs included in reports, summaries and papers will contain only data that is aggregated with small numbers suppressed in line with the Mental Health data sets disclosure control rules.

UCL will work with young people, carers and therapists on the best ways of using the information about the associations between case-mix characteristics and effective treatment outcomes and efficient resource use in clinical discussions with young people and families. These groups will be accessed by UCL existing networks including charities such as the Anna Freud Centre and Young Minds and through learning collaborations of practitioners and commissioners such as the Child Outcomes Research Consortium. These networks are well established and reach over six thousand practitioners and tens of thousands of children and young people. The guidance will be disseminated by email and web presence using existing newsletters and presentation events.

Clinicians will be encouraged to take note of the guidance via their regular clinical supervision. This will be supported through providing training, guidance and helping people to audit their practice. The organisation also works closely with colleagues in NHS England who are developing payment systems for child mental health services who will be able to draw on the learning from this project and can use the analysis outcomes to help determine case-mix groupings for payment purposes. These in turn will then underpin clinician behaviour.

UCL also has close working links with child mental health policy leads in NHS England and Department of Health who will be able to utilise the learning from this project to contribute and influence policy development in this area.

An academic article on the associations between case-mix characteristics and treatment outcome and resource use will be targeted to the Journal of Child Psychology and Psychiatry, a leading journal of interest to researchers and professionals in the sector. A blog and video summary will also be produced for the general public. This will be available by January 2020.

A summary of findings will be disseminated through UCL networks, the Child Outcomes Research Consortium’s (CORC) channels and UCL will liaise with MQ for dissemination via their networks. UCL and CORC websites will provide links to open access papers and offer free downloads of accessible summaries of findings. All publications and presentations will be promoted on twitter via the CORC account (over 1500 followers). Findings and implications will be presented at the CORC Members' Forum in November 2018 and the CAMHS/Child and Adolescent Mental Health conference in July 2019.

Processing:

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

For data from the Mental Health (MHSDS, MHLDDS, MHMDS) data sets, the following disclosure control rules must be applied:
• National-level figures only may be presented unrounded, without small number suppression
• Suppress all numbers between 0 and 5
• Round all other numbers to the nearest 5
• Percentages can be calculated based on unrounded values, but need to be rounded to the nearest integer in any outputs
• In addition for Learning Disability data in Mental Health (MHSDS, MHLDDS, MHMDS), the England-level data also must apply the suppression of all numbers between 0 and 5, and rounding of other numbers to the nearest 5.

Data flow:
1. UCL provides to NHS Digital a list of variables and a list of service providers and organisations submitting to MHSDS who have also submitted to UCL as part of the CORC Research Programme since January 2016.
2. NHS Digital will provide via Secure Electronic File Transfer, a pseudonymised extract of MHSDS (package 1d - community) this will then be uploaded to the ‘Precision in Provision’ partitioned area of the UCL Data Safe Haven. This area is only accessible to named individuals within the project team, all of whom are substantive employees of UCL or hold an honorary contract with UCL. The MHSDS 1d community package provides data to support analysis of local level community activity (administrative data, clinical data and demographics).
3. Also in the ‘Precision in Provision’ partitioned area of the UCL Data Safe Haven will be an extract from the pseudonymised CORC Research Programme UCL dataset.

Once NHS Digital has shared the pseudonymised MHSDS data with UCL and UCL has uploaded it to the UCL Data Safe Haven, the data will then be prepared by standardising and cleaning each data set (pseudonymised NHS Digital data and the anonymised CORC Research programme dataset held by UCL) so they can be best prepared for linkage. Particular time and care will be taken in this step given that good file preparation is one of the most important factors for ensuring effective data linkage. Example activities will include transforming data sets from hierarchical to flat structures as required, converting overlapping variables to the same format (e.g. dates), mapping variables to a common set of codes where different code lists have been used (e.g. ethnicity), and recalculating age variables that are based on different reference dates.

The linkage approach will be driven by the established framework of probabilistic linkage. This step will involve determining the specifics of the approach most suited to linking the UCL data and MHSDS data set. It will involve confirming the overlapping variables to use for linking, picking blocking variables, estimating parameters for assigning matching weights to record pairs, setting thresholds for classifying record pairs as matches, non-matches or possible matches (for manual review), and deciding on rules for multiple iterative passes of linkage.

Record pairs falling between the thresholds will be manually reviewed for classification as matches or non-matches. Based on the findings from the manual reviewing, iterative refinements to the linkage approach may be made. After data linkage, manual checks will be carried out on small random samples of record pairs designated as matches.

All data processing and analysis will be conducted within the partitioned area of the UCL Data Safe Haven. This will include linking the pseudonymised record level data from the MHSDS to the pseudonymised record level UCL data using a probabilistic method. Examples of linkage variables that will be used are gender, organisation identifier, team local identifier, team type, care contact date. The CORC Research programme data set contains pseudonymised information of patient demographics, period of contact, events and questionnaire data.

Linking these datasets is crucial to maximise the strengths and overcome the limitations of each individual dataset; for example, the UCL dataset contains rich outcome information but limited demographic information, whereas the MHSDS contains rich demographic information but limited outcome information. The linked data will be used in analysis investigating association between case-mix characteristics, treatment outcome and resource use.

The data will be minimised by only including variables needed for linkage and analysis and filtering for only services that submit data to the data set held by UCL. The lists of service providers to filter by and variables to be included for linkage and analysis have been provided to NHS Digital.

These steps that have been put in place and the nature of the data mean that the possibility of re-identification is so remote as to render the data as pseudonymised.

The data that does not link between the two sets of data will be considered for comparator analysis. The data will not be made available to any third parties except in the form of aggregated outputs with small numbers suppressed in line with the Mental Health data sets disclosure control rules.

For all outputs from this project, the following disclosure rules will be applied:
• Suppress all numbers between 0 and 5
• Round all other numbers to the nearest 5
• Percentages may be calculated based on unrounded values, but will be rounded to the nearest integer in any outputs

As of January 2016, the Mental Health and Learning Disabilities Data Set (MHLDDS) standard superseded and replaced the Child and Adolescent Mental Health Services (CAMHS) data set. For this reason, the date period of the data requested will be January 2016 up to the most recent available extract.

MR1179 - INFANT study — DARS-NIC-147793-R05H3

Opt outs honoured: N, Yes, No

Legal basis: Health and Social Care Act 2012,

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2012-07 – 2027-07 2016.09 — 2019.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

MRIS - Scottish NHS / Registration
MRIS - Personal Demographics Service

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

The objectives of the study are:
1. to determine whether intelligent decision support can improve interpretation of the intrapartum cardiotocograph (CTG) and therefore improve the management of labour for women who are judged to require continuous electronic heart rate monitoring. Specifically. will the system, compared with current clinical practice:
I. identify more clinically significant heart rate abnormalities
ii. result in more prompt and timely action on clinically significant heart rate abnormalities?
iii. result in fewer "poor neonatal outcomes"?
iv. change the incidence of operative interventions?
2. to determine whether use of the decision-support software has any effect on the longer term neurodevelopment of children born to women participating in the INFANT study.

Yielded Benefits:

Variation in Healthy Life Expectancy Throughout Childhood and Adulthood in England — DARS-NIC-06527-J1Q6T

Opt outs honoured: No - data flow is not identifiable, No (Excuses: Does not include the flow of confidential data)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2018-12 – 2021-11 2019.03 — 2019.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes---7th-may-2020-final.pdf, igard-minutes-6th-december-2018-final.pdf, IGARD_Minutes_30.03.17.pdf

Datasets:

Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
HES:Civil Registration (Deaths) bridge
Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Civil Registration (Deaths) - Secondary Care Cut
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)
Hospital Episode Statistics Outpatients (HES OP)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The UCL Institute of Health Informatics Research is a hub for facilitating the improvement of healthcare in the NHS underpinned by rigorous research methods of complex health data. The Institute has a strong commitment towards developing a culture for sharing innovative methods and outputs with the aim of maximising the impact and visibility of research using linked health data. Building on the success of the Farr Institute, the MRC has established Health Data Research UK (HDR UK), a multi-funder UK institute for health and biomedical informatics research.

The Farr Institute was a UK-wide research collaboration involving 21 academic institutions and health partners in England, Scotland and Wales. Publically funded by a consortium of ten organisations, led by the Medical Research Council, between 2013 and 2018, the Institute was committed to delivering high-quality, cutting-edge research using ‘big data’ to advance the health and care of patients and the public. The Farr Institute did not own or control data but analysed data to better understand the health of patients and populations.

The Farr Institute’s five years of funding comes to a close in October 2018 with the newly established Health Data Research UK stepping into position as the country’s national health data science institute. HDR UK is a joint investment led by the Medical Research Council, together with the National Institute for Health Research (England), the Chief Scientist Office (Scotland), Health and Care Research Wales, Health and Social Care Research and Development Division (Public Health Agency, Northern Ireland), the Engineering and Physical Sciences Research Council, the Economic and Social Research Council, the British Heart Foundation and Wellcome. There are 6 geographically placed centre, one of which is HDR London which includes Imperial College London.

University College London will undertake four studies on variation in healthy life expectancy throughout childhood and adulthood in England. This programme proposes 4 clinically relevant research studies investigating the relationship between age at which people develop morbidities or disability requiring hospital admission and subsequent survival. A commonly accepted criterion for prioritising health interventions is not solely to prolong life but to keep people healthy longer (Objective 4 NHS Mandate). Substantial inequalities exist in disability-free and healthy life expectancies across cross-sections of the population, particularly in groups with deprived socio-economic characteristics. The NHS Constitution for England states that the NHS has a ‘social duty to promote equality through the services it provides and to pay particular attention to groups or sections of society where improvements in health and life expectancy are not keeping pace with the rest of the population’ (DH 2015). The dual challenge researchers will tackle is, therefore, to understand how specific conditions lead to an overall degradation of health and even to death, and how this burden of illness is distributed across geographical areas and patient characteristics. Such evidence is used by health organisations such as Public Health England (PHE) and the National Institute for Health and Care Excellence (NICE) to issue clinical and policy guidelines.

Using HES linked to death registrations, the researchers propose to estimate measures of healthy life expectancy for a range of sociodemographic groups and cohorts of patients with specific conditions or risk factors. Individual-level exposures to be examined include; age at inception to the cohort, presenting condition, indicators of underlying chronic conditions recorded in hospital records, hospital contacts (including inpatient, outpatient and A&E) and GP registration and demographic factors (e.g. deprivation, ethnic group, gender). Organisational and area-level exposures will include local authority and hospital at cohort inception, and organisational characteristics such as specialty services, number of admissions, A&E and outpatient provision. Outcomes related to healthy life expectancy will be defined as time to death and time to indicators of loss of healthy life e.g. occurrence of complications or morbidity identified in subsequent hospital presentations.

The researchers will not undertake any analyses outside the aims and objectives specified for the four studies listed below.

As part of work within Health Data Research UK (HDR-UK), UCL will harmonise and standardise methods for analysing cohorts across the life course, range of conditions and demographic indices. UCL aim to ensure consistent validation and use of algorithms across conditions and age groups to enable comparability of the studies. The four studies will also develop new tools to advance policy and research into healthy life expectancy and for use in outputs by other agencies (e.g. PHE).

The approach of using information from the whole HES record across the life course represents a major advance over current estimates based on potential years of life lost from chronic conditions such as heart disease or COPD recorded on death registrations and for single conditions. These methods fail to capture conditions and risk factors earlier on in the life course, which often do not appear on death certificates and might offer opportunities for intervention. Processing of hospital records linked to death records can inform clinicians and patients about the future risks for patients given a point such as first diagnosis, when prognostic information is crucial for patients and clinicians deciding on future treatment plans. Further, the methods will provide population-based information on the burden of illness in the population of people who are still alive. UCL aim to inform policy about the growing gap between life expectancy and healthy life expectancy: people are living longer but spending more years with disability. How can years of healthy life be increased, in whom, and at what point in disease trajectories is intervention likely to have most impact? The studies will develop methods to guide patients, clinicians and policy makers, and will inform key policy programmes such as the English Burden of Disease (EBD) Study led by PHE. No data will be shared with PHE.

None of the collaborating teams or funders listed in this application will have access to the data. The full research database will be accessible only to researchers who are substantive employees within UCL. Data access will not be granted to researchers who do not have UCL substantive contracts.

The four research studies are as follows:

Study 1. Population-based indicators of healthy life expectancy related to COPD.
A study focused on COPD will develop methods to use HES and death registration records to produce population health indicators to assess variations in healthy life expectancy. Hospital and death records provide survival data and information that can be used to predict patient-reported health and disability status. Focusing on COPD as an exemplar condition, the study will use longitudinal HES-mortality data for patients with any diagnoses related to COPD to estimate ‘population at risk’ statistics based on patterns of attendance and survival. Researchers will use this approach to infer the size of the population at risk of severe degradation of their health leading to emergency admission or death. The study will estimate the size of populations at risk of an acute exacerbation broken down into age, sex, ethnicity and either local authority district or main treatment hospital catchment. Estimates of population-at-risk sizes for COPD/other related conditions (heart failure, asthma) may be used to inform health organisations across England of who may benefit from a health intervention for an ambulatory care-sensitive condition to improve service delivery to help people recover from poor health or stay in good health longer.

The study will include all adult patients (aged 18y+ from 01.04.1997) with at least one of the following: (a) an inpatient/day case hospital emergency admission with diagnosis codes J40, J41, J42, J43, J44 or J47 (COPD) or J45 (asthma) or I11, I25, I42, I50 (heart failure) (b) registered cause of deaths J40-J44, J47, I11, I20-I25, I42, I50. It will include the Admitted Patient Care, the Out Patients and the Critical Care HES data. It will include all deaths both linked and unlinked.

Funder: HDR-UK.

Study 2. Reproductive health
Pregnancy and admission for delivery is an opportunity for health services to identify and address underlying chronic health, pregnancy complications and psychosocial needs in mothers and to plan future care for the mother and child. Maternal mortality during delivery is rare, but mortality in the long-term can be high for some groups (e.g. drug or alcohol misusing pregnant mothers). Likewise, adverse birth outcomes such as very pre-term birth and congenital malformations may impact on the future health and life expectancy of the child and impact maternal health. Better information on healthy life expectancy, including expected disability years of life for mothers, children and young people could inform more proactive healthcare with benefits to the NHS. This study will focus on mortality up to 15 years after delivery for mothers with high risk characteristics (e.g. underlying chronic conditions and/or psychosocial needs) and children with risk factors at birth (e.g. preterm birth, congenital malformations) compared with unaffected populations. Longitudinal HES records will provide a measure of onset of conditions/procedures (e.g. diagnosis of cerebral palsy or epilepsy in children, admissions for mental illness in mothers). UCL will use published external evidence on health states to estimate years lived with disability.

Children and young people aged < 25 years (ie the oldest one with follow up would be 45 – and aged 29.9 in 1997). All women with any codes indicating a live or stillbirth (from 1.4.97 onwards). It will include the Admitted Patient Care, the Out Patients. the Accident and Emergency and the Critical Care HES data. It will include all deaths both linked and unlinked for the children and young people <25 and linked only for the maternities.

Funder: UCL GOS Institute of Child Health and Institute of Women’s Health and Epidemiology and Health Care, HDR-UK and Great Ormond Street Hospital (GOSH) Biomedical Research Centre (BRC).

Study 3. Inequities
Health inequities are unfair, avoidable differences in health that occur across the gradient of social deprivation. Minimal research has examined interactions between ethnicity and social deprivation on the risk of these outcomes. Some groups (e.g. homeless people/drug users) experience much more extreme health inequities. With increasing levels of homelessness and alcohol- and drug-related deaths it is important that the public response to health inequities encompasses the gradient across all groups and more extreme health inequities. Recent research has demonstrated the importance of overlapping risk factors for extreme health inequalities including homelessness and drug use – an area UCL are calling inclusion health. Estimating the extent and nature of hospital contact for inclusion health populations is fundamental to understanding the need for preventive services in secondary care. In the most comprehensive assessment of NHS homeless hospital care utilisation to date, the Department of Health estimated need using the ‘No Fixed Abode’ (NFA) code in Hospital Episodes Statistics (HES) data. NFA is a proxy indicator for single people sleeping rough or in a hostel. This work is over 10 years old and UCL will address the recent lack of research delineating the impact of such extreme social exclusion on healthy life expectancy and access to health care. Current population health assessments guiding policy and practice do not address the extremes of social exclusion or interactions between social deprivation and exclusion, ethnicity and health. This hinders the development of targeted approaches to prevention and uptake of services to address inequities in healthy life expectancy. The researchers will address this by comparing key socially excluded groups (homeless people, injecting drug users) and ethnic minority groups against the general population across different strata of social deprivation. Looking at this issue across deprivation categories and including extreme exclusion will allow researchers to identify the need and rationale for prevention opportunities in hospital settings for all deprived groups.

Include all patients (aged 18y+ from 01.04.1997 onwards) with at least one of the following: a) Substance use disorders - all patients with diagnosis codes: SUD: F11 F14 b) Homeless – all patients with any use of code: z59 c) Homeless: No Fixed Abode marked used for discharge and postcode of patient, or in any other field d) Ethnic groups: all HES records for all ethnic ethnic groups, e) IMD groups: all records for all IMD quintiles. Will Include Admitted Patient Care, Out Patients, Critical Care, and Accident and Emergency HES data and all death records linked and unlinked.

Funders: Expected component of forthcoming UK-Prevention Research Partnership research application, NIHR PhD Fellowship (Luchenski), Health Data Research UK, NIHR CLAHRC North Thames.

Study 4. Multimorbidity: detecting high risk clusters
The number of people living with multiple diseases is increasing. The Academy of Medical Sciences report ‘Multimorbidity: a priority for global health research’ recently listed numerous evidence gaps in multimorbidity research and called for more research on the scale and nature of multimorbidity. The first aim is to answer research priority 5 from the report: “What strategies are best able to maximise the benefits and limit the risks of treatment among patients with multimorbidity?”. To do this, the researchers need first to address research priority 1: “What are the trends and patterns in multimorbidity?”. The researchers will (a) identify the most common multimorbidity clusters at the population level using cluster and network analysis (b) follow changes in these clusters over the lifecourse (c) distinguish clusters which are associated with functional deficits, disability, or mortality. The second aim is to develop tools to determine which diseases cluster together more often than expected by chance.

Include all patients with a HES admission or death registration record, in all age groups from 1.4.97 onwards. Will Include Admitted Patient Care, Out Patients, and Critical Care HES data and all death records linked and unlinked.

Funders: Wellcome Trust clinical PhD studentship; Rutherford Fellowship-MRC, UCLH BRC, HDR-UK.

Expected Benefits:

The programme of research will facilitate the conduct of clinically relevant research into variation in healthy life expectancy in children and adults in England. Expected measurable benefits have been outlined under each of the four research themes. More broadly, these benefits are of three kinds:

(i) Methodological benefits in improving the potential uses of HES and mortality data: UCL will evaluate new ways of analysing hospital data cohorts based on inception events such as occurrence of a first diagnosis or admission, and approaches for defining specific conditions and comparators. These benefits will be disseminated to the research community through research publications. In addition, UCL has planned very specific research in statistical inference and modelling that will enable the production of new public health indicators (e.g. study 1). Widening the range, frequency and quality of public health indicators will benefit a range of statistical and health organisations in monitoring the health of the population, shaping policy, and allocating resources. UCL will ensure these benefits are delivered by engaging with NHS bodies (e.g. Public Health England) and NHS Digital’s Methodological Review Panel to seek peer review and examine opportunities for implementation in statistics production. UCL will also engage with patients early on to generate evidence around what they understand from survival and healthy life expectancy estimates, and whether it can support them in making individual care decisions. Public engagement is relevant to all 4 studies.

(ii) there will be benefits to NHS systems and services nationally and patient outcomes across a wide range of clinical disease areas, including, but not limited to, cardiovascular diseases, cancers, renal diseases, respiratory diseases and mental health in adulthood. In childhood, priorities will include mortality in early childhood compared with other developed countries (recognised as a priority for the NHS), outcomes for children with rare or chronic conditions, vulnerable adolescents (e.g. those admitted for injury related to self-harm, drug or alcohol use or violence), and the impact of transition from paediatric to adult services on service use and healthy life expectancy. The range (richness) and breadth (from 1997 to latest available) of data will provide information on healthy life expectancy across the age range from childhood to early adulthood, and from early adulthood middle to old age, and for a range of conditions, comorbidities and patient circumstances defined by area indicators (e.g. socioeconomic status, or local authority) or indicators of vulnerability (e.g. ethnic minorities, the homeless). The information from the programme can be used by policy makers and service commissioners to address disparities in healthy life expectancy.

(iii) Findings from the 4 studies on healthy life expectancy will shape policy by feeding into national programmes. The potential to increase the quality and breadth of risk factors examined as part of the Global Burden of Disease study in collaboration with Public Health England. UCL researchers currently contributing to the Global and English Burden of Disease studies will apply methodological results from study 1 to propose methodological improvements to the studies. This will result in higher quality disease prevalence and health burden information required to better inform the prioritisation of health policy across a widened range of risk factors than is currently possible. For more specific diseases and health care procedures, highlighting variation in outcomes and services for rare diseases and for maternal and child health and health inequalities, will inform the provision of health care by determining which patient groups are most at risk of poor healthy life expectancy; how outcomes vary across regions and hospital trusts, and where risk assessment and interventions should be targeted.

Benefits of the four studies include:

Study 1. Population-based indicators of healthy life expectancy related to COPD.
The investigators will widen the range of uses researchers can make of hospital data in the UK, taking advantage of the uniquely high coverage of HES compared to similar hospital datasets internationally. They will produce working papers that are directly relevant to the NHS and to PHE English Burden of Disease estimates. Production of new outputs on population-at-risk size estimates, disability-adjusted life years and healthy life years lost can produce better and more disaggregated information, enabling clinician and public health professionals to improve the way they prioritise and target health interventions e.g. production of estimates of the size of populations at risk of an acute exacerbation of COPD for small groups and catchment areas will enable acute trusts, commissioners and organisations such as NICE to measure severity of illness in groups that would benefit from interventions such as smoking cessation or winter pneumococcal immunisation.

Study 2. Reproductive health
Findings will inform targeting of interventions by obstetric and child health services and public health policy for families where the mother and/or child is at high risk of adverse outcomes and reduced healthy life expectancy.

Study 3. Inequities
This study will generate information describing how HES-mortality data can be used to inform the development of health promotion and health service responses to health inequities at national and local levels. It will inform resource allocation, evaluation of interventions to address health inequities and monitoring of local authority and health service performance in relation to health.

This work will draw together evidence of needs and opportunities for hospital teams to implement evidence-based prevention for deprived patients. By providing evidence, access to prevention services can be improved, and with sufficient scale-up, contribute to a reduction in preventable disease, inequalities, and NHS costs.

Study 4. Multimorbidity: detecting high risk clusters
UCL researchers will create tools to identify patients with multiple morbidities that co-occur more often than expected by chance to develop preventive strategies targeted at patients in “high-risk” clusters. These tools can inform practice and policy and the selection of patients for inclusion in randomised controlled trials (RCTs). Currently, RCTs frequently exclude patients with multimorbidity and there is an urgent need to determine benefits and harms of interventions in this group and to identify high risk patients with precursor conditions that put them at risk of chronic disability or mortality who might benefit from early intervention.

Outputs:

Knowledge resulting from this programme will be communicated via HDR-UK meetings and other mechanisms, including publication in high impact journals, presentations at key conferences and events, and teaching and training activities.

Publications and reports include:

Multiple reports will be produced for publication for each of the four studies. These reports will be disseminated at different stages. Initial outputs will be preliminary reports to promote discussion and critique by clinicians, service providers, public and scientists. Feedback from these discussions then informs peer reviewed and other publications.

By 12 months: Preliminary analyses for specific studies to develop the four studies, for example, reporting on the development of methods for phenotyping clinical conditions and computational methods, will be presented at a series of meetings within one year of receipt of the data, including Faculty or Institute seminars open to UCL staff, academics outside of UCL and NHS clinicians, clinical/health informatics interface meetings; CLAHRC/UCL Partner activities; and HDR-UK seminars and conferences. These fora will provide opportunities for discussion of preliminary results throughout the programme.

By 24 months: The applicants will submit abstracts for presentation of early findings from the 4 studies at key national and international clinical and data linkage conferences, e.g. Informatics for Health, the International Population Data Linkage conference, Medical Informatics Europe and Public Health Informatics. The applicants will feedback findings from individual studies to NHS clinicians through their involvement in Biomedical Research Centres at GOSH and UCL Hospital, through UCL Partners and links to AHSNs and applied health research networks (i.e. CLARHCs) across HDR-UK London, and through working with policy makers through four policy research units based at UCL, and through presentations, meetings and dissemination of working papers through HDR-UK.

Tools created include:

Within 24 months: New phenotypes, analytic scripts, tools and algorithms developed as part of the programme will be published on the UCL Institute of Health Informatics data portal, a resource made available for researchers to promote the transparent and scalable use of linked health data for research and benefits realisation for the NHS.

By 36 months, UCL will have produced papers for publication on all four studies and published tools on the website of UCL Institute. These working papers will be presented to relevant NHS bodies such as Public Health England, and NHS Digital’s methodological review panels as well as through HDR-UK. Each study will be expected to have at least one research report submitted for publication in key scientific journals by 36 months after the receipt of data. UCL will publish papers in high impact, open access scientific journals such as the Lancet, PLoS One, BMJ Open and Arch Dis Child, Heart, PlosMed, Journal of Public Health. Reports of findings will be shared with funders e.g. NIHR, MRC, and other stakeholders as appropriate. UCL will publish summaries of findings in newsletters disseminated through the IHI website, the NIHR CLAHRC, relevant clinical groups within UCL Partners, and to NHS organisations such as Public Health England, NHS Digital, Department of Health and NHS England.

Information for the public include:

From the outset, a public list of approved protocols and publications (with lay summaries) will be maintained on the UCL Institute of Health Informatics website via the UCL Faculty of Population Health Sciences. Findings will also be shared with patient groups (e.g. National Children’s Bureau, Generation R, Great Ormond Street Hospital (GOSH) young people advisory group) and UCLH ‘About Me’ public engagement group.

It is anticipated that the public will be involved in these research studies from the outset. Case study examples will be developed to show how patient health data is used by researchers to inform decisions to improve health, with examples of how data are handled, as part of the applicants mission to promote understanding of the use of health data among the public (see above).

All outputs and publications contain only aggregated data with small numbers suppressed in line with the HES Analysis Guide.

Outputs specific to the 4 research studies:

Study 1. Population-based indicators of healthy life expectancy related to COPD.
Methodological working papers will be disseminated to the RCP, PHE, NHS Digital’s Methodology Review Panel and NICE and to clinicians through the North Thames CLARHC. This knowledge transfer exercise would serve to cross-examine data quality questions before working papers are submitted for publication. Further engagement with NHS Digital and the Government Statistical Service could lead to the development of new official statistics methodology.

Study 2. Reproductive health
The study will initially describe variation in maternal healthy life expectancy following pregnancy outcome, over time, and by area, age and specific risk groups. Healthy life expectancy for children will be reported for specific high-risk groups (e.g. according to gestational age at birth, chronic conditions or congenital malformations). Research papers will report results of clustering of maternal and child morbidity and discuss relevance for health interventions to families, mothers and children.

Study 3. Inequities
Outputs of this research will include development of a ‘Toolkit’ report to translate research findings into prevention practice and online resources, working in collaboration with Pathway (national and local teams), The Faculty for Homeless and Inclusion Health and Pathway, University College London Hospital (UCLH), and collaborators from Health Data Research UK and the UCLH NIHR Biomedical Research Centre (BRC). Findings will be published in research papers describing the inequities. The research will be used to develop a series of indicators to monitor health inequities in these populations at national and local level. Developments will be fed back to health organisations (e.g. PHE, NHSD, NHS England) to consider for their own public health statistics production. By providing burden estimates specific to these vulnerable populations, the work will also contribute to the GBD Estimates, which are highly influential in guiding policy.

Study 4. Multimorbidity: detecting high risk clusters
This study will generate tools to inform clinicians, healthcare providers, and researchers of the components and progression of high-risk multimorbidity clusters. These tools will be made available on the UCL IHI website, shared with relevant organisations (e.g. PHE, NHSD, commissioning groups), and submitted for publication. The study will generate information on multimorbidity prevalence rates and associated risk factors for multimorbidity clusters at high risk of mortality or reduced healthy life expectancy. Findings will be relevant to policy and service provision and for developing trials (e.g. relevant to industry and NIHR).

Processing:

UCL are requesting month of death (date for infants) and causes of death. This level of detail is essential to be able to use death registration to identify morbidities in people who die and to determine age at death and time from key events such as birth or first diagnosis. Timing of death is also important to examine changes in external factors such as flu season, environment (e.g. temperature), or services. The studies also require unlinked death registrations to identify individuals with conditions of interest who were not seen in hospital or were missed links to HES.

UCL also require hospital spell number and augmented care period local ID to produce rates of health care utilisation (emergency admissions, elective admissions, 30-day relapse admission, critical care admission) for the specific cohorts defined from an inception event such as a first diagnosis or a first admission.

UCL are requesting data since 1997 (a) to enable the inclusion of sufficient numbers of patients in cohorts for less common conditions, for complex conditions or combinations of conditions; (b) to gain certainty regarding the time of first diagnoses/admissions took place (which requires several years of data prior to the first diagnosis); (c) to allow adequate follow-up periods: cohort studies on respiratory disease or diabetes usually require 15 years of follow up when it comes to mortality; (d) to use past admissions to characterise underlying risk factors (e.g. birth characteristics or prior admission with a chronic condition); (e) to examine evidence of changes over time and across birth/diagnosis cohorts in coding, hospitalisation practices; (f) to document changes over time in healthy life expectancy.

Data extracts are transferred from NHS Digital to the UCL Data Safe Haven by a named individual within the Institute of Health Informatics (IHI) using Secure Electronic File Transfer in the form of de-identified extracts on a yearly basis. They come with encrypted pseudonymised identifiers (pseudonymised HESID) which indexes records relating to any given patient.

UCL Institute of Health Informatics data scientists manage the dataset within the UCL Data Safe Haven and enrich the data with suitable metadata:
• NHS Classifications ICD-10 and OPCS-4 tables,
• Organisational Data Service tables referencing NHS trusts and hospitals,
• Data Quality Maturity Index information from NHS Digital reports,
. geographical coordinates of census Lower layer Super Output Areas (LSOAs) and
relation with other health and administrative geographical units,
• Area-level aggregate characteristics for LSOAs and larger geographical units, including environmental exposure, indicators of the physical, social and built environment such as climate, deprivation, health prevalence or housing density.

Linkage with data other than these area- or organisation-level data is not allowed. There will be no linkage to other record level datasets.

Extracts will be prepared for the 4 studies to minimise the amount of data seen by researchers. For example, the data scientist will provide extracts related to relevant age ranges, or conditions, as required for the study.

Data scientists at UCL will use standardised data cleaning methods, algorithms and coding clusters for defining cohorts, exposures and outcomes, so that healthy life expectancy can be compared across different age groups and different condition-specific or age-related cohorts. Methods for defining phenotypes or condition-specific cohorts, and their validation across age and condition groups, constitute considerable added value brought by UCL to the HES-mortality dataset. Researchers will undertake univariate and multivariate analyses using standard statistical packages (R and Stata) to determine associations between a range of demographic, clinical and environmental factors (e.g. season, area) and life expectancy, constructing relevant control groups and making appropriate adjustments for confounders. HES-mortality data will be used to determine a range of clinical outcomes, to estimate health states and determine cause-specific mortality, as part of analyses of healthy life expectancy.

No record-level data will be exported outside the Data Safe Haven and no data will be shared with any third-party organisation or user. No aggregate data with small cell size in breach of NHS Digital requirements will be exported from the UCL Data Safe Haven.

Aggregate data outputs for export can only be exported through the established disclosure control procedure by a data scientist or PI. Data scientists authorised to export aggregate outputs will control outputs by scrutinising aggregate tables and figures to assess whether the outputs meet the following requirements:
• the HES analysis guide
• the ICO Anonymisation Code of Practice
• the Anonymisation Standard for Publishing Health and Social Care Data Specification (ISB1523)
• the ONS Disclosure control guidance for birth and death statistics
• the SLMS Health Informatics – Pseudonymisation ISO/TS 25237:2008 Overview for audit purposes
Outputs deemed to have an excessive re-identification risk will not be released and will instead be returned to the researcher for further processing.

Authorised researchers will hold a substantive contract with UCL. Researchers analysing the data will only have access to de-identified data. Only designated staff within the UCL Data Safe Haven who are responsible for managing the data (cleaning, validating, downloading and extracting data) will have access to the full dataset across all years. The data requested will only be used for the purposes described in this application.

Principal investigators of each of the four studies listed will be responsible for monitoring each specific study undertaken as part of the study. All staff working with the data will undergo mandatory annual training in information governance run by the UCL Information Services Division (ISD), who maintain a log of course attendees.

Data processing for the four research studies:

Study 1. Population-based indicators of healthy life expectancy related to COPD.
This study will involve a combination of life table modelling and survival analysis which require follow-up until death and hence require past HES records and 10 to 15 years of follow-up. UCL therefore require data HES-Civil Registration and unlinked deaths from April 1997 onwards for adults aged 18 years or more with specific ICD10 codes relating to COPD, asthma or heart failure.

Study 2. Reproductive health
Data comprising the full HES record are required for all women with indicators of a live or stillbirth and for children (<25y) recorded in HES from 1997 until the most recent data available. This is to build cohort life tables describing changes in mortality across the period by age group and across clinical conditions, early risk factors (e.g. preterm birth or young age at onset of chronic conditions) and demographic risk factors. Information from HES-mortality data will be used to model the relationship between deprivation, ethnicity and specific conditions or comorbidites with mortality to determine how healthy life expectancy is changing over time. Causes of death are required to characterise clustering of health events and underlying conditions related to death). Date of death is required for infants <365 days old as age at death is strongly associated with pre and postnatal causal factors in infancy. Unlinked deaths (i.e. death registration records not linked to HES) are required to minimise biases due to the substantial proportion of children and young people whose death is not linked to a hospital record. Information on area of residence, hospital trust and GP practice, will be used to identify clustering of comorbidity and mortality, for mothers, children and within mother-child pairs. These results will be relevant to service commissioning for maternity, child health and primary care.

Study 3. Inequities
HES data enables exploration of the social gradient using the Index of Multiple Deprivation based on postcode of residence. Ethnicity codes enable identification of how social deprivation interacts with ethnicity to impact on healthy life expectancy. Groups experiencing more extreme disadvantage can be harder to identify but specific codes related to substance use disorders and homelessness as well as administrative codes such as “No fixed abode” can support this. HES and death registration data will be used to develop measures of healthy life expectancy and measures of avoidable mortality. UCL requires all HES records for adults aged 18 yrs and above linked to death records (including month and causes of death). UCL also requires unlinked death registrations to identify deaths that could identify individuals not seen in hospital or missed links to HES. UCL requires data from April 1997 to date to capture past history and long term follow up to measure morbidities and mortality and produce results on reductions/exacerbation of health inequities since 2001, when the NHS was officially mandated to address health inequities. Local authority of residence codes, CCG, GP practice and trust codes are needed to produce local level performance indicators and to link to spending measures via the Spend and Outcome Tool.

Study 4. Multimorbidity: detecting high risk clusters
UCL will use The Academy of Medical Sciences definition of multimorbidity as the co-existence of two or more of the following: non-communicable long-term conditions, mental health disorders, or chronic infections, with details of functional deficits or disabilities, frailty, states of poor health such as obesity, or health-related behaviours such as alcohol misuse. The case definition for a long-term condition will be based on codelists generated using HES diagnoses and procedural codes, and ICD codes recorded at death registration. Mortality data will be obtained from the death registration to estimate 5, 10, 15, and 20-year mortality risks. Cluster analysis will be applied to patient-disease matrices to identify disease clusters by age and sex. Comorbidity measures such as the Pearson correlation coefficient and the relative risk ratio which compare observed with expected co-occurrence of diseases will form the basis of network analysis methods which will investigate the correlation between pairwise diseases. Researchers will apply similar methods across the adult age range. UCL require HES admission data linked to mortality records and unlinked deaths from April 1997 onwards for adults (18y+). UCL will further minimise the data request by restricting to admission data and mortality only (i.e: no A&E or OP data) and by restricting to month of death.

Project 89 — DARS-NIC-221454-Z7R2K

Opt outs honoured: No - consent provided by participants of research study (Excuses: Consent (Reasonable Expectation))

Legal basis: Health and Social Care Act 2012 – s261(2)(c)

Purposes: ()

Sensitive: Sensitive

When:2018.10 — 2018.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type:

Sublicensing allowed:

AGD/predecessor discussions: igard-minutes-1-november-2018---final.pdf

Datasets:

MRIS - List Cleaning Report

Type of data:

Objectives:

The data controller who also process data for this study is University College London (UCL). Other organisations have been involved in the wider project (as data collectors), these are 28 palliative care services (i.e., hospices, hospital support teams and community services) in England and Wales, who have recruited patients to the study. The Prognosis in Palliative care Study II (PiPS2) was developed in response to a commissioned call for research by the National Institute for Health Research (NIHR) Health Technology Assessment (HTA) programme. The NIHR is the funder of the research.

The Priment Clinical Trials Unit is part of UCL and made up of the Research Department of Primary Care and Population Health, Division of Psychiatry and Department of Statistical Science.

Systematic identification of patients approaching the “end-of-life” is a key recommendation of the Department of Health end-of-life care strategy. The Gold Standards Framework (GSF) service improvement programme (widely used in general practice, nursing homes and increasingly in acute hospitals) uses a needs-based coding system dependent upon whether patients are expected to live for “days”, “weeks”, “months” or “years”. However, many patients who would potentially benefit from inclusion in such programs are currently unidentified by clinicians. Improved prognostication would benefit patients and their carers by providing them with better quality information to inform their choices about future care. Improved prognostication would also help clinicians to plan services and to ensure that patients are cared for in the most appropriate environment and with the most appropriate treatments.

The PiPS A&B prognostic models were developed (by members of the research team) in a previous cohort study (Gwilliam et al BMJ 2011). In the original study the PiPS A&B models were able to categorise patients into three prognostic groups; those with a survival of “days”, “weeks” or “months+”. However, before the PiPS A and B can be recommended for clinical use they need to be validated in an independent cohort of patients. As described in the Commissioning Brief issued by the NIHR, the overall purpose of this research is “the validation of models of survival to improve prognostication in advanced cancer care to include the Prognosis in Palliative care Study (PiPS) predictor models”. The primary aim is to validate PiPS-A&B and to compare PIPS-B against clinicians’ predictions of survival. Secondary aims are to validate four other prognostic tools: the Palliative Prognostic Score (PaP); the Feliu Prognostic Nomogram (FPN); the Palliative Prognostic Index (PPI); and the Palliative Performance Scale (PPS).

PiPS2 is a multi-site, prospective, cohort study in palliative care patients with advanced incurable cancer. The cohort consists of consecutive referrals to participating palliative care services who fulfilled the inclusion and exclusion criteria of the study.

The inclusion criteria for the study were:
a) Patients who have been recently referred (or re-referred) to palliative care service. For inpatient palliative care patients (including hospital support teams), “recent” referral means that the patient should have been first seen by a member of the palliative care team no more than 7-days previously. For a community, day-hospice or palliative care outpatient, “recent” means that the patient should have had fewer than three previous contacts with the palliative care service before they are recruited to the study.
b) Patients with locally advanced or metastatic, incurable cancer.
c) Aged 18 years or over.
d) Sufficient English language skills for patients with capacity to understand study literature and undertake study assessments. Whenever possible translation services will be used to maximise the potential for patients or carers to give consent/agreement to participate.

The exclusion criterion for the study was:
a) Currently receiving (or planned to receive) treatment with curative intent.

Participants have been recruited from hospices, hospitals and community palliative care services across England and Wales.

Due to the nature of the cohort being asked to take part in this study there will be incidents when the individual will be too unwell to decide for themselves whether they wish to consent to take part in the study. In this situation a family member or friend may be willing to act as a consultee of the individual and provide their view on whether the individual would wish to take part in the study (Personal Consultee form). Where an individual (who does not have the capacity to decide for themselves) does not have a family member or friend who is willing to act a personal consultee then a nominated consultee may be asked to give their view of whether the individual would wish to take part in the study, this maybe a carer or other care professional (Nominated Consultee form). Both personal consultees and nominated consultees are to be only asked to give the views and interests on behalf of individual. If the individual (who does not have the capacity to decide for themselves) still states that they do not consent to take part in the study then the individual would not be included in this study. All individuals with capacity to decide for themselves will be asked for their consent as per the patient consent form.

The dates of death are an essential part of the study. There is no alternative, less intrusive way, of obtaining the dates of death other than by requesting them from NHS Digital. The dates of death are essential to achieving the aim of the study because without this information it will not be possible to calculate the length of time for which patients survived nor the accuracy of the prognostic tools.

Expected Benefits:

Obtaining the dates of death from NHS Digital will enable UCL to validate the prognostic tools which are being evaluated in the PiPS2 study. The first results of the study will be available by May 2019. The outputs will achieve the purpose of the research because they will be the way in which the results of the study are communicated to patients, professionals, the public and policy makers.

If the prognostic scores being studied in this research can be shown to be accurate and reliable then non-specialist clinicians will be able to use this method to identify patients entering the last few days, weeks or months of life. The results will be communicated to the clinicians by publications in peer-reviewed journals and presentations at conferences. Marie Curie may also produce and disseminate briefings on the results to clinicians.

At present patients are included on "end-of-life" electronic communication systems (such as Coordinate My Care) only when clinicians deem them to be approaching the end of life. If the PiPS tools are found to be more accurate than clinicians’ estimates then this will facilitate access to specialist services, inclusion of patients on "end-of-life" electronic communication systems and will aid the identification of patients approaching the end of life. Prognostic scores could also facilitate comparison of services by more accurately describing the case-mix of referrals. Depending on the results of the study, Marie Curie may produce policy briefings in order to influence key decision-makers.

Studies show that patients, carers and clinicians all value accurate prognostic information. The number of elderly patients with advanced cancer is anticipated to increase substantially over the next twenty years. The results of the research are therefore highly likely to remain relevant and important to NHS needs in the future. Improved prognostication is expected to lead to better quality care for the 160 000 patients who die from cancer in the UK every year. More accurate survival predictions are likely to result in fewer hospital deaths & better utilisation of NHS services.

Successful validation of the PiPS would provide both the NHS and the wider healthcare community (particularly the independent hospice sector) with a reliable and objective prognostic indicator. PiPS scores could then be used as supportive information on applications for hospice admissions, for referrals for "fast-track" continuing care funding and for identification of patients approaching the terminal stages of their illness (e.g. identifying patients suitable for end-of-life care plans or supporting information for patients with electronic palliative care communication records). Palliative care research is largely dependent on patient inclusion on the basis of clinician estimates of survival. An objective method of estimating survival could be used as an inclusion criterion for future palliative care studies and would greatly improve the scientific quality of much research in this area by providing clearer sampling characteristics.

The PiPS tools will be free to use and easy to access and it is envisaged that further development will result in applications (apps) that could also be used on hand-held devices to provide convenient access for healthcare professionals in any setting.

Improved prognostication will help patients by allowing them sufficient time to prepare for the ends of their lives and to ensure that advance care plans (if so desired) are put in place. It is likely to increase the probability that patients will die in their preferred place of care. The PiPs instruments will also benefit families by providing more accurate information and by allowing loved ones to make timely plans. The prognostic scores will help clinicians and health care providers by providing an independent and objective assessment of prognosis. This will improve the utilisation of end-of-life services and will minimise "late referrals" for patients entering the terminal phase of their illness.

As highlighted in the 2013 Neuberger report into the shortcomings of the Liverpool Care Pathway (More Care, Less Pathway), a key research priority for the NHS is to determine the best ways to communicate uncertainty to patients & families about prognostic estimates. The study will explore with patients & carers the type and extent of prognostic information that they require and the best (and most sensitive) way to present this to them.

Outputs:

The PiPS2 study will result in a number of outputs. The data in the published outputs will be aggregate data about the participants (diagnoses, age distribution, gender) and the performance of the various prognostic tools (their accuracy, calibration and discrimination). Outputs will include publications in peer-reviewed journals and presentations at academic conferences. No specific conferences have yet been identified but likely venues include the European Association of Palliative Care Scientific Congress or the Multinational Association of Supportive Care conference.

The results will also be fed-back to participating centres and to any clinicians, relatives or surviving patients who express an interest in receiving the results. The original development study of the PiPS attracted considerable media attention and was covered by print, on-line and broadcast media. It is anticipated that successful validation of the prognostic score will be of similar media interest and this will help to disseminate the research findings beyond the academic community. The Marie Curie Palliative Care Research Department has experience with disseminating research findings via Twitter and other social media. Marie Curie also has a policy department that can help to work with policy makers to implement results into clinical practice. The patient representative members of the research team will be asked to help disseminate the study results via patient groups, conferences and co-authorships. A PiPS prognosticator web-site will be developed so that the prognosticator can be more easily accessed by health care providers. The validated prognostic tools may also be made available as a resource via the End of Life Care Intelligence Network web-site at Public Health England (www.endoflifecareintelligence.org.uk). The National End of Life Care Intelligence Network is part of Public Health England and it aims to improve the collection and analysis of information related to the quality, volume and costs of care provided to adults approaching the end of life.

The first publication is expected in May 2019 (Report to HTA). The main findings will hopefully be published later in 2019 (with a probable aim for the BMJ since this is where the original paper describing the development of the PiPS prognostic tool was published). Subsequent publications will be submitted in 2020 - 2021 and will concern sub-analyses of the study data set (including qualitative research findings). Publications will acknowledge NHS Digital as a data source.

Dates of death will not appear in any publications or outputs. Only lengths of survival will be used in analyses.

Processing:

University College London (UCL) will send the cohort information (direct patient identifiers) that will flow into NHS Digital (these will be the study participants’ names, dates of birth, NHS numbers, gender, addresses and postcodes). NHS Digital will disseminate the one off requested data for the cohort to UCL.

The disseminated data will be received by UCL. No data provided to UCL under its Data Sharing Agreement with NHS Digital will be stored outside of the Data Safe Haven. Data will only be processed by UCL substantive employees.

Data will be linked with participant level individual data (detailed in the paragraph below) in the cohort and by calculating the difference between the dates on which patients were enrolled into the study and the dates on which they died, the length of time for which patients survived will be able to be calculated. The length of survival (not the actual dates of death) for each participant will then be added to the main data set. The dates of death themselves will not be transferred to the main data set and will not be shared outside of the research team. The Date of Death dataset will only be stored for the duration of the data sharing agreement with NHS Digital, after which it will be securely destroyed and a certificate of destruction will be provided to NHS Digital.

The other study data consists of assessments by clinicians (age, gender, measures of performance status, observer-rated global health status, abbreviated mental test scores, pulse rate, clinician predictions of survival and estimated time since diagnosed with a terminal disease); blood results (white blood count, lymphocyte count, neutrophil count, platelet count, albumin, alkaline phosphatase, alanine transaminase, c-reactive protein, lactate dehydrogenase and urea); clinical signs and symptoms (presence or absence of key symptoms; anorexia, delirium, dysphagia, dyspnoea, fatigue, peripheral oedema, decreased oral intake, weight loss); and measures of disease extent (nature and site of primary and sites of metastases).

The data will be analysed in the following ways:

1. Descriptive analysis

Initially the predictors and the outcome will be summarised using descriptive analysis. Categorical predictors shall be reported as raw numbers and percentages. Reports of continuous variables shall include mean or median and standard deviation or interquartile (IQ) range as appropriate. The percentage of values missing for each predictor will also be presented. The survival times of patients will be summarised using median and IQ ranges and Kaplan Meier graphs.

2. Primary outcome analysis

2a. Validation of PiPS models

The PiPS models will be validated as they were presented for use in the original study by Gwilliam and co-workers (BMJ 2011). For both PiPS-A and PiPS-B, two separate models have been developed to predict the two week (14 day) and two month (56 day) survival of patients (thus generating three prognostic categories; less than two weeks, two weeks to two months and greater than two months). The week and month models include different sets of predictors. For both models (weeks and months), if the predicted probability of the event exceeded 50% for a patient, then the patient was classified to have the event. Otherwise it was assumed that the patient did not have the event. Thus if, for example, the models predicted that a patient would survive two weeks, but predicted that the patient would die within two months, then the PiPS model outcome would be that the patient was predicted to die in “weeks”.

The discriminatory ability of the models will be assessed using the C-statistic. Separate C-statistics will be calculated for the “two weeks” and the “month” models. The C-statistic will be estimated by forming all patient pairs and calculating the proportion of patient pairs where the patient who has the event has the higher predicted value. The PiPS online calculator provides (www.pips.sgul.ac.uk) a prediction as to whether a patient will survive for days, weeks or months. The model calibration will be assessed by comparing the observed and the predicted proportions for each of these categories. The calibration of the prognostic models will be further assessed using the calibration intercept and slope based on a logistic regression model fitted to the validation data using the predicted log-odds as the only predictor. This will also be done separately for the “two weeks” and the “month” models. The calibration intercept and slope, and the C-statistic will initially be estimated without taking account of potential patient clustering within centres. In a second analysis, these performance measures will be calculated for each centre separately (assuming most centres have sufficient number of events to allow such calculations) and the estimates pooled across centres using a weighted average. The calibration intercept and slope, and the C-statistic will be presented as estimates with confidence intervals.

Model performance will also be assessed by plotting Kaplan-Meier survival curves for each of the three risk groups identified by the PiPs models (“days,” “weeks,” and “months+”).

2b. Comparison between PiPS model and clinician predictions

To compare the accuracy of the model and clinicians’ predictions, the primary analysis will focus on the PiPS-B model. McNemar’s test will be used to compare the proportion of overall patient deaths predicted correctly by PIPS-B with the corresponding proportion predicted correctly by clinicians.

3. Secondary outcome analysis

As part of the secondary analyses UCL will combine the models’ predictions for the two week and two month cut-off points to produce a categorical prediction of survival (“days,” “weeks,” or “months/years”) and compare with clinicians’ estimates and the corresponding observed values descriptively with respect to their accuracy.

Linear weighted κ will be also used to compare the performance of the clinicians with that of the models. If appropriate, consideration will be given to using the net reclassification index (NRI) as part of this secondary analysis to compare clinician and model predictions, noting that NRI needs to be used with caution, particularly when there are three or more risk categories.

As part of the secondary analyses, the other risk models (PaP, FPN, PPI and PPS) will also be validated. The calibration of these prognostic models will be assessed using the calibration slope based on a logistic model for binary outcomes and Cox model for survival outcomes. Graphical comparisons of the observed and predicted risks for clinically relevant patient risk groups will also be made. Clinically relevant time points will be used for comparisons for survival outcomes. Model discrimination will be assessed using the C-statistic for binary outcomes and C-index for survival outcomes. The predictions made by the other prognostic models under evaluation in this project will also be compared with the corresponding observed outcomes and clinician predictions (where available). Potential missing data in predictor values will be handled as described.

4. Sensitivity and other planned analyses

Characteristics of patients with potential missing data will be compared with those with complete information to investigate any bias. Multiple imputation based on chained equations will be used to impute missing predictor values if considered necessary. In the previous study the outcome was complete and about 5% of the predictor values were missing with the exception of C-Reactive Protein (CRP) for which 13% of data were missing.

Once all of the analyses have been completed and the results have been disseminated, the data set which has been pseudonymised may be made available to other researchers on request. Key identifying numbers will be removed. The data set which has been pseudonymised will not include names, addresses, post-codes, NHS numbers dates of birth, dates of death or dates of enrolment. It will not be possible to link individual records to publicly accessible information on individuals.

All data sharing requests will be considered by the Data Management Group at the Priment clinical trials unit (including a statistician) to ensure that the data-set has been pseudonymised before being shared with any third party. Dates of death will not be transferred to any third party.

NHS Digital reminds all organisations party to this agreement of the need to comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

Variation in avoidable hospital admissions by mental health status — DARS-NIC-167186-V7J4F

Opt outs honoured: No - data flow is not identifiable (Excuses: Does not include the flow of confidential data)

Legal basis: Health and Social Care Act 2012 – s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Sensitive, and Non-Sensitive

When:DSA runs 2019-03 – 2022-03 2018.10 — 2018.12. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard_minutes_8_march_2018.pdf, igard-minutes-5-april-2018.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Mental Health Services Data Set
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Mental Health Services Data Set (MHSDS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The overall objective for the study is to identify the extent to which various supply (e.g., availability and quality of health and care services) and demand factors (e.g., population morbidity) influence variation between clinical commissioning groups (CCGs) in A&E attendance rates and rates of potentially avoidable emergency hospital admissions among 'mental health' and 'non-mental health' groups.

*************************
Potentially avoidable emergency admissions include those conditions which could be effectively managed outside of hospital. Such admissions partly reflect deficiencies in quality or access to physical health care, are expensive to the NHS and can be harmful to patients. Potentially avoidable admissions have been defined in relation to a specific set of conditions for which hospital admission could be prevented through adequate management or earlier detection outside of a hospital setting. The list of conditions comprising avoidable admissions varies between studies. A list of fourteen conditions (excluding one which relates only to children) will be used, identified through consensus methods by Coleman & Nicholl (2010), for which exacerbations should be able to be managed in a well-performing system without hospital inpatient admission. Relevant to the current study, this broadened the scope of ‘avoidable’ or ‘preventable’ admissions to include not just those that could be managed in primary care, but also those for whom exacerbations could be managed by a range of alternative care systems (e.g., GP out of hours services, district nursing, urgent care walk-in services). Examples of conditions include acute mental crisis, COPD, epileptic fit, non-specific abdominal pain, and blocked urinary catheter.

*******************

The research team at University College London (UCL) will assess variation at the CCG level (currently 207 in England), as these are responsible for the planning and commissioning of health care for their local area.

This study was instigated by the research team at UCL upon identification of an information gap in the published literature. Further, based on previous work with a local authority partner it was identified that there was a need to better understand whether quality and availability of services beyond primary care (i.e. including information about social services and mental health services) might influence variation in potentially avoidable hospital admissions.

UCL are examining CCG-level variation in potentially avoidable emergency admissions and A&E attendance rates among those in contact with secondary mental health services and comparing it with those not in contact with mental health services. This is because research has demonstrated that people with a variety of mental health conditions face elevated risks of morbidity and mortality; and, a wide range of barriers to good management of health conditions (e.g., due to symptoms and treatment related issues, availability of social support, stigma, and access to/receipt of appropriate health care). Studies also find those with mental health conditions have more frequent A&E attendances and potentially avoidable admissions (mainly for physical conditions) than the general population, not solely accounted for by greater severity of these conditions.

Locating and quantifying reasons for variation in these rates between areas of responsibility for commissioning health and care will help highlight where investment would make most difference to conditions management and prevention of avoidable hospital use.

The data requested will only be used to:

1. Calculate national CCG-level age and sex adjusted rates over a two year period (2016-2018) of i) potentially avoidable emergency hospital admissions and ii) A&E attendances for two cohorts: the national population (aged 18+ years) (non-mental health cohort) and those identified as in contact with secondary mental health services (aged 18+ years) (mental health cohort, including a subgroup identified with serious mental illness). The numerator of these rates will be calculated using admissions/attendance data, denominators will be calculated from openly available data (ONS mid-year population estimates for the national cohort) and from data received as part of this application (population in contact with mental health services for the mental health cohort). CCG-level variation in rates of potentially avoidable emergency admissions and A&E attendances will then be estimated. Rates over a two period will be calculated to reduce the impact of annual fluctuations of these rates at a CCG level.

2. Create CCG level indicators which will be used in regression models as generally considered predictors of variation (e.g. duration of admission (from date of admission/discharge fields); % of admissions from GP referral (from admission source field).

3. Describe characteristics of potentially avoidable emergency admissions among mental health/non-mental health cohorts (admission/discharge method, disposal, referral source).
Openly available routinely collected anonymised data aggregated to CCG level from external sources will subsequently be used to examine which supply (e.g. availability and quality of primary care) and demand factors (e.g. population deprivation, morbidity) explain any observed CCG-level variation in admission/attendance rates among mental health and non-mental health cohorts using regression analyses. Outputs will be at CCG-level rather than record level. These openly available sources include ONS, Atlas of Variation and Quality and Outcomes Framework.

This is a non-commercial standalone research project carried out at University College London and does not involve any third parties. The findings will be disseminated widely and are intended to help planners and funders of health and care to decide how best to focus efforts and resources to reduce potentially avoidable emergency hospital visits overall and among people with poor mental health.

**********************

UCL request data for a two year period to calculate average rates of potentially avoidable admissions and A&E attendances to account for annual fluctuations in rates. As a national level study UCL are requesting HES data on all admission/attendances over that time period in order to calculate rates of admissions/attendances, specifically calculating rates for each Clinical Commissioning Group in England. The study will compare rates among people identified as in contact with mental health services to those not in contact with services, thus data are requested for all admissions/attendances rather than just those in contact with mental health services.

******************************

All outputs/findings will be non-identifiable and aggregated with small numbers suppressed in line with the HES analysis guide. Data will only be accessed, processed and analysed by substantive employees of University College London.

Yielded Benefits:

As yet there are not yielded benefits, but UCL anticipate these to accrue within the extension period.

Expected Benefits:

Through producing the outputs specified, the research team at UCL aims to deliver information to key decision-makers in an accessible format about how best to reduce potentially avoidable emergency admissions in general and among those in contact with mental health services. Specifically, the outputs will highlight which elements of the health and care system influence unnecessary variation in A&E visits and potentially avoidable hospital admissions among people with poor mental health. By doing so, the researchers will highlight where improvements to different elements of the health and care system may reduce avoidable emergency hospital use.

This is important as potentially avoidable admissions have been defined in relation to a specific set of physical and mental conditions. Admissions for these conditions are thought to be preventable through proper management outside of a hospital setting, such as in primary care (e.g., by a GP). Availability and quality of other services (social services, mental health services) may also influence such hospital use, but thus far there is limited information about the relative importance of these, compared to primary care.

The researchers believe that the following will benefit from the information created:

1) Potentially avoidable emergency admissions and A&E attendances are expensive to the NHS, thus benefits from the information provided include more efficient targeting of resources to reduce such hospital use, and savings from any reductions incurred.

2) Potentially avoidable emergency admissions and A&E attendances are also detrimental to patients, who may be better supported outside of the emergency care system. People with poor mental health may be particularly negatively affected by unplanned hospital visits whether these are for physical or psychological reasons. For example, long wait times can be stressful and exacerbate acute mental health crises, and non-mental health professionals treating patient’s health may not have the correct skills to provide appropriate care. There is also a risk of exposure to stigma from non-mental health professionals. Benefits of the information provided may therefore include reduced exposure to the detrimental sequelae of emergency care for patients.

3) High rates of avoidable hospital admissions are therefore thought to reflect poor primary care performance, poor access to primary care, and underlying population characteristics. However, the influence of quality and availability of other services (including social services and mental health) on potentially avoidable emergency admissions and A&E attendances has not yet been considered. Potential benefits of the information provided therefore also include a more holistic understanding of how such hospital use could be reduced which does not unfairly penalise primary care.

The research team at UCL will be responsible for sharing the findings in several formats in order to be accessible and usable by different end users (see specific outputs). Responsibility for implementing any changes informed by the findings would be borne by those planners and funders of relevant elements of health and care services (i.e. Clinical Commissioning Groups (e.g. mental health services, hospital services), NHS England (primary care), Local Authorities (social services)).

It is possible (and likely) that the research team at UCL will not be able to explain all the observed variation in emergency hospital use outcomes, due to lack of available information on all possible factors influencing such variation. It is unlikely, however, that the research team will not be able to explain any of the variation since previous work using a subset of the proposed indicators has been undertaken in which some of the variation could be explained by CCG-level differences in availability and quality of health care services (primary care).
Even before attempts to explain variation, benefits will also be accrued from information on the extent of national variation between CCGs in hospital use outcomes which will be used to highlight which CCGs are doing more or less well.

Outputs:

The university will disseminate findings widely among NHS Trusts, CCGs and Local Authority partners and through Collaborations for Leadership in Applied Health and Care (CLAHRC) networks. The research team will produce full and brief reports that will be e-mailed to CLAHRC partners through the research team organisation's Research Updates newsletter, made available on the website, and highlighted through the organisation's social media platforms (specifically the organisation's Twitter feed). The research team at UCL will also submit a paper for peer reviewed publication in a Journal (e.g., BMJ Quality & Safety) and conference presentation (e.g., Health Services Research UK).

All outputs will contain only aggregate level data with small numbers suppressed in line with the HES analysis guide. For all outputs data will be presented as age/sex adjusted CCG-level rates of potentially avoidable emergency admissions/A&E attendances per 10,000 or 100,000 population. CCGs will be the unit of analysis and the age/sex adjusted rates will be the dependent variable of interest. Coefficients will be presented indicating the strength/direction of association between CCG-level supply (e.g. % of population satisfied with GP consultations) and demand factors (e.g. population deprivation), and the dependent variable. The extent to which various supply and demand factors influence rates of hospital admission/A&E attendances will be depicted through changes in adjusted R-squared scores (from regression analyses).

Specific outputs of the study will be:

• Full and brief report: Open Access (free) – made available electronically to CLAHRC North Thames partners (NHS Trusts, CCGs, Local Authorities) through the research team organisation's Research Updates e-newsletter. The research team will disseminate links to the report findings more broadly (e.g. to members of the public and mental health charities) through the organisation's website, and highlighted through the organisation's social media platforms - specifically through the associated Twitter account. The goal date of the dissemination of this report is currently October 2018.
• Peer reviewed article: Open Access (free) - research team submission to BMJ Quality & Safety. The intended customer group for this output are academics and clinicians. The goal date of the publication of the peer reviewed article is currently October 2018.
• Conference presentation: This will be a paid event – the research team will submit an abstract abstract to the Health Services Research Network UK. It will be attended by clinicians and academics in the study field. The goal date for presentation at this conference is July 2019.

***************************

Findings will be shared with charities such as the Mental Health Foundation and the 'Mental Elf Service' to help disseminate to the general public. In particular, The Mental Elf shares and summarise information about evidence-based research findings in a lay format through blogs and Twitter. UCL will also contact the Royal College of Psychiatrists and the Royal College of General Practices to ask them to promote findings through their website and newsletters.

Processing:

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data)

Pseudonymised data will be transferred from NHS Digital to the Data Safe Haven within University College London (UCL) via secure File Transfer. Data will be stored, handled and analysed within the UCL Data Safe Haven, no data will be shared with third parties. The UCL Data Safe Haven has been certified to the ISO27001 information security standard and conforms to the NHS Information Governance Toolkit. It is built using a ‘walled garden’ approach and a file transfer mechanism enables information to be transferred into the ‘walled garden’ simply and securely.

Data will be processed by substantive employees of University College London, all with up to date Information Governance training, and only those members of the research team who have been granted access to the secure data storage area within the UCL Data Safe Haven will be able to access the data for the purposes outlined above.

The research team at UCL will use HES admitted patient care and A&E datasets to calculate CCG-level rates of potentially avoidable emergency admissions, and A&E attendances as outlined above. The bridging file will be used to identify which patients have been in contact with mental health services within each year requested to calculate the above rates for the mental health cohort. As those in contact with mental health services are a heterogeneous group, the MHSDS data will be used to identify a subgroup mental health cohort diagnosed with serious mental illness.

Diagnosis data and information relating to contact with mental health services is sensitive data thus the research team have kept information requested to a minimum (e.g. primary diagnosis for first finished consultant episode rather than all diagnosis codes). The research team at UCL will minimise the risk of re-identification by:

- including minimum necessary diagnosis data (e.g. first 3 or 4 characters of primary diagnosis of first
finished consultant episode rather than all diagnosis codes),
- retaining information about those first finished episodes for which the primary diagnoses pertain to one of
a list of conditions for which admissions have been identified as 'potentially avoidable' (e.g., non-specific
chest pains, COPD, blocked urinary catheter). This list of conditions does not include any rare conditions
(affecting fewer than 5 in 10,000 individuals), minimising the risk of re-identification from the data items
requested.
- conducting analyses at a CCG-level & producing CCG level statistics as outputs (i.e. calculating age/sex
standardised CCG-level rate of emergency admissions for potentially avoidable conditions).

The research team at UCL will use openly available routinely collected anonymised data aggregated to CCG level from external sources (e.g. Office for National Statistics, Atlas of Variation, Quality and Outcomes Framework) to examine which supply (e.g. availability and quality of primary care) and demand factors (e.g. population deprivation, morbidity) explain any observed CCG-level variation in admission/attendance rates among mental health and non-mental health cohorts using regression analyses. The CCG will be the unit of analysis and data from these other sources will be linked at a CCG level only.

No data will be linked to record patient level data.

No data directly provided from NHS Digital will be stored, processed, or accessible to or by any third party organisations not listed in this agreement.

Data will only be accessed, processed and analysed by substantive employees of University College London.

******************

For data from the Mental Health (MHSDS, MHLDDS, MHMDS) data sets, and any Mental Health data linked to HES or SUS, the following disclosure control rules must be applied:
• National-level figures only may be presented unrounded, without small number suppression
• Suppress all numbers between 0 and 5
• Round all other numbers to the nearest 5
• Percentages can be calculated based on unrounded values, but need to be rounded to the nearest integer in any outputs
• In addition for Learning Disability data in Mental Health (MHSDS, MHLDDS, MHMDS), the England-level data also must apply the suppression of all numbers between 0 and 5, and rounding of other numbers to the nearest 5

Usual Care versus Specialist Integrated Care: A Study of Hospital Discharge Arrangements for Homeless People in England — DARS-NIC-86666-V7Z1L

Opt outs honoured: Y (Excuses: Section 251, Section 251 NHS Act 2006)

Legal basis: Section 251 approval is in place for the flow of identifiable data, Approved researcher accreditation under section 39(4)(i) and 39(5) of the Statistical Registration Service Act 2007 , Health and Social Care Act 2012, Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive, and Non-Sensitive

When:DSA runs 2019-12 – 2020-11 2018.03 — 2018.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard_minutes_7_december_2017.pdf

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Office for National Statistics Mortality Data
Bridge file: Hospital Episode Statistics to Mortality Data from the Office of National Statistics
Civil Registration - Deaths
Civil Registration (Deaths) - Secondary Care Cut
HES:Civil Registration (Deaths) bridge
Civil Registrations of Death - Secondary Care Cut
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Identifiable

Objectives:

The aim of this work is to evaluate the impact of specialist hospital discharge services developed as part of the Department of Health’s Section 64 (voluntary sector-led) ‘ten million pound cash boost’ to improve hospital discharge for homeless people compared to "usual care" and are funded to do this work. Specifically University College London (UCL) will evaluate evidence of differences in time-to-subsequent hospital admissions (re-admissions) and mortality between i) homeless people accessing care at discharge services delivered by participating study sites, ii) homeless people admitted to hospitals without specialist services and iii) hospitalised non-homeless people in the most deprived quintile. In addition, the study aims to quantify differences in the characteristics of people in each of the three groups that could drive differences in rates of readmission or mortality irrespective of the hospital care available. UCL will also present stratified data for the two dominant types of specialist hospital discharge schemes (housing-link and clinically-led services).

The principal research objectives are outlined as follows:
1. What are the rates of hospitalisation (overall admission rates, unscheduled admission rates and 28-day emergency readmission rates) for homeless people? Do these rates vary by type of specialist homeless hospital discharge service?
2. What are the mortality rates (including avoidable mortality) in homeless people and do these rates vary by type of specialist homeless hospital discharge service? How do mortality rates compare to people in the most deprived quintile who are not homeless?
3. What is the duration of hospital admission in homeless people accessing and not accessing specialist homeless hospital discharge services?
In addition UCL will present summary data for the characteristics of hospitalised individuals within each group (homeless accessing specialist services, homeless not accessing specialist services, deprived non-homeless).

The first work package seeks to gain an informed understanding of the ways in which specialist integrated homeless health and care services (SIHHC services) are being developed and implemented to facilitate hospital discharge in England and the impact this is having on quality of care and organisational outcomes such as the prevention of readmission to hospital. For this work package, local service providers will be asked to identify and nominate potential participants.
The second work package (WP2, for which support is requested for datasets 1, 3, 4, and 5) is a data linkage and health economic analysis work package that will work with twenty sites across England where homeless patients have been admitted to hospital. A cohort of homeless people who have used specialist discharge scheme will be compared to a cohort of homeless people who have not used such provision. The study will also compare patient’s hospitalisation history before and after engagement with specialist services. Analysis will also be undertaken to understand whether the outcomes are a factor of homelessness specifically or are tied to deprivation.

Data provided under this agreement is not being used for work package one.

Data provided under this agreement will not be shared with Kings College London.

Yielded Benefits:

In any future application, the applicant will be required to provide details of the expected benefits resulting from the study. UCL have completed analysis on all primary outcomes for the study, but have not yet undertaken analyses on several secondary outcomes and some mortality outcomes. Aggregated results of the primary outcome analysis completed thus far has enabled UCL to use the data to complete an economic analysis of the hospital discharge schemes.

Expected Benefits:

The findings of this study will feed into policy development for care pathways for homeless people accessing hospital and housing services. Policy responses are expected within 5-10 years, i.e. 2023-2028.
In addition the outputs from this study will enable Commissioners to make better informed decisions in relation to the specialist services within this research.
The findings from this study will enable Commissioners to ensure specialist services offered within their area are better targeted to suit the needs of the local population.
Specific Outputs Expected, Including Target Date include:
- Report for the funders The National Institute for Health Research (February 2018), including a Health Technology Assessment. This document will outline the results from the objectives and analyses outlined in the study protocol and include a "roadmap" for commissioners (see Benefits section).
- Manuscript of findings for peer-reviewed publication (Spring 2018). As above this document will outline the results from the objectives and analyses outlined in our study protocol. UCL's target journals are BMJ, The Lancet Public Health and BMC public health.
- Aggregated, with small numbers suppressed in line with the HES analyse guide, site-specific summary of readmissions and mortality for each of the sites participating in the study (February 2018). This information will help sites better understand the care pathways of patients accessing their services.

As a direct result of this research, subsequent policy development and better commissioning, UCL expect that this study will directly benefit patients the aim is to identify and improve:
- How hospital readmissions can be prevented through the provision of specialist services for people with experience of homelessness
- How UCL can help prevent people with experience of homelessness having to visit emergency departments after a hospital admission
- How UCL can provide more appropriate hospital treatment to people with experience of homelessness
- How UCL can avoid deaths in people with experience of homelessness through the provision of better services and treatment whilst in hospital

Outputs:

A report for the funders The National Institute for Health Research (February 2018), including a Health Technology Assessment. This document will outline the results from the objectives and analyses outlined in the study protocol and include a "roadmap" for commissioners (see Benefits section).
- Manuscript of findings for peer-reviewed publication (Spring 2018). As above this document will outline the results from the objectives and analyses outlined in our study protocol. UCL's target journals are BMJ, The Lancet Public Health and BMC public health.
- Aggregated, with small numbers suppressed in line with the HES analyse guide, site-specific summary of readmissions and mortality for each of the sites participating in the study (February 2018). This information will help sites better understand the care pathways of patients accessing their services.

To assist with development of policy from the findings a series of planned feedback events across the country. These events will be attended by local stakeholders - including the homeless health charity Pathway, homeless service users, health and social care commissioners and service providers - who are well placed to help us disseminate the findings in an effective way. These events will aid translation of the results into a "roadmap" against which commissioners can explore the strengths and weaknesses of their local provision. A final version of the roapmap will be published as part of a Health Technology Assessment paper, for use by policy developers such as NICE.

Processing:

Three groups of individuals admitted to hospital will be included in the study. The first group are homeless individuals admitted to hospital at any one of the 16 sites which offer a specialist discharge scheme. The second group are individuals seen by a community homeless service in London, Find and Treat. Find and Treat is a specialist outreach team that work alongside over 200 NHS and third sector front-line services to tackle TB among homeless people. The third group are a random sample of individuals equal in size to the Find and Treat group and are living in lowest quintile of deprivation areas.

Three main datasets will be used for analysis:
• Unconsented data collected at the 16 sites that are part of this study
• Unconsented data from Find and Treat
• Linked HES and ONS mortality data

Data flow from specialist hospital discharge services for homeless people, and from Find and Treat services for homeless people:
1) At each site the research team will compile identifiers (NHS number, forename, surname, aliases, date of birth, sex) for each patient accessing the service between November 2013 to November 2016 and create a unique study identifier for each record for the service provider.
2) The data requested for the study will then be securely uploaded and processed at University College London (UCL). The data will be stored and cleaned.
3) Identifiable information (NHS number, forename, surname, aliases, date of birth, sex,) required by NHS Digital for the linkage to Hospital Episode Statistics/Office for National Statistics (HES/ONS) will at this point be transferred to NHS Digital with the unique study identifier.
4) NHS Digital will use the Personal Demographics Service (PDS) to add NHS number (where missing) to the identifiable data from homeless healthcare users.
5) When NHS Digital have confirmed that the list is clean, and linkage to HES has been completed, the researchers will de-identify all data held by the researchers at UCL (i.e. all identifiers except the unique study identifier will be destroyed).
6) The data requested for the study - including the unique study identifier - will then be securely uploaded and processed on the data safe haven at UCL in line with the published study protocol.

Dataset 1 will consist of data from homeless healthcare users and include forename, surname, aliases, date of birth, sex, address, contact numbers, hospital of admission, date of hospital admission, nationality, ethnicity, and NHS number. This data will be collected from study fieldwork sites.

Dataset 3 will consist of data from homeless healthcare users and include forename, surname, aliases, sex, address, land contact number from Find and Treat Service.

Dataset 4 Personal Demographics Service data from homeless healthcare users, including date of hospital admission, date of hospital discharge, date of hospital appointment and date of death. The data within PDS will be used to provide missing NHS numbers for Dataset 1 and Dataset 3. The research team will not at any point have access to these NHS Numbers, which will be used to improve the linkage of data to HES data.

Dataset 5: HES ONS mortality data from homeless healthcare users and a geographically comparable and representative sample of lowest quintile of deprivation population in HES equal in size to the Find and Treat dataset. This data will be de-identified by NHS Digital.

Outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide.

Critical Care Health Informatics Collaborative — DARS-NIC-27803-W8G1B

Opt outs honoured: Yes (Excuses: Section 251, , Section 251 NHS Act 2006, )

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 s261(7); National Health Service Act 2006 - s251 - 'Control of patient information'.

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2017-07 – 2020-07 2017.12 — 2018.09. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Admitted Patient Care (HES APC)
Hospital Episode Statistics Critical Care (HES Critical Care)

Type of data: Identifiable

Objectives:

The Critical Care Health Informatics Collaborative (CCHIC) database is an informatics resource for health researchers.
There is currently very little knowledge about the long term outcomes of patients who have been through critical care. Current data is conflicting and there is no national strategy to follow these patients up. Up to this point most outcomes research looks at a single cohort of patients over a defined (and relatively short) period of time. This project aims to create an information technology capability, enabling researchers to investigate the course and outcome of the critically ill. The data, which will be rich both longitudinally and in depth, will enable researchers to answer questions that have been impossible.
The study aims to automatically collect and store routine clinical, demographic and long term outcome data of all patients admitted to participating critical care units. The database will be of interest to Health Services Researchers and Clinical Trial Researchers amongst others.
Linkage of HES and critical care data will enable longer term outcomes for critical care survivors to be tracked.
The broad objectives for this linkage are set out below:
1) Examining long term clinical outcomes. Do survivors of critical illness have significantly reduced life span or increased resource utilisation, compared to the healthy or general hospital population? Surprisingly this is not currently known but this knowledge would allow clinicians and patients alike to make informed care decisions. Equally, ongoing healthcare needs could be predicted and resources appropriately allocated.
2) Looking at predictors of long term outcome. Researchers have previously found a number of variables that may predict patient outcome however, the datasets are frequently limited by the inability to continually re-analyse or examine trends over time. This constantly updating database will enable researchers to analyse how multiple predictive variables interact with each other and over time. The numbers of patients within the database will grow rapidly (The 5 participating NHS trusts admit approximately 10,000 critically ill patients a year) allowing researchers to investigate these predictors in a large population of UK patients over a pro-longed period.
3) The impact of secular or process changes could be measured. The outcome of many Critical Care interventions, whether they are part of research, process change or just over time, are measured in the short term. Updating databases, such as CCHIC, that track longitudinally have the ability to either look for an immediate stepwise change, following implementation (e.g. High Impact Interventions in the Saving Lives campaign) and/or measure that impact over a longer period of time (following HES linkage). The latter is important as the perceived impact is often in reality attributable to secular change, as general levels of care improve rather than a specific intervention
This approach ultimately allows for the use of registry trials of interventions, be it a new process of care, mode of ventilation or the introduction of a new drug. These trials are very efficient (can be run quickly with minimal cost) but until now this approach has been problematic in Critical Care.

4) Clinical Trials: Over the years, approximately 70 drug trials have been performed within critical care, not one has demonstrated a reproducible benefit to the patient. Our understanding of disease has taken huge leaps forward but we seem unable to translate that into patient benefit. The reason for this is multifactorial; however trial design has often been called into question. Outcomes and patient selection is based on small population research and may well not be relevant to the population being studied. Large, all inclusive datasets will allow trialists to examine the course and outcome of their target population and question whether they have chosen the appropriate end points. These end points are often short term survival (usually 28 days, an FDA requirement) which are not highly relevant to the patient. Longer term outcomes could be tracked with HES linkage which would give a far more meaningful outcome thus CCHIC will help design more efficient trials.

Expected Benefits:

Very little is known about the long term outcomes of patients who have been through critical care. Following their recovery, do they have on-going specialist health needs? Despite surviving a critical illness, are they more likely to die earlier than before those who have not? Current data is conflicting and there is no national strategy to follow these patients up. The nature of the combining databases would allow examining the impact that individual, or combinations, of organ failures and severity of illness would have on chronic health. The benefits of this exemplar include:

· Allowing clinicians to make accurate predictions on the course of an illness and the long term outcome

· Allow patients and next of kin to be informed of the condition, empowering them to make choices regarding their treatment with a level of confidence currently unobtainable

· To predict future health care needs and resources

· Education of both patients and healthcare professionals on the sequelae of critical care

· Identify longer term research end points that may be more informative. Cancer treatments often look at 1 and 5 year survivals to measure success or failure of treatment. Critical Care research often looks at day 28 mortality (an FDA requirement for drug trials) or other short term markers, rarely are patients followed up for more than a year that may be relevant for researchers such as 5 year survival.

· The ability to look at regional differences in care and outcome

and endpoints etc. may impact on running the study. Other benefits would include:

· Studying the effect of time on the control group. Clinical trials often run over many years, during this time the introduction of other interventions or processes of care may impact on the trials endpoints, this database could be used to examine this. Providing this data is accurate and up to date, it could conceivably act as a ‘virtual’ control group and allow trials to adapt to secular changes

· Better information would allow more accurate recruitment targets, whilst the database can examine the impact of how recruitment rates may change by:

o altering entry criteria

o altering the time permitted to enrol patients

· Using this database, researchers could look accurately at a variety of different primary or secondary outcomes and endpoints for the trial, enabling each point to be accurately powered.

· Health Service Researchers can use this data to monitor impacts on longer term health care utilisation and long term survival of research interventions

· The potential for an alerting system, such that a recently admitted patient meeting specified criteria could be flagged to the research team. This could give patients the opportunity to take part in clinical trials despite being in a hospital remote to where the research team are based.

· Mapping morbidity relative to disease intervention strategies, thus improving our understanding of disease projection

As an IT resource, rather than a specific research project, UCL expect the outputs and benefits to be broad and ongoing. However, it is envisaged that the benefits will come from several angles. Little is known about the long term survival or outcome of patients who have survived a critical illness. Most previous research follows their acute admission but rarely look beyond hospital discharge. The linking of the proposed databases will bring benefit to several areas that are directly relevant to UK patients.

1) The patient. Long term outcomes and healthcare utilisation is vital if patients, or their proxy, are to make informed decisions over their care and have the appropriate expectations. The only available data is obtained from clinical trials which are not reflective of the general critical care population, often US based and so not very relevant to the UK. As this is about tracking longer term outcomes its value will grow over years. Examples would include:

a. Are those who have survived an episode of severe respiratory failure more likely to be admitted back to hospital with respiratory failure in the future?
b. Are those who require haemodialysis for an episode of acute kidney injury more likely to develop chronic kidney disease?
c. There is an argument that the intense and prolonged inflammation may predispose to cancer and coronary artery disease in the future. Linkage to HES (and disease registries) will help address this argument. It is recognised that this work will take many years to complete.

With all outcome research it is important to try and identify and potentially modify any risk factors?

2) Clinicians and researchers will be able to objectively evaluate the impact of interventions or process change on the longer term health of the critical care survivors. They will also be able to look at potential predictors of outcome enabling them to keep the patient or next of kin fully informed. This will also allow the beginnings of health economic studies to also evaluate these interventions, be it a new drug or a new process

3) Commissioning groups. Knowing these outcomes will enable appropriate planning and anticipation of future healthcare needs.

4) Clinical trialists. All drug trials have failed to produce any reproducible benefit in critical care. The reasons behind this are multi-factorial, but a poor understanding of the patient population is certainly a component. A highly comprehensive database would enable researchers to examine the impact of altering their trial entry criteria and look at different longer term endpoints. This would enable the trial to be appropriately powered and recruitment rates more accurately predicted.

This database will enable:

a) Appropriate powering of studies by examining the properties of the control subjects and how this may change as inclusion/exclusion criteria are altered. This is important, as the nature of the control group appears highly variable between different trials (mortality rate ranging from 15 to 40% within the same therapeutic area). This database will enable trialists to investigate how altering inclusion/exclusion criteria may alter the outcome of their control group and allow appropriate powering of their study.

b) Predict more accurate recruitment rates. The majority of large NIHR sponsored critical care trials have failed to recruit to target; many believe the targets to be unrealistically set. This database will enable accurate modelling of trial recruitment.

c) Critical Care research, unlike cancer research, usually focuses on short term goals such as 28 day mortality or hospital length of stay. This database would enable researchers to examine the impact of interventions on long term outcomes (mortality and health care utilisation).

It could be envisaged that this data will be of value (but not exclusively) to:
• Allow clinicians to make accurate predictions on the course of an illness and the long term outcome.
• Allow patients and next of kin to be informed of the condition, empowering them to make choices regarding their treatment with a level of confidence currently unobtainable.
• Predict future health care needs and resources following critical care discharge.
• Educate both patients and healthcare professionals on the sequelae of critical care.
• Identify longer term research end points that may be more informative. Cancer treatments often look at 1 and 5 year survivals to measure success or failure of treatment. Critical Care research often looks at day 28 mortality (an FDA requirement for drug trials) or other short term markers, rarely are patients followed up for more than a year that may be relevant for researchers such as 5 year survival.
• The ability to look at regional differences in care and outcome.
• Act as a ‘virtual’ control group and allow trials to adapt to secular changes, examine different end-points and inclusion criteria and more accurate recruitment targets.
• Health Service Researchers can use this data to monitor the impact of new changes or interventions longer term health care utilization and long term survival of research interventions.

The nature of the combining databases would allow examining the impact that individual, or combinations, of organ failures and severity of illness would have on chronic health. Data extracted from HES will be linked with clinical data obtained from their critical care admission to track long term mortality and hospital re-admissions.

Outputs:

As the CCHIC database and the HES linkage should be seen as a core resource for researchers around the UK. The outputs are expected to be broad. However there are specific projects that would use the HES linked data and these include:

Research

A well designed, adaptable and accessible database could help researchers, including the pharmaceutical and medical device industry, to design more accurate, predictable and efficient trials that are relevant to the NHS. This would continue to make the UK an attractive place to run these vital pieces of research. This database will include the clinical, pathological, outcome and demographic data that would allow researchers to construct ‘virtual trials’ examining how altering inclusion/exclusion criteria and endpoints etc. may impact on running the study. The linkage to longer term outcome databases (e.g. HES) will enable more relevant (and patient centred) endpoints to be examined as well as the impact on healthcare resources and the wider health economics. We are using this approach to help identify a patient cohort that may respond to a new treatment for pneumonia. Initial output expected end of 2017, however, longer term outcomes following HES linkage will also be examined and initial output expected end of 2018.

A novel project is also underway to map the new sepsis criteria onto the CCHIC cohort. These results will be submitted for publication. Longer term outcomes of this cohort are unknown, HES linkage will allow us to examine this highly relevant aspect. However, as data is collected prospectively it will take a minimum of a year before any publication becomes relevant.

Audit and Quality Improvement

The data base can be used to audit the performance of individual units in complying with national and international guidelines. For example, we will use the data to examine the ability of ICUs to ventilate their patients within the recognized safe limits. Ventilating above these limits is associated with a poor outcome. There is a complexity in this data that means achieving this locally without specialist data analysts will be difficult, it also enables variation between ICUs to be studied.

It is envisaged that these reports could be automated for participating units. Currently data quality reports are now automated (and returned to the Trusts) as the first step. An initial review of this data (excluding HES) will aim to be submitted for publication by the end of 2017. Linkage to HES will track these outcomes into the future this is important to again analyse whether any differences are sustained

Patient safety: Novel research modelling to identify potential complex signals preceding a clinical deterioration, this could potentially warn clinical staff of impending problems before they become clinically apparent. This approach will then be used to model and predict the longer term outcomes and problems that the HES data will be used for.

CCHIC is being created as a resource for researchers to use. Although the CCHIC team will produce some technical papers around the utility of the database, it is hoped that the majority of the outputs will come for researchers who can use the data.

As with all research UCL would expect the output of the research to be disseminated in the appropriate academic journals and meetings. CCHIC aims to be completely open and transparent. All data releases will be logged on a public facing website. Any coding associated with the database development is freely available on a GitHub repository (no data) and the associated NIHR website is being updated (http://www.hic.nihr.ac.uk/nihr-hic-themes). Any publications stemming from CCHIC will be required to acknowledge the database.

UCL are already presenting the concepts and utility in Critical Care Conferences such as the Intensive Care Society State of the Art meeting and the UK Critical Care Forum. UCL have held ‘datathons’ where jittered and anonymised data can be examined by interested researchers to examine the utility. UCL aim to submit the first paper to a peer reviewed speciality journal such as Critical Care.

Processing:

This project aims to collect a rich clinical data set during a patient’s admission to a participating critical care unit. This data set (containing identifiable data for linkage) would be encrypted and transferred to the University College London Identifiable Data Handling Solution where it will be pseudonymised on landing and the identifiers split from the clinical data. At regular intervals a data extract from HES will be requested so as to link with the clinical data, thus providing the long term outcomes.

The dataset will be created from merging routine electronic data extracted from several existing databases. These databases are:
1. Critical Care patient information system, ICIP (IntelliVue Clinical Information Portfolio, Phillips). This database integrates data from the hospital Patient Administration System, pathology database, patient monitors and point of care devices. It is also capable of extracting
2. The ICNARC (Intensive Care National Audit and Research Council) pre submission database: Data is routinely collected by NHS Trusts and submitted to ICNARC in order to bench mark and compare performances of critical care units around the country. This database is partly populated by ICIP and through manual entry by a data manager
3. Hospital Episode Statistics. Held by the NHS Digital, this database contains information of hospital admissions

Data flow: -
• Identifiable patient data extracted from NHS Trust clinical systems
• Encrypted and transferred to UCL Safe Haven (legal basis covered by s251)
• Pseudonymised on landing
• Identifiers split from clinical data
• Identifiers sent to NHS Digital to extract requested HES data (Study ID, NHS number, date of birth, sex and postcode)
• HES data linked in UCL Identifiable Data Handling Solution to the clinical data – identifiers removed
• Researcher requests access to data (with appropriate approvals, including information governance concerns addressed, signed data sharing agreement)
• Anonymous, crippled, example dataset of fields requested released to researcher. All data has been randomly adjusted, to give an appearance of what the data will look like
• Researcher works in their own time, with their own tools using a cloned analysis engine (R package or virtual machine)
• Analysis script posted back to safe haven to run on live database
• Analysis results (summary data) returned to researcher
• Primary (patient level) and identifiable data never leaves the Identifiable Data Handling Solution

Data will automatically be extracted to a SQL database within each Trust’s firewall. Integration engine ensure the data is compatible (defined by the dataset) between trusts. Data will then be moved to a SQL database within the University College London IDHS (Identifiable Data Handling Solution) Safe Haven using a secure, encrypted point to point protocol (AS256).

The Safe Haven meets the Information Governance standards required to hold sensitive and identifiable NHS data. It is envisaged that data will be transferred to the UCL Safe Haven every 24 hours. At the point of entry the Research Data Indexing Service within the Safe Haven pseudonymises the data and separates the clinical from the identifier demographics.

All personnel involved with the handling of sensitive and identifiable data will have been trained in information governance. For those involved with handling the data within the participating trusts the training will be provided by the Trusts internal programme. All personnel will comply with local Trust guidelines. Those involved with handling data within the UCL safe haven will have undergone information governance training through the Information Services Division of UCL, it will be a requirement that they comply with all the UCL regulations.

The research team currently consist of investigators from each site involved. Each investigator has a track record in high quality, academic critical care research. The IT support at University College London also has a highly successful research track record and the infrastructure required to support this project. All individuals with access to the record level HES data are substantive employees of UCL.

The Critical Care HIC management team will review all requests to access data. The management team will consist of a representative from each of the five participating NHS Trusts and a member (independent of the investigators) who is skilled in Information Governance and a lay member. This review will look at the both the validity of the research and can the database provide the data required to answer the question and the Information Governance to ensure the requested data subset cannot lead to any patients being identified.

The roll of the Management Committee is to evaluate applications from researchers to access data. The application will be assessed for research validity (by BRC members), information governance and chance of re-identification risk (IG expert) and in the public interest (lay member). Applications will need to show:

1. Submission from bona fide research institutions

2. The data is being acquired solely for valid research, that is non-commercial and intended for patient benefit. It is understood that some of these applications may be from the Bioscience Industry or researchers outside science field.

3. Researchers demonstrate understanding confidentiality and information governance and to abide by the UCL terms and conditions

4. Sign a data sharing agreement that states they will not to try to re-identify the data, will not share the data and will destroy the data within a specified period of time. This data sharing agreement will align with all the data sharing requirements from the HSCIC

5. Have appropriate Research Ethics approval

Any request to the data will need to be from a researcher from a research institute. The research question will need the appropriate HRA ethical approval. The request will need to demonstrate that it is scientifically robust, for healthcare purposes and for public benefit. Providing this is satisfied a ‘dummy’ data set will be released to the researcher. This data set will look very similar to the real data but will not contain any identifiers and the data will be ‘jittered’ (randomly altered). This ‘dummy’ data set will allow the researcher to develop their analysis script, once developed this script will be returned to the UCL Identifiable Data Handling Solution (IDHS) where the IDHS staff will run the script on the live data base. Only aggregate data with small numbers suppressed in line with the HES analysis guide will be released to the researcher.

All researchers will be required to sign an end user license agreement that is modelled on that used by the UK Data service (see https://www.ukdataservice.ac.uk).

The Critical Care HIC Management Team will examine all data access requests. Requests that have the potential for identifying individuals (e.g. narrow date ranges, extremes of age) will be rejected. All data requests will be analysed by the Critical Care HIC management team so as to ensure that small data subsets cannot identify individuals.

UCL will examine data requests classifying data into 4 categories: Direct Identifiers (always removed), Key Variables, Sensitive Fields, Non-identifying variables. The Key Variables and Sensitive Fields could potentially become identifiers in small numbers, UCL employ the concepts of k-anonymity and l-diversity to address this issue. This methodology ensures that data does not contain numbers lower than ten thus minimising the risk of re-identification.

Researchers will only have access to non-identifiable data as described above.

Researchers will come from a range of research institutions:
• Universities
• NHS Institutions
• Charitable Organisations
• Bioscience industries

It is likely interest in data will be from the whole population rather than a specific Trust. However it could be expected that an analysis of geographic variation in practice and outcome would be performed. Currently all Trusts who submit data get an automated data quality report. It is an aim that this data is automatically audited against recognised practice and the results fed back directly (this component of work is underway)

The pharmaceutical, device and diagnostics industry may apply for access to the data as outlined above. UCL believe, that the industry are absolutely vital to progress in this field. The IP developed from the use of the data will remain with the researchers. The access cannot be provided for free as there are significant ongoing costs to hosting and maintaining CCHIC, we would expect such companies to cover these costs. Importantly CCHIC is under the control of the 5 BRCs and is not for profit. All revenue will be re-invested into the project.

UCL are only permitted to approve projects to make use of the data for research where there is a clear benefit to the provision of healthcare or the promotion of health.

MR1213 - Lung-SEARCH: A RANDOMISED CONTROLLED TRIAL OF SURVEILLANCE FOR THE EARLY DETECTION OF LUNG CANCER... — DARS-NIC-147948-6MSGP

Opt outs honoured: N, Yes

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2011-03 – 2021-03 2016.06 — 2018.03. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type: CANCER RESEARCH UK, UNIVERSITY COLLEGE LONDON (UCL), UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Scottish NHS / Registration
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report
MRIS - Personal Demographics Service

Type of data: Identifiable

Objectives:

The Lung-SEARCH trial is a national screening study of >1700 individuals who smoke and have COPD are randomised to yearly sputum (and those who are positive have an annual bronchoscopy and CT scan) or no screening. All patients are followed up for 5 years, and the aim of the trial is to determine whether this screening policy can identify cancers at a lower stage at diagnosis. Because patients are free from cancer when they enrol in the study, we need to flag all patients with the national register to ensure that we find out about cancers in both trial groups, and also deaths and cause of death. The target sample size is about 80 lung cancers, so it is important that we can identify as many as possible, particularly those in the unscreened group.

Yielded Benefits:

Outputs:

The Ethics committee has been informed that this trial will be registered with the NHS Information Centre. Patients’ names are required for this and the ethics committee have approved this on the Patient Information Sheet and Consent Form.

Processing:

Patients sign the consent form after reading the Patient Information Sheet. This states that information from the National Health Service Care Register, the NHS Information Centre and/or Cancer Registries will be used to follow the patients’ progress
It also states these bodies need to be sent names identifiable information to be able to provide information. This means the patient is happy for their details (including name and date of birth) to be sent to the Cancer Research UK & UCL Cancer Trials Centre. With the patient’s permission, their GP will be notified of their patient’s participation in the trial.

Acute Day Units as Crisis Alternatives to Residential Care — DARS-NIC-93084-W4B4L

Opt outs honoured: No

Legal basis: Health and Social Care Act 2012, , Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2017-10 – 2020-09 2017.09 — 2018.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard_minutes_5_october_2017.pdf

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Accident and Emergency
Mental Health Minimum Data Set
Bridge file: Hospital Episode Statistics to Mental Health Minimum Data Set
Hospital Episode Statistics Accident and Emergency (HES A and E)
Hospital Episode Statistics Admitted Patient Care (HES APC)
Mental Health Minimum Data Set (MHMDS)

Type of data: Anonymised - ICO Code Compliant

Objectives:

University College London requires the requested data for use in the Acute Day Care (AD-CARE) study. UCL applied for and was awarded funding in response to the NIHR’s commissioned call ‘15/24 Assessing service models of community mental health response to urgent care needs’.

AD-CARE aims broadly to assess the real life effectiveness and user experience of Acute Day Units (ADUs) as a community response to mental health crises.

The requested data will be used for the objective of answering the following research questions:
a) Are acute readmission rates reduced in areas/Trusts with a more enhanced crisis care pathway, defined as having an Acute Day Unit (ADU) in the pathway?
b) In Trusts with ADUs, do individuals who access NHS-funded ADUs have different outcomes compared to similar people who have had an acute episode but do not access ADUs?

UCL is the only organisation that will have access to the record level data requested of NHS Digital, and the only people accessing the data will be substantive employees of UCL.

AD-CARE is a new study for which data has not been supplied before.

The data required from NHS digital are: 1) Mental Health data: 2013/14, 2014/15, 2015/16 (observation period), and data for 2011/12 and 2012/13 to provide information about cohort members prior to the start of the observation period. 2) HES data for the same period (2011-2016), to provide information about A&E attendance and admissions to acute care 3) Bridge file linking these two data sets.

Expected Benefits:

This study aims to determine the effectiveness of Acute Day Units in reducing admission rates to acute psychiatric units, as well as rates of compulsory admissions in a one year period. This will make up a part of the delivery of a comprehensive report on the current value of ADUs, and recommendations about service models.

The study is in a sense exploratory, as it aims to make up for a dearth of information regarding modern ADUs; measurable benefits are therefore highly dependent on its results. If, for example, the study finds that areas with ADUs have a 5% reduction in admissions, and can therefore recommend more widespread use of these units, then an expected measurable benefit would be a 5% reduction in admissions in areas which go on to introduce ADUs. The emphasis by NHS England on evidence-based practice suggests that if evidence was found of reduction of inpatient admissions in areas with ADUs, such units would be recommended by policy makers. The AD-CARE team have good working relationships with NHS England. Reduced admissions to inpatient units are of benefit in terms of reduced costs, waiting times, and future admissions, and improved outcomes for service users in terms of recovery and service experience. Given the low number of NHS trusts that currently have ADUs, a national roll-out of such services to every Trust would potentially impact a very large number of service users.

Outputs:

This is an NIHR funded study and all outputs of the trial will therefore be reported to the NIHR as part of the final report at the end of the study.

AD-CARE is a 36-month study which began in 07/16. Write up of Work Package 3, for which this data will be used, is anticipated for months 30-36.

The study final report submission is anticipated for month 36 (06/19). This will cover all findings of the study including: factors influencing planning and implementation; the key findings of the study; and the response to the research questions. The NIHR will then publish the report of the AD-CARE trial on its website.

Dissemination will be carefully planned with The McPin Foundation and NHS England to ensure high quality peer review of outputs and stakeholder engagement and information sharing. The McPin Foundation is a UK charity which exists to transform mental health research, by placing the lived experience of people with mental health problems at the centre of research methods and the research agenda. NHS England mental health leads are keen to work with the study so that findings feed directly into developments regarding crisis care across the NHS.

The usual full scientific reports, peer reviewed papers, powerpoint presentations, conference talks, and web output will be provided, as expected of all NIHR studies. PPI and NHS management colleagues will also be consulted to disseminate our findings across a range of NHS and health provider platforms. Summary documents will be provided in a range of formats suitable for different audiences.

The AD-CARE study website is hosted at UCL and is updated with new information as appropriate (https://www.ucl.ac.uk/psychiatry/research/epidemiology/ad-care). All ADUs and crisis services have been sent links to the website. Twitter has been a useful vehicle for distributing publications to a variety of audiences and findings more widely once published in open access journals. In addition, an expert consensus meeting/conference will be held in the final two months of the project.

All outputs will contain only data that is aggregated with small numbers suppressed in line with the HES Analysis Guide.

Processing:

No data will be provided to NHS Digital by UCL. NHS Digital will provide pseudonymised HES and Mental Health data to UCL. This will then be stored at the UCL SLMS Safe Haven, a technical environment for storing, handling and analysing identifiable data which has been certified to the ISO27001 information security standard and conforms to NHS Digital's Information Governance Toolkit. The data will only be accessible through the Data Safe Haven.

NHS Digital will create a cohort of mental health patients (MHMDS) for the years 2013/14, 2014/15 and 2015/16. Data will then be extracted from the MHMDS for this cohort for the years 2011/12, 2012/13, 2013/14, 2014/15 and 2015/16. The cohort created in the first step of this process will then be linked to the requested HES datasets for the years 2011/12, 2012/13, 2013/14, 2014/15 and 2015/16.

The core datasets will only be accessed by one statistician within UCL, who will be a substantive UCL employee. The full data set will be used to consider whether engagement with community services is a predictor or consequence of ADU treatment. Cases in the data set will then be identified that have used acute mental health care service. ‘Acute mental health care service’ will be defined by use of in-patient, Acute Day Service (ADU), Crisis Resolution Home Treatment (CRT), Crisis House, or other locally-defined acute services. Discussion and exploratory work with NHS Digital has resulted in the decision that UCL are best placed to identify these cases due to the lack of clarity over the definition of ‘Acute mental health care service’ users. UCL will identify the ‘Acute mental health care service’ users and the non-acute service users' records will be deleted from the dataset. UCL will also provide NHS Digital with the methodology on how to identify the non-acute service users’ records and remove them from the dataset.

NHS trusts use a variety of nomenclature for the services they provide, and identifying all and only the acute services is anticipated to be a complex task. It is unlikely that NHS Digital would be able to filter the full data set to provide all cases with acute service use. The data set will then be restricted to service users with records in MHMDS for the study years who have used any acute (urgent) mental health care service during this time. Non-acute service users’ records will be deleted from the data set at this point. Access to and use of these services will be used to identify the start and end of episodes of acute care. Multiple episodes per service user will be included. No data about or from service users will be obtained from any other sources.

Given that this study requires data from two separate databases (MHMDS and HES) across multiple years, a unique service user identifier (the MHMDS study ID) will be attached to each record by NHS Digital. This will make it possible to link care spells across reporting periods (years) and different providers of publically available data, as well as to HES data. MHMDS also includes geographic identifiers (Provider Trust or hospital, Commissioner, GP practice and Census Lower Super Output Area (LSOA) based on the postcode for the service user’s residence).
MHMDS records will be linked to local Census and deprivation data at the area level:

2011 Census (A), which is publically available from ONS (www.ons.gov.uk/ons/datasets-and-tables/index.html), which will be linked by geocoding the postcode of residence for each service user to its corresponding lower super output area (LSOA), a suitable spatial scale at which Census data is made available.
Index of Multiple Deprivation (A), created by the Department for Communities and Local Government, is publically available data and was last issued in 2015. Service users’ postcode at residence will be geocoded to their LSOA, as above, to obtain estimates of deprivation.

There will be no requirement nor attempt to re-identify individuals from the data.

The data will not be made available to any third parties except in the form of aggregated outputs with small numbers suppressed in line with the HES Analysis Guide.

All organisations party to this agreement must comply with the Data Sharing Framework Contract requirements, including those regarding the use (and purposes of that use) by “Personnel” (as defined within the Data Sharing Framework Contract ie: employees, agents and contractors of the Data Recipient who may have access to that data).

A secondary analysis will be undertaken of the MHMDS data using multilevel modelling (sometimes known as hierarchical modelling). Multilevel modelling is a statistical method designed to take account of clustered data, e.g. to take account of the fact that people using an NHS Trust in one area of the country may be more similar to each other than to people using an NHS Trust in a different area.

Data will be analysed as a cohort. Two sets of analyses will be undertaken at the individual level : 1) Within those areas that have an ADU, outcomes for service users who use the ADU will be compared to outcomes for service users who do not; 2) Using all areas (whether or not they have an ADU), service users who access an ADU during an episode of acute care will be compared with those who do not.

The overall association will be explored between services with and without ADUs in terms of overall admission rates to inpatient and crisis services, exploring how much variance is explained at the organisational/geographical level, accounting for social variables such as demographics (gender, age, ethnicity) and deprivation score.

Time of cohort entry (the start of the first episode of acute care) will be defined by the first occurrence of any of (a) Crisis Resolution Home Treatment service contact; (b) admission to a mental illness bed; or (c) contact with an ADU.
Associations will be investigated between ADU attendance and study outcomes. The hypothesis of the study will be tested: that service users who attend an ADU during an acute care episode spend less time as in-patients than matched controls, have fewer A&E attendances and a longer time to relapse (i.e. further use of acute care services).

Two additional analyses will be undertaken, a survival analysis (which looks at how long it takes before someone uses acute services again), and a cohort analysis (which looks at the group of people who use a certain service for a period of time). The details of each analysis are as follows: (1) survival analyses (in which the outcome is time to subsequent acute care episode) using multilevel modelling to test for the main effects of ADU use versus no ADU use on study outcomes and to model variance in these associations between provider Trusts; and (2) cohort analysis in which the primary outcome will be the total time spent in acute care over the 6 months since entering the cohort.

In the economic evaluation, source unit costs will also be identified for all service use within each person-level episode file. This will also use multilevel modelling, with cost as the dependent variable, identify predictors of costs incurred by service users, and estimate the impact of ADU attendance on subsequent costs associated with service use. As a summary:-
• Individuals, working under appropriate supervision on behalf of data controller(s)/processor(s) within this agreement, are subject to the same policies, procedures and sanctions as substantive employees.
• All outputs will be restricted to aggregated data with small numbers suppressed in line with the HES Analysis Guide
• All outputs will follow the Mental Health (MHSDS, MHLDDS, MHMDS) disclosure control rules
• The data from NHS Digital will not be used for any other purpose other than that outlined in this Agreement.

Small area geodemographic profiling of health needs — DARS-NIC-28051-Q3K7L

Opt outs honoured: N

Legal basis: Health and Social Care Act 2012, Health and Social Care Act 2012 s261(1) and s261(2)(b)(ii), Health and Social Care Act 2012 s261(2)(b)(ii)

Purposes: No (Academic)

Sensitive: Non Sensitive, and Non-Sensitive

When:DSA runs 2018-03 – 2020-09 2017.09 — 2017.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Admitted Patient Care (HES APC)

Type of data: Anonymised - ICO Code Compliant

Objectives:

The objective of this research project is to create geodemographic, small area profiling of Health Needs, which takes into account a range of patient and contextual characteristics.

Comprehensive and precise assessment of health needs, as required by The Local Government and Public Involvement in Health Act 2007 (see section 116 on JHWBs and JSNAs) remains a significant challenge due to the existence of complex pathways and multi-level processes. Because different groups of people have been shown to have different vulnerabilities to ill-health in different circumstances, this study seeks to identify;

(1) general health needs/vulnerability at the small area level and
(2) the geographical circumstances in which certain types of patients (classified by ethnicity) appear to be vulnerable/express different health needs.

This is novel research that will account for health needs at multiple levels (patient group, small areas) by taking a distinct geomedical angle that draws on latest advances in spatio-epidemiological modelling, geodemographics, analysis of surname geographies and population genetics.

The work that has been completed under the current Data Sharing Agreement has found marked differences in health needs between geographical areas. However, the work was limited by the available years of HES extracts (2003-2009). This time period does not allow for appropriate and robust age-and-sex standardisation based on reliable data sources (i.e. Census 2001 and 2011). In addition, the short time period does not permit investigation of temporal trends and thereby the stability of area health profiles. Stability of small areas health profiles is an essential dimension of health needs assessment and crucial for informed policy decisions, such as resource allocation.

UCL therefore wish to refine these objectives in two ways:

a) to break down small area health profiles by patient category; and
b) to assess the stability of health profiles by using more years of HES extracts covering two Census periods.

As soon as 2001 and 2011 HES extracts become available, the work completed so far can be updated and put forward to peer review and academic publication as well as dissemination to the health care community (see section 5c below).

In short, the follow-up research will develop a dynamic model to predict long term health needs refined for groups of patients and small areas.

Reasons for this study:
Geo-temporal small-area profiling is useful in identifying need for intervention, assessing causes of health challenges (specifically different profiles of disease burdens) by characterising their nature and spatial extent as well as the geographic and demographic context in which they are manifest. Small area profiling thus supports the definition of policy priorities at the strategic level as well as more tactical decisions by care providers.

For example, a temporally persistent disadvantaged profile in a number of neighbourhoods in a city can support evidence-based policy making by defining local strategies to address these challenges. On the provider level, awareness about locally specific health challenges and their contexts can support operational decisions, such as treatment or screening choices.

Small-area profiling are suitable methods to support strategies and decisions, because of their capacity to estimate health challenges in robust ways (or at least with a number of confidence measures about the certainty of the estimate) and in accounting for geographical and demographic context.

Similar approaches have been applied on ONS mortality data, and geographically and demographically varying challenges could be identified (see e.g. Green et al 2014 Health & Place 30C, Shelton et al 2006 Health & Place 12.4). Yet, mortality data are limited in a sense that they are retrospective and less useful for defining care priorities (since a death has already occurred) and only considers the cause of death, leaving out non-fatal and temporary conditions.

There is no study that uses HES data in this way, but work that has been carried out under the previous agreement (NIC-33864-6226N) suggests that resulting classifications can provide significant benefit in summarising and contextualising health needs with direct implications for care.

A similar product that already exists is healthACORN, but this product is limited in terms of health conditions it focusses on and its transparency: as a commercial product, the methods are not revealed, only the resulting classifications of fairly broad categories.

More generic classifications have been employed in health studies, particularly focussing on health screening (e.g. Sheringham et al 2009 Sexual Health 6.1, Noaham et al 2010 Journal of Public Health 32.4). But these studies, too, had to rely on generic and partly commercial products, and it is expected that the utility in assessing need and health screening can be improved if classifications are devised more scientifically and focussed on observed morbidity.

Therefore, a transparent, robust, dynamic and contextualised small area classification would be a valuable resource of policy makers and care providers alike who seek to define priorities and take decisions that are informed by a detailed and local understanding of health needs in the spirit of joined-up health care (see NHS Institute for Innovation and Improvement 2010 ‘Joined-up Care’).

Yielded Benefits:

The research programme under which this research is funded lasts until September 2020, and it is within this context that the above mentioned benefits are expected to be realised.

Expected Benefits:

The outcome of this research is envisioned to support local Clinical Commissioning Groups (CCGs), local authorities and other public partners including third sector organisations in determining their budgeting and commissioning priorities, in which a long-term view and strategic need orientation is crucial. CCGs are expected to lead Joint Strategic Health Needs Assessments (JSNAs) and Joint Health and Well-being Strategies (JHWBs), and there is a need to find appropriate methods and data sources to do this.

By investigating and contextualising specific health needs by different group of patients and areas, UCL intend to support these strategic players in assessing health needs as part of JSNAs and developing JHWBs by taking a novel long-term, geographic and temporal view of local health needs and preventive health care. At the strategic level, these will be the primary beneficiaries through the provision of a detailed, transparent and robust small area profiling. At an operational level, care providers (such as GP practices or hospital trusts) can use the classification in delivering personalised patient care, on which there is currently an important emphasis (see e.g. latest report of London Health Commission, 2014. ‘Better health for London’), including treatments and targeted health screening initiatives.

The classification including metadata and policy recommendations will be available for download at the aforementioned website and is thus likely to reach a wide audience. In the long-run, a common evidence base will enable various players of the health care system to deliver a higher level of joint-up care.

So far, analysis of the data has demonstrated the relevance of geographical variations in health needs and is being prepared for publication and sharing with CCG Ealing to link the findings to practice in North West London. It is intended to further and refine the investigation (with the inclusion of ethnicity) to deliver benefits using the existing formal connection to CCG of Ealing (which is their representation on the advisory board of the UCL research group of which the applicants are part). The data will be made available for download when classifications can be refined, i.e. by the end of the project at the latest, if not earlier for sufficiently robust, intermediate findings.

The target date for this research is December
2017

In summary, there are three main areas in which this study is intended to contribute to improving patient care.

1, At the strategic level, UCL seek to support Clinical Commissioning Groups (CCGs), local authorities and other public sector partners in defining budgeting and commissioning priorities as part of the compulsory Joint Strategic Health Assessments (JSNAs) and Joint Health and Well-Being Strategies (JHWBs) (as per section 116, The Local Government and Public Involvement in Health Act 2007).

2. At an more operational level, the study is also intended to improve personalised patient care by providing an evidence base (the profiling) for care providers, such as GP practices with a detailed contextualisation of local challenges including conditions by a refined measure of ethnicity. In addition, the classification is intended to facilitate targeted health screening in local areas, in which ethnicity has emerged to be a crucial factor in both screening uptake as well as condition onset.

3. The output of this research is intended to support preventive care by all strategic and operational organisations and agencies, by suggesting potential causes of observed health challenges through contextualisation, including links to a refined measure of ethnicity.

Outputs:

The output of the research will be presented at seminars and conferences (such as AAG - American Association of Geographers – Annual Conference: AAG Health and Medical Geography Specialty Group; International Population Data Linkage Conference – Farr Institute; REVES International Network on Health Expectancy Annual conferences), and included in published research papers in peer-reviewed journals (such as Social Science & Medicine; Health & Place; Int Journal of Epidemiology and Community Health). Intermediate findings are currently being prepared for publication in Health & Place and presentation at the above-mentioned AAG conference as part of the International Geography, GIScience, and Urban Health featured theme.

Once the data has been disseminated from NHS Digital to UCL , other conferences and papers may also be included but cannot be confirmed at this time.

The final classification of LSOAs along with recommendations for policy will also be published and made available for download on a website that is linked to the ESRC-funded Consumer Data Research Centre (CDRC – data.cdrc.ac.uk). The CDRC websites attracts an increasing user base from a range of NHS organisations (Public Health England, CCGs, NHS trusts), local government, third sector organisations. New data products are announced and linked in email notifications, quarterly newsletters and special features, such as the CDRC Map of the Month, and Twitter feeds. On average, there are about 150,000 page views and 50,000 data downloads per year.

Outside the CDRC community, UCL are keen to engage with Public Health England and are in contact with them to help identify suitable methods of dissemination, possibilities of promoting the CDRC website and products to the health care community, notably CCGs.

In summary, the outputs of the research comprises
• a novel geodemographic classification of health needs at LSOA level
• a detailed, contextualised characterisation of health needs and challenges associated with each area profile
• recommendations for policy and health and social care

The results will be disseminated in the following ways:

• presentation and publications within the academic community
• provision on data.cdrc.ac.uk, the CDRC data catalogue
• visualisation on maps.cdrc.ac.uk, the widely accessed map service of CDRC
• targeted engagement with Public Health England using existing links with this research group
• presentation of outputs to CCG Ealing using existing formal links between the CCG and CDRC

All outputs will use aggregate data with small numbers suppressed in line with the HES analysis guide.

Processing:

Data minimisation
The following data minimisation strategies have been considered:
(1) Temporal censoring is deemed impractical for the following two reasons. First, health care benefit of the output (geodemographic profiles of health needs) is strongly limited, if the question of temporal stability remains unaddressed. Are the health needs only short-lived phenomena or a long-term challenge that require strategic attention? Second, the research needs to capture at least two census periods in the analysis so as to derive age-and sex standardisation. Currently, the existing work with data pertaining to years 2003-2009 is compromised due to potentially inaccurate age- and sex standardisation.

(2) Demographic censoring (restricting the analysis to certain age, sex or ethnic groups) would limit the validity of the work, since the work aims at developing profiles for the entire population that comprises all ethnicities, age and sex groups. Only with the full population, will the research be able to display the full range of challenges and priorities facing local health care.

(3) Geographic censoring would equally reduce the possibility to identify specifically local challenges and hence limit the utility of the geodemographic profiles for the health care sector. Specific regional and local challenges can only be identified by viewing local patterns against country-wide patterns (e.g. averages). For example, in previous analysis, it was found that London faces greater incidence of sense organs and nerves-related conditions than other city regions in England. This specific challenge would have remained masked in an analysis that focussed on London only.

(4) A sample would undermine the validity, granularity and robustness of the work. In addition, sampling would be extremely difficult to implement within the context of this study. The estimation of geographically varying challenges need to be performed at a sufficient level of geographical granularity. Geodemographic indicators are typically developed at postcode or Census Output area level. The most granular level available in HES is LSOA (Lower Layer Super Output Area). There are approx. 27,000 LSOAs in England, and in previous work with HES data the research has confirmed there is a need for a sufficient number of cases to develop robust small area estimates and be capture temporal trends. A sample would need to cover each LSOA, be proportionate to local admissions within each LSOA, be demographically representative of all patients in each LSOA and be repeated in this way each year. As such, this would be extremely difficult to implement and significantly compromise the objective of the study to provide robust, full-fledged and sufficiently granular health profiles.

This data are obtained for research purposes in medical geography and will be processed fairly according to regulations and standards of the Data Protection Act and corresponding UCL policies.

The processing of the information is carried out for non-commercial research and educational purposes at a higher educational institution (UCL) in exercise of its legitimate functions of training and research.

The data will be accessed only by substantive employees of UCL and only for the purposes described in this document.

All relevant individuals (Data Protection Officer, Departmental IT Representative, Computer Security Team, Data Protection Coordinator) are informed about the research proposed and are able to monitor proper conduct in all procedures.

The data are used as direct input in the analysis and some data will be processed for the purpose of names classification and geographical classification. Items that are not used as input in the way set out in this application are not of interest and therefore not requested. In order to reduce the risk of identification to an acceptable minimum, a special procedure to names extraction has been elaborated jointly with NHS Digital.

At the end of the study, the data will be destroyed in accordance with UCL’s retention policy. The UCL Computer Security Team has developed guidelines of safe removal, and a retention schedule that is developed with the UCL Records Manager will ensure that timely removal can be monitored.

No data will be transferred to third parties, EU countries or countries outside the EU at any point of time in the research.

The project has successfully undergone Ethical Review and review by HRA's Confidentiality Advisory Group.

The DH Information Governance toolkit has been completed and is reviewed regularly. The project has achieved level 2 of requirements for Hosted Secondary Use Team/Project (IG toolkit version 13).

The data will be stored in a database at UCL's School of Life Science and Medical Studies’ safe haven environment and undergo statistical, multivariate analysis. The data will be processed from an authorised PC client located within UCL and queries will be performed through secure data access. Secure output files (e.g. statistical results) may be transferred through secure file transfer subject to standards of disclosure control. The data will also be related to census datasets (UK Census 2001 and 2011) using information on LSOA (Lower Layer Super Output Areas) level. The LSOA code held in HES extract will be used to match HES records to residential context.

In terms of patient classification, UCL will use demographic data (age, sex), primary diagnosis and admission and discharge information. UCL will derive ethnicity from a patient by linking HES records to the PDS and classify patients’ names. This linkage will be performed by NHS Digital. In previous communication with NHS Digital (ref NIC-216528-N0N5Q), UCL established the technical feasibility of a procedure to link HES extracts and use names stored in PDS to create a bespoke patient classification. CAG has reviewed this procedure and confirmed that section 251 support is not required as no confidential data will be disclosed to the researcher.

The data processing is a five-point process which is as follows;

1. NHS Digital extracts the patient identifiers of all patients in the HES index for the years 1998/99 to 2012/14
2. NHS Digital links the identifiers to the PDS (Personal Demographics Services) data which is held on the MIDAS system
3. NHS Digital applies the names classification algorithm provided by UCL. No names are disclosed to the researcher.
4. NHS Digital adds to each row of HESID to the name class
5. NHS Digital supply a file of HES records with requested fields including pseudonymised HESIDs and linked classes

UCL expect to develop such a method by grouping and aggregating health diagnosis for each small area by different patient categories over a study period from 1999/00 to 2013/14, which covers 2001 and 2011 Census periods. This allows area linkage to contextual variables, including the prevalence of long-term limiting illness and aggregate demographic characteristics of Lower-Layer Super Output Areas (LSOAs), which is also required for age-and-sex standardisation. The Census neighbourhood statistics, which the data will be linked to, are Office for National Statistics-cleared aggregate statistics; they do not contain information on individuals.

The work will be carried out by specified users at UCL at the Department of Geography at University College London, who are substantive employees of UCL and only for the purposes described in this document. The project objectives and plan have been reviewed by academic staff from UCL Epidemiology and Public Health, and the interaction will continue throughout the project to ensure scientific rigour and maximum impact to relevant interest or user groups, notably CCGs.

Project 96 — DARS-NIC-18646-P0R3M

Opt outs honoured: N

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC

Purposes: ()

Sensitive: Non Sensitive, and Sensitive

When:2017.06 — 2017.05. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type:

Sublicensing allowed:

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
MRIS - Cohort Event Notification Report
MRIS - Cause of Death Report

Type of data:

Objectives:

The Clinical Relevance of Microbleeds in Stroke (CROMIS-2) study is an observational inception cohort study of patients throughout the UK (80 hospitals) started on best practice oral anticoagulant (without prior use) for presumed cardioembolic ischaemic stroke due to non-valvular AF with follow up for the occurrence of ICH, ischaemic stroke and cognitive function for two years.

Over the last decade, increasing use of oral anticoagulants to prevent cardioembolic ischaemic stroke due to atrial fibrillation (AF) in an ageing population has led to a five-fold increase in the incidence of anticoagulant-related intracranial haemorrhage (ICH) - a rare but unpredictable and catastrophic complication. Cerebral microbleeds (CMBs) on magnetic resonance imaging (MRI) may predict ICH risk, as may genetic polymorphisms influencing brain small-vessel integrity or anticoagulation stability.

The CROMIS-2 study aims to establish the value of CMBs and genetic factors in predicting symptomatic ICH following best practice oral anticoagulation to prevent recurrent ischaemic stroke due to AF.

The data provided by NHS Digital about patients recruited to the CROMIS-2 study will inform UCL when an outcome event occurs during their 24 month follow up period. This supports the objectives of the study and will allow reporting on the whole study population, and helps ensure families are not approached after a relative’s death.

The study has also recruited patients admitted to participating centres with intracerebral haemorrhage -ICH and DNA is collected to increase the power of the genetic studies. Clinical and imaging data is collected from these ICH cases to investigate risk factors associated with anticoagulant-related ICH compared to non anticoagulant-related ICH.

The CROMIS-2 primary research question asks:

(1) whether the presence of CMBs helps predict the risk of symptomatic oral anticoagulant-related ICH in patients who are anticoagulated following cardioembolic stroke due to non-valvular AF?

(2) Do the burden (number) and distribution of CMBs at baseline influence the risk of ICH in this cohort?

Secondary Questions

(3) In patients anticoagulated after ischaemic stroke due to non-valvular AF, are CMBs associated with an increased risk of recurrent TIA, ischaemic stroke or death?
(4) Are genetic polymorphisms related to the integrity of brain small vessels or anticoagulant metabolism associated with an increased risk of ICH?
(5) Are CMBs a better predictor of oral anticoagulant-related ICH than clinical risk factors and/or leukoaraiosis on MRI scans?
(6) Can a useful risk prediction model incorporating clinical, imaging and genetic factors be developed to assess the risk of best practice oral anticoagulant-related ICH?
(7) Can UCL identify new genetic, clinical or radiological risk factors associated with anticoagulant-related ICH?

The data will not be used for commercial purposes, and will not be provided in record level to any third party, or used for direct marketing in any way.

The study tracks patients using NHS Digital's patient tracking service, and also by contacting the patients and their GPs at 6, 12 and 24 months to obtain follow up for our outcome measures. The purpose for receiving HES data is to ensure all hospital episodes of the cohort are captured if missed via the other methods of follow up, to ensure a high completion of data and success to the study, which aims to benefit directly the health and social care system by allowing better management of patients following stroke. Patients were recruited from August 2011 until July 2015, as an extension was granted by the funders to increase our sample size. UCL are now in follow up phase. Follow up will end July 2017.

Expected Benefits:

Both CROMIS-2 non vitamin k oral anticoagulant (NOAC) ICH and non NOAC ICH should expect to add a great deal to the understanding, prevention and management of ICH.

Published papers:

1) This paper outlined how NOAC intracerebral haemorrhage volumes were smaller and clinical outcomes better when compared with warfarin intracerebral haemorrhage in the CROMIS-2 data set. This may lead clinicians to have more confidence in prescribing non vitamin K oral anticoagulants as there has been previous cncern that ICH on these medications would be large and devastating.

2) This paper outlines current vitamin K reversal stratergies in ICH and their correlation with clinical outcome. We were able to show the combination of fresh frozen plasma and prothrombin complex concentrate might be associated with the lowest case fatality in reversal of vitamin k antagonist associated intracerebral haemorrhage (VKA-ICH) (VKA=warfarin in most cases), and fresh frozen plasma (FFP) (used to reverse anticoagulation) may be equivalent to prothrombin complex concentrate (PCC) (Used to reverse vitamin k antagonists (i.e. warfarin)). This helps clinicians make treatment decisions in patients with intracerebral haemorrhage.

Conference talks and published abstracts:
1) Missed opportunities to prevent stroke in patients with AF. European stroke organisation conference 2016. European stroke journal
This talk highlighted that only 1/3 of patients with AF and fulfilling guidelines for anticoagulation were anticoagulated. This should highlight to clinicians and GPs that patients with AF should be on anticoagulation to prevent ischaemic strokes.

Publications pending:
1) Results of CROMIS 2. Yet to be written/submitted
It is difficult to hypothesis what the benefits will be of this paper as we do not know the results. Either way it will be the largest dataset of patients with ischaemic stroke, AF and anticoagulatants with rating of cerebral microbleeds. It should inform clinicians what the risk of cerebral microbleeds are with these patients. This has not been written or submitted at this stage.

Through the papers listed above UCL would hope to be able to improve risk prediction for patients with ischaemic stroke and CMBs as well as for patients with ICH who have concurrent AF. Dissemination will be primarily through peer reviewed journals and presentations at major stroke conferences (International stroke conference, European stroke organisation conference and UKSF).

This study will help guide clinicians an policy as to when it is safe to provide anticoagulants to patients who have had ischaemic stroked with atrial fibrillation. The study results are also being pooled with an international collaboration, involving 15 other studies around the world that have been collecting data on the same sample of patients. This will further inform policy and benefit clinicians. The stroke association have a conference every year which UCL are part of, with a stand and also with talks and presentations. The stroke association are supportive of UCL’s research.

Outputs:

All outputs will be aggregated with small numbers suppressed in line with the HES analysis guidance.

Outcome measures are reoccurring stroke (by type), cardiac events, bleeding events and death. The study is funded until July 2017 and findings will be available early 2018. The data from the study will be anonymised fully, and be published in research journals, and the outputs will be accessible to clinicians, academics and the public, as well as charities such as the Stroke Association and the British Heart Foundation. The data will be beneficial to the health and social care system by enabling clinicians and academics to make informed decisions about anticoagulation for patients who have had an ischaemic stroke due to Atrial Fibrillation.

The study findings from CROMIS 2 study I (AF) will be available at the end of the study in 2017 (primary output). The study findings will be disseminated via a research article which will be submitted to a high impact peer reviewed journal (e.g. Lancet Neurology) and aims to help guide clinicians with anticoagulant decisions in patients with cerebral microbleeds.

In the interim, baseline data and other sub studies will be published. UCL are presenting two papers at the UK Stroke Forum (2016), highlighting findings from the baseline data. Both presentations are published as abstracts in the International Journal of Stroke. These abstracts will be made into papers in due course and submitted to high impact peer review journals who will review and decide whether to disseminate based on merit. The first talk outlines how common microbleeds are in this cohort and the second talk highlights the missed opportunities in preventing stroke secondary to atrial fibrillation in the UK.

CROMIS study II (ICH) has produced a paper currently in press on a sub study, showing the clinical and radiological characteristics as well as outcomes of warfarin intracerebral haemorrhage vs. direct thrombin inhibitor intracerebral haemorrhage. This paper has been accepted by Neurology, a high impact clinical journal. It will help guide clinicians in their choice of anticoagulant for stroke prevention in AF. This sub study has lead to an international collaboration of 12 centres worldwide looking at the same question. CROMIS-2 has the most patients within this collaboration. CROMIS- 2 (ICH) also contributed to another international collaboration looking at outcomes for different treatment regimens in ICH. Again this was published in a high quality peer reviewed journal and helps clinicians in their management of ICH (Annals of Neurology. Ann Neurol. 2015 Jul;78(1):54-62. doi: 10.1002/ana.24416. Epub 2015 May 14. PMID:2585722).

Published papers:
1) Volume and functional outcome of intracerebral hemorrhage according to oral anticoagulant type Wilson D, Charidimou A, Shakeshaft C, Ambler G, White M, Cohen H, Yousry T, Al-Shahi Salman R, Lip GY, Brown MM, Jäger HR, Werring DJ; CROMIS-2 collaborators. Neurology. 2016 Jan 26;86(4):360-6. doi: 10.1212/WNL.0000000000002310. Epub 2015 Dec 30.PMID: 26718576.

2) Reversal strategies for vitamin K antagonists in acute intracerebral hemorrhage. Parry-Jones AR, Di Napoli M, Goldstein JN, Schreuder FH, Tetri S, Tatlisumak T, Yan B, van Nieuwenhuizen KM, Dequatre-Ponchelle N, Lee-Archer M, Horstmann S, Wilson D, Pomero F, Masotti L, Lerpiniere C, Godoy DA, Cohen AS, Houben R, Al-Shahi Salman R, Pennati P, Fenoglio L, Werring D, Veltkamp R, Wood E, Dewey HM, Cordonnier C, Klijn CJ, Meligeni F, Davis SM, Huhtakangas J, Staals J, Rosand J, Meretoja A.Ann Neurol. 2015 Jul;78(1):54-62. doi: 10.1002/ana.24416. Epub 2015 May 14.PMID:25857223

In addition, UCL are investigating genotypes of haptoglobin in ICH among CROMIS-2 patients and risk prediction scores in ICH patients from CROMIS-2. UCL hope to publish these papers in 2016.

The HES data is vital for the primary outcome of the study, which is reoccurring stroke/TIA; bleeding events; cardiac events and death. Without this information UCL cannot be sure if they may have missed many primary outcomes which are absolutely essential to the reliability of the data and its outputs.

Processing:

UCL originally supplied the flowing identifiable details of the cohort to NHS Digital to flag on its IT system;
Member number, NHS number, surname, forename, date of birth, sex, postcode, address, date of address.

NHS Digital provides quarterly updates on participants’ deaths or changes in NHS registration status. The cohort identifiers NHS Digital holds will be used to link to HES data and the linked data will be supplied back to UCL.

The supplied data files will be downloaded by the Study Co-ordinator only. This will allow the patient database (accessed only by the Study Co-ordinator) to be updated with any primary and secondary outcomes that have occurred whilst the patient is being follow up and that UCL have not already been informed about via our follow up methods (GP and patient questionnaires). The identifiable data will be removed from the clinical data. All outputs will be aggregated with small numbers suppressed in line with the HES analysis guidance.

The data will only be stored on a secure database at UCL at one location only.

Standard ONS Terms and conditions will be adhered to in regards to the data being processed.

The record level data will be accessed only by the study coordinator at UCL they are a substantive employee of UCL.

MR104a - Regional Heart Study (Female Cohort) — DARS-NIC-148101-R7RSL

Opt outs honoured: Y (Excuses: Consent (Reasonable Expectation))

Legal basis: Section 251 approval is in place for the flow of identifiable data, Health and Social Care Act 2012 s261(7); Informed Patient consent to permit the receipt, processing and release of data by NHS Digital, Health and Social Care Act 2012 s261(7), Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive, and Non-Sensitive

When:DSA runs 2018-06 – 2020-03 2016.09 — 2017.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: AGD minutes - 8 June 2023 final.pdf, IGARD Minutes - 20th May 2021 final.pdf, igard_minutes_1_march_2018.pdf, igard_minutes_9_november_20171.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Scottish NHS / Registration
MRIS - Flagging Current Status Report
MRIS - List Cleaning Report
MRIS - Members and Postings Report
Cancer Registration Data
Civil Registrations of Death
Demographics

Type of data: Identifiable, Anonymised - ICO Code Compliant

Objectives:

The data supplied by the NHS IC to London School of Hygiene & Tropical Medicine will be used only for the approved Medical Research Project MR104A.

Yielded Benefits:

The BWHHS has 218 publications within peer reviewed journals to date addressing the aims outlined previously in this application. Many of the journals for BWHHS manuscripts are considered high ranking journals. This ranking of journals is measured by something called the impact factor which reflects the frequency with which the average article in a journal has been cited within the year. 31% of the publishing journals for BWHHS manuscripts have been published in journals with a high-impact factor (>8). This means that the output is published in journals that are widely read by doctors, scientist and public health practitioners, ensuring a greater impact of the work. The BWHHS has also had measurable benefit directly as fourteen of the 218 publications have contributed to the following fifteen clinical care and public health guidelines: 1. National Clinical Guideline Centre NICE clinical guideline CG181 Lipid modification (2014) 2. Diabetes, Pre-Diabetes and Cardiovascular Diseases developed with the EASD ESC Clinical Practice Guidelines (2013) 3. Dyslipidaemias 2016 (Management of) ESC Clinical Practice Guidelines (2011) 4. Dyslipidaemias 2016 (Management of) ESC Clinical Practice Guidelines (2016) 5. Arterial Hypertension (Management of) ESC Clinical Practice Guidelines (2013) 6. CVD Prevention in Clinical Practice (European Guidelines on) (2016) 7. Factors Influencing the Decline in Stroke Mortality: A Statement from the American Heart Association/American Stroke Association (2013) 8. Genetics and Genomics for the Prevention and Treatment of Cardiovascular Disease: Update A Scientific Statement From the American Heart Association (2013) 9. Guidelines for the Prevention of Stroke in Patients With Stroke and Transient Ischemic Attack: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association (2014) 10. Update on Prevention of Cardiovascular Disease in Adults With Type 2 Diabetes Mellitus in Light of Recent Evidence: A Scientific Statement From the American Heart Association and the American Diabetes Association 11. Social Determinants of Risk and Outcomes for Cardiovascular Disease: A Scientific Statement From the American Heart Association 12. Future Translational Applications From the Contemporary Genomics Era: A Scientific Statement From the American Heart Association 13. Basic Concepts and Potential Applications of Genetics and Genomics for Cardiovascular and Stroke Clinicians: A Scientific Statement From the American Heart Association 14. Salt Sensitivity of Blood Pressure: A Scientific Statement From the American Heart Association 15. Preventing and Experiencing Ischemic Heart Disease as a Woman: State of the Science. A Scientific Statement from the American Heart Association

Expected Benefits:

To be completed by the customer on re-submission

Outputs:

To be completed by the customer on re-submission

Processing:

To be completed by the customer on re-submission

Project 98 — DARS-NIC-161339-RC2NB

Opt outs honoured: Y

Legal basis: Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012), Other-Health and Social Care Act 2012 - section 261(1)

Purposes: ()

Sensitive: Sensitive, and Non Sensitive

When:2016.12 — 2017.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type:

Sublicensing allowed:

Datasets:

Office for National Statistics Mortality Data (linkable to HES)
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Admitted Patient Care

Type of data:

Objectives:

This request forms part of an NIHR HS&DR funded evaluation that aims to study reconfiguration of acute stroke services in London and Greater Manchester. In doing so, it aims to identify lessons that will guide future reconfiguration work in stroke and other services.

BACKGROUND
Considerable changes in the provision of clinical care within the English National Health Service (NHS) have been discussed in recent years, with proposals to concentrate specialist services, such as major trauma, cardiac surgery, and specialist paediatrics, in fewer centres serving larger populations. The case for reconfiguring acute stroke services was strong, with clear evidence of unacceptable variations in quality of care and clinical outcomes, with many patients not receiving timely, evidence-based care, which may in turn influence patient outcomes.

Major system change in acute stroke care was prompted by the National Stroke Strategy (2007), which noted the importance of stroke services offering rapid access to evidence-based care, and the potential benefits of reorganising acute stroke services. London and Greater Manchester led the way in this process, radically re-organising their stroke services in 2010.

Before reconfiguration, in both areas, patients with suspected stroke were taken to the nearest hospital with an Accident and Emergency (A&E) service, then admitted to a specialist stroke unit or general medical ward.

LONDON
After reconfiguration in 2010, the following system was implemented, with 8 hyper-acute stroke units (HASUs) set up to provide rapid access to brain imaging, assessment by stroke specialists, and interventions including thrombolysis, 24 hours per day, seven days per week (24/7); 24 stroke units established to provide acute rehabilitation services; in addition, 5 NHS Trusts had all stroke services withdrawn. In London, all patients with suspected stroke are eligible for treatment in a HASU; once stable, they are transferred to a stroke unit, nursing home or their own home.

GREATER MANCHESTER ‘A’
After reconfiguration in 2010, the following system was implemented: 3 HASUs (one operating 24/7, the other two operating from 7am to 7pm, Monday to Friday); and 10 NHS Trusts providing district stroke centre (DSC) services. Patients with suspected stroke arriving at hospital within four hours of developing symptoms were eligible for treatment in a HASU; once stable, were transferred to a DSC, nursing home or their own home. Patients with suspected stroke presenting outside the four-hour ‘window’ were taken to the nearest DSC, similar to the care pathway before reconfiguration.

GREATER MANCHESTER ‘B’
In March 2015, a revised service model was implemented in Greater Manchester: all suspected stroke patients are now eligible for HASU treatment; the in-hours HASUs have been extended to cover 7am-11pm, 7 days per week; and DSCs are no longer designated to treat patients with suspected acute stroke. With these changes, the Greater Manchester acute stroke system is now similar to the care pathway in London.

AIMS
This request is in support of an analysis of the impact of major system change in acute stroke services on stroke patient mortality (at 3, 30, and 90 days after admission), patient length of stay and cost-effectiveness of services, focusing on centralisation of services in London and Greater Manchester, and using the rest of England as a control.
The Stroke re-configuration team has already published an analysis of the impact of the changes implemented in London and Manchester in 2010. In this paper it was established that while both London and Greater Manchester centralisations were associated with significantly greater reductions in length of hospital stay than the rest of England, only the London centralisation was associated with significantly greater reductions in patient mortality than the rest of England (Morris et al, BMJ 2014).

The research team now wish to conduct a follow up analysis, studying

A) the sustainability of the impact of the London centralisation, and

B) the impact of further centralisation in Greater Manchester.

Since this is a follow-up analysis the team is requesting exactly the same variables requested previously, but now through to 2016.

Expected Benefits:

The continuation of research with HES/ONS data will provide vital evidence regarding organisation and provision of acute stroke services, building on Morris et al [2014]. In particular, it will provide further evidence on the impact of centralised acute stroke services on stroke patient mortality and length of stay and the sustainability of such effects over time. Importantly, it will also provide evidence on the cost-effectiveness of such changes.

At present, many parts of the English NHS are exploring ways how to reorganise care, including acute stroke services. Current developments, such as Sustainability and Transformation Plans, are anticipated to have a strong focus on major system change and centralisation. The Stroke Research Team at UCL anticipate that by providing evidence on two issues that are central to such decision-making - i.e. impact on outcomes and cost-effectiveness - it is anticipated to have a significant and ongoing influence on the organisation and provision of stroke care and other services across the English NHS, with significant increases in likelihood of patients receiving evidence-based care and improvements in outcomes.

IMPACT OF PREVIOUS PUBLICATIONS BASED ON THIS DATA SHARE

Key findings have already been published based on the HES/ONS data shared to date (Morris et al, British Medical Journal, 2014 - uploaded with this submission), with significant impacts on national policy and local stroke service organisation.

1. NATIONAL: findings were referred to in the Five Year Forward View (NHS England, 2014) as evidence of the benefits of greater concentration of care in terms of patient outcomes [http://www.england.nhs.uk/wp-content/uploads/2014/ 10/5yfv-web.pdf].

2. REGIONAL: findings were cited as evidence in support of the decision to further centralise stroke care services across Greater Manchester (a region covering approximately 3 million people) [see http://www.hsj.co.uk/5083372.article for summary].

Since the further centralisation, the Greater Manchester Operational Delivery Network reports that 84% of stroke patients are now transported to a HASU; this is an increase from 39% in 2010-12 [Ramsay et al, 2015]. Given evidence suggesting that HASUs are more likely to provide evidence-based care, this change should lead to increased likelihood of stroke patients in Greater Manchester receiving evidence-based care, and improved clinical outcomes.

Outputs:

The Stroke Research Team at UCL have employed an active dissemination strategy throughout this study. In line with the strategy, this data share will contribute to a range of academic and other outputs, including the final report to the funder (NIHR Health Services & Delivery Research); high impact, open access academic papers; accessible 1-page summaries of the findings; presentations to academic conferences; and presentations and workshops for other key stakeholders (including stroke patients and their carers, stroke clinicians, hospitals and commissioning organisations, national policy makers, and the wider public) at national and regional levels; The Stroke Research Team at UCL also make the findings available through the study webpage and promote outputs on social media.

All outputs will contain only aggregate level data with small numbers suppressed in line with HES analysis guide. Further, the Stroke Research Team at UCL will not explicitly identify the provider organisations providing services in different localities, but rather the impacts of centralisation at regional level (e.g. ‘London’, ‘Greater Manchester’).

FINAL REPORT
The final report will be submitted to the funder (NIHR Health Services and Delivery Research programme) in July 2017. This will cover all findings of the study, covering factors influencing planning, implementation, impact, and sustainability of major system change in acute stroke services. Once finalised, this will be published in the open access, peer-reviewed NIHR journal, Health Services and Delivery Research, with an estimated publication date of January 2018.

ACADEMIC PAPERS
All academic papers are published open access in high impact, peer-reviewed academic journals. Below, is a summary of the papers the Stroke Research Team at UCL anticipate publishing over the coming months.

PUBLISHED PAPER: IMPACT OF CENTRALISATION ON PATIENT MORTALITY AND LENGTH OF STAY
Morris et al. Impact of centralising acute stroke services in English metropolitan areas on mortality and length of hospital stay: difference-in-differences analysis.

BMJ 2014
This paper presented a controlled difference-in-differences analysis of HES/ONS data (2008-2012) to ascertain the effect of the centralisations in London and Greater Manchester on length of hospital stay and mortality at 3, 30, and 90 days for stroke patients. The key findings were:
- length of hospital stay reduced significantly more in London and Greater Manchester than in the rest of England;
- mortality in London reduced significantly more than that in the rest of England, while a similar reduction was not observed in Greater Manchester.
These findings suggest that fully centralised models are associated with better outcomes for stroke patients. Data on the significantly greater effect of the London changes on mortality and length of stay were reported in the final decision to further centralise stroke services in Greater Manchester, and presented as evidence of the potential benefits of centralising specialised healthcare services in the Five Year Forward View.

FUTURE PAPERS

1. COSTS AND COST-EFFECTIVENESS OF CENTRALISATION
Hunter et al have analysed costs and cost-effectiveness of the changes in London and Greater Manchester in 2010, using the current HES/ONS data share. The analysis has been written and submission is anticipated in December 2016.

2. FURTHER ANALYSIS OF IMPACT ON PATIENT MORTALITY AND LENGTH OF STAY
Morris et al will lead a further analysis of how centralisations in London and Greater Manchester influence patient mortality and length of stay, using the rest of England as a control. In this case, the focus will be the sustainability of the changes in London, and the impact of further centralisation in Greater Manchester in 2015. Analysis should be complete for submission in June 2017, and the article will be submitted to the BMJ. It is anticipated that this article will generate similar interest and impact to that of Morris et al (2014), discussed above.

3. FURTHER ANALYSIS OF COSTS AND COST-EFFECTIVENESS OF CENTRALISATION
Hunter et al will run a follow-up analysis of costs and cost-effectiveness of services, in terms of sustainability of the London centralisation and further changes implemented in Greater Manchester in 2015, using the rest of England as a control. Again, it is anticipated that this analysis will be completed for submission in June 2017.

ACCESSIBLE SUMMARIES
For all the academic papers published, The Stroke Research Team at UCL produce an accessible 1-page summary of the findings. These summaries present a clear outline of 1) what the Team knew; 2) what the Team found; and 3) what the findings mean.

The Stroke Research Team at UCL share the summaries with stakeholders (>200 members) and encourage them to share with their wider networks.
All summaries are uploaded to the study webpage, from which they may be freely downloaded.

CONFERENCE PRESENTATIONS
The research team has presented findings from this study at a wide range of national and international conferences. The aim is to present findings from the proposed share at the following academic conferences:
1. Health Services Research UK Symposium (July 2017)
2. UK Stroke Forum (November 2017)
3. Health Economists’ Study Group (HESG) conference (June 2017)

PRESENTATIONS AND WORKSHOPS FOR KEY STAKEHOLDERS
For each paper published, a short presentation is developed to summarise the findings for a range of stakeholders, including A) healthcare professionals and B) stroke patients and their carers. To share the lessons from the analyses described above, it is planned to develop similar presentations, which the Stroke Research Team at UCL aim to share at the following meetings:
1. UK Stroke Assembly 2017 (national conference for patients and carers, organised by the Stroke Association)
2. London Clinical Network stroke leaders’ meeting
3. Greater Manchester Stroke Network meeting
4. Kings’ Stroke Research Patients and Family Group
5. A number of local stroke patient and carer groups in Greater Manchester and London.

ENGAGING WITH THE STEERING COMMITTEE
The research team includes nationally and internationally respected clinical leaders in stroke. Effective sharing of lessons from this research is a key priority for the team. The Stroke Research Team engage actively at national and local levels, to ensure that findings are communicated accessibly to a wide range of stakeholders, including stroke patients, their carers and the wider public, stroke clinicians, hospitals and commissioning organisations, and national policy makers.

The Study Steering Committee (SSC) includes people in leadership roles in NHS England Strategic Clinical Networks and Clinical Commissioning Groups covering London, Greater Manchester, and the Midlands and East of England; it also features representatives of charities, the Stroke Association and Different Strokes, and a number of service user representatives. The team present developing findings to the SSC regularly, and explore how best to ensure these findings are communicated effectively and meaningfully to key stakeholders. SSC members have been highly supportive in sharing findings across local networks.

Initial findings will be presented from the new analyses with our SSC in March 2017, where the team will discuss methods, interpretation, and how best to disseminate these findings.

WEBSITE AND SOCIAL MEDIA
The study website provides links to our open access papers and offers free downloads of accessible summaries of findings. All publications and conference presentations are promoted on twitter, via the UCL Department of Applied Health Research account (>700 followers) and the NIHR Collaboration for Leadership in Applied Health Research and Care West North Thames account (>1000 followers).

Processing:

The aim is to use the data requested to update analysis of the impact of centralising acute stroke services in Greater Manchester and London on stroke patient mortality, length of hospital stay, and cost-effectiveness of services.

CLINICAL OUTCOMES - LENGTH OF HOSPITAL STAY AND PATIENT MORTALITY
The analysis will use a between-region difference-in-differences regression analysis: this will allow comparison of clinical outcomes in each studied region (Greater Manchester and London), before and after centralisation, using the rest of England as a control. This approach was used in a previously-published paper (Morris et al, BMJ 2014), which analysed the impact of the changes implemented in London and Greater Manchester in 2010 and established significantly different outcomes of the two centralisations.

This approach will now be used to study
A) the impact of further centralisation in Greater Manchester in March 2015, and
B) the sustainability of the impact of the London centralisation.

This requires patient-level data for the whole of England over several years as this allows control for differences in patient characteristics between areas and over time. This analysis will have two stages:

Stage 1: To calculate expected risks of death at 3, 30 and 90 days after admission, and length of hospital stay, using patient level regressions, controlling for gender and age interactions (age measured in five year bands), stroke diagnosis using the first four digits of the primary ICD-10 diagnostic code (19 categories), Charlson index derived from secondary ICD-10 diagnostic codes, presence of 16 comorbidities included in the Charlson index, ethnic group, and deprivation quintile and urban/rural classification of the Lower Layer Super Output Area in which the patient lived. The regression coefficients will be used to predict the probability of mortality and the length of hospital stay for every patient.

Stage 2: These expected values will be aggregated to create a dataset of the actual percentage of patients who died and the expected percentage, and also the actual and expected length of hospital stay, by admitting hospital and quarter. This will test whether the reconfigurations have an impact on mortality and length of hospital stay using least squares regression of the actual minus expected mortality percentage and actual minus expected length of hospital stay against interaction terms between Greater Manchester and the post-reconfiguration period and London and the post-reconfiguration period.

This exactly replicates the approach used in previous analysis.

To analyse the impact of the further changes in Greater Manchester implemented in 2015, replication of previous approaches will be used as described above, comparing outcomes before and after the latest reorganisation was implemented. To analyse the sustainability of the London changes, estimation of separate effects of the centralisation over time by including terms for the interaction between the intervention group and annual post-implementation periods will be analysed .

The requested data covers the time period 2003/04 – 2015/16 so that a comparison of performance before and after changes took place can be conducted (permitting a year of post-centralisation data for Greater Manchester B, and over 5 years of post-implementation data for London).

The research team has requested data covering the whole of England, including areas outside London and Greater Manchester. Using the rest of England as a control allows the team to replicate its previous analysis of HES/ONS data, published in the BMJ (Morris et al, 2014), and build on the influential lessons presented in that article. In addition, it means that any changes observed in Greater Manchester and London can be understood within the context of changes occurring at the national level and help ensure that any changes observed are attributed appropriately. To enable the research team to carry out the analysis objectively it is essential that data for the whole nation is used. This will enable the team to publish results based on several different variables rather than focus on a selection of geographical areas. To pick out certain areas would not be representative of the nation and therefore not comparable. Further, if a reduced number of areas were selected rather than using national data, the team would be unable to guarantee that the data selected are truly representative. This would A) weaken the analysis, and B) reduce the credibility of the findings (e.g. by prompting suggestions of ‘cherry-picking’).

As noted, patient-level data has been requested in order to be able to control for patient level factors affecting outcomes.

COST-EFFECTIVENESS

The analysis will combine the outputs from the analyses of length of stay and patient mortality described above with procedure data to update a cost-effectiveness model built as part of an evaluation of the London Stroke Strategy. The analysis will update research previously undertaken to evaluate the cost-effectiveness of the changes in London and Greater Manchester , which will be submitted to a journal soon. The update will account for the sustainability of the changes in London and the introduction of the Manchester B model, as described above.

STORAGE/SECURITY

The HES/ONS data will be stored on UCL's Data Safe Haven (https://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln). It is built using a walled garden approach, where the data are stored, processed and managed within the security of the system, avoiding the complexity of assured end point encryption. A file transfer mechanism enables information to be transferred into the walled garden simply and securely.

The data will be accessed and analysed on secure, password-protected computers at two locations within UCL: 1-19 Torrington Place, and the UCL Medical School, based at the Royal Free Campus. This reflects the locations of the two health economists who will work on the econometric analysis and cost effectiveness analysis respectively. Both sites are compliant with all statements regarding data security within our application. All of the 4 approved users of the data are substantively employed by University College London and the data available will only be used for the purposes of this project.

MR1291: Clinical Cohorts in Coronary Disease Collaboration (4C) — DARS-NIC-152228-DL5MK

Opt outs honoured: Y, N (Excuses: Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive, and Non Sensitive

When:DSA runs 2017-03 – 2020-02 2016.04 — 2017.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report

Type of data: Identifiable

Objectives:

Clinical Cohorts in Coronary disease Collaboration (4C)

Aim;
To evaluate important new opportunities for improving quality of care and outcomes for patients with angina and acute coronary syndromes, integrated across the patient journey, and at different levels of care.

Objectives;
(i) To determine the cumulative impact on patient outcome of missed opportunities for improving patient outcome, from the beginning to the end of the patient journey, across five of the most common symptomatic coronary presentations, assessing inequalities in care and outcome.
(ii) To determine at the level of the individual hospital the extent to which the organisation and processes of care have an impact on the patient journey.
(iii) To establish the effectiveness and cost-effectiveness of a multi-faceted intervention targeting initial specialist management at hospital chest pain clinics of patients early in the symptomatic phase of the patient journey.
(iv) To determine whether novel biomarkers are a cost-effective addition to existing clinical information in predicting the progression of chronic stable angina to acute fatal and non-fatal events.

Yielded Benefits:

Due to the lower than expected frequency of outputs produced from the study and HES linked to MRIS report data being excluded form outputs due to only receiving HES data on a sub-cohort of participants from the total cohort, there is a lack of benefits being realised as a result of previous outputs. Future benefits are expected as follows: 1) As a resource Despite their value in translational research, few large prospective clinical cohorts have been established in stable coronary artery disease. In the report to the National Institute of Health Research (Hemingway H et al. Programme Grants Appl Res 2017 (Feb);5(4)) the researcher highlighted a need for embedding genetic information in hospital EHR at scale in order to carry out a wide range of translational research, from discovery efforts, through drug repurposing and embedded pharmacogenetics for patient safety. 4C was established as a resource for research into cardiovascular disease and will fill this important gap. 2) Translational benefits Data are contributing to the GENIUS Consortium, (http://www.genius-chd.com/) an international collaboration of investigators seeking to collectively better understand the genetic and non-genetic drivers of subsequent or recurrent events in those who have established CHD. Clinical record data are highly effective in distinguishing risk groups, for diverse diseases and in diverse settings and higher risk patients usually have more absolute benefit than those in lower risk groups (i.e. without biologic interaction). Clinical risk prediction algorithms and decision support are rapidly proliferating in CVD and many tools can be envisaged in the management of a single patient, spanning benefits and harms at different time points. Clinical data can outperform the Framingham risk score and can flexibly model start point populations and endpoints and be easily updated in the light of new imaging, genetic information, and implemented in clinical practice. Predictions may be improved by incorporating clinical trajectories. For example patients in whom blood pressure declines over time, without diagnosed heart failure, have a worse survival than those whose blood pressure remains stable.

Expected Benefits:

4C forms part of GENIUS-CHD (http://www.genius-chd.com/), an international consortium of 63 clinical cohorts containing coronary heart disease patients and DNA to associate genetic variants with subsequent events. The consortium was established in 2014 and contains data from cohorts from 17 countries, including over 270,000 patients - the largest effort of its kind studying determinants of risk for subsequent CHD events. 4C's contribution has already seen benefits being realised in the development of the consortium, as detailed above. Further benefits will be gained through 4C's role in GENIUS utilising further outputs from the study which will incorporate the use of HES & MRIS report data to better understand patient pathways.

Collaborative research based on individual patient data should be encouraged to strengthen prognostic model development and external validation of models. This is timely, with new standards to improve the quality of prognosis research and new opportunities to link clinical health-care data with other sources of electronic information and bespoke data at scale. To provide a new resource for prognosis research, we established 4C, a novel, contemporary clinical cohort which includes a DNA and biomarker resource linked to hospital EHR, questionnaire data and health outcomes. A combination of -omic, clinical and phenotypic data, 4C will support research on the causes of stable coronary disease and its main complication, acute coronary syndrome, and the complex interactions between genetic and environmental determinants of post-acute myocardial infarction outcomes.

Use of MRIS report (ONS) data provided as part of this application will support investigation of long-term outcomes for patients with stable coronary disease. Further use of this data will benefit the study by working to achieve the aim of follow-up on patient journey over a 10 year period (2014-2024). Use of HES & MRIS data within outputs made available to researchers are expected to broaden the scope of impact. Research findings will be published in a high impact, open access scientific journal. Study findings can be used to inform policy around the management of patients with stable coronary disease.

Outputs:

Direct outputs from the study were produced in aggregated form with small numbers suppressed in the form of an open access journal available to all in line with UCLs open access publication policy. Access to this data is via secure token permissions for approve researchers. To date, outputs produced in this form have utilised bespoke study data from the sources below:

Self-reported health questionnaire data including information about the patients chest pain and general health and one-year follow up health questionnaire
A blood sample for research (circulating biomarkers and DNA)
Access to their GP and hospital medical record

Outputs to date have currently excluded HES linked to MRIS report data due to the UCL only holding hospital records for a portion of their cohort.

Applicant researchers using these aggregate reports have a strong track record in publishing research in high impact journals. Examples of the use of 4C data include the protocol, baseline characteristics of the cohort and patients genotyping data being used within projects aimed at improving patient care for patients suffering from angina and heart attacks (https://www.journalslibrary.nihr.ac.uk/programmes/pgfar/RP-PG-0407-10314/#/) and 4C data contributing to the GENIUS-CHD Consortium (Patel R et al. Circ Genom Precis Med. 2019 Mar 21 DOI: 10.1161/CIRCGEN.119.002470; DOI: 10.1161/CIRCGEN.119.002471) have been published. This research has been cited as a resource for CVD research (https://academic.oup.com/eurheartj/article/39/16/1481/4096831) and is registered on ClinicalTrials.gov (Identifier: NCT02402478).

Planned outputs:
The HES-mortality data which forms the subject of the planned application will lead to submission of at least two research papers for publication in leading peer-reviewed journals. It is currently envisaged that tThe first paper will describe 10-year follow-up of patients for CVD events and all-cause mortality. The second paper will focus on multi-omic (genomic, metabolomic and proteomic) associations with CVD events and mortality at 10-year follow up among this cohort. Any publication arising from analysis of study data will include only aggregated data summarised in tables and figures. We will submit research papers within 12-18 months of approval of our application for full renewal of the data sharing agreement. In accordance with UCLs policy on open access publishing, we will submit all publications to fully open access journals.

Processing:

Under this agreement, UCL previously received demographic and mortality (date and causes of death), registration & embarkation data for all patients recruited to 4C since date of enrolment to the study in order to ensure complete follow-up data, including outcomes and confounding variables, for all patients in the study.

Patients additionally consented to access and linkage to their past, present and future medical and other health-related records for research purposes.

Data was processed according to the following steps:
1. UCL sent patient NHS Numbers and demographic details (date of birth, sex and postcode) of the established cohort of participants to NHS Digital for tracing.

2. NHS Digital produced reports for mortality and registration data using participant data.

3. NHS Digital sent the resulting Medical Research reports to UCL to be received by the study data manager.

4. The study data manager removed patient identifiers and stored the data, pseudonymised at Study ID, in the UCL Data Safe Haven secure study database.

5. UCL IHI approved analysts processed the data using the statistical analysis package STATA.

6. Baseline characteristics of the cohort and DNA analyses were published in a funder final report using bespoke study data (see Outputs below). UCL intends to produce data reporting 10-year outcomes of patients using MRIS requested under this agreement and HES data to be requested under an amendment, in conjunction with forthcoming proteomic analyses. Outputs will be produced in aggregated form with small number suppressed and published in an open access journal and included on the news section of the UCL IHI website (https://www.ucl.ac.uk/health-informatics/) which is made available for UCL researchers.

Medical Research (MRIS) data was downloaded from NHS Digital by the study data manager within UCL and stored in a secure database (project share) in the UCL Data Safe Haven (http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln) to which only approved study staff have access. The server is held on-site at UCL and access is restricted to named individuals according to UCLs security policy. The Data Safe Haven is governed by a strict information governance framework that is used throughout the UCL School of Life and Medical Sciences of which the Senior Information Risk Owner (SIRO) is the Dean of the Faculty of Population Health Sciences, Professor Graham Hart.

DNA and biomarker samples, questionnaire and clinical data extracted from EHRs were linked to Medical Research report (formerly Office for National Statistics [ONS]) mortality and demographic data (DARS-NIC-152228-DL5MK) and Hospital Episode Statistics (HES) inpatient, outpatient and A&E data (NIC-152452-N5J8N - HES data were sent on 833 patients) using patients NHS number, date of birth, sex and postcode.

Remote access to the project, shared by researchers analysing the data, is permitted but only via secure token for approved researchers (so processing is carried out on site), and with local printing and access to internet disabled. Only study staff who have passed NHS Digital information governance training (https://www.ucl.ac.uk/isd/it-for-slms/research-ig/information-governance-training-awareness-service) and who are authorised as a member of the study team were able to access NHS Digital data for this study. No record-level data will be exported outside the Data Safe Haven or shared with any third party organisation or user. No aggregate data with small numbers un-suppressed will be exported from the Data Safe Haven.

Study team members who will be responsible for processing study data will hold a substantive contract with UCL. Researchers responsible for analysing the data will only have access to de-identified data. The data requested will only be used for the purposes described in this application. The full HES-mortality data will be accessible only by the PI or researchers responsible for managing (downloading and extracting data) the resource, curating and cleaning the data.

Researchers cannot export results from their own analyses from the UCL Data Safe Haven. Rights to export will be restricted to only authorised staff within the study team. Aggregate data outputs for export can only be exported through the established disclosure control procedure. The UCL Data Safe Haven requires specific authorisation of staff who can export outputs. Staff authorised to export aggregate outputs will control outputs by scrutinising aggregate tables and figures to assess whether the outputs meet the following requirements:

the HES analysis guide
the ICO Anonymisation Code of Practice
the Anonymisation Standard for Publishing Health and Social Care Data Specification (ISB1523)
the ONS Disclosure control guidance for birth and death statistics
UCL SLMS Health Informatics requirements
Pseudonymisation ISO/TS 25237:2008 Overview for audit purposes

A joiners and leavers standard operating procedure will apply to identify training requirements for new staff and rescind
access to the Data Safe Haven on completion of the study or when staff leave/staff appointment status changes.

Data analysis is carried out using the statistical analysis package STATA. Baseline data have been analysed and published (see 5c. below). HES-mortality data will be linked to bespoke study data (bloods, questionnaire and clinical data extracted from EHRs) by the study data manager at UCL IHI using the unique participant study identifier for patients enrolled in the study. Following approval of this application for a short-term extension we will submit a DARS amendment to request full renewal of the data sharing agreement with NHS Digital and request follow-up HES and ONS data for study patients from date of expiry of the data sharing agreement (2017) to date so we may carry out analyses on 10-year outcomes (CVD events and all-cause mortality) for study patients.

Project 100 — DARS-NIC-147953-SXCMS

Opt outs honoured: Y, N

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC

Purposes: ()

Sensitive: Sensitive, and Non Sensitive

When:2016.04 — 2017.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type:

Sublicensing allowed:

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report

Type of data:

Objectives:

It is proposed that the main focus of this service evaluation will be assessing the effectiveness of cardiopulmonary exercise testing to predict outcome (mortality & morbidity) following major elective surgery compared to existing risk assessment tools such as the Revised Cardiac Risk Index and the Duke Activity Status Index. In the context of preoperative assessment, CPET is used to provide information on the risks related to planned surgery (based on the fitness of the patient measured during CPET) and thereby guide perioperative care management. By conducting this service evaluation, we will be able to assess the risk stratification criteria currently used, which we apply to high and low risk patient groups in our specific sub-population of patients (both at the Whittington Hospital and University College London Hospital) and therefore improve/refine the prognostic ability of cardiopulmonary exercise testing to predict surgical outcome.

MR1090 - HiLo: Multicentre randomised trial of high dose versus low dose radioiodine — DARS-NIC-147936-1L3FD

Opt outs honoured: N (Excuses: Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 s261(2)(c)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2022-01 – 2023-01 2017.06 — 2017.02. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-6th-february-2020---final.pdf, igard-minutes-13-september-2018.pdf, igard-minutes-30th-august-2018-final.pdf

Datasets:

MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report
MRIS - Flagging Current Status Report
MRIS - Members and Postings Report

Type of data: Identifiable

Objectives:

The Data supplied by the NHSIC to Cancer Research UK and UCK Cancer Trials Centre will be used only for the approved medical Research Project MR1090

Yielded Benefits:

Safety and efficacy data for the trial were published in 2012 (Mallick et al., Ablation with Low-Dose Radioiodine and Thyrotropin-alfa in Thyroid Cancer, New Engl J Med 2012; 366:1674-85). The results showed that there are similar rates of thyroid-remnant ablation, without evidence of residual disease after surgery, for pre-treatment with either 1.1-GBq or 3.7-GBq 131I, plus or minus thyrotropin-alfa. Thus lower dose radioiodine (1.1 GBq), with thyrotropin-alfa or thyroid hormone withdrawal, is as effective as high dose (3.7-GBq) treatment for patients with low and intermediate risk differentiated thyroid cancer. The reduction in irradiation had the benefit of a shorter period of isolation in hospital following treatment, and patients experienced fewer adverse events during and for up to 3 months after ablation, hence improving quality of life, and, incidentally, reducing healthcare and societal costs. This changed clinical practice and 1.1 GBq has now become standard of care for this population of patients, as recommended in the 2014 British Thyroid Association (BTA) guidelines. Long-term follow up results for the trial were published in 2017 (Dehbi et al., Recurrence after low-dose radioiodine ablation and recombinant human thyroid-stimulating hormone for differentiated thyroid cancer (HiLo): long-term results of an open-label, non-inferiority randomised controlled trial, Lancet Diabetes Endocrinol 2019; 7(1):44-5). These results showed that after a median follow up of 6.5 years 21 patients had confirmed recurrence (11 who had 1.1 GBq ablation and 10 who had 3.7 GBq ablation), and was not affected by whether patients had rhTSH or hormone withdrawal prior to ablation. Cumulative recurrence rates at 3 years (1.5% vs 2.1%), 5 years (2.1% vs 2.7%) and 7 years (5.9% vs 7.3%) were similar between low-dose and high-dose radioactive iodine groups. This provides further evidence for the use of 1.1 GBq as standard of care for this population of patients.

Expected Benefits:

This Agreement permits the secure retention of the data only and no other processing.

In any future application, the applicant will be required to provide details of the expected benefits resulting from the study.

In terms of benefits to Health and/or Social Care it is expected that the evidence provided from updated long term follow up of this trial (including new malignancies and deaths) will further assure patients and clinicians of the benefits of using low dose (1.1 GBq) in this patient population.

Outputs:

This Agreement permits the secure retention of the data only and no other processing. No new outputs will be produced under this Data Sharing Agreement.

A further publication on long term follow up is planned, once median follow up is between 10 and 15 years. This will provide further updates on recurrences, and data on new primary cancers and deaths (some of which may be obtained from NHS Digital) will be included.

It is anticipated the numbers of new malignancies and deaths received from NHS Digital will continue to be low, which will provide evidence on the safety of this treatment for this population of patients.

It is anticipated these additional results will provide further evidence for the use of 1.1 GBq as standard of care for this population of patients. Guidelines (e.g. the BTA guidelines) could also be strengthened on the basis of the evidence.

Processing:

Under this Agreement, the data may be securely stored but not otherwise processed. No new data will be provided by NHS Digital under this Agreement.

The study data, including data provided by NHS Digital under previous versions of this Agreement, are currently held by UCL. Under this interim extension all devices containing data will be securely locked away in a locked cabinet at the UCL storage address specified in this Agreement.

The following provides background on the processing activities undertaken for the original study:

Identifiable data was shared with ONS to carry out the linkage between the study data and civil registration data. Participants records were flagged with the Office for National Statistics (ONS). ONS notified the study team at UCL of participants deaths (date and cause) and cancer events when they occurred. The flagging for long-term follow up service transferred from ONS to the HSCIC in 2008. Data was last supplied in June 2017.

New outputs are being requested on the data already flagged, to fulfil the objectives stated in section 5a above. Data were previously received every 3 months, but as the numbers received are low it is proposed bi-annual data outputs would be acceptable. In addition, there may be scope for data minimisation patient trial number, patient initials, and data of birth would be adequate to ensure correct linkage of patients at CTC.

MR1396 - GALA-5: An Evaluation of the Tolerability and Feasibility of combining 5-Amino-Levulinic Acid (5-ALA) with Carmustine Wafers (Gliadel) in the Surgical Management of Primary Glioblastoma. — DARS-NIC-03422-Y7Y0Z

Opt outs honoured: N (Excuses: Consent (Reasonable Expectation))

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC, Health and Social Care Act 2012 s261(7)

Purposes: No (Academic)

Sensitive: Sensitive

When:DSA runs 2018-04 – 2021-03 2016.09 — 2016.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing, One-Off

Data-controller type: UNIVERSITY COLLEGE LONDON (UCL)

Sublicensing allowed: No

AGD/predecessor discussions: igard-minutes-1-november-2018---final.pdf, igard-minutes-11th-october-2018.pdf, igard-minutes-4th-october-2018.pdf

Datasets:

MRIS - Flagging Current Status Report
MRIS - Cause of Death Report

Type of data: Identifiable

Objectives:

Date and cause of death are being requested to enable the assessment of patient survival in the GALA-5 trial. As some patients have become lost to follow up over time, obtaining this information for these patients will ensure that the dataset is as complete as possible. Mortality information is being requested only for patients that have become lost to follow up. The primary objective of the GALA-5 trial is to establish whether the combined use of 5-ALA and Carmustine wafers is safe and does not compromise a patient from receiving or completing standard chemoRT. The secondary objective of this study is to gather preliminary evidence as to whether the combined use of 5-ALA and Carmustine wafers at surgery has the potential to improve clinical outcome.

Yielded Benefits:

As per Clinical Trial Regulations, all trial related documentation, including, trial data must be retained for a minimum of 5 years following the end of trial declaration. The end of the trial has not been declared yet. The datasets have been analysed, however, the manuscripts for publication are still being prepared - so no benefits have been yielded from any publications as none have been created yet.

Expected Benefits:

The aim of GALA-5 was to establish whether the combined use of 5-ALA and Carmustine wafers during surgery in patients diagnosed with Glioblastoma (GBM) is safe and does not compromise delivery of standard chemo radiotherapy. The results of the GALA-5 trial are also expected to guide the protocol for a new phase III trial that will further explore the use of 5-ALA and Carmustine wafers during surgery in patients diagnosed with Glioblastoma (GBM). This trial will also be run by the Cancer Research UK and UCL Cancer Trials Centre. The data provided will inform the clinical impact of combination therapy incorporating Carmustine wafers with Temozolomide by providing additional information regarding the cause of death in patients given this combination. The data will also inform reappraisal of NICE TA121 (https://www.nice.org.uk/guidance/ta121).
While the NICE TA121 guidelines comment on the use of carmustine wafers and temozolomide, they do not comment on the combined use of carmustine wafers and temozolomide, and are based on historical studies that antedate temozolomide and 5-ALA. By treating patients with both 5-ALA, carmustine and temozolomide, GALA-5 will therefore inform reappraisal of NICE TA121 (and hence patient treatment). Results will be disseminated to NICE by forwarding the final publication. No record level data will be disseminated to NICE.

Outputs:

The mortality data obtained from HSCIC will be used to calculate the overall survival of GALA-5 trial patients otherwise lost to follow-up. The GALA-5 trial dataset will be analysed as soon as data are received (ie early in 2016), and will submit the trial results for publication in a peer reviewed journal.
The manuscript will most likely be submitted to the Journal of Neuro-Oncology. Once the manuscript has been accepted for publication, the results will also be entered on the clinicaltrials.gov database, the EU Clinical Trial Register, and potentially NICE (see below). No record level data will be disseminated. The publication will also be forwarded to Cancer Research UK, who will produce a lay summary that will be uploaded on to the CRUK website. End of trial reports for the GALA-5 trial have been submitted to the REC and MHRA, which summarised the main findings. Submitting these requirements within 12 months of end of trial being declared was a requirement. No further report will be required by the MHRA.
Survival data of individual patients will not be published, no individuals will be identifiable from any publications, nor will record level data be shared with third parties.

Processing:

The Cancer Research UK and UCL Cancer Trials Centre will provide patient date of birth, NHS number and date of diagnosis of glioblastoma to the HSCIC in order to identify these patients. Data from the HSCIC will be integrated into the existing anonymised trial dataset in a MACRO database (internally maintained database) by the Trial Coordinator. This dataset is held at the Cancer Research UK and UCL Cancer Trials Centre at UCL. Data will be stored electronically on servers with restricted access. There will be no linkage of the received data to other datasets. No third parties have access to these data. This dataset will be analysed by the trial statistician at the Cancer Research UK and UCL Cancer Trials Centre using STATA (statistical programme software) to produce statistical outputs for publications.

Project 103 — DARS-NIC-147817-KPFRY

Opt outs honoured: Y, N

Legal basis: Section 251 approval is in place for the flow of identifiable data

Purposes: ()

Sensitive: Sensitive

When:2016.04 — 2016.11. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type:

Sublicensing allowed:

Datasets:

MRIS - Scottish NHS / Registration
MRIS - Cause of Death Report
MRIS - Cohort Event Notification Report

Type of data:

Objectives:

To provide data required for an informed decisiion about the introduction of population screening for ovarian cancer. This involves establishing the impact of screening on ovarian cancer mortality, determining the best screening strategy and assessing the physical and psychological morbidity and health economic implications of screening

Project 104 — DARS-NIC-294605-F1P7F

Opt outs honoured: N

Legal basis: Informed Patient consent to permit the receipt, processing and release of data by the HSCIC

Purposes: ()

Sensitive: Sensitive

When:2016.04 — 2016.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: Ongoing

Data-controller type:

Sublicensing allowed:

Datasets:

MRIS - Scottish NHS / Registration

Type of data:

Objectives:

To develop an optimised screening procedure for ovarian cancer in women who are at high-risk due to a strong family history or genetic predisposition to cancer.

To determine the physical morbidity, resource implications, feasibility and acceptability of screening this high-risk population.

To establish a serum bank for the future assessment of novel tumour markers which may help in the prevention, diagnosis and/or treatment of ovarian cancer.

Project 105 — DARS-NIC-00656-V0Z4C

Opt outs honoured: Y

Legal basis: Health and Social Care Act 2012, Section 42(4) of the Statistics and Registration Service Act (2007) as amended by section 287 of the Health and Social Care Act (2012)

Purposes: ()

Sensitive: Non Sensitive, and Sensitive

When:2016.04 — 2016.08. SMLS reported a DPA serious incident; breached contract — audit report.

Access method: One-Off

Data-controller type:

Sublicensing allowed:

Datasets:

Hospital Episode Statistics Accident and Emergency
Hospital Episode Statistics Admitted Patient Care
Hospital Episode Statistics Critical Care
Hospital Episode Statistics Outpatients
Office for National Statistics Mortality Data

Type of data:

Objectives:

This agreement supersedes NIC-330769-C9Y8Y as a new DSA

Improving the transition of young people from child-centred to adult-centred health care systems has been a health policy priority for successive governments. There is clear evidence that poor transitions result in poor health outcomes, particularly related to drop-out from healthcare at a key time in chronic disease management. Conversely, there is now reasonable quality evidence that improving transitions improves outcomes in diseases such as diabetes and chronic renal disease/transplantation. Such evidence has resulted in a decade of clear guidance to health services on the importance of providing good transitional care in England. Yet despite this, and subsequent DH initiatives, such as the Transitions Champions programme, there appears to have been little change in the majority of health services in England. It is anecdotally believed that the majority of young people with long-standing conditions in England do not receive appropriate transitional care, although routine data on transitional care are not collected.
The objective for processing the data is to investigate, using a contemporary UK sample, the effect of transitioning from paediatric to adult care on indicators of illness management relating primarily to health service usage. UCL will also investigate how specific features of the transition to adult services are associated with these outcomes to contribute to the evidence-base regarding the features of successful transitions. This research is limited to the following three purposes:
1) Examine the health impact of transitions from paediatric to adult care.
This will help determine the extent to which current transition arrangements are fit for purpose. This entails examining whether transition itself is associated with detrimental health-related outcomes such as changes in planned health service use, increases use of inpatient, A&E and critical care services and changes in the frequency of missed outpatient appointments.

2) Guide healthcare policy to improve health transitions.
This research will identify factors associated with good transition outcomes to guide policy efforts to improve healthcare transitions. First, UCL will examine how age of transition is associated with transition-related outcomes. This may influence guidance regarding appropriate age of transition. Secondly, UCL will examine the impact of the frequency of outpatient appointments in paediatric care on adult outcomes. Finally, UCL will examine differences in transition outcomes across sentinel health conditions and specialties: diabetes, renal disease and gastroenterology. This will identify areas requiring improvement in transition approaches.

3) Contribute to the development of a measureable outcome metric for transition that could be included in the NHS outcomes framework, to drive improved attention to transition by providers.
The research will constitute a trial of an approach to measuring transition outcomes using routine health data. Health outcomes which are found to be associated with health transitions could form the basis of quality measures for transition across regions, specialties or health authorities.

Expected Benefits:

Improved transition-related health policy
2) This research will guide Department of Health policy on the need, as well as specific policy measures, for improving transitions from paediatric care. This research will inform policy recommendations regarding appropriate age of transition and measures for reducing upheaval to service provision during the transition to adult services including frequency of appointments in late paediatric care and improving retention in adult services.
3) The development of recommendations of quality indicators for health transition.
4) UCL will develop recommendations of indicators of high-quality transition (based on the outcomes studied within the research). With support from the Department of Health, these recommendations would help health services identify where improved transition is justified and may subsequently contribute to the development of standards for transition quality.
Subsequent benefits resulting from this research would relate to improved health outcomes and healthcare (including, for example, retainment in adult services, fewer missed appointments and less usage of emergency services) stemming from improved transition services.
Planned target date is December 2016.

Outputs:

Two key outputs will be produced based on the data analyses:
Peer-reviewed publication(s) designed to disseminate the findings regarding the effect of transition from paediatric to adult care on subsequent service use including inpatient and A&E use, missed hospital appointments and changes in the frequency of healthcare appointments. Paedeatrics Journals, the Lancet and the Journal of Public Health planned to be targeted. They aim to publish this by the end 2016.
These publications will target researchers and practitioners to stimulate debate and subsequent research regarding alternative transition models, evidence-based improvements to existing transition services and reducing the negative impacts of poor transition.
The Department of Health, who are funding the research, will be sent briefing reports in mid-2016 (before they are published in peer review journals) summarising the key findings and suggesting policy measures for maximising health outcomes resulting from service transition. Policy guidance may include the development of measures for monitoring transition care and outcomes.
Outputs will include aggregated data only and will be limited to:
- Average age of transition overall and across sentinel health conditions
- Frequency of inpatient appointments pre- and post-transition
- Frequency of inpatient, A&E and critical care appointments pre- and post-transition
- Frequency of missed appointments pre- and post-transition

Due to the nature of the aggregated data in the outputs, there will be no cases of small numbers requiring suppression.

Processing:

Data will be stored in and accessed through UCL’s ‘Information Data Safe Haven’ (IDHS) which ensures the appropriate and safe handling of sensitive data (see: http://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln). Only authorised UCL staff members will have access to the data and it will not be accessible by any third parties, nor will it be accessed outside the UK.
Data analyses will be conducted in Stata to extract summary statistics. UCL will determine age of transition with reference to paediatric and adult codes within the data. UCL will also define two additional characteristics of transition for each patient: the delay in transition (the gap between last paediatric code and first adult code), and retention in adult services (including changes in regular, planned outpatient appointments). The analysis is limited to comparing outcomes pre and post transition, and examining the effect of age of outcome, delays in transition to adult services and retention in adult services on these outcomes. UCL will conduct this analyses across conditions, as well as in three sentinel conditions: renal pathologies, diabetes and gastroentological diseases.
Outcomes are limited to use of inpatient care, A&E attendances, frequency of hospital admissions, critical care admissions, and mortality.